Published August 24, 2024 | Version 1.0
Dataset Restricted

4 Bars Monophonic Melodies Dataset (Pitch Sequence)

Description

This dataset is designed for applications in music information retrieval, algorithmic composition, and machine learning tasks involving symbolic music data and it consists in a collection of unique 4 bars monophonic melodies represented as MIDI pitch sequences and each accompanied by thirteen attributes obtained with computational methods. The dataset has been generated using the Resolv system's pipelines starting from the full version of the Lakh MIDI Dataset (a collection of 176,581 unique MIDI files) and it is possible to download it only upon request at the moment (just send an email). The full article of this work also contains all the details on how the attributes have been obtained and on the implementation of the pipelines used for the generation and here it is worth to point out that melodies have been quanized to 4 steps per quarter and only 4/4 time signatures have been considered, hence each melody consists of N = 64 steps where each step is a number in the range [21-108], the MIDI pitches available in a standard piano, or a token in the set {128, 129} for hold note and note off events respectively. No additional performance features (e.g., dynamics, duration, or timing) are included, making this dataset a purely pitch-based collection.

Three datasets (train, validation and test) are provided as TFRecord file divided into 8 shards that contain the data in the Tensorflow's SequenceExample format in which the feature_lists field contains the pitch sequence as a list of integers and the context field its attributes.

The table below shows the numbers of unique melodies contained in the three datasets.

  Train Validation Test
Total unique melodies 10,126,676 70,908 22,265

 

And here is the list of computed attributes for each melody:

Attribute Name SequenceExample Context Key Description
Toussaint Metrical Complexity toussaint A metric that measures the degree of syncopation in rhythm patterns.
Note Density note_density Measures the density of note onsets within the melody.
Pitch Range pitch_range An indicator of how wide or narrow the melody is in terms of its pitch content.
Contour contour Measures the degree to which the melody moves up or down.
Note Change Ratio note_change_ratio The number of note changes normalized to the total number of steps N.
Dynamic Range dynamic_range The difference between the maximum and minimum note velocities.
Longest Repetitive Section len_longest_rep_section The length of the longest repetitive section in the melody normalized to the total number of steps N. A repetitive section is defined as a note that consecutively repeats at least r = 4 times.
Repetitive Section Ratio repetitive_section_ratio The ratio between the total number of repetitive sections and a normalization factor N/r = 64/4 = 16.
Hold Note Steps Ratio ratio_hold_note_steps The ratio between the number steps where a note is hold and the total steps N.
Note Off Steps Ratio ratio_note_off_steps The ratio between the number steps where no note is played and the total steps N.
Unique Notes Ratio unique_notes_ratio The ratio of unique notes is defined with respect to the total number of MIDI pitches considered (88) and the total number of steps N.
Unique Bigrams Ratio unique_bigrams_ratio It is the ratio of the unique bigrams in the melody with respect to the total numbers of steps N.
Unique Trigrams Ratio unique_trigrams_ratio It is the ratio of the unique trigrams in the melody with respect to the total numbers of steps N.

 

To access the content of a SequenceExample use the tf.io.parse_single_sequence_example, for instance:

tf.io.parse_single_sequence_example(
serialized_example,
context_features={
"toussaint": tf.io.FixedLenFeature([], dtype=tf.float32, default_value=0),
"note_density": tf.io.FixedLenFeature([], dtype=tf.float32, default_value=0),
},
sequence_features=["pitch_seq"]
)

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Related works

Software

Repository URL
https://github.com/resolv-libs
Programming language
Python
Development Status
Wip