rdb_100_claraprints
Description
This dataset extends the Rondo DB dataset_rdb_100_20200115.json.gz (checksum 49540f7855bed26cdaa28ef038d16321) with the following information:
- Chord extraction with algorithm Chordino as defined in Mauch, Matthias and Dixon, Simon. Approximate
Note Transcription for the Improved Identification of Difficult Chords, Proc. of the 11th International Society for Music Information Retrieval Conference (ISMIR 2010), 2010 and available as a Vamp plugin (accessible in Python) here: http://www.isophonics.net/nnls-chroma - Chord extraction with algorithm Crema as defined in McFee, Brian, Juan Pablo Bello. Structured training
for large-vocabulary chord recognition. In ISMIR, 2017 and available here: https://github.com/bmcfee/crema - Melody extraction with algorithm Melodia as defined in J. Salamon and E. Gómez. Melody Extraction from Polyphonic Music Signals using Pitch Contour Characteristics, IEEE Transactions on Audio, Speech and Language Processing, 20(6):1759-1770, Aug. 2012 and demonstrated here: https://github.com/justinsalamon/melodia_python_tutorial/blob/master/melodia_python_tutorial.ipynb
- Melody extraction with algorithm Piptrack as defined in https://librosa.github.io/librosa/generated/librosa.core.piptrack.html
Chord and melody extraction are performed on the first 120 seconds of referenced recordings.
This dataset uses the format JAMS as described in E. J. Humphrey, J. Salamon, O. Nieto, J. Forsyth,
R. M. Bittner, & J. P. Bello. JAMS: A JSON Annotated Music Specification for Reproducible MIR Research. In ISMIR (pp. 591-596), October 2014 and here: https://github.com/marl/jams.
Here is an example of a document in this JSON file:
{
"chord": [
{
"data": {
"annotation_metadata": {
"annotator": {},
"annotation_tools": "nnls-chroma:chordino",
"version": "",
"annotation_rules": "",
"data_source": "program",
"corpus": "",
"validation": "",
"curator": {
"name": "",
"email": ""
}
},
"sandbox": {},
"data": [
{
"time": 0.185759637,
"confidence": null,
"duration": null,
"value": "N"
},
// a lot of data
]
}
},
{
"data": [
{
"annotation_metadata": {
"annotator": {},
"annotation_tools": "CREMA 0.1.0",
"version": "d65ffd9.0",
"annotation_rules": "",
"data_source": "program",
"corpus": "",
"validation": "",
"curator": {
"name": "",
"email": ""
}
},
"data": [
{
"time": 0.0,
"confidence": 0.2663787007331848,
"duration": 0.09287981859410431,
"value": "G:min"
}
// a lot of data
],
"namespace": "chord",
"time": 0,
"duration": 118.1,
"sandbox": {}
}
]
}
],
"melody": [
{
"data": [
{
"value": [
-201.7408905029297,
-199.42369079589844,
-199.42369079589844,
// a lot of data
],
"time": [
0.023219954648526078,
0.026122448979591838,
0.029024943310657598,
// a lot of data
]
}
],
"annotation_metadata": {}
},
{
"data": [
{
"value": [
402.3492126464844,
403.3961181640625,
393.2862548828125,
// a lot of data
],
"time": [
0.023219954648526078,
0.026122448979591838,
0.029024943310657598,
// a lot of data
}
],
"annotation_metadata": {}
}
],
"file_metadata": {
"version": "1.0",
"identifiers": {
"youtube_id": "0RUhgsuDDe8",
"rondodb_piece_id": 82,
"rondodb_movement_id": null
},
"artist": "Ludwig Van Beethoven",
"title": "Ludwig Van Beethoven: Symphony No.5 in C minor, Op. 67",
"release": null,
"duration": 120
},
"sandbox": {
"catalogue_number": "Op. 67",
"composer": {
"birth_date": "1770-12-17",
"death_date": "1827-03-26",
"name": "Ludwig Van Beethoven",
"rdb_id_people": 13
},
"composition_year": 1808,
"name": "Symphony",
"number": "5",
"piece_full_name": "Symphony No.5 in C minor, Op. 67",
"piece_full_name_with_composer": "Ludwig Van Beethoven: Symphony No.5 in C minor, Op. 67",
"rdb_id_piece": 82,
"recordings": [
{
"is_live": false,
"start_at": 0,
"url": "https://www.youtube.com/watch?v=0RUhgsuDDe8",
"claraprints": {
"120s_chords_chordino": "ejdmmfmfmeahcifmmmfmfmfmffmcmfmfmfmfmcmeimcceicmjihffmfmmffmfmlejdmmfmfmeahcimimmibmfmf",
"120s_chords_crema": "ejdbfmfmfmeahcimmdmmmfmffmcmfmfmfmceceimccjlmmimmimfmjdbfmfmfmelfhcdnammdmmmfmf",
"120s_melody_piptrack": "szsxr$$pyry$yy$t$ptxx$rywxx$ryrs$zsotrrt$t%qt$t$rrqxrbqrrqxqosz$sqxy$$xxwxq$szqqxpzsxrszsqxqtpwpwpw$xyryxt$tpzzyrs$xz$xqtzsxq$oyqxtszq$t$obtxtx$qb%ry$yryrbzszwxyrp$ostqxxbryxxbysqxz$yrq%y$t%yrryryrp$rbt$xtw$bzqxqsz%wwxszq$xtszz$stqstszqxqst$t$t$tsz$tsz$tsz$t$t$zst$xrtttptt$yyrrt$ywxq$wytt$wyr$tsqt$xqtt$bqt$rrszr$sqrzzqpwosqsqxqr$szsqxxtpwtxstqstszszw$s",
"120s_melody_melodia": "sxr$sqx$oyy%qrbqx$qzqbz$tt$$wsqtpwpwywxyqxywpz$xqobxbozsqxqyoqxqq%otrzszbxt$troxobosoyzqsobxtxos$t$t$t$t$t$zsxrbxwsqtbqxybzbxwqxszw$sq",
"30s_chords_chordino": "ejdmmfmfmeahcifmmmfmf",
"30s_chords_crema": "ejdbfmfmfmeahcimmdmm",
"30s_melody_piptrack": "szsxr$$pyry$yy$t$ptxx$rywxx$ryrs$zsotrrt$t%qt$t$rrqxrbqrrqxqosz$sqxy$$xxwxq$szqqxpzsxrszsqxqtpwpwpw$xyryxt$tpzzyrs$xz$xqtzsxq$oyqxtszq$t$obtxtx$qb%ry$yryrbzszwxyrp$ostqxxbryxxbysqxz$yrq%y$t%yrryryrp$rbt$xtw$bzqxqsz%wwxszq$xtszz$stqstszqxqst$t$t$tsz$tsz$tsz$t$t$zst$xrtttptt$yyrrt$ywxq$wytt$wyr$tsqt$xqtt$bqt$rrszr$sqrzzqpwosqsqxqr$szsqxxtpwtxstqstszszw$s",
"30s_melody_melodia": "sxr$sqx$oyy%qrbqx$qzq"
}
},
// 4 other recordings
],
"tonality": "C Minor",
"wikipedia_url": "https://en.wikipedia.org/wiki/Symphony_No._5_(Beethoven)"
}
},
The `chord` and `melody` sections contain the result of the automatic extraction according to each algorithm.
The `sandbox` section contains the original data from the source dataset (Rondo DB), with the addition of the `claraprints` section with various claraprints for different durations and algorithms. For example, `30s_chords_chordino` is the claraprint for the 30 first seconds of this recording using the Chordino algorithm.
The original versions of the sandox data are described on https://www.rondodb.com/about and are copied below:
This dataset contains 100 of such works. Here are a detailed field description:
• rdb_id_piece (100/100): This is the unique piece ID of this work in Rondo DB. If you prefix it with https://www.rondodb.com/piece/ you will reach its official page on Rondo DB.
• name (100/100): The raw name of the piece. As most of classical pieces don't have titles, this is in general the form of the piece, like here: "Symphony".
• number (37/100): Applicable to pieces with numbering. Only 32 of them in this dataset. Otherwise the field is absent.
• piece_full_name (100/100): Generally accepted name for the given piece. Contains catalog information, subname and tonality.
• piece_full_name_with_composer (100/100): Same as piece_full_name but prefixed with composer first name and last name and a semi column.
• catalog_number (79/100): This is the full information of how this piece is generally catalogued. Despite the name of the field, this is not a number but a textual information, such as "Op. 37"
• tonality (57/100): A full tonality (key and mode) of the given piece. Flats and sharps are writen like "E-flat" or "C-sharp".
• composition_year (91/100): The year on four digit of the composition of this piece.
• wikipedia_url (82/100): The full URL of the English Wikipedia page describing this work.
• rdb_id_movement (26/100): The ID of a movement on Rondo DB. By prefixing this ID with URL https://www.rondodb.com/movement/ you will go to the unique movement page on Rondo DB. A "movement" can also be a "part" in a piece edited with several parts, like a book of preludes, études, or arias in an opera.
• movement_full_name (26/100): The name of a movement, which can contain the number of the movement, like in "5. Andaluza". The movement name is part of the piece_full_name and piece_full_name_with_composer as described above.
composer (100/100) is a nested object and contains the following fields:
• rdb_id_people (100/100): The Rondo DB ID of this composer. By prefixing this ID with URL https://www.rondodb.com/people/ you will go to the unique composer page on Rondo DB.
• birth_date (100/100): The date of birth in format YYYY-MM-DD.
• death_date (100/100): The date of death in format YYYY-MM-DD.
• name (100/100): The full name of the composer like "Heitor Villa-Lobos".
recordings (100/100) contains a like of five recording objects. Each recording contains the following fields:
• is_live (500/500) indicates if the recording has been performed live or not. When recorded live, it indicates that it can contain applauses before/after the recording, tuning orchestra, noises during the performances...
• start_at (500/500) the time where the piece is actually starting at, in seconds. Will roughly starts in less than a second after this point. The instruction given to the annotator was: "place the cursor just before the piece actually starts".
• url (500/500) the full URL of the Youtube video. At the time of the creation of the dataset, all recording were publicly available.
Files
Files
(170.6 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:ba83e135c180cd01f4b3758addac43d7
|
170.6 MB | Download |
Additional details
Related works
- Continues
- Dataset: https://www.rondodb.com/datasets/dataset_rdb_100_20200115.json.gz (URL)