Dataset Open Access

Expanded Groove MIDI Dataset

Callender, Lee; Hawthorne, Curtis; Engel, Jesse

The Expanded Groove MIDI Dataset (E-GMD), a large dataset of human drum performances, with audio recordings annotated in MIDI. E-GMD contains 444 hours of audio from 43 drum kits and is an order of magnitude larger than similar datasets. It is also the first human-performed drum dataset with annotations of velocity.

Additional information is available on the Magenta website: The Expanded Groove MIDI Dataset

If you use the E-GMD dataset in your work, please cite the paper where it was introduced:

Lee Callender, Curtis Hawthorne, and Jesse Engel. "Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset." 2020. arXiv:2004.00188.

You can also use the following BibTeX entry:

@misc{callender2020improving,
    title={Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset},
    author={Lee Callender and Curtis Hawthorne and Jesse Engel},
    year={2020},
    eprint={2004.00188},
    archivePrefix={arXiv},
    primaryClass={cs.SD}
}

Please also make sure to specify which version of the dataset you are using.

Files (96.4 GB)
Name Size
e-gmd-v1.0.0.csv
md5:a212cb4b1aec205ef1882d9f9bb6150a
9.6 MB Download
e-gmd-v1.0.0.zip
md5:510af23329b0472a8349d1aaf8fb98dd
96.4 GB Download
505
1,191
views
downloads
All versions This version
Views 505310
Downloads 1,1911,182
Data volume 111.2 TB110.6 TB
Unique views 403260
Unique downloads 205197

Share

Cite as