Published April 1, 2020 | Version 1.0.0
Dataset Open

Expanded Groove MIDI Dataset

  • 1. Google
  • 2. Google Research

Description

The Expanded Groove MIDI Dataset (E-GMD), a large dataset of human drum performances, with audio recordings annotated in MIDI. E-GMD contains 444 hours of audio from 43 drum kits and is an order of magnitude larger than similar datasets. It is also the first human-performed drum dataset with annotations of velocity.

Additional information is available on the Magenta website: The Expanded Groove MIDI Dataset

If you use the E-GMD dataset in your work, please cite the paper where it was introduced:

Lee Callender, Curtis Hawthorne, and Jesse Engel. "Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset." 2020. arXiv:2004.00188.

You can also use the following BibTeX entry:

@misc{callender2020improving,
    title={Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset},
    author={Lee Callender and Curtis Hawthorne and Jesse Engel},
    year={2020},
    eprint={2004.00188},
    archivePrefix={arXiv},
    primaryClass={cs.SD}
}

Please also make sure to specify which version of the dataset you are using.

Files

e-gmd-v1.0.0.csv

Files (96.4 GB)

Name Size Download all
md5:a212cb4b1aec205ef1882d9f9bb6150a
9.6 MB Preview Download
md5:510af23329b0472a8349d1aaf8fb98dd
96.4 GB Preview Download