Dataset Open Access

OpenMIC-2018

Humphrey, Eric J.; Durand, Simon; McFee, Brian

The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.

If you use this dataset, please cite the following work:

Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

The dataset is made available by Spotify AB under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. The full terms of this license are included alongside this dataset.

This dataset contains the following:

  • 10 second snippets of audio, in a directory format like 'audio/{0:3}/{0}.ogg'.format(sample_key)
  • VGGish features as JSON objects, in a directory format like 'vggish/{0:3}/{0}.json'.format(sample_key)
  • MD5 checksums for each OGG and JSON file
  • Anonymized individual responses, in 'openmic-2018-individual-responses.csv'
  • Aggregated labels, in 'openmic-2018-aggregated-labels.csv'
  • Track metadata, with licenses for each audio recording, in 'openmic-2018-metadata.csv'
  • A Python-friendly NPZ file of features and labels, 'openmic-2018.npz'
  • Sample partitions for train and test, in 'partitions/*.txt'

Files (2.6 GB)
Name Size
openmic-2018-v1.0.0.tgz
md5:e4ccf187e2bb5ab2e115416e8aafe7f4
2.6 GB Download
  • Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
682
182
views
downloads
All versions This version
Views 682682
Downloads 182182
Data volume 477.5 GB477.5 GB
Unique views 433433
Unique downloads 137137

Share

Cite as