Dataset Open Access

OpenMIC-2018

Humphrey, Eric J.; Durand, Simon; McFee, Brian

The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.

If you use this dataset, please cite the following work:

Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

The dataset is made available by Spotify AB under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. The full terms of this license are included alongside this dataset.

This dataset contains the following:

  • 10 second snippets of audio, in a directory format like 'audio/{0:3}/{0}.ogg'.format(sample_key)
  • VGGish features as JSON objects, in a directory format like 'vggish/{0:3}/{0}.json'.format(sample_key)
  • MD5 checksums for each OGG and JSON file
  • Anonymized individual responses, in 'openmic-2018-individual-responses.csv'
  • Aggregated labels, in 'openmic-2018-aggregated-labels.csv'
  • Track metadata, with licenses for each audio recording, in 'openmic-2018-metadata.csv'
  • A Python-friendly NPZ file of features and labels, 'openmic-2018.npz'
  • Sample partitions for train and test, in 'partitions/*.txt'

Files (2.6 GB)
Name Size
openmic-2018-v1.0.0.tgz
md5:e4ccf187e2bb5ab2e115416e8aafe7f4
2.6 GB Download
  • Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
436
54
views
downloads
All versions This version
Views 436436
Downloads 5454
Data volume 141.7 GB141.7 GB
Unique views 206206
Unique downloads 4747

Share

Cite as