Dataset Open Access

OpenMIC-2018

Humphrey, Eric J.; Durand, Simon; McFee, Brian

The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.

If you use this dataset, please cite the following work:

Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

The dataset is made available by Spotify AB under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. The full terms of this license are included alongside this dataset.

This dataset contains the following:

  • 10 second snippets of audio, in a directory format like 'audio/{0:3}/{0}.ogg'.format(sample_key)
  • VGGish features as JSON objects, in a directory format like 'vggish/{0:3}/{0}.json'.format(sample_key)
  • MD5 checksums for each OGG and JSON file
  • Anonymized individual responses, in 'openmic-2018-individual-responses.csv'
  • Aggregated labels, in 'openmic-2018-aggregated-labels.csv'
  • Track metadata, with licenses for each audio recording, in 'openmic-2018-metadata.csv'
  • A Python-friendly NPZ file of features and labels, 'openmic-2018.npz'
  • Sample partitions for train and test, in 'partitions/*.txt'

Files (2.6 GB)
Name Size
openmic-2018-v1.0.0.tgz
md5:e4ccf187e2bb5ab2e115416e8aafe7f4
2.6 GB Download
  • Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
560
86
views
downloads
All versions This version
Views 560560
Downloads 8686
Data volume 225.6 GB225.6 GB
Unique views 320320
Unique downloads 7676

Share

Cite as