Dataset Open Access

OpenMIC-2018

Humphrey, Eric J.; Durand, Simon; McFee, Brian

The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.

If you use this dataset, please cite the following work:

Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

The dataset is made available by Spotify AB under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. The full terms of this license are included alongside this dataset.

This dataset contains the following:

  • 10 second snippets of audio, in a directory format like 'audio/{0:3}/{0}.ogg'.format(sample_key)
  • VGGish features as JSON objects, in a directory format like 'vggish/{0:3}/{0}.json'.format(sample_key)
  • MD5 checksums for each OGG and JSON file
  • Anonymized individual responses, in 'openmic-2018-individual-responses.csv'
  • Aggregated labels, in 'openmic-2018-aggregated-labels.csv'
  • Track metadata, with licenses for each audio recording, in 'openmic-2018-metadata.csv'
  • A Python-friendly NPZ file of features and labels, 'openmic-2018.npz'
  • Sample partitions for train and test, in 'partitions/*.txt'

Files (2.6 GB)
Name Size
openmic-2018-v1.0.0.tgz
md5:e4ccf187e2bb5ab2e115416e8aafe7f4
2.6 GB Download
  • Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.

1,200
233
views
downloads
All versions This version
Views 1,2001,200
Downloads 233233
Data volume 611.2 GB611.2 GB
Unique views 895895
Unique downloads 182182

Share

Cite as