OpenMIC-2018
Description
The OpenMIC-2018 dataset is made available through a collaboration between Spotify and MARL@NYU. Additionally, the cost of annotation was sponsored by Spotify, whose contributions to open-source research can be found online at the developer site, engineering blog, and public GitHub.
If you use this dataset, please cite the following work:
Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018. [pdf]
The dataset is made available by Spotify AB under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. The full terms of this license are included alongside this dataset.
This dataset contains the following:
- 10 second snippets of audio, in a directory format like 'audio/{0:3}/{0}.ogg'.format(sample_key)
- VGGish features as JSON objects, in a directory format like 'vggish/{0:3}/{0}.json'.format(sample_key)
- MD5 checksums for each OGG and JSON file
- Anonymized individual responses, in 'openmic-2018-individual-responses.csv'
- Aggregated labels, in 'openmic-2018-aggregated-labels.csv'
- Track metadata, with licenses for each audio recording, in 'openmic-2018-metadata.csv'
- A Python-friendly NPZ file of features and labels, 'openmic-2018.npz'
- Sample partitions for train and test, in 'partitions/*.txt'
Files
Files
(2.6 GB)
Name | Size | Download all |
---|---|---|
md5:e4ccf187e2bb5ab2e115416e8aafe7f4
|
2.6 GB | Download |
Additional details
References
- Humphrey, Eric J., Durand, Simon, and McFee, Brian. "OpenMIC-2018: An Open Dataset for Multiple Instrument Recognition." in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.