There is a newer version of this record available.

Dataset Open Access

Medley-solos-DB: a cross-collection dataset for musical instrument recognition

Lostanlen, Vincent; Cella, Carmine-Emanuele; Bittner, Rachel; Essid, Slim

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.1344103", 
  "title": "Medley-solos-DB: a cross-collection dataset for musical instrument recognition", 
  "issued": {
    "date-parts": [
  "abstract": "<p>Medley-solos-DB<br>\n=============<br>\nVersion 1.0, February 2019.<br>\n&nbsp;</p>\n\n<p>&nbsp;</p>\n\n<p>Created By<br>\n--------------</p>\n\n<p>Vincent Lostanlen (1), Carmine-Emanuele Cella (2), Rachel Bittner (3), Slim Essid&nbsp;(4).<br>\n<br>\n(1): New York University<br>\n(2): UC Berkeley<br>\n(3): Spotify, Inc.<br>\n(4): T&eacute;l&eacute;com ParisTech</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nDescription<br>\n---------------</p>\n\n<p>&nbsp;</p>\n\n<p>Medley-solos-DB is a cross-collection dataset for automatic musical instrument recognition in solo recordings. It consists of a training set of 3-second audio clips, which are extracted from the MedleyDB dataset of Bittner et al. (ISMIR 2014) as well as a test set set of 3-second clips, which are extracted from the solosDB dataset of Essid et al. (IEEE TASLP 2009). Each of these clips contains a single instrument among a taxonomy of&nbsp;eight: clarinet, distorted electric guitar, female singer,&nbsp;flute,&nbsp;piano,&nbsp;tenor saxophone,&nbsp;trumpet,&nbsp;and&nbsp;violin.</p>\n\n<p>The Medley-solos-DB dataset is the dataset that is used in the benchmarks of musical instrument recognition in the publications of Lostanlen and Cella&nbsp;(ISMIR 2016) and And&eacute;n et al. (IEEE TSP 2019).</p>\n\n<p>&nbsp;</p>\n\n<p>[1] V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference&nbsp;(ISMIR), 2016.</p>\n\n<p>[2] J. And&eacute;n, V. Lostanlen S. Mallat. Joint time-frequency scattering. IEEE Transactions in Signal Processing. 2019, to appear.</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nData Files<br>\n--------------</p>\n\n<p>The Medley-solos-DB&nbsp;contains 21572&nbsp;audio clips as WAV files, sampled at 44.1&nbsp;kHz, with a single channel (mono), at a bit depth of 32. Every audio clip has a fixed duration of&nbsp;2972 milliseconds, that is, 65536 discrete-time samples.</p>\n\n<p>Every audio file has a name of the form:</p>\n\n<p>Medley-solos-DB_SUBSET-INSTRUMENTID_UUID.wav</p>\n\n<p>&nbsp;</p>\n\n<p>For example:</p>\n\n<p>Medley-solos-DB_test-0_0a282672-c22c-59ff-faaa-ff9eb73fc8e6.wav</p>\n\n<p>corresponds to the snippet whose universally unique identifier (UUID) is&nbsp;0a282672-c22c-59ff-faaa-ff9eb73fc8e6, contains clarinet sounds (clarinet has instrument id equal to 0), and belongs to the test set.</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nMetadata Files<br>\n-------------------</p>\n\n<p>The&nbsp;Medley-solos-DB_metadata is a CSV file containing 21572 rows (one for each audio clip) and five&nbsp;columns:</p>\n\n<p>1. subset: either &quot;training&quot;, &quot;validation&quot;, or &quot;test&quot;</p>\n\n<p>2. instrument: tag in Medley-DB taxonomy, such as&nbsp;&quot;clarinet&quot;, &quot;distorted electric guitar&quot;, etc.</p>\n\n<p>3. instrument id: integer from 0 to 7. There is a one-to-one&nbsp;between &quot;instrument&quot; (string format) and &quot;instrument id&quot; (integer). We provide both for convenience.</p>\n\n<p>4. track id: integer from 0 to 226. The track and artist names are anonymized.</p>\n\n<p>5. UUID: universally unique identifier. Assigned and random, and different for every row.</p>\n\n<p>&nbsp;</p>\n\n<p>The list of instrument classes is:</p>\n\n<p>0. clarinet</p>\n\n<p>1. distorted electric guitar</p>\n\n<p>2. female singer</p>\n\n<p>3. flute</p>\n\n<p>4. piano</p>\n\n<p>5. tenor saxophone</p>\n\n<p>6. trumpet</p>\n\n<p>7. violin</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nPlease acknowledge Medley-solos-DB&nbsp;in academic research<br>\n---------------------------------------------------------------------------------</p>\n\n<p>When Medley-solos-DB&nbsp;is used for academic research, we would highly appreciate it if&nbsp; scientific publications of works partly based on this dataset cite the following publication:</p>\n\n<p>V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference&nbsp;(ISMIR), 2016.</p>\n\n<p>The creation of this dataset was supported by ERC InvariantClass grant&nbsp;320959.</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nConditions of Use<br>\n------------------------</p>\n\n<p>Dataset created by Vincent Lostanlen, Rachel Bittner, and Slim Essid, as a derivative work of Medley-DB and solos-Db.</p>\n\n<p>The Medley-solos-DB&nbsp;dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license:<br>\n</p>\n\n<p>The dataset and its contents are made available on an &quot;as is&quot; basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or&nbsp;completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, the authors are&nbsp;not liable for, and expressly exclude&nbsp;all liability for, loss or damage however and whenever caused to anyone by any use of the Medley-solos-DB&nbsp;dataset or any part of it.</p>\n\n<p>&nbsp;</p>\n\n<p><br>\nFeedback<br>\n-------------</p>\n\n<p>Please help us improve Medley-solos-DB&nbsp;by sending your feedback to:<br>\</p>\n\n<p>In case of a problem, please include as many details as possible.</p>\n\n<p>&nbsp;</p>\n\n<p>&nbsp;</p>\n\n<p>Acknowledgement<br>\n-------------------------<br>\nWe thank all artists, recording engineers, curators, and annotators of both MedleyDB and solosDb.</p>", 
  "author": [
      "family": "Lostanlen, Vincent"
      "family": "Cella, Carmine-Emanuele"
      "family": "Bittner, Rachel"
      "family": "Essid, Slim"
  "id": "1344103", 
  "event-place": "New York, NY, USA", 
  "version": "1.0", 
  "type": "dataset", 
  "event": "International Society of Music Information Retrieval (ISMIR)"
All versions This version
Views 2,0221,124
Downloads 1,849613
Data volume 8.4 TB3.0 TB
Unique views 1,6921,017
Unique downloads 1,108387


Cite as