There is a newer version of this record available.

Dataset Open Access

Medley-solos-DB: a cross-collection dataset for musical instrument recognition

Lostanlen, Vincent; Cella, Carmine-Emanuele; Bittner, Rachel; Essid, Slim


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">music</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">instrument</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">audio</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">machine listening</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">music information retrieval</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">timbre</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">machine learning</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">classification</subfield>
  </datafield>
  <controlfield tag="005">20200124192550.0</controlfield>
  <controlfield tag="001">1344103</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">August 7-11, 2016</subfield>
    <subfield code="g">ISMIR</subfield>
    <subfield code="p">7</subfield>
    <subfield code="a">International Society of Music Information Retrieval</subfield>
    <subfield code="c">New York, NY, USA</subfield>
    <subfield code="n">PS3</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Ircam</subfield>
    <subfield code="a">Cella, Carmine-Emanuele</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Spotify Inc.</subfield>
    <subfield code="0">(orcid)0000-0001-7757-2232</subfield>
    <subfield code="a">Bittner, Rachel</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Télécom ParisTech</subfield>
    <subfield code="a">Essid, Slim</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1276169</subfield>
    <subfield code="z">md5:5da9775d2b9bbcc351eccb9740031474</subfield>
    <subfield code="u">https://zenodo.org/record/1344103/files/Medley-solos-DB_metadata.csv</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">6663778174</subfield>
    <subfield code="z">md5:4464cfa1fadfef441d370110932f46aa</subfield>
    <subfield code="u">https://zenodo.org/record/1344103/files/Medley-solos-DB.zip</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u">https://wp.nyu.edu/ismir2016/</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-09-28</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:1344103</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">New York University</subfield>
    <subfield code="0">(orcid)0000-0003-0580-1651</subfield>
    <subfield code="a">Lostanlen, Vincent</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Medley-solos-DB: a cross-collection dataset for musical instrument recognition</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">320959</subfield>
    <subfield code="a">Invariant Representations for High-Dimensional Signal Classifications</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Medley-solos-DB&lt;br&gt;
=============&lt;br&gt;
Version 1.0, February 2019.&lt;br&gt;
&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Created By&lt;br&gt;
--------------&lt;/p&gt;

&lt;p&gt;Vincent Lostanlen (1), Carmine-Emanuele Cella (2), Rachel Bittner (3), Slim Essid&amp;nbsp;(4).&lt;br&gt;
&lt;br&gt;
(1): New York University&lt;br&gt;
(2): UC Berkeley&lt;br&gt;
(3): Spotify, Inc.&lt;br&gt;
(4): T&amp;eacute;l&amp;eacute;com ParisTech&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Description&lt;br&gt;
---------------&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Medley-solos-DB is a cross-collection dataset for automatic musical instrument recognition in solo recordings. It consists of a training set of 3-second audio clips, which are extracted from the MedleyDB dataset of Bittner et al. (ISMIR 2014) as well as a test set set of 3-second clips, which are extracted from the solosDB dataset of Essid et al. (IEEE TASLP 2009). Each of these clips contains a single instrument among a taxonomy of&amp;nbsp;eight: clarinet, distorted electric guitar, female singer,&amp;nbsp;flute,&amp;nbsp;piano,&amp;nbsp;tenor saxophone,&amp;nbsp;trumpet,&amp;nbsp;and&amp;nbsp;violin.&lt;/p&gt;

&lt;p&gt;The Medley-solos-DB dataset is the dataset that is used in the benchmarks of musical instrument recognition in the publications of Lostanlen and Cella&amp;nbsp;(ISMIR 2016) and And&amp;eacute;n et al. (IEEE TSP 2019).&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;[1] V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference&amp;nbsp;(ISMIR), 2016.&lt;/p&gt;

&lt;p&gt;[2] J. And&amp;eacute;n, V. Lostanlen S. Mallat. Joint time-frequency scattering. IEEE Transactions in Signal Processing. 2019, to appear.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Data Files&lt;br&gt;
--------------&lt;/p&gt;

&lt;p&gt;The Medley-solos-DB&amp;nbsp;contains 21572&amp;nbsp;audio clips as WAV files, sampled at 44.1&amp;nbsp;kHz, with a single channel (mono), at a bit depth of 32. Every audio clip has a fixed duration of&amp;nbsp;2972 milliseconds, that is, 65536 discrete-time samples.&lt;/p&gt;

&lt;p&gt;Every audio file has a name of the form:&lt;/p&gt;

&lt;p&gt;Medley-solos-DB_SUBSET-INSTRUMENTID_UUID.wav&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;p&gt;Medley-solos-DB_test-0_0a282672-c22c-59ff-faaa-ff9eb73fc8e6.wav&lt;/p&gt;

&lt;p&gt;corresponds to the snippet whose universally unique identifier (UUID) is&amp;nbsp;0a282672-c22c-59ff-faaa-ff9eb73fc8e6, contains clarinet sounds (clarinet has instrument id equal to 0), and belongs to the test set.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Metadata Files&lt;br&gt;
-------------------&lt;/p&gt;

&lt;p&gt;The&amp;nbsp;Medley-solos-DB_metadata is a CSV file containing 21572 rows (one for each audio clip) and five&amp;nbsp;columns:&lt;/p&gt;

&lt;p&gt;1. subset: either &amp;quot;training&amp;quot;, &amp;quot;validation&amp;quot;, or &amp;quot;test&amp;quot;&lt;/p&gt;

&lt;p&gt;2. instrument: tag in Medley-DB taxonomy, such as&amp;nbsp;&amp;quot;clarinet&amp;quot;, &amp;quot;distorted electric guitar&amp;quot;, etc.&lt;/p&gt;

&lt;p&gt;3. instrument id: integer from 0 to 7. There is a one-to-one&amp;nbsp;between &amp;quot;instrument&amp;quot; (string format) and &amp;quot;instrument id&amp;quot; (integer). We provide both for convenience.&lt;/p&gt;

&lt;p&gt;4. track id: integer from 0 to 226. The track and artist names are anonymized.&lt;/p&gt;

&lt;p&gt;5. UUID: universally unique identifier. Assigned and random, and different for every row.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;The list of instrument classes is:&lt;/p&gt;

&lt;p&gt;0. clarinet&lt;/p&gt;

&lt;p&gt;1. distorted electric guitar&lt;/p&gt;

&lt;p&gt;2. female singer&lt;/p&gt;

&lt;p&gt;3. flute&lt;/p&gt;

&lt;p&gt;4. piano&lt;/p&gt;

&lt;p&gt;5. tenor saxophone&lt;/p&gt;

&lt;p&gt;6. trumpet&lt;/p&gt;

&lt;p&gt;7. violin&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Please acknowledge Medley-solos-DB&amp;nbsp;in academic research&lt;br&gt;
---------------------------------------------------------------------------------&lt;/p&gt;

&lt;p&gt;When Medley-solos-DB&amp;nbsp;is used for academic research, we would highly appreciate it if&amp;nbsp; scientific publications of works partly based on this dataset cite the following publication:&lt;/p&gt;

&lt;p&gt;V. Lostanlen, C.E. Cella. Deep convolutional networks on the pitch spiral for musical instrument recognition. Proceedings of the International Society for Music Information Retrieval Conference&amp;nbsp;(ISMIR), 2016.&lt;/p&gt;

&lt;p&gt;The creation of this dataset was supported by ERC InvariantClass grant&amp;nbsp;320959.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Conditions of Use&lt;br&gt;
------------------------&lt;/p&gt;

&lt;p&gt;Dataset created by Vincent Lostanlen, Rachel Bittner, and Slim Essid, as a derivative work of Medley-DB and solos-Db.&lt;/p&gt;

&lt;p&gt;The Medley-solos-DB&amp;nbsp;dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International (CC BY 4.0) license:&lt;br&gt;
https://creativecommons.org/licenses/by/4.0/&lt;/p&gt;

&lt;p&gt;The dataset and its contents are made available on an &amp;quot;as is&amp;quot; basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or&amp;nbsp;completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, the authors are&amp;nbsp;not liable for, and expressly exclude&amp;nbsp;all liability for, loss or damage however and whenever caused to anyone by any use of the Medley-solos-DB&amp;nbsp;dataset or any part of it.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;br&gt;
Feedback&lt;br&gt;
-------------&lt;/p&gt;

&lt;p&gt;Please help us improve Medley-solos-DB&amp;nbsp;by sending your feedback to:&lt;br&gt;
vincent.lostanlen@nyu.edu&lt;/p&gt;

&lt;p&gt;In case of a problem, please include as many details as possible.&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;p&gt;Acknowledgement&lt;br&gt;
-------------------------&lt;br&gt;
We thank all artists, recording engineers, curators, and annotators of both MedleyDB and solosDb.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.1344102</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.1344103</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
1,987
1,833
views
downloads
All versions This version
Views 1,9871,100
Downloads 1,833609
Data volume 8.3 TB2.9 TB
Unique views 1,664998
Unique downloads 1,095383

Share

Cite as