Dataset Open Access

Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles

Ondřej Cífka; Umut Şimşekli; Gaël Richard


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">musical styles</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">parallel corpus</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">music</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">MIDI</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">accompaniments</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">chord charts</subfield>
  </datafield>
  <controlfield tag="005">20210426173032.0</controlfield>
  <controlfield tag="001">3958000</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">LTCI, Télécom Paris, Institut Polytechnique de Paris</subfield>
    <subfield code="a">Umut Şimşekli</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">LTCI, Télécom Paris, Institut Polytechnique de Paris</subfield>
    <subfield code="0">(orcid)0000-0002-4960-0010</subfield>
    <subfield code="a">Gaël Richard</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">236114569</subfield>
    <subfield code="z">md5:c407de7b3676267660c88dc6ee351c79</subfield>
    <subfield code="u">https://zenodo.org/record/3958000/files/groove2groove-data-v1.0.0.tar.gz</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-08-29</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="p">user-ieee</subfield>
    <subfield code="p">user-ismir</subfield>
    <subfield code="o">oai:zenodo.org:3958000</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">LTCI, Télécom Paris, Institut Polytechnique de Paris</subfield>
    <subfield code="0">(orcid)0000-0002-6268-6445</subfield>
    <subfield code="a">Ondřej Cífka</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-ieee</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-ismir</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">765068</subfield>
    <subfield code="a">New Frontiers in Music Information Processing</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nc/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;The&amp;nbsp;&lt;em&gt;Groove2Groove MIDI Dataset&lt;/em&gt;&amp;nbsp;is a parallel corpus of synthetic MIDI accompaniments in almost 3000 different styles,&amp;nbsp;created as described in the paper&amp;nbsp;&lt;em&gt;&lt;a href="https://doi.org/10.1109/TASLP.2020.3019642"&gt;Groove2Groove: One-Shot Accompaniment Style Transfer with Supervision from Synthetic Data&lt;/a&gt;&lt;/em&gt;&amp;nbsp;[&lt;a href="https://groove2groove.telecom-paris.fr/data/paper.pdf"&gt;pdf&lt;/a&gt;]. See the &lt;code&gt;README.md&lt;/code&gt; file or the&amp;nbsp;&lt;em&gt;&lt;a href="https://groove2groove.telecom-paris.fr/#Dataset"&gt;Groove2Groove website&lt;/a&gt;&lt;/em&gt; for more information.&lt;/p&gt;

&lt;p&gt;The dataset is split into the following sections:&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;code&gt;train&lt;/code&gt;&amp;nbsp;contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures&amp;nbsp;following a 2 measure count-in.&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;val&lt;/code&gt;&amp;nbsp;and&amp;nbsp;&lt;code&gt;test&lt;/code&gt;&amp;nbsp;each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those in&amp;nbsp;&lt;code&gt;train&lt;/code&gt;.&lt;/li&gt;
	&lt;li&gt;&lt;code&gt;itest&lt;/code&gt;&amp;nbsp;is generated from the same chord charts as&amp;nbsp;&lt;code&gt;test&lt;/code&gt;, but in 40 styles from the training set.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Chord charts for all MIDI files are provided in the ABC format&amp;nbsp;and the Band-in-a-Box (MGU) format. Each chord chart corresponds to at least 2 MIDI files in different styles.&lt;/p&gt;

&lt;p&gt;The code used to automate Band-in-a-Box is available in the &lt;a href="https://github.com/cifkao/pybiab"&gt;pybiab&lt;/a&gt; package.&lt;/p&gt;

&lt;p&gt;If you use the data in your research, please reference the paper (not just&amp;nbsp;the Zenodo record):&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;@article{groove2groove,
  author={Ond\v{r}ej C\'{i}fka and Umut \c{S}im\c{s}ekli and Ga\"{e}l Richard},
  title={{Groove2Groove}: One-Shot Music Style Transfer with Supervision from Synthetic Data},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  publisher={IEEE},
  year={2020},
  volume={28},
  pages={2638--2650},
  doi={10.1109/TASLP.2020.3019642},
  url={https://doi.org/10.1109/TASLP.2020.3019642}
}&lt;/code&gt;&lt;/pre&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isSupplementTo</subfield>
    <subfield code="a">10.1109/TASLP.2020.3019642</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3957999</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3958000</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
467
150
views
downloads
All versions This version
Views 467467
Downloads 150150
Data volume 35.4 GB35.4 GB
Unique views 391391
Unique downloads 138138

Share

Cite as