Dataset Open Access

Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles

Ondřej Cífka; Umut Şimşekli; Gaël Richard

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3958000", 
  "language": "eng", 
  "title": "Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles", 
  "issued": {
    "date-parts": [
  "abstract": "<p>The&nbsp;<em>Groove2Groove MIDI Dataset</em>&nbsp;is a parallel corpus of synthetic MIDI accompaniments in almost 3000 different styles,&nbsp;created as described in the paper&nbsp;<em><a href=\"\">Groove2Groove: One-Shot Accompaniment Style Transfer with Supervision from Synthetic Data</a></em>&nbsp;[<a href=\"\">pdf</a>]. See the <code></code> file or the&nbsp;<em><a href=\"\">Groove2Groove website</a></em> for more information.</p>\n\n<p>The dataset is split into the following sections:</p>\n\n<ul>\n\t<li><code>train</code>&nbsp;contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures&nbsp;following a 2 measure count-in.</li>\n\t<li><code>val</code>&nbsp;and&nbsp;<code>test</code>&nbsp;each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those in&nbsp;<code>train</code>.</li>\n\t<li><code>itest</code>&nbsp;is generated from the same chord charts as&nbsp;<code>test</code>, but in 40 styles from the training set.</li>\n</ul>\n\n<p>Chord charts for all MIDI files are provided in the ABC format&nbsp;and the Band-in-a-Box (MGU) format. Each chord chart corresponds to at least 2 MIDI files in different styles.</p>\n\n<p>The code used to automate Band-in-a-Box is available in the <a href=\"\">pybiab</a> package.</p>\n\n<p>If you use the data in your research, please reference the paper (not just&nbsp;the Zenodo record):</p>\n\n<pre><code>@article{groove2groove,\n  author={Ond\\v{r}ej C\\'{i}fka and Umut \\c{S}im\\c{s}ekli and Ga\\\"{e}l Richard},\n  title={{Groove2Groove}: One-Shot Music Style Transfer with Supervision from Synthetic Data},\n  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},\n  publisher={IEEE},\n  year={2020},\n  volume={28},\n  pages={2638--2650},\n  doi={10.1109/TASLP.2020.3019642},\n  url={}\n}</code></pre>", 
  "author": [
      "family": "Ond\u0159ej C\u00edfka"
      "family": "Umut \u015eim\u015fekli"
      "family": "Ga\u00ebl Richard"
  "version": "1.0.0", 
  "type": "dataset", 
  "id": "3958000"
All versions This version
Views 466466
Downloads 150150
Data volume 35.4 GB35.4 GB
Unique views 390390
Unique downloads 138138


Cite as