Dataset Open Access

Groove2Groove MIDI Dataset: synthetic accompaniments in 3k styles

Ondřej Cífka; Umut Şimşekli; Gaël Richard

The Groove2Groove MIDI Dataset is a parallel corpus of synthetic MIDI accompaniments in almost 3000 different styles, created as described in the paper Groove2Groove: One-Shot Accompaniment Style Transfer with Supervision from Synthetic Data [pdf]. See the README.md file or the Groove2Groove website for more information.

The dataset is split into the following sections:

  • train contains 5744 MIDI files in 2872 styles (exactly 2 files per style). Each file contains 252 measures following a 2 measure count-in.
  • val and test each contain 1200 files in 40 styles (exactly 30 files per style, 16 bars per file after the count-in). The sets of styles are disjoint from each other and from those in train.
  • itest is generated from the same chord charts as test, but in 40 styles from the training set.

Chord charts for all MIDI files are provided in the ABC format and the Band-in-a-Box (MGU) format. Each chord chart corresponds to at least 2 MIDI files in different styles.

The code used to automate Band-in-a-Box is available in the pybiab package.

If you use the data in your research, please reference the paper (not just the Zenodo record):

@article{groove2groove,
  author={Ond\v{r}ej C\'{i}fka and Umut \c{S}im\c{s}ekli and Ga\"{e}l Richard},
  title={{Groove2Groove}: One-Shot Music Style Transfer with Supervision from Synthetic Data},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  publisher={IEEE},
  year={2020},
  volume={28},
  pages={2638--2650},
  doi={10.1109/TASLP.2020.3019642},
  url={https://doi.org/10.1109/TASLP.2020.3019642}
}

Files (236.1 MB)
Name Size
groove2groove-data-v1.0.0.tar.gz
md5:c407de7b3676267660c88dc6ee351c79
236.1 MB Download
415
127
views
downloads
All versions This version
Views 415415
Downloads 127127
Data volume 30.0 GB30.0 GB
Unique views 340340
Unique downloads 115115

Share

Cite as