Pop-K: Augmented MIDI Dataset for Learning Constrained Modern Pop Melodies
Authors/Creators
Description
Pop-K MIDI Dataset
The Pop-K MIDI Dataset is an open collection of modern pop melodies developed for training and testing symbolic music models in a constrained musical domain. The dataset contains 305,815 files augmented from a base dataset of 8-bar vocal lead, chords, and bass melody tracks. An accompanying model trained on this dataset can be found on GitHub.
The dataset was created to evaluate how limited training data can be scaled via augmentation to efficiently train a model to generate a specific musical style. Additionally, the melodies were transposed to C major and A minor, with timing information normalized to 120 BPM at a 96-tick resolution. This results in a total duration of approximately 1360 hours of musical notation.
License
The Pop-K MIDI Dataset is licensed under the Creative Commons Attribution-NonCommercial (CC BY-NC) license. While efforts have been made to augment and transform the original melodies, some segments may still resemble the source material.
Files
Files
(56.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:25decb80e7c1e4395694e90aa6f39f76
|
56.2 MB | Download |
Additional details
Software
- Repository URL
- https://github.com/patchbanks/Pop-K