Published March 8, 2023 | Version 0.1
Dataset Open

Slakh2100-16k for YourMT3

  • 1. C4DM, Queen Mary University of London

Description

About this version:

This is a variant of Slakh2100 dataset (Manilow, 2019) resampled in '16 kHz-mono-16-bit-wav' format. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached. Note that the 'omitted' directory of the slakh2100-redux version is reduced in this version. Thus, we have three splits: train, validation and test. 

MIRData integration:

Like the previous version of Slakh, this version will also be integrated into the MIRData (Bittner, 2019) project for convenient use. For this we provide an index file in 'json' format. The code for the customized MIRData is included in our YourMT3 project.

Citing YourMT3:

@misc{sungkyun_chang_2022_7470191,
  author = {Sungkyun Chang and Simon Dixon and Emmanouil Benetos},
  title = {{YourMT3: a toolkit for training multi-task and multi-track music transcription model for everyone}},
  month = dec,
  year = 2022,
  note = {{(Poster) Presented at DMRN+17: Digital Music Research Network One-day Workshop 2022}},
  publisher = {Zenodo},
  doi = {10.5281/zenodo.7470191},
  url = {https://doi.org/10.5281/zenodo.7470191}
}

 

About Slakh2100:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

Citing Slakh & MIRData:

@inproceedings{manilow2019cutting,
  title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
  author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
  booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  year={2019},
  organization={IEEE}
}
@inproceedings{
  bittner_fuentes_2019,
  title={mirdata: Software for Reproducible Usage of Datasets},
  author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
  booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
  year={2019}
}

Acknowledgement:

We thank the Zenodo team for allowing us additional storage.

Files

slakh_index_2100-yourmt3-16k.json

Files (82.8 GB)

Name Size Download all
md5:c44f9bcba07b3c6ddeaf604f45dc61c5
82.8 GB Download
md5:e6cbe693ae7ca1e5d8fa71abedb94e6c
70 Bytes Download
md5:fab898bd82827ddc4c3e4dbd7b7fcbd9
9.0 MB Preview Download
md5:e40ed8e694b9c79b0a6baa9b680f8d6b
75 Bytes Download

Additional details

Related works

Is derived from
Dataset: 10.5281/zenodo.4599666 (DOI)