Published October 20, 2019 | Version slack2100-redux
Dataset Open

Slakh2100

  • 1. Northwestern University
  • 2. Mitsubishi Electric Research Labs

Description

Introduction:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

 

At a Glance:

  • The dataset comes as a series of directories named like TrackXXXXX, where XXXXX is a number between 00001 and 02100. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here.
  • All audio in Slakh2100 is distributed in the .flac format. Scripts to batch convert are here.
  • All audio is mono and was rendered at 44.1kHz, 16-bit (CD quality) before being converted to .flac.
  • Slakh2100 is a 105 Gb download. Unzipped and converted to .wav, Slakh2100 is almost 500 Gb. Please plan accordingly.
  • Each mixture has a variable number of sources, with a minimum of 4 sources per mix.
  • Every mix as at least 1 instance of each of the following instrument types: Piano, Guitar, Drums, Bass.
  • metadata.yaml has detailed information about each source. Details about the metadata are here.

 

Helpful Links:

For more information, see www.slakh.com.

Support code for Slakh: Available here.

Code to render Slakh data: Available in this repo.

See the dataset at a glance, and info about metadata.yaml.

A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.

 

Important Info about Splits:

The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.

The version of Slakh hosted, here, on Zenodo contains a directory called omitted, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits

 

Citing Slakh:

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting,
  title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
  author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
  booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  year={2019},
  organization={IEEE}
}

 

Files

Files (104.3 GB)

Name Size Download all
md5:f4b71b6c45ac9b506f59788456b3f0c4
104.3 GB Download