Divide and Remaster (DnR)

Petermann, Darius; Wichern, Gordon; Wang, Zhong-Qiu; Le Roux, Jonathan

doi:10.5281/zenodo.6949108

Published October 17, 2021 | Version 2.0

Dataset Open

Divide and Remaster (DnR)

1. Indiana University, Department of Intelligent Systems Engineering
2. Mitsubishi Electric Research Laboratories

Introduction:

Divide and Remaster (DnR) is a source separation dataset for training and testing algorithms that separate a monaural audio signal into speech, music, and sound effects/background stems. The dataset is composed of artificial mixtures using audio from the librispeech, free music archive (FMA), and Freesound Dataset 50k (FSD50k). We introduce it as part of the Cocktail Fork Problem paper.

At a Glance:

The size of the unzipped dataset is ~200GB
Each mixture is 60 seconds long and sources are not fully overlapped
Audio is encoded as 32-bit .wav files at a sampling rate of 44.1 kHz
The data is split into training `tr` (3406 mixtues), validation `cv` (487 mixtures) and testing `tt` (973 mixtures) subsets
The directory for each mixture contains four .wav files, mix.wav, music.wav, speech.wav, sfx.wav, and annots.csv which contains the metadata for the original audio used to compose the mixture (transcriptions for speech, sound classes for sfx, and genre labels for music)

Other Resources:

Demo examples and additional information are available at: https://cocktail-fork.github.io/

For more details about the data generation process, the code used to generate our dataset can be found at the following: https://github.com/darius522/dnr-utils

Fix Brought to V2:

V1 contained some errors as for the speech annotations, which in some cases did not match with their associated audio utterances. This was caused by some of the speech utterances being truncated to fit the length of their associated DnR mixtures, while their transcriptions were reported in their entirety. To address the issue, we discarded the possibility of any utterances being truncated during the DnR creation process. Since the number of test-set mixtures is determined such that we exhaust all utterances from the LibriSpeech TEST-CLEAN set twice, the test-set grew larger in V2 (from 652 to 973 mixtures). To maintain the same split proportions, the training and validation sets have been increased accordingly as well (3406 and 487, respectively).

All the results in the camera-ready paper have also been updated to reflect these changes and using the DnR V2.

In this version, we split the datasets into smaller chunks to ease-up the download process.

Contact and Support:

Have an issue, concern, or question about DnR ? If so, please open an issue here.

For any other inquiries, feel free to shoot an email at: firstname.lastname@gmail.com, my name is Darius Petermann ;)

Download:

DnR V2 is split into smaller ~10GB chunks to ease-up the download process. First download all the chunks into a single directory:

dnr_v2.tar.gz.00
dnr_v2.tar.gz.01
dnr_v2.tar.gz.02
dnr_v2.tar.gz.03
dnr_v2.tar.gz.04
dnr_v2.tar.gz.05
dnr_v2.tar.gz.06
dnr_v2.tar.gz.07
dnr_v2.tar.gz.08
dnr_v2.tar.gz.09
dnr_v2.tar.gz.10

From the same directory run the following to compile all the chunks into a single .tar file:

cat dnr_v2.tar.gz.* >dnr_v2.tar.gz

Finally untar the resulting file:

tar -xf dnr_v2.tar.gz

Citation:

If you use DnR please cite our paper in which we introduce the dataset as part of the Cocktail Fork Problem:

@INPROCEEDINGS{petermann2021cfp,
  author={Petermann, Darius and Wichern, Gordon and Wang, Zhong-Qiu and Roux, Jonathan Le},
  booktitle={ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, 
  title={The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks}, 
  year={2022},
  volume={},
  number={},
  pages={526-530},
  doi={10.1109/ICASSP43922.2022.9746005}
}

Files

Files (116.4 GB)

Name	Size	Download all
dnr_v2.tar.gz.00 md5:c4d200e054e2b80b82bdaf70bd9927a2	10.7 GB	Download
dnr_v2.tar.gz.01 md5:f5ad60624a69c1760de6289a0e4f391a	10.7 GB	Download
dnr_v2.tar.gz.02 md5:41cd14f12b90a6347afac1e3fad8506a	10.7 GB	Download
dnr_v2.tar.gz.03 md5:cf5ec720d68f90ef461fe1713118913b	10.7 GB	Download
dnr_v2.tar.gz.04 md5:8f56a37b39ce65df6aca66fc07159d77	10.7 GB	Download
dnr_v2.tar.gz.05 md5:b6d4bc7a2592bf09e5985beb2492cbd9	10.7 GB	Download
dnr_v2.tar.gz.06 md5:23e9ce5861fa33b1667ba303cf759e0e	10.7 GB	Download
dnr_v2.tar.gz.07 md5:8e8c115fcd083bc4215e51e36709e6a1	10.7 GB	Download
dnr_v2.tar.gz.08 md5:8c3d65ebf197abb5054edfc8e6f77eb8	10.7 GB	Download
dnr_v2.tar.gz.09 md5:eabe1367b91c12c3ccb9d160ddcf2aff	10.7 GB	Download
dnr_v2.tar.gz.10 md5:e06ae1858761341365579de3a1228dc1	9.0 GB	Download

	All versions	This version
Views	10,586	4,039
Downloads	8,611	7,590
Data volume	783.3 TB	534.1 TB

Divide and Remaster (DnR)

Creators

Description

Files

Files (116.4 GB)