Published November 18, 2020 | Version v1
Dataset Open

Test dataset for separation of speech, traffic sounds, wind noise, and general sounds

Description

The dataset was generated as part of the paper:
Deep Complex U-Net Ensemble for Outdoor Urban Sound Source Separation,
K. Arendt, A. Szumaczuk, B. Jasik, P. Masztalski, K. Piaskowski, M. Matuszewski, K. Nowicki, P. Zborowski.

It contains various sounds from the Audio Set [1] and spoken utterances from VCTK [2] and DNS [3] datasets.

Contents:
sr_8k/
    mix_clean/
    s1/
    s2/
    s3/
    s4/
sr_16k/
    mix_clean/
    s1/
    s2/
    s3/
    s4/
sr_48k/
    mix_clean/
    s1/
    s2/
    s3/
    s4/

Each directory contains 512 audio samples in different sampling rate (sr_8k - 8 kHz, sr_16k - 16 kHz, sr_48k - 48 kHz).
The audio samples for each sampling rate are different as they were generated randomly and separately.
Each directory contains 5 subdirectories:
- mix_clean - mixed sources,
- s1 - source #1 (general sounds),
- s2 - source #2 (speech),
- s3 - source #3 (traffic sounds),
- s4 - source #4 (wind noise).

The sound mixtures were generated by adding s2, s3, s4 to s1 with SNR ranging from -10 to 10 dB w.r.t. s1.


REFERENCES:

[1] Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman,
    Aren Jansen, Wade Lawrence, R. Channing Moore,
    Manoj Plakal, and Marvin Ritter, “Audio set: An ontology
    and human-labeled dataset for audio events,” in
    Proc. IEEE ICASSP 2017, New Orleans, LA, 2017.

[2] Christophe Veaux, Junichi Yamagishi, and Kirsten Mac-
    Donald, “CSTR VCTK corpus: English multi-speaker
    corpus for CSTR voice cloning toolkit, [sound],”
    https://doi.org/10.7488/ds/1994, University of Edinburgh.
    The Centre for Speech Technology Research
    (CSTR). 2017.

[3] Chandan K. A. Reddy, Ebrahim Beyrami, Harishchandra
    Dubey, Vishak Gopal, Roger Cheng, Ross Cutler,
    Sergiy Matusevych, Robert Aichner, Ashkan Aazami,
    Sebastian Braun, Puneet Rana, Sriram Srinivasan, and
    Johannes Gehrke, “The interspeech 2020 deep noise
    suppression challenge: Datasets, subjective speech
    quality and testing framework,” 2020.

Files

multi-dcunet-uss-test-data.zip

Files (2.3 GB)

Name Size Download all
md5:d733081cfdc3d0f1c601ed6ec511bf94
2.3 GB Preview Download