Test dataset for separation of speech, traffic sounds, wind noise, and general sounds

Krzysztof Arendt; Artur Szumaczuk; Bartłomiej Jasik; Karol Piaskowski; Piotr Masztalski; Mateusz Matuszewski; Konrad Nowicki; Paweł Zborowski

doi:10.5281/zenodo.4279220

Published November 18, 2020 | Version v1

Dataset Open

Test dataset for separation of speech, traffic sounds, wind noise, and general sounds

1. Samsung R&D Institute Poland

The dataset was generated as part of the paper:
Deep Complex U-Net Ensemble for Outdoor Urban Sound Source Separation,
K. Arendt, A. Szumaczuk, B. Jasik, P. Masztalski, K. Piaskowski, M. Matuszewski, K. Nowicki, P. Zborowski.

It contains various sounds from the Audio Set [1] and spoken utterances from VCTK [2] and DNS [3] datasets.

Contents:
sr_8k/
mix_clean/
s1/
s2/
s3/
s4/
sr_16k/
mix_clean/
s1/
s2/
s3/
s4/
sr_48k/
mix_clean/
s1/
s2/
s3/
s4/

Each directory contains 512 audio samples in different sampling rate (sr_8k - 8 kHz, sr_16k - 16 kHz, sr_48k - 48 kHz).
The audio samples for each sampling rate are different as they were generated randomly and separately.
Each directory contains 5 subdirectories:
- mix_clean - mixed sources,
- s1 - source #1 (general sounds),
- s2 - source #2 (speech),
- s3 - source #3 (traffic sounds),
- s4 - source #4 (wind noise).

The sound mixtures were generated by adding s2, s3, s4 to s1 with SNR ranging from -10 to 10 dB w.r.t. s1.

REFERENCES:

[1] Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman,
Aren Jansen, Wade Lawrence, R. Channing Moore,
Manoj Plakal, and Marvin Ritter, “Audio set: An ontology
and human-labeled dataset for audio events,” in
Proc. IEEE ICASSP 2017, New Orleans, LA, 2017.

[2] Christophe Veaux, Junichi Yamagishi, and Kirsten Mac-
Donald, “CSTR VCTK corpus: English multi-speaker
corpus for CSTR voice cloning toolkit, [sound],”
https://doi.org/10.7488/ds/1994, University of Edinburgh.
The Centre for Speech Technology Research
(CSTR). 2017.

[3] Chandan K. A. Reddy, Ebrahim Beyrami, Harishchandra
Dubey, Vishak Gopal, Roger Cheng, Ross Cutler,
Sergiy Matusevych, Robert Aichner, Ashkan Aazami,
Sebastian Braun, Puneet Rana, Sriram Srinivasan, and
Johannes Gehrke, “The interspeech 2020 deep noise
suppression challenge: Datasets, subjective speech
quality and testing framework,” 2020.

Files

multi-dcunet-uss-test-data.zip

Files (2.3 GB)

Name	Size	Download all
multi-dcunet-uss-test-data.zip md5:d733081cfdc3d0f1c601ed6ec511bf94	2.3 GB	Preview Download

	All versions	This version
Views	1,401	1,400
Downloads	327	327
Data volume	932.5 GB	932.5 GB

Test dataset for separation of speech, traffic sounds, wind noise, and general sounds

Authors/Creators

Description

Files

multi-dcunet-uss-test-data.zip

Files (2.3 GB)