Auditory Scene Analysis dataset (Multichannel universal sound separation & polyphonic audio classification)
Description
We constructed a new dataset for multichannel universal sound separation and polyphonic audio classification tasks.
We constructed a new dataset for multichannel USS and polyphonic audio classification tasks. The proposed dataset is designed to reflect various conditions, including moving sources with temporal onsets and offsets. For foreground sound sources, signals from 13 audio classes were selected from open-source databases (Pixabay and FSD50K, Librispeech, MUSDB18, Vocalsound). These signals were resampled to 16 kHz and pre-processed by either padding zeros or cropping to 4 seconds. Each sound source has a 75% probability of being a moving source, with speeds ranging from 0 to 3 m/s. The dataset features between 2 to 4 foreground sound sources, along with one background noise from the diffused TAU-SNoise dataset with a signal-to-noise ratio (SNR) ranging from 6 to 30 dB. The simulations were conducted using gpuRIR. Room dimensions were set to a width and length between 5 and 8 meters, and a height between 3 and 4 meters, with reverberation times ranging from 0.2 to 0.6 seconds. These parameters were sampled from uniform distributions. We simulated spatialized sound sources using a 4-channel tetrahedral microphone array with a radius of 4.2 cm. The procedure for dataset generation and details about class configuration and durations of audio clips are provided in the paper. This dataset poses a significant challenge for separation tasks due to the inclusion of moving sources, onset and offset conditions, overlapped in-class sources, and noisy reverberant environments.
The procedure for dataset generation and details about class configuration and durations of audio clips are provided in the paper. This dataset poses a significant challenge for separation tasks due to the inclusion of moving sources, onset and offset conditions, overlapped in-class sources, and noisy reverberant environments.
Files
ASA_20k_4s_nspk2-4.zip
Files
(34.9 GB)
Name | Size | Download all |
---|---|---|
md5:2608279e515791fee2568ce66d1fb438
|
34.9 GB | Preview Download |
Additional details
Dates
- Available
-
2024-09-12
Software
- Repository URL
- https://github.com/donghoney0416/DeFTMamba