Biodenoising validation
Description
Biodenoising_validation is a benchmark dataset for animal vocalization denoising. It contains 62 pairs of clean animal vocalizations and noise excerpts.
We list the data sources in the clean.csv and noise.csv files.
The dataset is created at two sample rates: 16000 and 44100. Each subfolder contains the clean, noise, and noisy subfolders with the accompanying metadata related to the data sources.
Methodology
We programatically create mixtures by pairing vocalizations of noise at random Signal-to-Noise Ratios (SNR) from an uniform distribution between -5 and 10 dB (2.8 average SNR). To ensure reproducibility, we start with a fixed seed that controls the SNR of the mixtures. The samples are between 1 to 60 seconds long (20.14 seconds on average). We split the vocalizations and noises into two lists: underwater (11 vocalizations and 26 noises) and terrestrial (51 vocalizations and 20 noises). For each separate case, we sort the vocalizations and the noise samples and pair them in the order of their duration e.g. matching the longest calls with longest noises.
Citation
Miron, Marius, Sara Keen, Jen-Yu Liu, Benjamin Hoffman, Masato Hagiwara, Olivier Pietquin, Felix Effenberger, Maddie Cusimano, "Biodenoising: animal vocalization denoising without access to clean data,"
License
This dataset is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders.
Contact
info@mariusmiron.com
Files
biodenoising_validation_1.0.zip
Files
(706.2 MB)
Name | Size | Download all |
---|---|---|
md5:05648b477b73f0e71cf98441a98630ef
|
706.2 MB | Preview Download |
Additional details
Additional titles
- Alternative title
- Biodenoising: animal vocalization denoising without access to clean data
Software
- Repository URL
- https://github.com/earthspecies/biodenoising-datasets
- Programming language
- Python
References
- Miron, Marius, Sara Keen, Jen-Yu Liu, Benjamin Hoffman, Masato Hagiwara, Olivier Pietquin, Felix Effenberger, Maddie Cusimano, "Biodenoising: animal vocalization denoising without access to clean data,"