NIGENS general sound events database
- 1. Technische Universität Berlin
Description
NIGENS (Neural Information Processing group GENeral sounds) is a database provided for sound-related modeling in the field of computational auditory scene analysis, particularly for sound event detection, that has emerged from the Two!Ears project.
It contains 1017 wav files of various lengths (between 1s and 5mins), in total comprising 4h:46m of sound material. Mostly, sounds are provided with 32-bit precision and 44100 Hz sampling rate. The files contain sound events in isolation, i.e. without superposition of ambient or other foreground sources.
Fourteen distinct sound classes are included: alarm, crying baby, crash, barking dog, running engine, burning fire, footsteps, knocking on door, female and male speech, female and male scream, ringing phone, piano. Additionally, there is the general (“anything else”) class. Care has been taken to select sound classes representing different features, like noise-like or pronounced, discrete or continuous.
The general class is a pool of sound events different than the 14 distuingished target sound classes, containing as heterogeneous sounds as possible (303 in total). For example, it includes nature sounds such as wind, rain, or animals, sounds from human-made environments such as honks, doors, or guns, as well as human sounds like coughs. These sounds are intended both as ``disturbance'' sound events (superposing) and as counterexamples to target sound classes.
Wav files are accompanied by annotation (.txt) files that include perceptual on- and offset times of the file's sound events.
You are free to use this database non-commercially under Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 license.
If you use this data set, please cite as:
Ivo Trowitzsch, Jalil Taghia, Youssef Kashef, and Klaus Obermayer (2019). The NIGENS general sound events database. Technische Universität Berlin, Tech. Rep. arXiv:1902.08314 [cs.SD]
In [1], we have developed and analyzed a robust binaural sound event detection training scheme using NIGENS. In [2], we have extended it to join sound event detection and localization through spatial segregation.
[1] Trowitzsch, I., Mohr, J., Kashef, Y., Obermayer, K. (2017). Robust detection of environmental sounds in binaural auditory scenes. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(6).
[2] Trowitzsch, I., Schymura, C., Kolossa, D., Obermayer, K. (2019). Joining Sound Event Detection and Localization Through Spatial Segregation. accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language Processing. DOI: 10.1109/TASLP.2019.2958408. E-Preprint: arXiv:1904.00055 [cs.SD].
Files
NIGENS.zip
Files
(2.2 GB)
Name | Size | Download all |
---|---|---|
md5:939fb4893015cc1434ad47bbd0ceb6b9
|
2.2 GB | Preview Download |
Additional details
Related works
- Cites
- 10.5281/zenodo.597348 (DOI)
- 10.1109/taslp.2017.2690573 (DOI)
- Is cited by
- https://github.com/nigroup/Supplementaries-to-TASLP-SELD-Spatial-Segregation (URL)
- arXiv:1902.08314 (arXiv)
- arXiv:1904.00055 (arXiv)
References
- Trowitzsch, I., Mohr, J., Kashef, Y, Obermayer, K. (2017). Robust detection of environmental sounds in binaural auditory scenes. IEEE/ACM Transactions on Audio, Speech, and Language Processing 25(6).
- Trowitzsch, Ivo, et al (2019). The NIGENS General Sound Events Database. Technische Universität Berlin, Tech. Rep. arXiv:1902.08314 [cs.SD]
- Trowitzsch, I., Schymura, C., Kolossa, D., Obermayer, K. (2019). Joining Sound Event Detection and Localization Through Spatial Segregation. accepted for publication in IEEE/ACM Transactions on Audio, Speech, and Language Processing. DOI: 10.1109/TASLP.2019.2958408. E-Preprint: arXiv:1904.00055 [cs.SD].