DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection Development Set
Creators
- 1. Queen Mary University of London (QMUL)
- 2. University of Konstanz & Max Planck Institute of Animal Behavior
- 3. Biotopia, Naturkundemuseum Bayern
- 4. AGH University of Science and Technology,
- 5. University of Oxford
- 6. Syracuse University
- 7. University of Salford
- 8. University of Surrey
- 9. La Salle, Universitat Ramon Llull
- 10. Centre National de la Recherche Scientifique (CNRS)
- 11. Tilburg University & Naturalis Biodiversity Centre
Description
General Description:
The development set for task 5 of DCASE 2022 "Few-shot Bioacoustic Event Detection" consists of 192 audio files acquired from different bioacoustic sources. The dataset is split into training and validation sets.
Multi-class annotations are provided for the training set with positive (POS), negative (NEG) and unkwown (UNK) values for each class. UNK indicates uncertainty about a class.
Single-class (class of interest) annotations are provided for the validation set, with events marked as positive (POS) or unkwown (UNK) provided for the class of interest.
this version (3):
* fixes issues with annotations from HB set
Folder Structure:
Development_Set.zip
|_Development_Set/
|__Training_Set/
|___JD/
|____*.wav
|____*.csv
|___HT/
|____*.wav
|____*.csv
|___BV/
|____*.wav
|____*.csv
|___MT/
|____*.wav
|____*.csv
|___WMW/
|____*.wav
|____*.csv
|__Validation_Set/
|___HB/
|____*.wav
|____*.csv
|___PB/
|____*.wav
|____*.csv
|___ME/
|____*.wav
|____*.csv
Development_Set_Annotations.zip has the same structure but contains only the *.csv files
## Dataset statistics
Some statistics on this dataset are as follows, split between training and validation set and their sub-folders:
-----------------------------------------------------
TRAINING SET
-----------------------------------------------------
Number of audio recordings | 174
Total duration | 21 hours
Total classes | 47
Total events | 14229
-----------------------------------------------------
TRAINING SET/BV
-----------------------------------------------------
Number of audio recordings | 5
Total duration | 10 hours
Total classes | 11
Total events | 9026
Ratio event/duration | 0.04
Sampling rate | 24000 Hz
-----------------------------------------------------
TRAINING SET/HT
-----------------------------------------------------
Number of audio recordings | 5
Total duration | 5 hours
Total classes | 5
Total events | 611
Ratio event/duration | 0.05
Sampling rate | 6000 Hz
-----------------------------------------------------
TRAINING SET/JD
-----------------------------------------------------
Number of audio recordings | 1
Total duration | 10 mins
Total classes | 1
Total events | 357
Ratio event/duration | 0.06
Sampling rate | 22050 Hz
-----------------------------------------------------
TRAINING SET/MT
-----------------------------------------------------
Number of audio recordings | 2
Total duration | 1 hour and 10 mins
Total classes | 4
Total events | 1294
Ratio event/duration | 0.04
Sampling rate | 8000 Hz
-----------------------------------------------------
TRAINING SET/WMW
-----------------------------------------------------
Number of audio recordings | 161
Total duration | 4 hours and 40 mins
Total classes | 26
Total events | 2941
Ratio event/duration | 0.24
Sampling rate | various sampling rates
-----------------------------------------------------
-----------------------------------------------------
VALIDATION SET
-----------------------------------------------------
Number of audio recordings | 18
Total duration | 5 hours and 57 minutes
Total classes | 5
Total events | 1077
-----------------------------------------------------
VALIDATION SET/HB
-----------------------------------------------------
Number of audio recordings | 10
Total duration | 2 hours and 38 minutes
Total classes | 1
Total events | 712
Ratio event/duration | 0.7
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/PB
-----------------------------------------------------
Number of audio recordings | 6
Total duration | 3 hours
Total classes | 2
Total events | 292
Ratio event/duration | 0.003
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/ME
-----------------------------------------------------
Number of audio recordings | 2
Total duration | 20 minutes
Total classes | 2
Total events | 73
Ratio event/duration | 0.01
Sampling rate | 44100 Hz
-----------------------------------------------------
Annotation structure
Each line of the annotation csv represents an event in the audio file. The column descriptions are as follows:
TRAINING SET
---------------------
Audiofilename, Starttime, Endtime, CLASS_1, CLASS_2, ...CLASS_N
VALIDATION SET
---------------------
Audiofilename, Starttime, Endtime, Q
Classes
DCASE2022_task5_training_set_classes.csv and DCASE2022_task5_validation_set_classes.csv provide a table with class code correspondence to class name for all classes in the Development set.
DCASE2022_task5_training_set_classes.csv
---------------------
dataset, class_code, class_name
DCASE2022_task5_validation_set_classes.csv
---------------------
dataset, recording, class_code, class_name
Evaluation Set
The Evaluation set for this task will be released on the 1st of June 2022
Open Access:
This dataset is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Contact info:
Please send any feedback or questions to:
Ines Nolasco - i.dealmeidanolasco@qmul.ac.uk
Files
DCASE2022_task5_Training_set_classes.csv
Files
(4.5 GB)
Name | Size | Download all |
---|---|---|
md5:abce1818ba10436971bad0b6a3464aa6
|
1.5 kB | Preview Download |
md5:0c05ff0c9e1662ff8958c4c812abffdb
|
802 Bytes | Preview Download |
md5:cf4d3540c6c78ac2b3df2026c4f1f7ea
|
4.5 GB | Preview Download |
md5:4d1b14db6fde54366ffea0210dbfa57e
|
229.1 kB | Preview Download |
md5:6cda1fd2ffd93ab0622e9a786d9696fb
|
6.5 kB | Preview Download |