DCASE 2024 Task 5: Few-shot Bioacoustic Event Detection Development Set
Creators
-
1.
Naturalis Biodiversity Center
-
2.
Queen Mary University of London
-
3.
Syracuse University
-
4.
AGH University of Krakow
- 5. University of Salford
-
6.
University of Konstanz
-
7.
Max Planck Institute of Animal Behavior
- 8. Biotopia
- 9. Naturkundemuseum Bayern
-
10.
University of Oxford
-
11.
University of Surrey
- 12. La Salle, Universitat Ramon Llull
-
13.
Centre National de la Recherche Scientifique
-
14.
Tilburg University
Description
General Description:
The development set for task 5 of DCASE 2024 "Few-shot Bioacoustic Event Detection" consists of 217 audio files acquired from different bioacoustic sources. The dataset is split into training and validation sets.
Multi-class annotations are provided for the training set with positive (POS), negative (NEG) and unkwown (UNK) values for each class. UNK indicates uncertainty about a class.
Single-class (class of interest) annotations are provided for the validation set, with events marked as positive (POS) or unkwown (UNK) provided for the class of interest.
Folder Structure:
Development_set.zip
|_Development_Set/
|__Training_Set/
|___JD/
|____*.wav
|____*.csv
|___HT/
|____*.wav
|____*.csv
|___BV/
|____*.wav
|____*.csv
|___MT/
|____*.wav
|____*.csv
|___WMW/
|____*.wav
|____*.csv
|__Validation_Set/
|___HB/
|____*.wav
|____*.csv
|___PB/
|____*.wav
|____*.csv
|___ME/
|____*.wav
|____*.csv
|___PB24/
|____*.wav
|____*.csv
|___RD/
|____*.wav
|____*.csv
|___PW/
|____*.wav
|____*.csv
Development_set_annotations.zip has the same structure but contains only the *.csv files
Dataset statistics
Some statistics on this dataset are as follows, split between training and validation set and their sub-folders:
-----------------------------------------------------
TRAINING SET
-----------------------------------------------------
Number of audio recordings | 174
Total duration | 21 hours
Total classes | 47
Total events | 14229
-----------------------------------------------------
TRAINING SET/BV
-----------------------------------------------------
Number of audio recordings | 5
Total duration | 10 hours
Total classes | 11
Total events | 9026
Sampling rate | 24000 Hz
-----------------------------------------------------
TRAINING SET/HT
-----------------------------------------------------
Number of audio recordings | 5
Total duration | 5 hours
Total classes | 5
Total events | 611
Sampling rate | 6000 Hz
-----------------------------------------------------
TRAINING SET/JD
-----------------------------------------------------
Number of audio recordings | 1
Total duration | 10 mins
Total classes | 1
Total events | 357
Sampling rate | 22050 Hz
-----------------------------------------------------
TRAINING SET/MT
-----------------------------------------------------
Number of audio recordings | 2
Total duration | 1 hour and 10 mins
Total classes | 4
Total events | 1294
Sampling rate | 8000 Hz
-----------------------------------------------------
TRAINING SET/WMW
-----------------------------------------------------
Number of audio recordings | 161
Total duration | 4 hours and 40 mins
Total classes | 26
Total events | 2941
Sampling rate | various sampling rates
-----------------------------------------------------
-----------------------------------------------------
VALIDATION SET
-----------------------------------------------------
Number of audio recordings | 43
Total duration | 49 hours and 57 minutes
Total classes | 7
Total events | 3504
-----------------------------------------------------
VALIDATION SET/HB
-----------------------------------------------------
Number of audio recordings | 10
Total duration | 2 hours and 38 minutes
Total classes | 1
Total events | 712
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/PB
-----------------------------------------------------
Number of audio recordings | 6
Total duration | 3 hours
Total classes | 2
Total events | 292
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/ME
-----------------------------------------------------
Number of audio recordings | 2
Total duration | 20 minutes
Total classes | 2
Total events | 73
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/PB24
-----------------------------------------------------
Number of audio recordings | 4
Total duration | 2 hours
Total classes | 2
Total events | 350
Sampling rate | 44100 Hz
-----------------------------------------------------
VALIDATION SET/RD
-----------------------------------------------------
Number of audio recordings | 6
Total duration | 18 hours
Total classes | 1
Total events | 1372
Sampling rate | 48000 Hz
-----------------------------------------------------
VALIDATION SET/PW
-----------------------------------------------------
Number of audio recordings | 15
Total duration | 24 hours
Total classes | 1
Total events | 705
Sampling rate | 96000 Hz
-----------------------------------------------------
Annotation structure
Each line of the annotation csv represents an event in the audio file. The column descriptions are as follows:
TRAINING SET
---------------------
Audiofilename, Starttime, Endtime, CLASS_1, CLASS_2, ...CLASS_N
VALIDATION SET
---------------------
Audiofilename, Starttime, Endtime, Q
Classes
DCASE2024_task5_training_set_classes.csv and DCASE2024_task5_validation_set_classes.csv provide a table with class code correspondence to class name for all classes in the Development set. Additionally, DCASE2024_task5_validation_set_classes.csv also provides a recording names column.
DCASE2024_task5_training_set_classes.csv
---------------------
dataset, class_code, class_name
DCASE2024_task5_validation_set_classes.csv
---------------------
dataset, recording, class_code, class_name
Evaluation Set
The Evaluation set for this task will be released on the 1 June 2024
Open Access:
This dataset is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Contact info:
Please send any feedback or questions to:
Burooj Ghani - burooj.ghani@naturalis.nl | Ines Nolasco - i.dealmeidanolasco@qmul.ac.uk
Alternately, join us on Slack: task-fewshot-bio-sed