BioDCASE 2025 Task 2 : Development set
Authors/Creators
- Jean-Labadye, Lucie (Data curator)1, 2, 3
-
Parcerisas, Clea
(Data curator)4
-
Miller, Brian
(Data manager)5
- Carvaillo, Paul6
-
Dubus, Gabriel1
-
Farrugia, Nicolas3
- Gros-Martial, Anatole7
-
Marmoret, Axel3
-
Moummad, Ilyass8
- Napoli, Andrea9
-
Nguyen Hong Duc, Paul10
-
Raumer, Pierre-Yves11
-
Schall, Elena4
- White, Ellen9
-
Adam, Olivier1
- Roch, Marie A.12
- White, Paul9
-
Cazau, Dorian
(Project manager)2
-
1.
Sorbonne Université
-
2.
École Nationale Supérieure de Techniques Avancées
-
3.
IMT Atlantique
-
4.
Flanders Marine Institute
-
5.
Australian Antarctic Division
-
6.
France Marine Renewable Energies
-
7.
Centre d'Etudes Biologiques de Chizé
-
8.
National Institute for Research in Computer and Control Sciences
-
9.
University of Southampton
-
10.
Curtin University
-
11.
University of Western Brittany
-
12.
San Diego State University
Description
General description
- 6591 audio files (6004 in train set + 587 in validation set) totaling 1880 hours of recordings from 11 different deployments organized in site-year datasets (eg, kerguelen-2015);
- 11 CSV annotation files named after each corresponding site-year dataset. See "Annotation structure" for more details.
Folder structure
biodcase_development_set.zip
|___biodcase_development_set/
|____train/
|____annotations/
|____site_year1.csv
|____site_year2.csv
|____...
|____audio/
|____site-year1/
|____*.wav
|____site-year2/
|____...
|____validation/
|____annotations/
|____site_year1.csv
|____site_year2.csv
|____...
|____audio/
|____site-year1/
|____*.wav
|____site-year2/
|____...
Annotation structure
Each annotated sound event is defined by the tuple(dataset,filename,annotation,annotator,low_frequency,high_frequency,start_datetime,end_datetime), with annotation representing the class label and taking a unique value in {bma, bmb, bmz, bmd, bpd, bp20, bp20plus}. Note that these labels correspond to a more machine-readable version of the list {BmA, BmB, BmZ, BmD, BpD, Bp20, Bp20plus} described below in the section Call description. Annotator represents the short name of the expert annotator who have produced the annotation file. There is one single annotator per dataset but a same annotator may have annotated several datasets.
As calls may overlap, the set up is multi-class and multi-label : one file or segment of file file is likely to contain several classes.
Please note that evaluation metric is IoU 1D over the temporal axis : the frequency component is provided but will not be part of the final evaluation.
Please note that 7 classes are provided but the evaluation will only take 3 into account as calls can be gathered by similarity following this table :
| bma | bmb | bmz | bmd | bpd | bp20 | bp20plus |
| ABZ call | ABZ call | ABZ call | Downsweep | Downsweep | Bp call | Bp call |
Calls description
BmA, BmB, and BmZ calls are specific to blue whales (Balaenoptera musculus intermedia, Bm), while Bp20 and Bp20Plus calls are characteristic of fin whales (Balaenoptera physalus quoyi, Bp). Both species also produce downsweeps.
As described by Miller et al. (see "Related Works"), BmA calls consist of a constant-frequency tone between 25 and 28 Hz, without additional units. BmB calls are similar but followed by a partial or full inter-tone downsweep. BmZ calls contain two tonal units: A (higher frequency) and C (lower frequency). Occasionally, a B downsweep unit appears between them, forming a "Z" shape on spectrograms.
Bp20 and Bp20Plus vocalizations are pulsed calls with peak energy at 20 Hz (Bp20) and additional energy at higher frequencies (80–100 Hz) in Bp20Plus.
Downsweeps are characterized by a continuous frequency modulation from f₁ to f₂, where f₁ > f₂. Example spectrograms are available on the here, with more detailed references in Miller et al.
Dataset statistics
Train set
| Dataset | Number of audio recordings | Total audio duration (h) | Positive duration (h:m.s) | Total sound event |
| ballenyisland2015 | 205 | 204 | 02:46.32 | 2222 |
| casey2014 | 194 | 194 | 14:12.00 | 6866 |
| elephantisland2013 | 2247 | 187 | 16:06.58 | 21966 |
| elephantisland2014 | 2595 | 216 | 28:08.25 | 20964 |
| greenwich2015 | 190 | 32 | 02:03.11 | 1128 |
| kerguelen2005 | 200 | 200 | 03:31.37 | 2960 |
| maudrise2014 | 200 | 83 | 05:42.45 | 2360 |
| rosssea2014 | 176 | 176 | 08:51.00 | 104 |
| TOTAL | 6004 | 1292 | 72:40.20 | 58570 |
Validation set
| Dataset | Number of audio recordings | Total audio duration (hour) | Positive duration (h:m.s) | Total sound event |
| casey2017 | 187 | 187 | 06:06.42 | 3263 |
| kerguelen2014 | 200 | 200 | 11:23.02 | 8822 |
| kerguelen2015 | 200 | 200 | 07:23.30 | 5542 |
| TOTAL | 587 | 587 | 24:53.14 | 17627 |
A more complete version of this table is available here, with more statistics within the different classes and more information on the recording deployments.
Evaluation set
The evaluation set for this task will be released on June 1, 2025.
Data collection and curation
Original data were collected, curated, annotated and published by Miller, B.S., The IWC-SORP/SOOS Acoustic Trends Working Group., Balcazar, N. et al. An open access dataset for developing automated detectors of Antarctic baleen whale sounds and performance evaluation of two commonly used detectors. Sci Rep 11, 806 (2021). https://doi.org/10.1038/s41598-020-78995-8 (see Related works section)
Minor changes were brought to the original data and annotation set to fit the challenge formatting :
-
adopting more consistent naming conventions, particularly for file paths. All .wav filenames now follow the format
"YYYY-mm-ddTHH-MM-SS_fff.wv", and the same convention is used in the CSV files under the "filename" column. Additionally, the start and end timestamps for temporal annotation boxes are formatted as"YYYY-mm-ddTHH:MM:SS.ffffff+zz:zz", all in Zulu Time, - pooling together call-type-specific Raven tables into one CSV file per subdataset, named after the site-year annotated,
- resampling all audio files at 250 Hz, the minimal original sample rate of several subdatasets. This was done to standardize sample rates and because the vocalizations of interest are very low-frequency calls (15–120 Hz), ensuring no loss of information at sampling rate of 250 Hz
Open access
Data used in this study are publicly available under a Creative Commons 4.0 Attribution licence. It is attributed to Miller, B.S., The IWC-SORP/SOOS Acoustic Trends Working Group., Balcazar, N. et al.They can be accessed via the Australian Antarctic Data Centre at https://data.aad.gov.au/metadata/records/AcousticTrends_BlueFinLibrary
Contact info
Please, send any feedback or question to :
Dorian CAZAU (dorian.cazau@ensta.fr) | Lucie JEAN-LABADYE (lucie.jean-labadye@sorbonne-universite.fr) | Cléa PARCERISAS (clea.parcerisas@vliz.be)
Files
biodcase_development_set.zip
Files
(5.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:f7899c93aff78a7fa12f7232cde5aa79
|
5.6 GB | Preview Download |
Additional details
Related works
- Is described by
- Publication: 10.1038/s41598-020-78995-8 (DOI)