Voxaboxen Data

Hoffman, Benjamin; Mahon, Louis; James, Logan; Cusimano, Maddie; Hagiwara, Masato; Woolley, Sarah; Pietquin, Olivier

doi:10.5281/zenodo.15507508

Published May 24, 2025 | Version v1

Dataset Open

Voxaboxen Data

1. McGill University

Overview

Detecting the sounds produced by animals is the foundation of bioacoustics research. This task must often be performed using noisy recordings that include many overlapping sounds from multiple individuals. Identifying each individual acoustic unit is necessary for a diversity of tasks, including species recognition and population estimation, which are critical to research on topics such as ecology and conservation.

This dataset consists of eight real component datasets for evaluating bioacoustic sound event detection performance, as well as six synthetic component datasets. Each dataset consists of several audio recordings. Annotations consist of the start- and stop-times of each event of interest, as well a class label.

Component datasets

This dataset consists of eight real component datasets which are used to evaluate bioacoustic sound event detection performance. Seven of these datasets are derived from data that appeared in previous publications. For the license and original citation for each component dataset, please see the license file for each dataset. If you are using a component dataset, please cite our paper (see below) in addition to the original work. Dataset characteristics are summarized below:

Dataset	N. Files (train/val/test)	N. Classes	Dur. (hr) (train/val/test)	N. Events (train/val/test)	Mean event dur. (sec)	Location	Taxa
Anuraset (AnSet)	967/322/323	10	16.09/5.37/5.37	4279/1893/1635	6.23	Brazil	Anura
BirdVox-10h (BV10)	5/5/5	1	6.00/2.00/2.00	4196/1064/3764	0.15	New York, USA	Passeriformes
Hawaii Birds (HawB)	379/126/130	9	30.48/10.05/10.35	33372/11209/11132	1.11	Hawaii, USA	Aves
Humpback (HbW)	388/125/129	1	8.08/2.60/2.69	2952/959/865	0.99	North Pacific Ocean	Megaptera novaeangliae
Katydids (Katy)	16/5/6	1	2.66/0.83/1.00	7434/1550/2977	0.17	Panama	Tettigoniidae
Meerkat (MT)	2/2/2	1	0.76/0.25/0.25	773/269/252	0.15	South Africa	Suricata suricatta
Powdermill (Pow)	44/14/19	6	3.67/1.17/1.58	5138/2276/2505	1.11	Pennsylvania, USA	Passeriformes
Overlapping Zebra Finch (OZF)	46/6/13	1	0.77/0.10/0.22	5514/1246/1744	0.11	Laboratory	Taeniopygia castanotis

This dataset also consists of six synthetic component datasets Overlapping Zebra Finch Synthetic (OZF Synthetic) x, where x can be any of [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]. The value of x is the ratio of the number of overlapping call pairs, to the number of calls. Each of these synthetic component datasets has the same characteristics. These were designed to mirror those of the real OZF dataset:

Dataset	N. Files (train/val/test)	N. Classes	Dur. (hr) (train/val/test)	N. Events (train/val/test)	Mean event dur. (sec)	Location	Taxa
OZF Synthetic x	65	1	1.08	5514/1246/1744	0.13	Synthetic	Taeniopygia castanotis

Dataset details

Each dataset consists of three info files: train_info.csv, val_info.csv, and test_info.csv, which define splits into train, validation, and test sets. Each info file has the following columns, where each row corresponds to one audio file:

- fn: Each audio file has a unique filename associated with it.
- audio_fp: Relative filepath to audio file.
- selection_table_filepath: Relative filepath to annotations, which are in the form of a Raven selection table

Selection tables are tab-separated `.txt` files. Each row corresponds to one annotated audio event (typically, a vocalization). Each selection table has (at least) the following columns:

- Begin Time (s): The time the event starts in the audio file
- End Time (s): The time the event ends in the audio file
- Annotation: A label (e.g. species) associated with the audio event

Some selection tables have no rows. This indicates that no events of interest occur in the corresponding recording.

Associated papers and code

If you use this data please cite the associated paper, which is currently available at https://arxiv.org/abs/2503.02389.

The code associated with the paper can be found at https://github.com/earthspecies/voxaboxen.

If you use any of the datasets besides OZF and OZF synthetic, please also cite the original work (found in the license file for each dataset).

Files

AnSet.zip

Files (22.4 GB)

Name	Size	Download all
AnSet.zip md5:1ee4e9f506ad13ac252a231d3c8dac3e	7.2 GB	Preview Download
BV.zip md5:214cc0e4743641a682ee25030ee8e5bd	1.2 GB	Preview Download
HawB.zip md5:9b5cc24a6f377bfd17c777cf3bad8bcc	8.6 GB	Preview Download
HbW.zip md5:2f185b86e25f80df0a634d9520cc850d	529.7 MB	Preview Download
Katy.zip md5:643871706f49b70f58c70dae3f048414	2.4 GB	Preview Download
MT.zip md5:c82ee21ca7fe906e71abd2789ea462ad	24.4 MB	Preview Download
OZF.zip md5:12f85aded4f0f94c3cabb1f7f1a5048f	352.2 MB	Preview Download
OZF_synthetic.zip md5:2043ea73143d08ab221c408e430fcabd	639.4 MB	Preview Download
Pow.zip md5:1485f3860c7fad14ca542e149f7aebd8	1.4 GB	Preview Download
README.md md5:2f4d19cadc5928411b9064bbf75089b2	4.5 kB	Preview Download

Additional details

Is described by: Preprint: arXiv:2503.02389 (arXiv)

Repository URL: https://github.com/earthspecies/voxaboxen
Programming language: Python

	All versions	This version
Views	54	54
Downloads	79	79
Data volume	172.9 GB	172.9 GB

Voxaboxen Data

Overview

Component datasets

Dataset details

Associated papers and code

Files

AnSet.zip

Files (22.4 GB)

Additional details

Related works

Software

Voxaboxen Data

Creators

Description

Overview

Component datasets

Dataset details

Associated papers and code

Files

AnSet.zip

Files (22.4 GB)

Additional details

Related works

Software