Published February 9, 2019 | Version 1.0
Dataset Open

BirdVox-scaper-10k: a synthetic dataset for multilabel species classification of flight calls from 10-second audio recordings

  • 1. Forest Hills High School
  • 2. Cornell Lab of Ornithology
  • 3. New York University

Description

BirdVox-scaper-10k: a synthetic dataset for multilabel species classification of flight calls from 10-second audio recordings
=============================================================================================
Version 1.0, September 2019.

 

Created By
-------------

Elizabeth Mendoza (1), Vincent Lostanlen (2, 3, 4), Justin Salamon (3, 4), Andrew Farnsworth (2), Steve Kelling (2), and Juan Pablo Bello (3, 4).

 

(1): Forest Hills High School, New York, NY, USA
(2): Cornell Lab of Ornithology, Cornell University, Ithaca, NY, USA
(3): Center for Urban Science and Progress, New York University, New York, NY, USA
(4): Music and Audio Research Lab, New York University, New York, NY, USA

https://wp.nyu.edu/birdvox

 

Description
--------------

The BirdVox-scaper-10k dataset contains 9983 artificial soundscapes. Each soundscape lasts exactly ten seconds and contains one or several avian flight calls from up to 30 different species of New World warblers (Parulidae). Alongside each audio file, we include an annotation file describing the start time and end time of each flight call in the corresponding soundscape, as well as the species of warbler it belongs to.

In order to synthesize soundscapes in BirdVox-scaper-10k, we mixed natural sounds from various pre-recorded sources. First, we extracted isolated recordings of flight calls containing little or no background noise from the CLO-43SD dataset [1]. Secondly, we extracted 10-second "empty" acoustic scenes from the BirdVox-DCASE-20k dataset [2]. These acoustic scenes contain various sources of real-world background noise, including biophony (insects) and anthropophony (vehicles), yet are guaranteed to be devoid of any flight calls. Lastly, we "fill" each acoustic scene by mixing it with flight calls sampled at random.

Although the BirdVox-scaper-10k does not consist of natural recordings, we have taken several measures to ensure the plausibility of each synthesized soundscape, both from qualitative and quantitative standpoints.

The BirdVox-scaper-10k dataset can be used, among other things, for the research, development, and testing of bioacoustic classification models.

For details on the hardware of ROBIN recording units, we refer the reader to [2].

[1] J. Salamon, J. Bello. Fusing shallow and deep learning for bioacoustic bird species classification. Proc. IEEE ICASSP, 2017.

[2] V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, and J. Bello. BirdVox-full-night: a dataset and benchmark for avian flight call detection. Proc. IEEE ICASSP, 2018.

[3] J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring. PLoS One, 2016.

 

 

 

@inproceedings{lostanlen2018icassp,
  title = {BirdVox-full-night: a dataset and benchmark for avian flight call detection},
  author = {Lostanlen, Vincent and Salamon, Justin and Farnsworth, Andrew and Kelling, Steve and Bello, Juan Pablo},
  booktitle = {Proc. IEEE ICASSP},
  year = {2018},
  published = {IEEE},
  venue = {Calgary, Canada},
  month = {April},
}

Files

BirdVox-scaper-10k.zip

Files (8.3 GB)

Name Size Download all
md5:2cf5b68ef340f55d48b055d76fab9ddd
8.3 GB Preview Download