Published March 24, 2018 | Version v2
Dataset Open

BirdVox-DCASE-20k: a dataset for bird audio detection in 10-second clips

  • 1. Cornell Lab of Ornithology
  • 2. New York University


BirdVox-DCASE-20k: a dataset for bird audio detection in 10-second clips
Version 2.0, March 2018.

Created By

Vincent Lostanlen (1, 2, 3), Justin Salamon (2, 3), Andrew Farnsworth (1), Steve Kelling (1), and Juan Pablo Bello (2, 3).

(1): Cornell Lab of Ornithology (CLO)
(2): Center for Urban Science and Progress, New York University
(3): Music and Audio Research Lab, New York University



The BirdVox-DCASE-20k dataset contains 20,000 ten-second audio recordings. These recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the fall 2015. They were captured on the night of September 23rd, 2015, by six different sensors, originally numbered 1, 2, 3, 5, 7, and 10.

Out of these 20,000 recording, 10,017 (50.09%) contain at least one bird vocalization (either song, call, or chatter).

The dataset is a derivative work of the BirdVox-full-night dataset [1], containing almost as much data but formatted into ten-second excerpts rather than ten-hour full night recordings.

In addition, the BirdVox-DCASE-20k dataset is provided as a development set in the context of the "Bird Audio Detection" challenge, organized by DCASE (Detection and Classification of Acoustic Scenes and Events) and the IEEE Signal Processing Society.

The dataset can be used, among other things, for the development and evaluation of bioacoustic classification models.

We refer the reader to [1] for details on the distribution of the data and [2] for details on the hardware of ROBIN recording units.

[1] V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, J.P. Bello. "BirdVox-full-night: a dataset and benchmark for avian flight call detection", Proc. IEEE ICASSP, 2018.

[2] J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring. PLoS One, 2016.


Data Files

The wav folder contains the recordings as WAV files, sampled at 44,1 kHz, with a single channel (mono). The original sample rate was 24 kHz.

The name of each wav file is a random 128-bit UUID (Universal Unique IDentifier) string, which is randomized with respect to the origin of the recording in BirdVox-full-night, both in terms of time (UTC hour at the start of the excerpt) and space (location of the sensor).

The origin of each 10-second excerpt is known by the challenge organizers, but not disclosed to the participants.


Metadata Files

A table containing a binary label "hasbird" associated to every recording in BirdVox-DCASE-20k is available on the website of the DCASE "Bird Audio Detection" challenge:

These labels were automatically derived from the annotations of avian flight call events in the BirdVox-full-night dataset.

If your evaluation procedure requires the precise timestamps of each avian flight call (at a fine time scale of 50 ms), and is agnostic to non-flight call avian vocalizations (e.g. geese, crows, owls, etc.), we kindly suggest you to use the BirdVox-full-night dataset rather than BirdVox-DCASE-20k:

On the other hand, if your evaluation procedure encompasses all avian vocalizations, and is performed at a coarse time scale of 10 seconds, then BirdVox-DCASE-20k is the appropriate dataset.

The annotation campaign of avian flight calls in BirdVox-full-night was performed by Andrew Farnsworth and lasted 102 hours.

The additional annotation campaign of non-flight call avian vocalizations was performed by Vincent Lostanlen and lasted 10 hours.

The accuracy of the labeling is estimated to be somewhere between 99.5% (100 mislabelings) and 99.95% (10 mislabelings).

Please Acknowledge BirdVox-DCASE-20k in Academic Research

When BirdVox-70k is used for academic research, we would highly appreciate it if  scientific publications of works partly based on this dataset cite the  following publication:

V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, J. Bello. "BirdVox-full-night: a dataset and benchmark for avian flight call detection", Proc. IEEE ICASSP, 2018.

  title = {BirdVox-full-night: a dataset and benchmark for avian flight call detection},
  author = {Lostanlen, Vincent and Salamon, Justin and Farnsworth, Andrew and Kelling, Steve and Bello, Juan Pablo},
  booktitle = {Proc. IEEE ICASSP},
  year = {2018},
  published = {IEEE},
  venue = {Calgary, Canada},
  month = {April},

The creation of this dataset was supported by NSF grants 1125098 (BIRDCAST) and 1633259 (BIRDVOX), a Google Faculty Award, the Leon Levy Foundation, and two anonymous donors.


Conditions of Use

Dataset created by Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, Steve Kelling, and Juan Pablo Bello.

The BirdVox-DCASE-20k dataset is offered free of charge under the terms of the Creative  Commons Attribution 4.0 International (CC BY 4.0) license:

The dataset and its contents are made available on an "as is" basis and without  warranties of any kind, including without limitation satisfactory quality and  conformity, merchantability, fitness for a particular purpose, accuracy or  completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, Cornell Lab of Ornithology is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-DCASE-20k dataset or any part of it.



Please help us improve BirdVox-DCASE-20k by sending your feedback to:
* Vincent Lostanlen: for feedback regarding data pre-processing,
* Andrew Farnsworth: for feedback regarding data collection and ornithology, or
* Dan Stowell: for feedback regarding the DCASE "Bird Audio Detection" challenge.

In case of a problem, please include as many details as possible.



We thank Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes for designing autonomous recording units and collecting data.
We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy.



Files (16.5 GB)

Name Size Download all
16.5 GB Preview Download

Additional details


  • V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, J. Bello. BirdVox-full-night: a dataset and benchmark for avian flight call detection. Proc. IEEE ICASSP, 2018.