There is a newer version of the record available.

Published October 28, 2017 | Version v1
Dataset Open

BirdVox-70k: a dataset for avian flight call detection

  • 1. Cornell Lab of Ornithology, Cornell University, Ithaca, NY, USA
  • 2. Music and Audio Research Lab, New York University, New York, NY, USA

Description

BirdVox-70k
=========
Version 1.0, October 2017.

Created By
----------

Vincent Lostanlen (1, 2, 3), Justin Salamon (2, 3), Andrew Farnsworth (1), Steve Kelling (1), and Juan Pablo Bello (2, 3).

(1): Cornell Lab of Ornithology (CLO)
(2): Center for Urban Science and Progress, New York University
(3): Music and Audio Research Lab, New York University

https://wp.nyu.edu/birdvox


Description
-----------

The BirdVox-70k dataset contains 6 audio recordings, each about ten hours in duration. These recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the fall 2015. They were captured on the night of September 23rd, 2015, by six different sensors, originally numbered 1, 2, 3, 5, 7, and 10.

Andrew Farnsworth used the Raven software to pinpoint every avian flight call in time and frequency. He found 35402 flight calls in total. He estimates that about 25 different species of passerines (thrushes, warblers, and sparrows) are present in this recording. Species are not labeled in BirdVox-70k, but it is possible to tell apart thrushes from warblers and sparrrows by looking at the center frequencies of their calls. The annotation process took 102 hours.

The dataset can be used, among other things, for the research,
development and testing of bioacoustic classification models, including the reproduction of the results reported in [1].

For details on the hardware of ROBIN recording units, we refer the reader to [2].

[1] V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, J. Bello. BirdVox-70k: a dataset and benchmark for avian flight call detection, submitted, 2018.

[2] J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring. PLoS One, 2016.


Data Files
----------

The BirdVox-70k_full-night-audio folder contains the recordings as FLAC files, sampled at 24 kHz, with a single channel (mono).


Metadata Files
--------------

The BirdVox-70k_annotations folder contains CSV files, where each row correspond to a different location in the time frequency domain (columns "Center Time (s)" and "Center Freq (Hz)").
/!\ CAUTION: in addition to the 35402 flight calls, Andrew Farnsworth pinpointed 29 artificial beeps produced by the recording device itself. These beeps are labeled as "alarm" instead of "flight call". For collecting positive samples for avian flight call detection, make sure you filter out the rows corresponding to alarms.

The approximate GPS coordinates of the sensors (latitudes and longitudes rounded to 2 decimal points) and UTC timestamps corresponding to the start of the recording for each sensor are included as CSV files in the main directory.


Please Acknowledge BirdVox-70k in Academic Research
------------------------------------------------------

When BirdVox-70k is used for academic research, we would highly appreciate it if  scientific publications of works partly based on this dataset cite the  following publication:

V. Lostanlen, J. Salamon, A. Farnsworth, S. Kelling, J. Bello, “BirdVox-70k: a dataset and benchmark for avian flight call detection”, submitted.

The creation of this dataset was supported by NSF grants 1125098 (BIRDCAST) and 1633259 (BIRDVOX), a Google Faculty Award, the Leon Levy Foundation, and two anonymous donors.


Conditions of Use
-----------------

Dataset created by Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, Steve Kelling, and Juan Pablo Bello.
 
The BirdVox-70k dataset is offered free of charge under the terms of the Creative  Commons CC0 1.0 Universal License:
https://creativecommons.org/publicdomain/zero/1.0/
 
The dataset and its contents are made available on an "as is" basis and without  warranties of any kind, including without limitation satisfactory quality and  conformity, merchantability, fitness for a particular purpose, accuracy or  completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, CLO is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-70k dataset or any part of it.


Feedback
--------

Please help us improve BirdVox-70k by sending your feedback to:
vincent.lostanlen@gmail.com and af27@cornell.edu

In case of a problem, please include as many details as possible.

 

Acknowledgements
-------------------
Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes.

Files

BirdVox-70k_full-nights.zip

Files (5.6 GB)

Name Size Download all
md5:8db662f19607ad9d38d76b5f37171aeb
5.6 GB Preview Download