There is a newer version of the record available.

Published December 19, 2021 | Version 1.0
Dataset Open

BirdVox-296h: a large-scale dataset for detection and classification of flight calls

  • 1. Cornell Lab of Ornithology
  • 2. LS2N
  • 3. Adobe Research
  • 4. New York University

Description

BirdVox 296 hours dataset (BirdVox-296h)
====================================

Version 1.0, December 2021.


Created By
----------

Andrew Farnsworth (1), Benjamin Mark Van Doren (1), Steve Kelling (1), Vincent Lostanlen (2), Justin Salamon (3), Aurora Cramer (4), Juan Pablo Bello (4)

(1): Cornell Lab of Ornithology (CLO)
(2): Laboratoire des Sciences du Numérique de Nantes (LS2N), CNRS
(3): Adobe Research
(4): New York University

https://wp.nyu.edu/birdvox

 

Description
---------------

The BirdVox-296h dataset contains 148 audio recordings, each two hours in duration. These recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the fall 2015. They were captured on the night of September 23rd, 2015, by nine different sensors, originally numbered 1, 2, 3, 4, 5, 6, 7, 8, and 10.

Ornithologist Andrew Farnsworth used the Raven software to pinpoint and label every avian flight call in time and frequency. He found 26138 sound events, of which 21546 are flight calls from Passeriformes. Of those, 13385 are identifiable in terms of family, and 8669 are identifiable in terms of both family and species. The annotation process took over 600 hours.

The dataset can be used, among other things, for the research, development and testing of machine listening models for bird migration monitoring.

 

Data Files
------------

The BirdVox-296h_wav folder contains 148 recordings as WAV files, sampled at 24 kHz, with a single channel (mono). Each recording lasts exactly two hours and is named according to the following format:

YYYY-MM-DD_hh-mm-ss_unitUU.wav

Where Y means Year, M means Month, D means Day, h means hour, m means minute, and s means second. This date format corresponds to the start time of the recording file, expressed in Coordinated Universal Time (UTC).

The field UU contains two digits corresponding to the identifier of the autonomous recording unit (i.e., bioacoustic sensor). UU is either equal to 01, 02, 03, 04, 05, 06, 07, 08, or 10. Note that 09 is absent from the list because sensor 09 failed during the acquisition campaign.

 

Metadata Files
-------------------

The BirdVox-296h_csv-annotations folder contains CSV files, one for each audio file. The columns of each CSV file are:

ID,Time (s),Frequency (Hz),Taxonomy Code,Fine Label,Medium Label,Coarse Label


"Taxonomy Code" is compliant with the BirdVoxClassify software: github.com/BirdVox/BirdVoxClassify

"Fine Label", "Medium Label", and "Coarse Label" most often correspond to species, family and order respectively.

 

The BirdVox-296h_gps-coordinates.csv file contains the approximate GPS coordinates of the sensors (latitudes and longitudes rounded to 2 decimal points) of all nine sensors.

 

 

Conditions of Use
-----------------

Dataset created by Andrew Farnsworth, Steve Kelling, Vincent Lostanlen, Justin Salamon, Aurora Cramer, and Juan Pablo Bello.

The BirdVox-full-night dataset is offered free of charge under the terms of the Creative  Commons Attribution 4.0 International (CC BY 4.0) license:
https://creativecommons.org/licenses/by/4.0/

The dataset and its contents are made available on an "as is" basis and without  warranties of any kind, including without limitation satisfactory quality and  conformity, merchantability, fitness for a particular purpose, accuracy or  completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, Cornell Lab of Ornithology is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-full-night dataset or any part of it.

 

Feedback
-------------

Please help us improve BirdVox-296h by sending your feedback to:
vincent.lostanlen@ls2n.fr and af27@cornell.edu

In case of a problem, please include as many details as possible.

 

Acknowledgements
--------------------------

Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes.

We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy.

Files

BirdVox-296h_csv-annotations.zip

Files (36.3 GB)

Name Size Download all
md5:fbc7b36cbe0b0d8e407dbb97a0c9bed6
451.6 kB Preview Download
md5:0e00f9c420d231bb5061e32d143376ec
204 Bytes Preview Download
md5:b7beea5fcd1488421b356612bf45c99c
36.3 GB Preview Download