There is a newer version of the record available.

Published February 13, 2020 | Version 1.0
Dataset Open

BirdVox-ANAFCC: A dataset for American Northeast Avian Flight Call Classification

  • 1. New York University
  • 2. Cornell Lab of Ornithology
  • 3. Old Bird, Inc.
  • 4. Adobe, Inc.

Description

BirdVox-ANAFCC: A dataset for American Northeast Avian Flight Call Classification
===============================================================
Version 1.0, May 2020.

https://wp.nyu.edu/birdvox


Description
---------------

BirdVox-ANAFCC is a dataset of short audio waveforms, each of them containing a flight call from one of 14 birds of North America: four American sparrows, one cardinal, two thrushes, and seven New World warblers.
* American Tree Sparrow (ATSP)
* Chipping Sparrow (CHSP)
* Savannah Sparrow (SAVS)
* White-throated Sparrow (WTSP)
* Red-breasted Grosbeak (RBGR)
* Gray-cheeked Thrush (GCTH)
* Swainson's Thrush (SWTH)
* American Redstart (AMRE)
* Bay-breasted Warbler (BBWA)
* Black-throated Blue Warbler (BTBW)
* Canada Warbler (CAWA)
* Common Yellowthroat (COYE)
* Mourning Warbler (MOWA)
* Ovenbird (OVEN)

It also contains other sounds which are often confused for one of the species above. These "confounding factors" encompass flight calls from other species of birds, vocalizations from non-avian animals, as well as some machine beeps.

BirdVox-ANAFCC results from an aggregation of various smaller datasets, integrated under a common taxonomy. For more details on this taxonomy, we refer the reader to [1]:

[1] J. Cramer, V. Lostanlen, J. Salamon, A. Farnsworth, J. Bello. Chirping up the right tree: Incorporating biological taxonomies into deep bioacoustic classifiers. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.


Data Files
------------
BirdVox-ANAFCC contains the recordings as HDF5 files, sampled at 22,050 Hz, with a single channel (mono). Each HDF5 file contains flight call vocalizations of a particular species. The name of each HDF5 file follows the format: `<data-source>_<taxonomy-code>_original.h5`. The name of the HDF5 dataset in each file is "waveforms", with the corresponding key for each audio recording varying in format depending on the data source.

 

Metadata Files
---------------
`taxonomy.yaml` details the three-level taxonomy structure used in this dataset, reflected in three-number-codes which largely follow "<family>.<order>.<species>". Additionally, at any level of the taxonomy, the numeric code "0" is reserved for "other" and the code "X" refers to unknown. For example, 1.1.0 corresponds to an American Sparrow with a species outside of our scope of interest, and 1.1.X corresponds to an American Sparrow of unknown species. At the top level (family), the "other" codes (0.\*.\*) deviate from the family-order-species in order to capture a variety of other out-of-scope sounds, including anthropophony, non-avian biophony, and biophony of avians outside of the scope of interest.


Please acknowledge BirdVox-ANAFCC in academic research
--------------------------------------------------------------------------

When BirdVox-ANAFCC is used for academic research, we would highly appreciate it if  scientific publications of works partly based on this dataset cite the following publication:

J. Cramer, V. Lostanlen, J. Salamon, A. Farnsworth, J. Bello. Chirping up the right tree: Incorporating biological taxonomies into deep bioacoustic classifiers. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020.

The creation of this dataset was supported by NSF grants 1125098 (BIRDCAST) and 1633259 (BIRDVOX), a Google Faculty Award, the Leon Levy Foundation, and two anonymous donors.

 

Conditions of Use
----------------------

Dataset created by Jason Cramer, Vincent Lostanlen, Bill Evans, Andrew Farnsworth, Justin Salamon, and Juan Pablo Bello.
 
The BirdVox-ANAFCC dataset is offered free of charge under the terms of the Creative Commons Attribution International License:
https://creativecommons.org/licenses/by/4.0/
 
The dataset and its contents are made available on an "as is" basis and without warranties of any kind, including without limitation satisfactory quality and conformity, merchantability, fitness for a particular purpose, accuracy or completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, the authors are not liable for, and expressly exclude all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-ANAFCC dataset or any part of it.


Feedback
-------------

Please help us improve BirdVox-full-night by sending your feedback to:
vincent.lostanlen@gmail.com and jtc440@nyu.edu

In case of a problem, please include as many details as possible.


Acknowledgement
--------------------------
Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes.

We thank contributors and maintainers of the Macaulay Library and the Xeno-Canto website.

We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy.

Files

Files (2.3 GB)

Name Size Download all
md5:de234c1eea42ddd8b9f208bdec1d9ad1
8.3 kB Download
md5:4d05edbb4dc7bd43aeec01df9c828e11
8.3 kB Download
md5:d65742877de1a2db341e1b65f47180cf
3.8 MB Download
md5:9b7d4837a15b87d58485c8af4dabcf2e
8.3 kB Download
md5:ccd2f898397cfd7c07a89ea0a49b06f1
8.3 kB Download
md5:5c46e0909402e9a32416854a41c0689a
8.3 kB Download
md5:b93d2dca1404080281f2f3d4643716aa
8.3 kB Download
md5:87ecca74194019d62bb625b06c7e2785
5.0 MB Download
md5:a98c2247c158fbdbacab5decd19a5bc0
8.3 kB Download
md5:4c640eeac747746a103664fd8d91b462
8.3 kB Download
md5:99ca398379c45c560c827a6677f00e29
8.3 kB Download
md5:57acc47ce7261e7ffc07d10e7c9e526e
61.9 MB Download
md5:3537f60d3539ad2baf528c4ef6071269
4.6 MB Download
md5:beefe7f6204dabd9de7a0ec5e8cfc2b8
917.4 kB Download
md5:05bcfa90618cd58a4c9f3aff6a8d102e
2.4 MB Download
md5:ebd1c2b175a818817808f5b35f3c612a
191.7 kB Download
md5:d0c6771fe7a1d9915ba555b8d4151226
883.1 kB Download
md5:e1326661a97415e641e6ecf089234920
4.9 MB Download
md5:1738e3cdd0f065ea872c134cb485fc37
142.6 MB Download
md5:c6aaf3f1c06e645b916a8c8cfec0ab97
167.7 MB Download
md5:39f81850133d5aaace6d42c86a5ad007
12.1 MB Download
md5:bce537e0449b484881ae220fc628a445
52.7 MB Download
md5:0d6b8da38b9bae43772419194dc4c3de
59.3 MB Download
md5:293ebc21a935022fe971c1305170fb3e
245.7 kB Download
md5:57b040016dd8ba5cb23c6634148b161a
28.5 MB Download
md5:5f7eec249fc0e039014fb77154a263cc
18.4 MB Download
md5:bd74f000e524e131a1d8d79fdbe8eaf2
45.9 MB Download
md5:89e57a9bbad84ba5830c40dbf41a578c
127.0 MB Download
md5:dbec2fa71e368a5907f00e8b4403c80c
151.1 MB Download
md5:1ac357a26fe23c4835604d602ef579a8
70.4 MB Download
md5:87a4de7e6583889279b953db5e80bc7a
46.8 kB Download
md5:8a7cd408b5fd4453c9469ea33a369c02
72.0 kB Download
md5:61a4548b024e71275d7c30b68cef3190
235.9 MB Download
md5:ff3436c66220858a7971f696a6c9ca1d
93.7 MB Download
md5:0b806bfcf003ee07859755833e3221cb
33.3 MB Download
md5:822ed2d2fe6b993492d412ebaec0959e
20.1 MB Download
md5:f582bc8278f7c6e4206d2664f337b9ef
79.0 MB Download
md5:b4eaf89ab37b3d49a9d6c9e705f17e77
11.1 MB Download
md5:197ce7304899c5d435bb82df7169aa3a
79.7 MB Download
md5:76acb58ce39eb64a529eb692b6e89fca
49.5 kB Download
md5:61c219f1babf1100f64f4a3957960055
827.8 MB Download
md5:1be0ead7a17b34da171d8edd7496dd3b
4.1 MB Download