Published January 21, 2022 | Version 1.0
Dataset Open

BirdVox-25SD: a dataset of flight calls with species annotations

  • 1. Cornell Lab of Ornithology
  • 2. LS2N
  • 3. Adobe Research
  • 4. New York University

Description

BirdVox 25 Species Dataset (BirdVox-25SD)
=============
Version 1.0, Jan 2021.

Created By
----------

Andrew Farnsworth (1), Benjamin Mark Van Doren (1), Steve Kelling (1), Vincent Lostanlen (2), Justin Salamon (3), Aurora Cramer (4), Juan Pablo Bello (4)

(1): Cornell Lab of Ornithology (CLO)
(2): Laboratoire des Sciences du Numérique de Nantes (LS2N), CNRS
(3): Adobe Research
(4): New York University

https://wp.nyu.edu/birdvox


Description
-----------

The BirdVox 25 Species Dataset (BirdVox-25SD) contains 26,124 audio clips of avian flight calls, each ranging from about 150 ms to 500 ms in duration. The clips are extracted from the BirdVox-296h dataset using the corresponding annotations. The recordings come from ROBIN autonomous recording units, placed near Ithaca, NY, USA during the 2015 migration season (August - November).

The dataset can be used, among other things, for the research,
development and testing of bioacoustic classification models.

For details on the hardware of ROBIN recording units, we refer the reader to [1].

[1] J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the Automatic Classification of Avian Flight Calls for Bioacoustic Monitoring. PLoS One, 2016.


Changes from BirdVox 14-SD
----------------------------

This dataset builds upon the BirdVox 14 Species Dataset (BirdVox-14SD), adding ~12,000 audio clips and annotations. The annotation taxonomy has been expanded to add a new order, a new family, and 11 new species. Additionally, the audio clips are more accurately aligned to the annotation times.

For backwards compatibility with the BirdVox-14SD taxonomy, we include the file `birdvox25sd-to-birdvox14sd-taxonomy-code-map.csv` which maps BirdVox-25SD taxonomy codes to BirdVox-14SD taxonomy codes.


Taxonomic Annotations
-----------------------

Classification annotations for each flight call are given at three taxonomic levels: order, family, and species. These annotations are condensed into a three-number-code which largely follow "..". The specific numeric codes are:

* Order
    * 1.\*.\* - Passeriformes
    * 2.\*.\* - Pelecaniformes
* Family
    * 1.1.\* - American Sparrow
    * 1.2.\* - Cardinals
    * 1.3.\* - Thrushes
    * 1.4.\* - New World warblers
    * 2.1.\* - Herons
* Species
    * 1.1.1  - American tree sparrow (ATSP)
    * 1.1.2  - Chipping sparrow (CHSP)
    * 1.1.3  - Savannah sparrow (SAVS)
    * 1.1.4  - White-throated sparrow (WTSP)
    * 1.1.5  - Song sparrow (SOSP)
    * 1.2.1  - Rose-breasted grosbeak (RBGR)
    * 1.3.1  - Gray-cheeked thrush (GCTH)
    * 1.3.2  - Swainson's thrush (SWTH)
    * 1.3.3 -  Hermit thrush (HETH)
    * 1.3.4 -  Veery (VEER)
    * 1.3.5 -  Wood thrush (WOTH)
    * 1.4.1  - American redstart (AMRE)
    * 1.4.2  - Bay-breasted warbler (BBWA)
    * 1.4.3  - Black-throated blue warbler (BTBW)
    * 1.4.4  - Canada warbler (CAWA)
    * 1.4.5  - Common yellowthroat (COYE)
    * 1.4.6  - Mourning warbler (MOWA)
    * 1.4.7  - Ovenbird (OVEN)
    * 1.4.8  - Black-and-white warbler (BAWW)
    * 1.4.9  - Cape May warbler (CMWA)
    * 1.4.10 - Chestnut-sided warbler (CSWA)
    * 1.4.11 - Northern Parula (NOPA)
    * 1.4.12 - Wilson's warbler (WIWA)
    * 1.4.13 - Yellow-rumped warbler (YRWA)
    * 2.1.1  - Green heron (GRHE)

Additionally, at any level of the taxonomy, the numeric code "0" is reserved for "other" and the code "X" refers to unknown. For example, 1.1.0 corresponds to an American Sparrow with a species outside of our scope of interest, and 1.1.X corresponds to an American Sparrow of unknown species. At the top level (family), the "other" codes (0.\*.\*) deviate from the family-order-species in order to capture a variety of other out-of-scope sounds, including anthropophony, non-avian biophony, and biophony of avians outside of the scope of interest. Please refer to `BirdVox-296h_taxonomy.yaml` in BirdVox-296h for the details of this taxonomy structure.


Data Files
------------

BirdVox-25SD contains the recordings as HDF5 files, sampled at 22,050 Hz, with a single channel (mono). Each HDF5 file contains flight call vocalizations of a particular species. The name of each HDF5 file follows the format: `BirdVox-25SD-v1pt0_{taxonomy_code}_original.h5`. The name of the HDF5 dataset in each file is "waveforms", with the corresponding key for each audio recording following the format: `unit-{unit_num}`.


Conditions of Use
----------------------

Dataset created by Andrew Farnsworth, Steve Kelling, Vincent Lostanlen, Justin Salamon, Aurora Cramer, and Juan Pablo Bello.

The BirdVox-25SD dataset is offered free of charge under the terms of the Creative Commons Attribution 4.0 International License.

The dataset and its contents are made available on an "as is" basis and without  warranties of any kind, including without limitation satisfactory quality and  conformity, merchantability, fitness for a particular purpose, accuracy or  completeness, or absence of errors. Subject to any liability that may not be excluded or limited by law, CLO is not liable for, and expressly excludes all liability for, loss or damage however and whenever caused to anyone by any use of the BirdVox-25SD dataset or any part of it.


Feedback
-----------

Please help us improve BirdVox-25SD by sending your feedback to:
vincent.lostanlen@gmail.com and auroracramer@nyu.edu

In case of a problem, please include as many details as possible.


Acknowledgements
------------------------
Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes.

We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy.

The creation of this dataset was supported by NSF grants 1633259 (BIRDVOX).

Notes

Jessie Barry, Ian Davies, Tom Fredericks, Jeff Gerbracht, Sara Keen, Holger Klinck, Anne Klingensmith, Ray Mack, Peter Marchetto, Ed Moore, Matt Robbins, Ken Rosenberg, and Chris Tessaglia-Hymes. We acknowledge that the land on which the data was collected is the unceded territory of the Cayuga nation, which is part of the Haudenosaunee (Iroquois) confederacy. The creation of this dataset was supported by NSF grants 1633259 (BIRDVOX).

Files

birdvox25sd-to-birdvox14sd-taxonomy-code-map.csv

Files (2.4 GB)

Name Size Download all
md5:eb06c73765eeb69b98bd6a74c75faf07
98.1 kB Download
md5:d63b0da648802a723a05077dfc008ce3
11.6 MB Download
md5:7b5ed7259cef0eb324b33b130950f356
369.4 kB Download
md5:2197ca002cf0668b96a5feb5f6f4ee1c
13.2 MB Download
md5:15cdedbc1ac25d057b3105701f4430f1
1.1 MB Download
md5:c79a9fe15aa04e39b813ec39e240435c
459.7 kB Download
md5:587b11d02d2d8e67cfd6738e9a270742
158.7 MB Download
md5:21b874e8d32bea53d0145f4b9bc03867
7.1 MB Download
md5:59de32f785dead4e96f262332e4882f6
4.3 MB Download
md5:850ca9c50028322a82dc2e615fcdebe9
19.0 MB Download
md5:f5a1836c326f5dce0d2f8f4c0dee65e3
3.5 MB Download
md5:71be9b76827150716432199b98b2840a
98.1 kB Download
md5:d511eb27d5eca416238e76ec224fe179
2.3 MB Download
md5:115242ef47e76d7a006abec0f20137b0
5.2 MB Download
md5:a5211ec6018629af81697babf5f4871a
27.7 MB Download
md5:5bd385d96f7da283d0329241db941408
10.9 MB Download
md5:081910b78d9e5f2076367eb8f834c92d
13.8 MB Download
md5:c8c8c4611286b6817e710f9d4b089f28
10.5 MB Download
md5:4f6ebdec3b3d45b6f2e6b3b5c595d342
27.6 MB Download
md5:c8c051fc55551ecf5b8e739d2f95c325
35.8 MB Download
md5:803826e50d61cbb4a0762b055971d446
1.3 MB Download
md5:16e1e0f35c8ed921f8e93b366cf055a1
91.2 MB Download
md5:9acb371ad5221217a68219fdfcbd59d7
2.6 MB Download
md5:ec54ccc2dab017984320cb0038d5a5d4
40.7 MB Download
md5:b4a7105fa2f7b465caad93b9a0c98945
98.1 kB Download
md5:cf98fc28e978fb3c599941ee09a3b4e5
10.1 MB Download
md5:7b268bbe7c80743937ae68a30ff991f0
13.9 MB Download
md5:0ebff44749dce010b0639196baa84b3f
312.1 MB Download
md5:29a124d50f27d2dc57d6406ed0b97e00
5.0 MB Download
md5:3be32928a785a567bcbddad16bc0360f
73.8 MB Download
md5:61c3a36bbbf251427a47cdd0fef5669e
33.0 MB Download
md5:b4fa095491f419e1582c245d2889a8fc
112.9 MB Download
md5:4365ad379979bf01642ac95775980761
20.6 MB Download
md5:aef305098edbb44d6e666dc914a8b4ac
3.7 MB Download
md5:e5c6a9cff29f49e7ba94e3cc3644d941
3.0 MB Download
md5:b66436246ef81e2b48fc4336ddf3c0b2
2.4 MB Download
md5:ba4e6c15db18d9c5a446d7947b8c2ceb
1.9 MB Download
md5:57b0bf5c0520e3da77e67b3eb5ff8dfb
13.6 MB Download
md5:d2f3ed312003f98359b22be07ab63999
11.3 MB Download
md5:6a69edde81c45686730c9d43362369d4
17.4 MB Download
md5:969de8ce9493b74a820193c3d48b2d3d
1.5 MB Download
md5:39296de18abdb74dc50d332264160aa4
26.3 MB Download
md5:55fc8f599f6424f87742de027d5c4561
1.4 MB Download
md5:a4653633ec65ec8391f9685fc7077fbf
149.8 MB Download
md5:b012ebd651a02fdecb49e04a51f92314
4.0 MB Download
md5:d4c505ebc3f5621a776ca0864a8b2473
23.9 MB Download
md5:540380d1931e87e34429050945f19526
207.1 MB Download
md5:c330f8d4db20a8c4f4c02d43f7b627f5
741.9 MB Download
md5:ea249ef5f013c9fe2ca320d38d3cb40c
1.0 MB Download
md5:72e99bcec74699f4e3bd1e4f2c24a4dc
825.6 kB Download
md5:6d2aa8d15d8967ab6395dceaeafc99d0
279.0 kB Download
md5:caffac69e46edad879be466b5b842ba5
94.2 MB Download
md5:91a525d4b1a3b9d29aeb08735917e604
954 Bytes Preview Download

Additional details

Related works

References
Dataset: 10.5281/zenodo.5856260 (DOI)
Dataset: 10.5281/zenodo.3667094 (DOI)