A collection of fully-annotated soundscape recordings from the Northeastern United States
This collection contains 285 hour-long soundscape recordings which have been annotated by expert ornithologists who provided 50,760 bounding box labels for 81 different bird species from the Northeastern USA. The data were recorded in 2017 in the Sapsucker Woods bird sanctuary in Ithaca, NY, USA. This collection has (partially) been used as test data in the 2019, 2020 and 2021 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.
As part of the Sapsucker Woods Acoustic Monitoring Project (SWAMP), the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology deployed 30 first-generation SWIFT recorders in the surrounding bird sanctuary area in Ithaca, NY, USA. The sensitivity of the used microphones was -44 (+/-3) dB re 1 V/Pa. The microphone's frequency response was not measured, but is assumed to be flat (+/- 2 dB) in the frequency range 100 Hz to 7.5 kHz. The analog signal was amplified by 33 dB and digitized (16-bit resolution) using an analog-to-digital converter (ADC) with a clipping level of -/+ 0.9 V. This ongoing study aims to investigate the vocal activity patterns and seasonally changing diversity of local bird species. The data are also used to assess the impact of noise pollution on the behavior of birds. Recordings were recorded 24 h/day in 1-hour uncompressed WAVE files at 48 kHz, converted to FLAC and resampled to 32 kHz for this collection. Parts of this dataset have previously been used in the 2019, 2020 and 2021 BirdCLEF competition.
Sampling and annotation protocol
We subsampled data for this collection by randomly selecting one 1-hour file from one of the 30 different recording units for each hour of one day per week between Feb and Aug 2017. For this collection, we excluded recordings that were shorter than one hour or did not contain a bird vocalization. Annotators were asked to box every bird call they could recognize, ignoring those that are too faint or unidentifiable. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. Annotators were allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than five seconds. We use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list).
Files in this collection
Audio recordings can be accessed by downloading and extracting the “soundscape_data.zip” file. Soundscape recording filenames contain a sequential file ID, recording date and timestamp in UTC. As an example, the file “SSW_001_20170225_010000Z.flac” has sequential ID 001 and was recorded on Feb 25th 2017 at 01:00:00 UTC. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. The approximate recording location with longitude and latitude can be found in the “recording_location.txt” file.
Compiling this extensive dataset was a major undertaking, and we are very thankful to the domain experts who helped to collect and manually annotate the data for this collection (individual contributors in alphabetic order): Jessie Barry, Sarah Dzielski, Cullen Hanks, Robert Koch, Jim Lowe, Jay McGowan, Ashik Rahaman, Yu Shiu, Laurel Symes, and Matt Young.