Large-scale, Stratified, Fully Annotated Acoustic Forest Soundscape Dataset of Avian Vocalizations from Eastern North America
Authors/Creators
-
1.
Dartmouth College
- 2. Department of Forest and Wildlife Ecology, University of Wisconsin–Madison, Madison, WI, US
-
3.
Cornell University
- 4. Environmental Science and Studies Programs, Colby-Sawyer College, New London, NH
-
5.
University of Windsor
- 6. US Department of Interior, National Park Service, Northeast Temperate Network, 54 Elm St, Woodstock, VT.
- 7. Department of Biological Sciences, Dartmouth College, Hanover, New Hampshire, US.
- 8. Center for Avian Population Studies, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, USA
- 9. K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, USA.
Description
This dataset contains 1302 10-minute soundscape recordings that have been annotated by expert ornithologists, resulting in approximately ~176,000 vocalizations labelled for 88 bird species from the Northeastern USA. The data were recorded at 104 sites in four parks: Acadia National Park - Bar Harbor, Maine (ACAD, n=31), Hubbard Brook Experimental Forest - in White Mountains National Forest, New Hampshire (HBEF, n=24), Katahdin Woods and Waters National Monument - Millinocket, Maine (KAWW, n=20), and Marsh-Billings-Rockefeller National Historical Park - Woodstock, Vermont (MABI, n=25) in 2022 and 2023. These datasets are intended to facilitate reproducible research and to support the development and evaluation of automated bioacoustic analysis methods across ecology and machine learning. Additionally, these high-quality annotated datasets can be used to address other ecological and behavioral questions, including evaluating geographic variation in bird song types and song production, the temporal relationships between intra- and interspecific vocalizing individuals, and the phenology of singing behavior across species.
Data collection
We deployed SwiftOne recorders at 104 sites across four parks in the northeastern USA in 2022 and 2023. The microphone sensitivity on the recording unit was -25 dB (±3 dB) re 1 V/Pa. The microphone's frequency response was not measured, but is assumed to be flat (+/- 2 dB) in the frequency range 100 Hz to 10 kHz. The recordings were made at a sampling rate of 32 kHz. The analog signal was amplified by 33 dB and digitized at 16-bit resolution (analog-to-digital converter (ADC) clipping level of ±0.9 V). This ongoing study aims to investigate the vocal activity patterns and seasonally changing diversity of local bird species. Recordings were collected at 5 hours (05:00 - 10:00 local time) in the morning and 1 hour in the evening (19:30 - 20:30) as uncompressed 1-hour WAV files at 32 kHz, converted to FLAC. We then extracted 1302 10-minute recordings from this collection.
Sampling and annotation protocol
We provide a collection of 10-minute recordings as three datasets: DatasetSIMR, DatasetMABI, and DatsetACAD. For DatasetSIMR dataset, we selected 429 10-minute recordings corresponding to point-count surveys conducted concurrently in the field between 5-10 am at 104 sites across all four parks in 2022 and 2023. For the ACAD (n = 396) and MABI (n = 477) datasets, we extracted 10-minute recordings 40 minutes after the local sunrise on clear days. These clear days were selected during the peak breeding season (15 May- 7 Jul) with the goal of selecting approximately 20 mornings per site. At a few sites, we had fewer than 20 recordings due to equipment failure. For the MABI dataset, 477 10-minute recordings were collected at 24 sites in 2022. For the ACAD dataset, 396 10-minute recordings were collected from 23 sites, with recordings in 2022 (2 sites) and 2023 (21 sites).
Annotators created annotations box around every vocalization they could recognize, ignoring those that were too faint or unidentifiable. Raven Pro 1.6 software was used to annotate the data. The provided labels contain full bird vocalizations, boxed in time and frequency space. Annotators were allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than two seconds. We used the standard four-letter code for bird species in accordance with the 65th AOU supplement (Chesser et al., 2025).
Files in this collection
Please read the ReadMe.txt file for a complete description of each file included in the collection. All 10-minutere recordings for each dataset can be accessed by downloading and extracting the corresponding recordings zip file (e.g., DatasetSIMR_Recordings.zip). These recording filenames contain a file ID, site (recording location), date, and timestamp in EST. As an example, the file “6001.41.01x.ACAD3002_20220519_050000.flac” has file ID 6001 recorded at site "ACAD3002" on 19th May 2022 at 05:41:00 EST.
Ground truth annotation text files generated using Raven Pro 1.6 for each 10-minute recording in each dataset can be downloaded and extracted from the corresponding annotations zip file (e.g., DatasetSIMR_Annotations.zip). Each row of the annotation text file represents one boxed vocalization specifying the start and end time in seconds, low and high frequency in Hertz, a 4-letter AOU species code, and the corresponding recording_filename. See data_dictionary for more details on the description of each field in the text file. These species codes can be assigned to the scientific and common names of a species with the “species.csv” file.
The spatial information for each site (longitude and latitude) and the number of 10-minute files are provided in the “site_metadata.txt” file. Each row of "recordings_metadata.csv" specifies one 10-minute recording with associated information like recording_filename, annotation_filename, a unique item identifier (itemID), sub-collection identifier (subcollection_ID: DatasetACAD, DatasetMABI, DatsasetSIMR), site identifier (siteID), date (YYYYMMDD), time (HHMM), year (YYYY), and Park Identifier (Park_ID). See "data_dictionary.csv" for further description of fields across CSV files. We also provide a test audio file designed for pre-deployment microphone testing (AudioTestFileforRecorders.wav). It contains pure tones from 500 to 9,500 Hz at 1-kHz intervals, with five amplitude levels per frequency.
Acknowledgements
Compiling this extensive dataset was a significant undertaking, and we are grateful to the domain experts who helped collect and manually annotate the data for this collection (individual contributors in alphabetical order): Gillian Audier, Kyle Burton, Jack Bushong, Brooke A. Goodman, Alexander Harris, Brian Hofstetter, Lakshmi Meghana Kesanapalli, Ethan Reilly, and Ed Sharron.
Files
AudioTestFileForRecorders.wav
Files
(27.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:23700d79087b544ce4b9024041e97ef3
|
2.2 MB | Preview Download |
|
md5:efe0c2b34edccf81eae100ba6703149d
|
2.8 kB | Preview Download |
|
md5:3a5d13bf8fdf95e5465560aae3fb468f
|
1.4 MB | Preview Download |
|
md5:28bcd75a93625a77e2d1c76175c41650
|
8.4 GB | Preview Download |
|
md5:9842aacd5ca3f7a1223cb764b197a5fb
|
3.1 MB | Preview Download |
|
md5:9103e2b4fff5870af1ca4cf06a1751d6
|
10.2 GB | Preview Download |
|
md5:87240d5658565367980758e3ab6b87c4
|
1.3 MB | Preview Download |
|
md5:6c68fe0a05b9ccd8c0f6ae05a041f701
|
9.1 GB | Preview Download |
|
md5:b208cdac0be10e60fedcdac68e3be957
|
4.4 kB | Preview Download |
|
md5:d870505cd74af383e3aa65b0f66f928d
|
176.2 kB | Preview Download |
|
md5:9c8c4dab492a906929a43a2299de4ef1
|
3.9 kB | Preview Download |
|
md5:39fe3a7acc7ba70d2ba8d85927aa3a80
|
5.7 kB | Preview Download |