Published September 14, 2022 | Version 1
Dataset Open

A collection of fully-annotated soundscape recordings from the Southwestern Amazon Basin

  • 1. College of Agriculture and Life Sciences, Cornell University
  • 2. K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University


This collection contains 21 hour-long soundscape recordings, which have been annotated with 14,798 bounding box labels for 132 different bird species from the Southwestern Amazon Basin. The data were recorded in 2019 in the Inkaterra Reserva Amazonica, Madre de Dios, Peru. This collection has partially been featured as test data in the 2020 BirdCLEF competition and can primarily be used for training and evaluation of machine learning algorithms.

Data collection

This acoustic data was collected at the Inkaterra Reserva Amazonica (ITRA) between January 14th and February 2nd, 2019, during the rainy season. ITRA is a 2 km2 lowland rainforest reserve on the banks of the Madre de Dios river, approximately 20 km east of the frontier town of Puerto Maldonado. The region's extraordinary biodiversity is threatened by accelerating rates of deforestation, degradation, and fragmentation, which are driven primarily by expanding road networks, mining, agriculture, and an increasing population. The acoustic data from this site were collected as part of a study designed to assess spatio-temporal variation in avian species richness and vocal activity levels across intact, degraded, and edge forest, and between different days at the same point locations. 

Ten SWIFT recording units, provided by the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology, were placed at separate sites spanning edge habitat, degraded forest, and intact forest within the reserve. These omnidirectional recorders were set to record uncompressed WAVE files continuously for the duration of their deployment, with a sampling rate of 48 kHz. The sensitivity of the used microphones was -44 (+/-3) dB re 1 V/Pa. The microphone's frequency response was not measured but is assumed to be flat (+/- 3 dB) in the frequency range 100 Hz to 7.5 kHz. The analog signal was amplified by 35 dB and digitized (16-bit resolution) using an analog-to-digital converter (ADC) with a clipping level of -/+ 0.9 V. For this collection, recordings were resampled at 32 kHz and converted to FLAC. Recorders were placed at a consistent height of approximately 1.5 m above the ground. To minimize background noise, all sites used for data analysis were located at a minimum distance of 450 m from the river.

Sampling and annotation protocol

A total of 21 dawn-hours, from 05:00-06:00 PET (10:00-11:00 UTC), representing 7 of the 10 sites on three randomly-selected dates, were manually annotated. Many neotropical bird species sing almost exclusively during the dawn hour, so this time window was selected to maximize the number of species present in the recordings. A single annotator boxed every bird call he could identify and ignored those that were too faint. Raven Pro software was used to annotate the data. Provided labels contain full bird calls that are boxed in time and frequency. The annotator was allowed to combine multiple consecutive calls of one species into one bounding box label if pauses between calls were shorter than five seconds. In this collection, we use eBird species codes as labels, following the 2021 eBird taxonomy (Clements list). Parts of this dataset have previously been featured in the 2020 BirdCLEF competition.

Files in this collection

Audio recordings can be accessed by downloading and extracting the “” file. Soundscape recording filenames contain a sequential file ID, recording site, date, and timestamp in UTC. As an example, the file “PER_001_S01_20190116_100007Z.flac” has sequential ID 001 and was recorded at site S01 on Jan 16th, 2019 at 10:00:07 UTC. Ground truth annotations are listed in “annotations.csv” where each line specifies the corresponding filename, start and end time in seconds, low and high frequency in Hertz, and an eBird species code. These species codes can be assigned to scientific and common name of a species with the “species.csv” file. Unidentifiable calls have been marked with “????” and are included in the ground truth annotations. The approximate recording location and a short habitat description for all sites can be found in the “recording_location.txt” file.


We would like to thank the Inkaterra Association (ITA) staff for providing logistical support and excellent field station facilities, particularly Noe Huaraca, Dennis Osorio, and Kevin Jiménez Gonzales, who helped set up recorders. Noe Huaraca, John Fitzpatrick, Fernando Angulo, Will Sweet, Ken Rosenburg, and Alex Wiebe helped identify unknown vocalizations. Funding for equipment was provided by the K. Lisa Yang Center for Conservation Bioacoustics at the Cornell Lab of Ornithology, with support from Innóvate Perú, CORBIDI, and the Inkaterra Association. Travel expenses were funded by the Cornell Lab of Ornithology.



Files (2.4 GB)

Name Size Download all
1.1 MB Preview Download
162.0 kB Preview Download
853 Bytes Preview Download
2.4 GB Preview Download
6.5 kB Preview Download