Published November 20, 2023 | Version v1
Dataset Open

Rana sierrae annotated aquatic soundscapes (2022)

  • 1. University of Pittsburgh

Description

This dataset is associated with the following manuscript, which contains details in the methodology of data collection and annotation: 

Lapp, S., Smith, T. C., Wilhelm, A, Knapp, R., Kitzes, J. In press. Aquatic soundscape recordings reveal diverse vocalizations and nocturnal activity of an endangered frog. The American Naturalist.

Rana sierrae (the Sierra Nevada yellow-legged frog) is an endangered species residing in high-elevation lakes in the Sierra Nevada mountains. The species is highly aquatic and, unlike most amphibians, primarily vocalizes while underwater. As a result, its vocalizations have rarely been recorded and its vocal repertoire is not well studied.

This dataset contains an annotated set of underwater soundscape recordings containing 1236 annotations of R. sierrae vocalizations. We annotated five distinct vocalization types of R. sierrae, only two of which have been previously documented for this species. Besides the calls of R. sierrae, these audio recordings also contain stridulation sounds (not annotated), which were most likely produced by members of the family Corixidae or other aquatic invertebrates that stridulate underwater. 

Notes

Audio files are provided in .mp3 format and are compatible with any program that supports this format. For instance, audio can be played and viewed as a spectrogram in the free Audacity software. Annotation files are provided in two formats:

(1) Raven annotation tables, which can be opened along with audio files in Raven Lite (free) or Raven Pro (paid) software to view, create, and manipulate annotation boxes on spectrograms. Note that even without Raven Lite / Pro, the annotation files can be loaded, explored, and manipulated using the free and open-source python package OpenSoundscape.

(2) A .csv file containing a table of 0/1 (present/absent, also known as "one-hot") labels for each call type for 2-second audio segments. This file can be opened with text or table editors, or loaded into Python using the pandas package.

Funding provided by: National Science Foundation
Crossref Funder Registry ID: https://ror.org/021nxhr62
Award Number:

Methods

The audio in this dataset is a set of 672 10-second audio files recorded at a spacing of 15 minutes over the course of 7 days on a single underwater audio recorder. The recorder, an AudioMoth version 1.2.0 in an underwater case, was deployed approximately 0.5 m from the shoreline, on the bottom of a lake in the Sierra Nevada in which R. sierrae breed and overwinter. 

The annotations of the five call types correspond to the descriptions in the associated manuscript:

A primary vocalization "meow" described in Vredenburg et al 2007

B stuttered vocalization, also described in Vredenburg et al 2007

C chuck, double/triple chuck calls

D short downward single note

E frequency-modulated call

X: could not determine if sound is R. sierrae or not; these were excluded from training and validation of the CNN in the manuscript

Files were annotated by Sam Lapp using Raven Pro with closed-back headphones while viewing spectrogram. Only calls that could both be heard and seen on spectrogram were annotated. This dataset also contains one-hot labels (0/1 per class per audio clip) for 2-second segments of audio. To generate these labels, we considered R. sierrae vocalizations to be present in a 2-second sample if any R. sierrae annotation overlapped with the sample by at least 0.2 seconds or if greater than 50% of an annotation box overlapped in time with the sample. A notebook in the associated GitHub repository demonstrates how the Raven annotations were converted to one-hot labels.

Works Cited

Vredenburg VT, Bingham R, Knapp R, Morgan JAT, Moritz C, Wake D. 2007. Concordant molecular and phenotypic data delineate new taxonomy and conservation priorities for the endangered mountain yellow-legged frog. Journal of Zoology 271:361–374.

Files

rana_sierrae_2022.zip

Files (41.0 MB)

Name Size Download all
md5:9888ea63210107511addc5b6523ab702
41.0 MB Preview Download
md5:acc2292826127e2d6cea2e1e156525d4
3.6 kB Preview Download

Additional details

Related works