Published July 6, 2024 | Version 3.0.1
Dataset Open

Worldwide Soundscapes project metadata

  • 1. ROR icon Institut National de Recherche pour l'Agriculture, l'Alimentation et l'Environnement
  • 2. The Fish Listener
  • 3. Environment & Climate Change Canada
  • 4. Institut de Recherche pour le Développement
  • 5. Chinese Academy of Sciences
  • 6. Westlake University


The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated passive acoustic monitoring meta-datasets (i.e. meta-data collections). This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description.

The overview of all sampling sites can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings.

The audio recording criteria justifying inclusion into the meta-database are:

  • Stationary (no transects, towed sensors or microphones mounted on cars)
  • Passive (unattended, no human disturbance by the recordist)
  • Ambient (no directional microphone or triggered recordings)
  • Spatially and/or temporally replicated (i.e. multiple sites sampled at the same time and/or multiple days - covering the same daytime - sampled at the same site)

The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database. The data shared here only includes validated collections.

Changes from version 2.0.0

Expanded the database, removed protected_area field from collections table. Added secondary realm, biome, and functional group fields in sites table.

In deployments table: Removed start_date_min and start_date_max to only use unequivocal start_date, removed end_date_min and end_date_max to only use unequivocal end_date. Accordingly, removed variable_duration_days and duration_days. Added high_pass_filter_Hz, bit_depth, channels, and exact_recordings fields. Renamed comments to contributor_comments and recording_time to operation_mode.


  • collection_id: unique integer, primary key
  • name: name of the dataset. if it is repeated,  incremental integers should be used in the "subset" column to differentiate them.
  • ecoSound-web_link: link of validated meta-collection on ecoSound-web
  • primary_contributors: full names of people deemed corresponding contributors who are responsible for the dataset
  • secondary_contributors: full names of people who are not primary contributors but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses
  • date_added: when the datased was added (YYYY-MM-DD)
  • URL_open_recordings: internet link for openly-available recordings from this collection
  • URL_project: internet link for further information about the corresponding project
  • DOI_publication: Digital Object Identifiers of corresponding publications
  • core_realm_IUCN: The main, core realm of the dataset according to IUCN Global Ecosystem Typology (v2.0):
  • medium: the physical medium the microphone is situated in
  • locality: optional free text about the locality
  • spatial_selection: spatial selection criteria that were used to determine in which locations to record sound (ecotone, elevated spot, etc.) - any deviations from randomness
  • temporal_exclusion: environmental exclusion criteria that were used to determine which recording days or times to discard
  • freshwater_recordist_position: position of the recordist relative to the microphone during sampling (only for freshwater)
  • contributor_comments: free-text field for comments by the primary contributors


  • dataset_ID: primary key of collections table
  • site_ID: primary key of sites table


  • site_ID: unique integer, primary key
  • site_name: internal name or code of sampling site as used in respective projects
  • latitude_numeric: site's numeric degrees of latitude
  • longitude_numeric: site's numeric degrees of longitude
  • blurred_coordinates: whether latitude and longitude coordinates are inaccurate, boolean. Coordinates may be blurred with random offsets, rounding, snapping, etc. Indicate the blurring method inside the comments field
  • topography_m: vertical position of the microphone relative to the sea level. for sites on land: elevation. For marine sites: depth (negative). in meters. Only indicate if the values were measured by the collaborator.
  • freshwater_depth_m: microphone depth, only used for sites inside freshwater bodies that also have an elevation value above the sea level
  • realm_1: Ecosystem type: main realm according to IUCN GET
  • realm_2: additional, secondary realm (optional)
  • biome_1: Ecosystem type: main biome according to IUCN GET
  • biome_2: additional, secondary biome (optional)
  • functional_group_1: Ecosystem type: main functional group according to IUCN GET
  • functional_group_2: additional, secondary functional group (optional)
  • contributor_comments: free text field for contributor comments


  • dataset_ID: primary key of datasets table
  • deployment: identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
  • subset_site_ID: If the deployment was not done in all the sites of the corresponding collection, site IDs where the deployment was conducted
  • start_date: date of deployment start
  • start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • permanent: whether the deployment is permanent, boolean
  • end_date: date of deployment end (date when last scheduled operation starts)
  • end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
  • operation_mode: continuous: recording takes place from the deployment start date-time to deployment end date-time.
    periodical: recording takes place periodically (i.e., with duty cycle) from the deployment start date-time to deployment end date-time.
    scheduled: recording takes place  during scheduled daily time intervals (optionally with duty cycle)
  • duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". empty if no duty cycle is used. For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes
  • operation_start_time_mixed: only for scheduled recordings: start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • operation_duration_minutes: only for scheduled recordings: duration of operation in minutes, if constant
  • operation_end_time_mixed: only for scheduled recordings: end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Only required if durations are variable. Do not use when end times are ambiguous (for instance, if a recording could be 1 hour or 25 hours long because the end is on the next day). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • high_pass_filter_Hz: frequency of the high-pass filter of the recorder if applied, in Hz. Otherwise, write "none". This may be called a "low-cut" filter too.
  • bit_depth: sampling bit depth of the recordings. Often constant for a particular recorder
  • channels: number of recorded audio channels
  • sampling_frequency_kHz: frequency at which the microphone signal was sampled by the recorder (sounds of half that frequency will be recorded)
  • recorder: recorder used for deployment
  • microphone: microphone used for deployment
  • target_taxa: main IUCN animal taxa that were studied with this deployment, using the exact IUCN Red list names (, separated by commas. Only genera, families, orders, and classes are accepted. Empty if there was no taxonomic focus (i.e., general soundscapes were the study focus).
  • contributor_comments: free text field for contributor comments
  • exact_recordings: whether the deployment data here have been superseded by inserting more exact recording date-time ranges into the meta-collection on ecoSound-web



Files (3.1 MB)

Name Size Download all
110.4 kB Preview Download
110.4 kB Preview Download
1.1 MB Preview Download
1.8 MB Preview Download

Additional details

Related works

Is published in
Preprint: 10.1101/2024.04.10.588860 (DOI)