There is a newer version of the record available.

Published December 8, 2022 | Version 1.0.0
Dataset Open

Worldwide Soundscapes project meta-data

  • 1. Westlake University
  • 2. The Fish Listener
  • 3. Environment & Climate Change Canada
  • 4. Institut de Recherche pour le Développement
  • 5. Chinese Academy of Sciences

Description

The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated soundscape datasets. This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description.

The overview of all sampling sites can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings. More information on the project can be found here and on ResearchGate.

The audio recording criteria justifying inclusion into the meta-database are:

  • Stationary (no transects, towed sensors or microphones mounted on cars)
  • Passive (unattended, no human disturbance by the recordist)
  • Ambient (no spatial or temporal focus on a particular species or direction)
  • Spatially and/or temporally replicated (multiple sites sampled at least at one common daytime or multiple days sampled at least in one common site)

The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database.

datasets

  • dataset_id: incremental integer, primary key
  • name: name of the dataset. if it is repeated,  incremental integers should be used in the "subset" column to differentiate them.
  • subset: incremental integer that can be used to distinguish datasets with identical names
  • collaborators: full names of people deemed responsible for the dataset, separated by commas
  • contributors: full names of people who are not the main collaborators but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses, separated by commas.
  • date_added: when the datased was added (DD/MM/YYYY)
  • URL_open_recordings: if recordings (even only some) from this dataset are openly available, indicate the internet link where they can be found.
  • URL_project: internet link for further information about the corresponding project
  • DOI_publication: DOI of corresponding publications, separated by comma
  • core_realm_IUCN: The core realm of the dataset. Datasets may have multiple realms, but the main one should be listed. Datasets may contain sampling sites from different realms in the "sites" sheet. IUCN Global Ecosystem Typology (v2.0): https://global-ecosystems.org/
  • medium: the physical medium the microphone is situated in
  • protected_area: Whether the sampling sites were situated in protected areas or not, or only some.
  • GADM0: For datasets on land or in territorial waters, Global Administrative Database level0
    https://gadm.org/
  • GADM1: For datasets on land or in territorial waters, Global Administrative Database level1
    https://gadm.org/
  • GADM2: For datasets on land or in territorial waters, Global Administrative Database level2
    https://gadm.org/
  • IHO: For marine locations, the sea area that encompassess all the sampling locations according to the International Hydrographic Organisation. Map here: https://www.arcgis.com/home/item.html?id=44e04407fbaf4d93afcb63018fbca9e2
  • locality: optional free text about the locality
  • latitude_numeric_region: study region approximate centroid latitude in WGS84 decimal degrees
  • longitude_numeric_region: study region approximate centroid longitude in WGS84 decimal degrees
  • sites_number: number of sites sampled
  • year_start: starting year of the sampling
  • year_end: ending year of the sampling
  • deployment_schedule: description of the sampling schedule, provisional
  • temporal_recording_selection: list environmental exclusion criteria that were used to determine which recording days or times to discard
  • high_pass_filter_Hz: frequency of the high-pass filter of the recorder, in Hz
  • variable_sampling_frequency: Does the sampling frequency vary? If it does, write "NA" in the sampling_frequency_kHz column and indicate it in the sampling_frequency_kHz column inside the deployments sheet
  • sampling_frequency_kHz: frequency the microphone was sampled at (sounds of half that frequency will be recorded)
  • variable_recorder:
  • recorder: recorder model used
  • microphone: microphone used
  • freshwater_recordist_position: position of the recordist relative to the microphone during sampling (only for freshwater)
  • collaborator_comments: free-text field for comments by the collaborators
  • validated: This cell is checked if the contents of all sheets are complete and have been found to be coherent and consistent with our requirements.
  • validator_name: name of person doing the validation
  • validation_comments: validators: please insert the date when someone was contacted
  • cross-check: this cell is checked if the collaborators confirm the spatial and temporal data after checking the corresponding site maps, deployment and operation time graphs found at https://drive.google.com/drive/folders/1qfwXH_7dpFCqyls-c6b8RZ_fbcn9kXbp?usp=share_link

datasets-sites

  • dataset_ID: primary key of datasets table
  • dataset_name: lookup field
  • site_ID: primary key of sites table
  • site_name: lookup field

sites

  • site_ID: unique site IDs, larger than 1000 for compatibility with ecoSound-web
  • site_name: name or code of sampling site as used in respective projects
  • latitude_numeric: exact numeric degrees coordinates of latitude
  • longitude_numeric: exact numeric degrees coordinates of longitude
  • topography_m: for sites on land: elevation. For marine sites: depth (negative). in meters
  • freshwater_depth_m
  • realm: Ecosystem type according to IUCN GET  https://global-ecosystems.org/
  • biome: Ecosystem type according to IUCN GET  https://global-ecosystems.org/
  • functional_group: Ecosystem type according to IUCN GET  https://global-ecosystems.org/
  • comments

deployments

  • dataset_ID: primary key of datasets table
  • dataset_name: lookup field
  • deployment: use identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
  • start_date_min: earliest date of deployment start, double-click cell to get date-picker
  • start_date_max: latest date of deployment start, if applicable (only used when recorders were deployed over several days), double-click cell to get date-picker
  • start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • permanent: is the deployment permanent (in which case it would be ongoing and the end date or duration would be unknown)?
  • variable_duration_days: is the duration of the deployment variable? in days
  • duration_days: deployment duration per recorder (use the minimum if variable)
  • end_date_min: earliest date of deployment end, only needed if duration is variable, double-click cell to get date-picker
  • end_date_max: latest date of deployment end, only needed if duration is variable, double-click cell to get date-picker
  • end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
  • recording_time: does the recording last from the deployment start time to the end time (continuous) or at scheduled daily intervals (scheduled)? Note: we consider recordings with duty cycles to be continuous.
  • operation_start_time_mixed: scheduled recording start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • operation_duration_minutes: duration of operation in minutes, if constant
  • operation_end_time_mixed: scheduled recording end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
  • duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes.
  • sampling_frequency_kHz: only indicate the sampling frequency if it is variable within a particular dataset so that we need to code different frequencies for different deployments
  • recorder
  • subset_sites: If the deployment was not done in all the sites of the corresponding datasest, site IDs can be indicated here, separated by commas
  • comments

Files

Worldwide Soundscapes - datasets-sites.csv

Files (1.5 MB)

Name Size Download all
md5:7fc68bbb9acae585667a4c91d6244310
308.3 kB Preview Download
md5:5eacedc6416bd5c67ca329b669d88f12
158.4 kB Preview Download
md5:7b60647e2b97956c3ed5655f1e176e1c
93.4 kB Preview Download
md5:559f96e38f17f44f2946137470d737c5
893.9 kB Preview Download
md5:dea02c91a710c4fa0fc3b2d4962eb39b
11.4 kB Preview Download