Worldwide Soundscapes project metadata and analysis scripts
Creators
Contributors
Data curator:
Description
The Worldwide Soundscapes project is a global, open inventory of spatio-temporally replicated passive acoustic monitoring meta-datasets (i.e. meta-data collections). This Zenodo entry comprises the data tables that constitute its (meta-)database, as well as their description. Additionally, R scripts are provided to replicate the analysis published in [placeholder].
The overview of all sampling sites and timelines can be found on the corresponding project on ecoSound-web, as well as a demonstration collection containing selected recordings. The recordings of this collection were annotated and analysed to explore macro-ecological trends.
The audio recording criteria justifying inclusion into the meta-database are:
- Stationary (no transects, towed sensors or microphones mounted on cars)
- Passive (unattended, no human disturbance by the recordist)
- Ambient (no directional microphone or triggered recordings, non-experimental conditions)
- Spatially and/or temporally replicated (i.e. multiple sites sampled at the same time and/or multiple days - covering the same daytime - sampled at the same site)
The individual columns of the provided data tables are described in the following. Data tables are linked through primary keys; joining them will result in a database. The data shared here only includes validated collections.
Changes from version 4.0.0
Added link to the published synthesis.
Meta-database CSV files
collections
- collection_id: unique integer, primary key
- name: name of the dataset. if it is repeated, incremental integers should be used in the "subset" column to differentiate them.
- ecoSound-web_link: link of validated meta-collection on ecoSound-web
- primary_contributors: full names of people deemed corresponding contributors who are responsible for the dataset
- secondary_contributors: full names of people who are not primary contributors but who have significantly contributed to the dataset, and who could be contacted for in-depth analyses
- date_added: when the datased was added (YYYY-MM-DD)
- URL_open_recordings: internet link for openly-available recordings from this collection
- URL_project: internet link for further information about the corresponding project
- DOI_publication: Digital Object Identifiers of corresponding publications
- core_realm_IUCN: The main, core realm of the dataset according to IUCN Global Ecosystem Typology (v2.0): https://global-ecosystems.org/
- medium: the physical medium the microphone is situated in
- locality: optional free text about the locality
- contributor_comments: free-text field for comments by the primary contributors
collections-sites
- dataset_ID: primary key of collections table
- site_ID: primary key of sites table
sites
- site_ID: unique integer, primary key
- site_name: internal name or code of sampling site as used in respective projects
- latitude_numeric: site's numeric degrees of latitude
- longitude_numeric: site's numeric degrees of longitude
- blurred_coordinates: whether latitude and longitude coordinates are inaccurate, boolean. Coordinates may be blurred with random offsets, rounding, snapping, etc. Indicate the blurring method inside the comments field
- topography_m: vertical position of the microphone relative to the sea level. for sites on land: elevation. For marine sites: depth (negative). in meters. Only indicate if the values were measured by the collaborator.
- freshwater_depth_m: microphone depth, only used for sites inside freshwater bodies that also have an elevation value above the sea level
- realm: Ecosystem type: main realm according to IUCN GET https://global-ecosystems.org/
- biome: Ecosystem type: main biome according to IUCN GET https://global-ecosystems.org/
- functional_group: Ecosystem type: main functional group according to IUCN GET https://global-ecosystems.org/
- contributor_comments: free text field for contributor comments
- GADM_0: Global ADMinistrative Database level 0 classification of terrestrial site or marine site that is within territorial waters. Source: https://gadm.org/download_world.html
- IHO: International Hydrographic Organization classification of marine site. Source: https://marineregions.org/downloads.php
- WDPA: World Database on Protected Areas classification of the site. Source: https://www.protectedplanet.net/en/thematic-areas/wdpa?tab=WDPA
deployments
- dataset_ID: primary key of datasets table
- deployment: identical subscript letters to denote rows that belong to the same deployment. For instance, you may use different operation times and schedules for different target taxa within one deployment.
- subset_site_ID: If the deployment was not done in all the sites of the corresponding collection, site IDs where the deployment was conducted
- start_date: date of deployment start
- start_time_mixed: deployment start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset). Corresponds to the recording start time for continuous recording deployments. If multiple start times were used, you should mention the latest start time (corresponds to the earliest daytime from which all recorders are active). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
- permanent: whether the deployment is permanent, boolean
- end_date: date of deployment end (date when last scheduled operation starts)
- end_time_mixed: deployment end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Corresponds to the recording end time for continuous recording deployments.
- operation_mode: continuous: recording takes place from the deployment start date-time to deployment end date-time.
periodical: recording takes place periodically (i.e., with duty cycle) from the deployment start date-time to deployment end date-time.
scheduled: recording takes place during scheduled daily time intervals (optionally with duty cycle) - duty_cycle_minutes: duty cycle of the recording (i.e. the fraction of minutes when it is recording), written as "recording(minutes)/period(minutes)". empty if no duty cycle is used. For example: "1/6" if the recorder is active for 1 minute and standing by for 5 minutes
- operation_start_time_mixed: only for scheduled recordings: start local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
- operation_duration_minutes: only for scheduled recordings: duration of operation in minutes, if constant
- operation_end_time_mixed: only for scheduled recordings: end local time, either in HH:MM format or a choice of solar daytimes (sunrise, sunset, noon, midnight). Only required if durations are variable. Do not use when end times are ambiguous (for instance, if a recording could be 1 hour or 25 hours long because the end is on the next day). If applicable, positive or negative offsets from solar times can be mentioned (For example: if data are collected one hour before sunrise, this will be "sunrise-60")
- high_pass_filter_Hz: frequency of the high-pass filter of the recorder if applied, in Hz. Otherwise, write "none". This may be called a "low-cut" filter too.
- bit_depth: sampling bit depth of the recordings. Often constant for a particular recorder
- channels: number of recorded audio channels
- sampling_frequency_kHz: frequency at which the microphone signal was sampled by the recorder (sounds of half that frequency will be recorded)
- recorder: recorder used for deployment
- microphone: microphone used for deployment
- target_taxa: main IUCN animal taxa that were studied with this deployment, using the exact IUCN Red list names (http://www.iucnredlist.org/), separated by commas. Only genera, families, orders, and classes are accepted. Empty if there was no taxonomic focus (i.e., general soundscapes were the study focus).
- contributor_comments: free text field for contributor comments
- exact_recordings: whether the deployment data here have been superseded by inserting more exact recording date-time ranges into the meta-collection on ecoSound-web
recordings (partial download from ecoSound-web)
- recording_id: primary key of the recordings table
- collection_id: ID of the collection the recording belongs to
- name: name of the recording
- site_id: site ID the recording belongs to:
- recorder_id: ID of the recorder used for the recording (internal ecoSound-web code)
- microphone_id: ID of the microphone used for the recording (internal ecoSound-web code)
- recording_gain:recording gain applied for amplifying the audio signal, in decibels
- duty_cycle_recording: fraction of the recording periode when the recorder is actively recording audio
- duty_cycle_period: period of the duty cycle, i.e., time between the starts of two subsequent recordings
- note: comments (contains the target taxon)
- file_date: date of the recording start
- file_time: local time of the recording start
- sampling_rate: audio sampling rate in Hz
- bitdepth: depth in bits for each audio sample
- channel_num: number of channels
- duration: duration of the recording in seconds. Note: duty-cycled recordings cover only a proportion of this duration
affiliations
- affiliation_id: primary key of affiliations table
- lab_research_group: Laboratory or research group name
- department_school_institute: department, school, or institute name
- university_institution: University or institution name
- street_address: street address
- region_state_province_city: region, state, province, or city name
- postal_code: postal code
- country: country name
primary_contributors
- First_name: First, given name, anonymised when contributor is technically accepted but has not yet given publication authorisation
- Last_name: Last, family name, anonymised when contributor is technically accepted but has not yet given publication authorisation
- ORCiD
- affiliation_IDs: primary keys of the affiliations' table corresponding affiliations, separated by comma
- first_tier_position: Author position in first-tier
- publication_agreement: Has contributor explicitly agreed to share her/his meta-data in the collaboration agreement?
- co_author_first_synthesis: Has contributor confirmed co-authorship intention in the collaboration agreement?
The following columns describe the contributor's role in the project accordint to CRediT taxonomy.
Auxiliary files for reproducing analysis
R scripts
- acoustic analysis.R: reproduces the result of the soundscape case studies
- metadata analysis.R: reproduces the metadata analysis results in the publication
Data from the demonstration collection (download from ecoSound-web)
- demo_recordings.csv: metadata of the recordings, see recordings table
- demo_sites.csv: metadata of the sampling locations, see sites table
- demo_tags.csv: data describing annotations made in demonstration recordings for the biophony, anthropophony, geophony, and unknown sound sources
- spectrograms.zip: contains PNG format spectrograms used in generating Figure 5
Externally sourced data
- GET_areas_2.1.1.csv: raw data obtained from Keith et al. 2023 (https://doi.org/10.5281/zenodo.10081251), then summarized in QGIS to obtain areas per functional group
- Havlik_sites.csv: data obtained from Havlik et al. 2022 supplementary material (https://www.frontiersin.org/articles/10.3389/fmars.2022.919418), originally named "Data Sheet 1.CSV"
- Sugai_sites_updated.csv: data obtained from Sugai et al. 2019 (https://doi.org/10.1093/biosci/biy147), personal communication with permission
- taxonomy.csv: raw data obtained from IUCN Red List for all animal taxa (https://www.iucnredlist.org/)
- topography_range_latitude.csv: raw topography from GEBCO sub-ice data (https://www.gebco.net/data_and_products/gridded_bathymetry_data/), summarised by bins of 10 latitudinal rows
Files
collections.csv
Files
(161.1 MB)
Name | Size | Download all |
---|---|---|
md5:9e4823332cc29a771aac95d66b20f138
|
21.4 kB | Download |
md5:c8aff528bdf731f2cd81c1a005f63349
|
79.0 kB | Preview Download |
md5:fd7ab49be4efee7e26eccd206a9b37a9
|
124.1 kB | Preview Download |
md5:f31dc6525c5c76522930e413e81e766c
|
117.0 kB | Preview Download |
md5:29a0c0c51fac4cc5cd6989653950897c
|
45.9 kB | Preview Download |
md5:80123bbd7d8a93b56ae5b21690f870ce
|
7.6 kB | Preview Download |
md5:457a79e1d9be3a0af672803cc0c72c31
|
114.6 kB | Preview Download |
md5:9fdada2735fe590d8869fb787081e900
|
1.4 MB | Preview Download |
md5:3dbc1f93a4bd577ce8054c62a57c3ee7
|
9.6 kB | Preview Download |
md5:ea7def5e7a59049f3febf77df9110875
|
87.1 kB | Preview Download |
md5:58b3cc136c6c7d118784e07fe8afc3cd
|
64.4 kB | Download |
md5:b9a53547a7e487a0a04a7de396271d9d
|
46.1 kB | Preview Download |
md5:1a58011ce1c41a089410a25ac9b5fe0b
|
128.0 MB | Preview Download |
md5:7acbd845bbbc344b843e04ca1dc6e03a
|
2.1 MB | Preview Download |
md5:387e75907bd56d776a807acecb10a0a2
|
5.8 MB | Preview Download |
md5:f8d79ffcd06161151d21e5e043dd0207
|
22.5 kB | Preview Download |
md5:8de95c8b53af08cb28eeadccda822d8c
|
22.0 MB | Preview Download |
md5:4dcb8fd29f918a88e38d590cb6fc10d1
|
1.1 MB | Preview Download |
Additional details
Related works
- Is published in
- Publication: 10.1111/geb.70021 (DOI)