Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics
Description
Supplementary 2
Repository Overview
This is v2 of the repository. Check you are using the latest version of this repo under the versions tab.
Citation
Please cite the following paper if you use any material from this repository: 'Leveraging tropical reef, bird, and unrelated sounds for superior transfer learning in marine bioacoustics', Williams et al. (2024): https://doi.org/10.48550/arXiv.2404.16436.
ReefSet
Contents
The full version of ReefSet used in Williams et al. (2024). This folder includes:
- 57,084 WAV files that make up
ReefSet_v1.0
. Each file is 1.88 seconds in length, sampled at 16 kHz. All files have an associated label. reefset_annotations.json
, which contains the associated label, file ID, filename, data sharer, dataset and recorder type used for each sample inReefSet_v1.0
. This information is also indicated in the filename of each file.reefset_labels_by_dataset.csv
provides a table containing the counts of each label class within each dataset.
ReefSet Contribution Guidelines
We welcome contributions to continue expanding ReefSet:
- Please prepare your data with a matching JSON file and WAV files sampled at 16 kHz. While recordings should ideally be 1.88 seconds in length, flexibility with duration may be possible.
- Ensure your files have a strong label that follows the format detailed under the 'Dataset' heading in the research article and in the supplementary tables, you can also see this in the JSON file. Take care to listen to existing label classes to avoid introducing new label classes that may already be present within ReefSet. For example, if you have data that is an entirely new sound type, please create a new label; however, if your data matches an existing sound type (e.g., bioph_growl), please use this existing label.
- Contact ben.williams.20@ucl.ac.uk and abram@conservationmetrics.com to contribute.
SurfPerch
The pretrained SurfPerch_v1.0
folder contains the SurfPerch model, and associated files, optimized for feature embedding extraction and/or fine-tuning for tropical reef and similar underwater acoustic data.
Getting Started with SurfPerch
See the tutorial in this repo or visit the Perch GitHub repository for details on how to start using the model.
The tutorial.zip
contains:
- The notebook
SurfPerch Demo with Calling in Our Corals.ipynb
, providing a full tutorial on using the SurfPerch model with new data. The sample data is taken from the Google Arts and Culture project 'Calling in Our Corals'. The notebook can be used locally or, for greatest ease, uploaded to Google Drive for use as a Colab notebook. - The
SurfPerch Demo with Calling in Our Corals Data
folder contains sample data from three different Calling in Our Corals datasets, which can be used in conjunction with the notebook. The notebook contains a cell to the sample copy data from the Perch team's public GCS bucket. If the GCS bucket is no longer accessible, you can upload this folder to your Google Drive instead, ensuring it is unzipped. Further instructions are available in the notebook. - The
SurfPerch Demo with Calling in Our Corals Output
folder contains the exact outputs (embeddings, audio samples labeled during the search, inference results CSV) generated by our run-through of the tutorial. This can be used to reproduce the figures presented at the end of the tutorial.
Note:
Over time, as the standard Colab environment and Perch package update, the installation and imports may begin to fail. Check the following for updates:
- The SurfPerch GitHub for an updated version of this notebook
- The original NeurIPS Workshop Tutorial for updates to the installation or imports
- The Perch GitHub repository for open or closed issues on installs and imports.
- Alternatively, maintained versions of the notebooks (.ipynb files) needed to run this workflow can be accessed from the Perch GitHub repository. These can be installed to your local machine, instead of Colab, by following instructions in the ReadMe on the Perch GitHub.
License
All files contained in this directory are licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Files
ReadMe.txt
Additional details
Software
- Programming language
- Python
- Development Status
- Active