Data from: BioEncoder: a metric learning toolkit for comparative organismal biology
Description
BioEncoder: a metric learning toolkit for comparative organismal biology
Abstract - In the realm of biological image analysis, deep learning (DL) has become a core toolkit, e.g., for segmentation and classification. However, conventional DL methods are challenged by large biodiversity datasets characterized by unbalanced classes and hard-to-distinguish phenotypic differences between them. Here we present BioEncoder, a user-friendly toolkit for metric learning, which overcomes these challenges by focussing on learning relationships between individual data points rather than on the separability of classes. BioEncoder is released as a Python package, created for ease of use and flexibility across diverse datasets. It features taxon-agnostic data loaders, custom augmentation options, and simple hyperparameter adjustments through text-based configuration files. The toolkit's significance lies in its potential to unlock new research avenues in biological image analysis while democratizing access to advanced deep metric learning techniques. BioEncoder focuses on the urgent need for toolkits bridging the gap between complex DL pipelines and practical applications in biological research.
Dataset - This data repository includes two things: a snapshot of the BioEncoder package (BioEncoder-main.zip, version 1.0.0, downloaded from https://github.com/agporto/BioEncoder on 2024-07-19 at 17:20), and the damselfly dataset used for the case study presented in the paper (bioencoder_data.zip). The dataset archive also encompasses the configuration files and the final model checkpoints from the case study, as well as a script to reproduce the results and figures presented in the paper.
How to use - Get started by consulting the GithHub repository for information on how to install BioEncoder, then download the data archive and run the script. Some parts of the script can be executed using the model checkpoints, for orther parts the training rountine needs to be run.
Files
BioEncoder-data.zip
Files
(2.5 GB)
Name | Size | Download all |
---|---|---|
md5:48d36b385fe871698d53d44c534f98fd
|
2.5 GB | Preview Download |
md5:a875d1b98c23358bae11b4f404ea5ecc
|
1.5 MB | Preview Download |
Additional details
Funding
Dates
- Created
-
2024-03-28version 0.1.0 released
- Updated
-
2024-07-19version 1.0.0 released
Software
- Repository URL
- https://github.com/agporto/BioEncoder
- Programming language
- Python
- Development Status
- Active