Published October 18, 2024 | Version 1.1.0
Dataset Open

Seatizen Atlas image dataset

  • 1. Ifremer DOI, La Réunion, France
  • 2. UMR Marbec, IRD, France
  • 3. INRIA Zenith, Montpellier, France

Description

Seatizen Atlas image dataset

This repository contains the resources and tools for accessing and utilizing the annotated images within the Seatizen Atlas dataset, as described in the paper Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery.

Download the Dataset

This annotated dataset is part of a bigger dataset composed of labeled and unlabeled images. To access information about the whole dataset, please visit the Zenodo repository and follow the download instructions provided.

If you are interested in training AI models using this dataset, you can directly access the processed version on Hugging Face.
This version is already split into training, validation and test sets, and includes only the classes with more than 200 annotations for more robust model training.

An example of a trained model based on this dataset is DinoVdeau.

Scientific Publication

If you use this dataset in your research, please consider citing the associated paper:

@article{contini2025seatizen,
  title={Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery},
  author={Contini, Matteo and Illien, Victor and Julien, Mohan and Ravitchandirane, Mervyn and Russias, Victor and Lazennec, Arthur and Chevrier, Thomas and Rintz, Cam Ly and Carpentier, L{\'e}anne and Gogendeau, Pierre and others},
  journal={Scientific Data},
  volume={12},
  number={1},
  pages={67},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

For detailed information about the dataset and experimental results, please refer to the previous paper.

Overview

The Seatizen Atlas dataset includes 14,492 multilabel and 1,200 instance segmentation annotated images. These images are useful for training and evaluating AI models for marine biodiversity research. The annotations follow standards from the Global Coral Reef Monitoring Network (GCRMN).

Annotation Details

  • Annotation Types:
  • Multilabel Convention: Identifies all observed classes in an image.
  • Instance Segmentation: Highlights contours of each instance for each class.

List of Classes

Algae

  1. Algal Assemblage
  2. Algae Halimeda
  3. Algae Coralline
  4. Algae Turf

Coral

  1. Acropora Branching
  2. Acropora Digitate
  3. Acropora Submassive
  4. Acropora Tabular
  5. Bleached Coral
  6. Dead Coral
  7. Gorgonian
  8. Living Coral
  9. Non-acropora Millepora
  10. Non-acropora Branching
  11. Non-acropora Encrusting
  12. Non-acropora Foliose
  13. Non-acropora Massive
  14. Non-acropora Coral Free
  15. Non-acropora Submassive

Seagrass

  1. Syringodium Isoetifolium
  2. Thalassodendron Ciliatum

Habitat

  1. Rock
  2. Rubble
  3. Sand

Other Organisms

  1. Thorny Starfish
  2. Sea Anemone
  3. Ascidians
  4. Giant Clam
  5. Fish
  6. Other Starfish
  7. Sea Cucumber
  8. Sea Urchin
  9. Sponges
  10. Turtle

Custom Classes

  1. Blurred
  2. Homo Sapiens
  3. Human Object
  4. Trample
  5. Useless
  6. Waste

These classes reflect the biodiversity and variety of habitats captured in the Seatizen Atlas dataset, providing valuable resources for training AI models in marine biodiversity research.

Usage Notes

The annotated images are available for non-commercial use. Users are requested to cite the related publication in any resulting works. A GitHub repository has been set up to facilitate data reuse and sharing: GitHub Repository.

Code Availability

All related codes for data processing, downloading, and AI model training can be found in the following GitHub repositories:

Acknowledgements

This dataset and associated research have been supported by several organizations, including the Seychelles Islands Foundation, Réserve Naturelle Marine de la Réunion, and Monaco Explorations, among others.

For any questions or collaboration inquiries, please contact seatizen.ifremer@gmail.com.

Files

20241016_132819_multilabel_annotations.csv

Files (35.8 GB)

Name Size Download all
md5:a61e61b653c7d3698dd1f4567f4e6ea8
2.2 MB Preview Download
md5:ebc6f7e67548bae68cb0c8d1aac7bf2a
351.1 MB Preview Download
md5:25e51ec8492c4c16505e2b7516f578fd
3.3 kB Preview Download
md5:75f9111c98040e6aebde0ebcb76e389a
35.5 GB Preview Download

Additional details

Identifiers

URN
urn:seatizen_atlas_dataset

Related works