Published January 11, 2026 | Version v1.2.6
Dataset Open

Seatizen Atlas

  • 1. IFREMER, DOI, France
  • 2. IRD
  • 3. INRIA
  • 4. COOOL

Description

This deposit offers a comprehensive collection of geospatial and metadata files that constitute the Seatizen Atlas dataset, facilitating the management and analysis of spatial information.

To navigate through the data, you can use an interface available at seatizenmonitoring.ifremer.re, which provides a condensed CSV file tailored to your choice of metadata and the selected area.
To retrieve the associated images, you will need to use a script that extracts the relevant frames. A brief tutorial is available here: Tutorial.
All the scripts for processing sessions, creating the geopackage, and generating files can be found here: SeatizenDOI github repository.
All our CSV files are also available in Parquet format.

The repository includes:
  • seatizen_atlas_db.gpkg: geopackage file that stores extensive geospatial data, allowing for efficient management and analysis of spatial information.

  • session_doi.csv: a CSV file listing all sessions published on Zenodo. This file contains the following columns:

    • session_name: identifies the session.
    • session_doi: indicates the URL of the session.
    • place: indicates the location of the session.
    • date: indicates the date of the session.
    • raw_data: indicates whether the session contains raw data or not.
    • processed_data: indicates whether the session contains processed data.

  • metadata_images.csv: a CSV file describing all metadata for each image published in open access. This file contains the following columns:

    • OriginalFileName: indicates the original name of the photo.
    • FileName: indicates the name of the photo adapted to the naming convention adopted by the Seatizen team (i.e., YYYYMMDD_COUNTRYCODE-optionalplace_device_session-number_originalimagename).
    • relative_file_path: indicates the path of the image in the deposit.
    • frames_doi: indicates the DOI of the version where the image is located.
    • GPSLatitude: indicates the latitude of the image (if available).
    • GPSLongitude: indicates the longitude of the image (if available).
    • GPSAltitude: indicates the depth of the frame (if available).
    • GPSRoll: indicates the roll of the image (if available).
    • GPSPitch: indicates the pitch of the image (if available).
    • GPSTrack: indicates the track of the image (if available).
    • GPSDatetime: indicates when frames was take (if available).
    • GPSFix: indicates GNSS quality levels (if available).

  • metadata_multilabel_predictions.csv: a CSV file describing all predictions from last multilabel model with georeferenced data.

    • FileName: indicates the name of the photo adapted to the naming convention adopted by the Seatizen team (i.e., YYYYMMDD_COUNTRYCODE-optionalplace_device_session-number_originalimagename).
    • frames_doi: indicates the DOI of the version where the image is located.
    • GPSLatitude: indicates the latitude of the image (if available).
    • GPSLongitude: indicates the longitude of the image (if available).
    • GPSAltitude: indicates the depth of the frame (if available).
    • GPSRoll: indicates the roll of the image (if available).
    • GPSPitch: indicates the pitch of the image (if available).
    • GPSTrack: indicates the track of the image (if available).
    • GPSFix: indicates GNSS quality levels (if available).
    • prediction_doi: refers to a specific AI model prediction on the current image (if available).
    • A column for each class predicted by the AI model.

  • metadata_multilabel_annotation.csv: a CSV file listing the subset of all the images that are annotated, along with their annotations. This file contains the following columns:

    • FileName: indicates the name of the photo.
    • frame_doi: indicates the DOI of the version where the image is located.
    • relative_file_path: indicates the path of the image in the deposit.
    • annotation_date: indicates the date when the image was annotated.
    • A column for each class with values:
      • 1: if the class is present.
      • 0: if the class is absent.
      • -1: if the class was not annotated.

  • darwincore_multilabel_annotations.zip: a Darwin Core Archive (DwC-A) file listing the subset of all the images that are annotated, along with their annotations.

Scientific Publication

If you use this dataset in your research, please consider citing the associated paper:

@article{contini2025seatizen,
title={Seatizen Atlas: a collaborative dataset of underwater and aerial marine imagery},
author={Contini, Matteo and Illien, Victor and Julien, Mohan and Ravitchandirane, Mervyn and Russias, Victor
and Lazennec, Arthur and Chevrier, Thomas and Rintz, Cam Ly and Carpentier, L{\'e}anne and Gogendeau, Pierre and others},
journal={Scientific Data},
volume={12},
number={1},
pages={67},
year={2025},
publisher={Nature Publishing Group UK London}
}

For detailed information about the dataset and experimental results, please refer to the previous paper.

Files

darwincore_multilabel_annotation.zip

Files (7.8 GB)

Name Size Download all
md5:27ceca697716aeda9778717f6add139d
7.5 MB Preview Download
md5:966f12a83f441b37df13e1a974748508
784.0 MB Preview Download
md5:4c1df2e9595fa619b3c32a047a668bea
103.5 MB Download
md5:380d190ad4ff50d88a18dc6c4de8bbfd
8.0 MB Preview Download
md5:bdfadbbfedc6fc334ee63415d69032a4
200.2 kB Download
md5:41457c8c92b3dc54d91173a23a32f661
848.2 MB Preview Download
md5:187a7625fbb0b332d7660f97e5e07104
82.7 MB Download
md5:a029a27d5405a4ec168d1cc2b0071c46
6.0 GB Download
md5:7c5a3d75385077bd30e0da22fa21ec4c
58.0 kB Preview Download
md5:ce77769bc217413900dc68bcac0b868c
10.3 kB Download

Additional details

Identifiers

URN
urn:seatizen-atlas