SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

Talijančić, Igor; Segvic-Bubic, Tanja; Zuvic, Luka; Institute of Oceanography and Fisheries

doi:10.5281/zenodo.17131026

Published April 24, 2025 | Version v2

Dataset Open

SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

1. Institute of Oceanography and Fisheries

Contributors

Contact person:

Igor, Talijančić²

Data curator:

Šarić, Josip¹

1. University of Ljubljana
2. Institute of Oceanography and Fisheries

SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

The dataset contains 2052 high-resolution images of Gilthead seabream specimen annotated with 18 landmarks/keypoints. Additionally, each specimen is associated with one of the three classes according to its origin: wild, farmed, or farm-associated.

Landmarks

Short descriptions of the 18 labeled landmarks are as follows:

anterior tip of snout at the upper jaw
vertical point above the most anterior point in the eye
anterior insertion of the dorsal fin
last spiny ray of the dorsal fin
posterior insertion of the dorsal fin
dorsal point at the least depth of the caudal peduncle
posterior body extremity
ventral point at the least depth of the caudal peduncle
posterior insertion of the anal fin
anterior insertion of the anal fin
insertion of the pelvic fin
ventral tip of the insertion of the operculum on the lateral profile
point of maximum extension of the operculum on the lateral profile
anterior extremity of the lateral line on the head profile
dorsal insertion of the pectoral fin
ventral insertion of the pectoral fin
the most anterior point in the eye
the most posterior point in the eye

Subsets

The dataset is split into three subsets for training, validation and testing. The distribution of images according to the subset and the origin is given in the table below.

Origin	Training	Validation	Test
Wild	419	97	190
Farmed	411	100	324
Farm-associated	254	51	206
Total	1084	248	720

File structure

Dataset file structure is illustrated below.

├── SPARUS-LD
│   ├── res_5184x3456
│   │   ├── train_landmark_configuration_expert.TPS
│   │   ├── val_landmark_configuration_expert.TPS
│   │   ├── test_landmark_configuration_expert.TPS
│   │   ├── test_landmark_configuration_novice.TPS
│   │   ├── test_landmark_configuration_machine.TPS
│   │   ├── train
│   │   │   ├── *.JPG
│   │   ├── test
│   │   │   ├── *.JPG
│   │   ├── val
│   │   │   ├── *.JPG

Each subset is accompanied by a single TPS file containing the landmark coordinates annotated by the expert and a directory with the corresponding images. Image files follow the naming convention {origin}_{id}.JPG, where origin can be wild, farmed, or farm-assoc, and id ranges from 00001 to 02052. This naming scheme allows the specimen’s origin to be automatically inferred from the image name.

For the test set, additional TPS files are provided that include landmark coordinates annotated by the machine (our deep learning model) and the novice annotator, in addition to the expert annotation.

The TPS file specifies the landmark coordinates for all images of the specific subset. The TPS represents one of the standard formats for geometric morphometrics. It is actually a text file, which means it can be read and edited with any regular text editor (e.g. notepad, gedit). It can also be easily read into python structures using the py-tps library. In our case, the TPS file structure is the following:

LM=18
w_1 h_1
w_2 h_2
...
w_18 h_18
IMAGE=image1_name
ID=image1_id
SCALE=image1_scale

LM=18
w_1 h_1
w_2 h_2
...
w_18 h_18
IMAGE=image2_name
ID=image2_id
SCALE=image2_scale

The line LM=18 marks the beginning of a new TPS record describing one specimen and the corresponding image. The following 18 lines w_i h_i specify the coordinates of the corresponding landmark, where w_i describes the width in pixels measured from the left side, and h_i height in pixels measured from the bottom side. Note that if you work with some image processing libraries, you may want to convert the height coordinates to be measured from the top side. The next two lines IMAGE=... and ID=... describe the corresponding image name and image id. Finally, the line SCALE=... specifies the image scale expressed in pixels/cm, which enables the conversion from pixel measurement to real-world metric measurement.

Additional Metadata

A supplementary CSV file (sparusld_per_specimen_metadata.csv) provides per-specimen metadata with the following columns: image, subset, origin, population, year_sampled, latitude, and longitude. This file enables linking each image to its respective subset, specimen origin, sampling population, and collection coordinates.

Code repository

Using this dataset, we developed a new method for automated landmark detection of Gilthead seabream based on deep learning. Check the details in our github repository.

Files

iso_meta.xml

Files (12.0 GB)

Name	Size	Download all
iso_meta.xml md5:273fe8c83f670a24a7592bb8159bb98b	21.5 kB	Preview Download
SPARUS-LD.zip md5:05e34077cbc727c97dac2c555f0ff646	12.0 GB	Preview Download
sparusld_per_specimen_metadata.csv md5:3d45b5c1ac85a63644b56fe6cc8433a4	125.5 kB	Preview Download

Additional details

Is part of: Journal article: 10.3354/aei00294 (DOI); Journal article: 10.3389/fmars.2021.694627 (DOI)

Croatian Science Foundation
Enhancing Environmental Performance of Net-Pen Marine Aquaculture HRZZ-IP-2022-10-7232

Collected: 2015/2021

Time period where all specimen samples were collected.

Repository URL: https://github.com/jsaric/sparus-ld
Programming language: Python
Development Status: Active

Life stage: Adult
Sample size unit: 2052
Locality: Eastern Adriatic Sea
Country: Croatia
Species: Gilthead seabream (Sparus aurata)

Capture device: Canon EOS 600D digital camera
Subject orientation: Lateral pictures of left side of each gilthead seabream were taken. Body posture and fins were teased into a neutral position with needles to minimize the arching effect.

	All versions	This version
Views	228	104
Downloads	94	48
Data volume	794.2 GB	240.7 GB

Contributors

Contact person:

Data curator:

SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

Landmarks

Subsets

File structure

Additional Metadata

Code repository

iso_meta.xml

Files (12.0 GB)

Related works

Funding

Dates

Software

Biodiversity

Audiovisual core

SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

Authors/Creators

Contributors

Contact person:

Data curator:

Description

SPARUS-LD: Gilthead seabream (Sparus aurata) landmark detection dataset

Landmarks

Subsets

File structure

Additional Metadata

Code repository

Files

iso_meta.xml

Files (12.0 GB)

Additional details

Related works

Funding

Dates

Software

Biodiversity

Audiovisual core