Published March 25, 2025 | Version 1.0.0
Dataset Open

RipAID: Rip current Annotated Image Dataset

Description

Training dataset

RipAID is a dataset tailored to train Artificial Intelligence applications dedicated to automating rip currents detection in RGB images. It includes oblique images captured by SIRENA beach video-monitoring systems, along with corresponding annotations in various formats (XML, JSON, TXT). RipAID encompasses images from two microtidal sandy beaches, with varying fields of view (8 cameras), rip currents characteristics, and diverse meteoceanic and lighting conditions. The RipAID dataset contains two classes: ‘rip currents’ and ‘doubt’, labeled with oriented bounding boxes.

Technical details

The RipAID version 1.0.0 is packaged in a compressed file (RipAID_v1.0.0.zip). A total of 2815 RGB images are shared in PNG format, corresponding annotations in various formats (XML, JSON, TXT), and the README file in PDF format.

Data preprocessing

The RipAID dataset comprises original resolution (1280✕960 px) snapshot images from two SIRENA beach monitoring systems.  No further preprocessing was performed. Refer to the README file for more information.

Data splitting

Researchers should consistently document their splitting method and rationale in publications to ensure reproducibility and facilitate comparisons.

Classes, labels and annotations

The RipAID dataset has been labelled manually using the 'Computer Vision Annotation Tool' (CVAT). In the RipAID dataset, two classes are differentiated, and labelled using oriented bounding boxes: 'rip_current' and 'doubt'. The 'rip_current' label denotes a clearly identifiable rip current, while the 'doubt' label is assigned to features that exhibit uncertainty regarding their classification as rip currents. The "doubt" category has been included as a preventive measure to ensure a conservative approach. The README file contains further details on the criteria used to define bounding boxes.

        Label                                                                              Description
rip_current Clearly identifiable rip-current, with defined lateral edges, and neck and/or head observable.
doubt Plausible rip current, considering factors such as incoming wave patterns, disruption in wave breaking front, the presence of a defined neck, on other relevant hydrodynamic features.

Annotations were exported from CVAT in three different formats: (i) CVAT for images (XML); (ii) COCO (JSON); (iii) Ultralytics YOLO-OBB (TXT). The diverse annotation formats offered in RipAID simplify the interaction with the dataset.

Parameters

RGB values or any transformation in the colour space can be used as parameters.

Data sources

A SIRENA system consists of a set of RGB cameras mounted at the top of buildings on the beachfront. These cameras take oblique pictures of the beach, with overlapping sights, at 7.5 FPS during the first 10 minutes of each hour in daylight hours. From these pictures, different products are generated, including snapshots, which correspond to the frame of the video at the 5th minute. In the Balearic Islands, SIRENA stations are managed by the Balearic Islands Coastal Observing and Forecasting System (SOCIB), and are mounted at the top of hotels located in front of the coastline. The present dataset includes snapshots from 8 different cameras of the SIRENA systems operating since 2011 at Cala Millor and Son Bou beaches, located in Mallorca and Menorca islands (Balearic Islands, Spain), respectively. All latest and historical SIRENA images are available at the Beamon app viewer (https://apps.socib.es/beamon). 

Data quality

The RipAID dataset has uneven image distribution across SIRENA stations, cameras, and seasons due to rip current occurrence and collection strategy. Users should be aware of this variability. Additionally, despite expert labeling, the inherent variability of rip currents can lead to labeling ambiguity, which is important to consider. Further details are available in the README file. 

Image resolution

The resolution of the images in RipAID is of 1280✕960 pixels.

Spatial coverage

The RipAID version 1.0.0 contains data from two SIRENA beach video-monitoring stations, encompassing two microtidal sandy beaches in the Balearic Islands, Spain. These are: Cala Millor (clm) and Son Bou (snb). 

SIRENA station  Longitude   Latitude
clm 3.383 39.596
snb 4.077 39.898

Contact information

For further technical inquiries or additional information about the annotated dataset, please contact jsoriano@socib.es.

Notes (English)

We are extremely grateful to the General Directorate of Emergencies and Home Affairs of the Balearic Islands ('Direcció General d’Emergències i Interior'), the Balearic Islands lifeguard teams, for their invaluable collaboration in creating this rip current dataset. We hope that through this collaborative effort will enhance beach safety and contribute to the development of decision-making tools for beach and emergency management. We extend our gratitude to the iMagine project (https://www.imagine-ai.eu/) managers and partners for their contributions in creating open-access image repositories for AI-based image analysis services.

Files

README_RipAID_v1.0.0.pdf

Files (6.5 GB)

Name Size Download all
md5:04b12f7c1c90b78649d96e427bad3d7e
2.6 MB Preview Download
md5:6e2ae609f9dbb25dd3cb442a7cd6203a
6.4 GB Preview Download

Additional details

Funding

European Commission
iMagine - Imaging data and services for aquatic science 101058625

Dates

Created
2025-03
First version