Published August 1, 2021 | Version 1.0.0
Dataset Open

MARIDA: Marine Debris Archive


MARIne Debris Archive (MARIDA) is a marine debris-oriented dataset on Sentinel-2 satellite images. It also includes various sea features that co-exist. MARIDA is primarily focused on the weakly supervised pixel-level semantic segmentation task. 

Citation: Kikaki K, Kakogeorgiou I, Mikeli P, Raitsos DE, Karantzalos K (2022) MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. PLoS ONE 17(1): e0262247.

For the quick start guide visit


The dataset contains:

i. 1381 patches (256 x 256) structured by Unique Dates and S2 Tiles. Each patch is provided along with the corresponding masks of pixel-level annotated classes (*_cl) and confidence levels (*_conf). Patches are given in GeoTiff format.

ii. Shapefiles data in WGS’84/ UTM projection, with file naming convention following the scheme: s2_dd-mm-yy_ttt, where s2 denotes the S2 sensor, dd denotes the day, mm the month, yy the year and ttt denotes the S2 tile. Shapefiles include the class of each annotation along with the confidence level and the marine debris report description.

iii. Train, Validation and Test split for evaluating machine learning algorithms.

iv. The assigned multi-labels for each patch (labels_mapping.txt).

The mapping between Digital Numbers and Classes is:

1: Marine Debris
2: Dense Sargassum
3: Sparse Sargassum
4: Natural Organic Material
5: Ship
6: Clouds
7: Marine Water
8: Sediment-Laden Water
9: Foam
10: Turbid Water
11: Shallow Water
12: Waves
13: Cloud Shadows
14: Wakes
15: Mixed Water

The mapping between Digital Numbers and Confidence level is:

1: High
2: Moderate
3: Low

The mapping between Digital Numbers and marine debris Report existence is:

1: Very close
2: Away
3: No


The final uncompressed dataset requires 4.38 GB of storage.


Files (1.2 GB)

Name Size Download all
1.2 GB Preview Download

Additional details

Related works

Is supplemented by
Software: 10.5281/zenodo.5152216 (DOI)