Dataset Open Access
MARIne Debris Archive (MARIDA) is a marine debris-oriented dataset on Sentinel-2 satellite images. It also includes various sea features that co-exist. MARIDA is primarily focused on the weakly supervised pixel-level semantic segmentation task.
Citation: Kikaki K, Kakogeorgiou I, Mikeli P, Raitsos DE, Karantzalos K (2022) MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data. PLoS ONE 17(1): e0262247. https://doi.org/10.1371/journal.pone.0262247
For the quick start guide visit marine-debris.github.io
The dataset contains:
i. 1381 patches (256 x 256) structured by Unique Dates and S2 Tiles. Each patch is provided along with the corresponding masks of pixel-level annotated classes (*_cl) and confidence levels (*_conf). Patches are given in GeoTiff format.
ii. Shapefiles data in WGS’84/ UTM projection, with file naming convention following the scheme: s2_dd-mm-yy_ttt, where s2 denotes the S2 sensor, dd denotes the day, mm the month, yy the year and ttt denotes the S2 tile. Shapefiles include the class of each annotation along with the confidence level and the marine debris report description.
iii. Train, Validation and Test split for evaluating machine learning algorithms.
iv. The assigned multi-labels for each patch (labels_mapping.txt).
The mapping between Digital Numbers and Classes is:
1: Marine Debris
2: Dense Sargassum
3: Sparse Sargassum
4: Natural Organic Material
7: Marine Water
8: Sediment-Laden Water
10: Turbid Water
11: Shallow Water
13: Cloud Shadows
15: Mixed Water
The mapping between Digital Numbers and Confidence level is:
The mapping between Digital Numbers and marine debris Report existence is:
1: Very close
The final uncompressed dataset requires 4.38 GB of storage.