Dataset Open Access

Sentinel-2 reference cloud masks generated by an active learning method

Louis Baetens; Olivier Hagolle

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.1460961", 
  "language": "eng", 
  "title": "Sentinel-2 reference cloud masks generated by an active learning method", 
  "issued": {
    "date-parts": [
  "abstract": "<p>&nbsp;<strong>Reference classifications generated with Active Learning for Cloud Detection (ALCD)</strong></p>\n\n<p>This data set provides a reference cloud mask data set for 38 Sentinel-2 scenes. These reference masks have been created with the ALCD tool, developed by Louis Baetens, under the direction of Olivier Hagolle at CESBIO/CNES[1]. They were created to validate the cloud masks generated by the MAJA software [2].</p>\n\n<p>- The `Reference_dataset` directory contains 31 scenes selected in 2017 or 2018.<br>\n- The `Hollstein` directory contains 7 scenes that were used to validate the ALCD tool by comparison to manually generated reference images kindlyprovided by Hollstein et al[3]<br>\nOne of these scenes is present in both directories. For the validation of MAJA, the &quot;Hollstein&quot; scenes were not used because of their acquisition at a time period when Sentinel-2 was not yet operational, with a degraded repetitivity of observations.</p>\n\n<p><strong># Description of the data structure</strong><br>\nThe name of each scene directory is the name of the corresponding Sentinel-2 L1C product.<br>\nIn the scene directory, three sub-directories can be found.<br>\n- `Classification`<br>\n- `Samples`<br>\n- `Statistics`</p>\n\n<p><strong># Description of the files</strong><br>\n- `Classification/classification_map.tif` --- the main product, which is the classified scene. 7 classes are available. Each one is represented with a different integer.<br>\n0: no_data.<br>\n1: not used.<br>\n2: low clouds.<br>\n3: high clouds.<br>\n4: clouds shadows.<br>\n5: land.<br>\n6: water.<br>\n7: snow.</p>\n\n<p>- `Classification/confidence_enhanced.tif` --- enhanced confidence map of the classification. The values are between 0 and 255 (coded on 1 bit).<br>\nThe original confidence map is, for each pixel, the proportion of votes for the majority class as the classification map has been created via a Random Forest algorithm.<br>\nA median filter has been applied to this confidence map. Finally, the value was saved on 1 bit, leading to the value being between 0 and 255.</p>\n\n<p>- `Classification/contours.png` --- the contours of the classes from the classification map, overlayed on the scene. The color code depends on each class.<br>\nGreen: low and high clouds. Yellow: cloud shadows. Blue: water. Purple: snow.</p>\n\n<p>- `Classification/used_parameters.json` --- the parameters that were used to classify the scene. It includes the tile code, the cloudy and clear dates, along with their product reference.</p>\n\n<p>- `Samples/` --- this directory contains all the shapefiles, one per class.</p>\n\n<p>- `Statistics/k_fold_summary.json` --- results of the 10-fold cross-validation on the scene.<br>\n5 metrics are computed, in the order given in the &quot;metrics_names&quot;. &quot;all_metrics&quot; is a list of the 10 folds, with the 5 metrics in the correct order for each fold.<br>\n&quot;means&quot; and &quot;stds&quot; are the means and standard deviations of the 10 folds.</p>\n\n<p><br>\n<strong># References</strong></p>\n\n<p>[1] Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. <em>Remote Sens.</em> <strong>2019</strong>, <em>11</em>, 433.</p>\n\n<p>[2] A multi-temporal method for cloud detection, applied to FORMOSAT-2, VEN&micro;S, LANDSAT and SENTINEL-2 images, O Hagolle, M Huc, D. Villa Pascual, G Dedieu, Remote Sensing of Environment 114 (8), 1747-1755, 2010</p>\n\n<p>[3] Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M. Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666</p>", 
  "author": [
      "family": "Louis Baetens"
      "family": "Olivier Hagolle"
  "type": "dataset", 
  "id": "1460961"
All versions This version
Views 3,7963,795
Downloads 1,0471,047
Data volume 245.6 GB245.6 GB
Unique views 3,3753,374
Unique downloads 717717


Cite as