Dataset Open Access

Sentinel-2 reference cloud masks generated by an active learning method

Louis Baetens; Olivier Hagolle


JSON-LD (schema.org) Export

{
  "inLanguage": {
    "alternateName": "eng", 
    "@type": "Language", 
    "name": "English"
  }, 
  "description": "<p>&nbsp;<strong>Reference classifications generated with Active Learning for Cloud Detection (ALCD)</strong></p>\n\n<p>This data set provides a reference cloud mask data set for 38 Sentinel-2 scenes. These reference masks have been created with the ALCD tool, developed by Louis Baetens, under the direction of Olivier Hagolle at CESBIO/CNES[1]. They were created to validate the cloud masks generated by the MAJA software [2].</p>\n\n<p>- The `Reference_dataset` directory contains 31 scenes selected in 2017 or 2018.<br>\n- The `Hollstein` directory contains 7 scenes that were used to validate the ALCD tool by comparison to manually generated reference images kindlyprovided by Hollstein et al[3]<br>\nOne of these scenes is present in both directories. For the validation of MAJA, the &quot;Hollstein&quot; scenes were not used because of their acquisition at a time period when Sentinel-2 was not yet operational, with a degraded repetitivity of observations.</p>\n\n<p><strong># Description of the data structure</strong><br>\nThe name of each scene directory is the name of the corresponding Sentinel-2 L1C product.<br>\nIn the scene directory, three sub-directories can be found.<br>\n- `Classification`<br>\n- `Samples`<br>\n- `Statistics`</p>\n\n<p><strong># Description of the files</strong><br>\n- `Classification/classification_map.tif` --- the main product, which is the classified scene. 7 classes are available. Each one is represented with a different integer.<br>\n0: no_data.<br>\n1: not used.<br>\n2: low clouds.<br>\n3: high clouds.<br>\n4: clouds shadows.<br>\n5: land.<br>\n6: water.<br>\n7: snow.</p>\n\n<p>- `Classification/confidence_enhanced.tif` --- enhanced confidence map of the classification. The values are between 0 and 255 (coded on 1 bit).<br>\nThe original confidence map is, for each pixel, the proportion of votes for the majority class as the classification map has been created via a Random Forest algorithm.<br>\nA median filter has been applied to this confidence map. Finally, the value was saved on 1 bit, leading to the value being between 0 and 255.</p>\n\n<p>- `Classification/contours.png` --- the contours of the classes from the classification map, overlayed on the scene. The color code depends on each class.<br>\nGreen: low and high clouds. Yellow: cloud shadows. Blue: water. Purple: snow.</p>\n\n<p>- `Classification/used_parameters.json` --- the parameters that were used to classify the scene. It includes the tile code, the cloudy and clear dates, along with their product reference.</p>\n\n<p>- `Samples/` --- this directory contains all the shapefiles, one per class.</p>\n\n<p>- `Statistics/k_fold_summary.json` --- results of the 10-fold cross-validation on the scene.<br>\n5 metrics are computed, in the order given in the &quot;metrics_names&quot;. &quot;all_metrics&quot; is a list of the 10 folds, with the 5 metrics in the correct order for each fold.<br>\n&quot;means&quot; and &quot;stds&quot; are the means and standard deviations of the 10 folds.</p>\n\n<p><br>\n<strong># References</strong></p>\n\n<p>[1] Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. <em>Remote Sens.</em> <strong>2019</strong>, <em>11</em>, 433.</p>\n\n<p>[2] A multi-temporal method for cloud detection, applied to FORMOSAT-2, VEN&micro;S, LANDSAT and SENTINEL-2 images, O Hagolle, M Huc, D. Villa Pascual, G Dedieu, Remote Sensing of Environment 114 (8), 1747-1755, 2010</p>\n\n<p>[3] Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M. Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "CESBIO/CNES", 
      "@type": "Person", 
      "name": "Louis Baetens"
    }, 
    {
      "affiliation": "CESBIO/CNES", 
      "@id": "https://orcid.org/0000-0003-2358-0493", 
      "@type": "Person", 
      "name": "Olivier Hagolle"
    }
  ], 
  "url": "https://zenodo.org/record/1460961", 
  "datePublished": "2018-10-12", 
  "keywords": [
    "Sentinel-2", 
    "Cloud mask", 
    "Validation"
  ], 
  "@context": "https://schema.org/", 
  "distribution": [
    {
      "contentUrl": "https://zenodo.org/api/files/ae8ac2a1-35db-4869-b577-b0cf918d121b/SENTINEL_2_reference_cloud_masks_Baetens_Hagolle.tgz", 
      "encodingFormat": "tgz", 
      "@type": "DataDownload"
    }
  ], 
  "identifier": "https://doi.org/10.5281/zenodo.1460961", 
  "@id": "https://doi.org/10.5281/zenodo.1460961", 
  "@type": "Dataset", 
  "name": "Sentinel-2 reference cloud masks generated by an active learning method"
}
2,083
716
views
downloads
All versions This version
Views 2,0832,082
Downloads 716716
Data volume 168.0 GB168.0 GB
Unique views 1,8511,850
Unique downloads 430430

Share

Cite as