Dataset Open Access

Sentinel-2 reference cloud masks generated by an active learning method

Louis Baetens; Olivier Hagolle


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Louis Baetens</dc:creator>
  <dc:creator>Olivier Hagolle</dc:creator>
  <dc:date>2018-10-12</dc:date>
  <dc:description> Reference classifications generated with Active Learning for Cloud Detection (ALCD)

This data set provides a reference cloud mask data set for 38 Sentinel-2 scenes. These reference masks have been created with the ALCD tool, developed by Louis Baetens, under the direction of Olivier Hagolle at CESBIO/CNES[1]. They were created to validate the cloud masks generated by the MAJA software [2].

- The `Reference_dataset` directory contains 31 scenes selected in 2017 or 2018.
- The `Hollstein` directory contains 7 scenes that were used to validate the ALCD tool by comparison to manually generated reference images kindlyprovided by Hollstein et al[3]
One of these scenes is present in both directories. For the validation of MAJA, the "Hollstein" scenes were not used because of their acquisition at a time period when Sentinel-2 was not yet operational, with a degraded repetitivity of observations.

# Description of the data structure
The name of each scene directory is the name of the corresponding Sentinel-2 L1C product.
In the scene directory, three sub-directories can be found.
- `Classification`
- `Samples`
- `Statistics`

# Description of the files
- `Classification/classification_map.tif` --- the main product, which is the classified scene. 7 classes are available. Each one is represented with a different integer.
0: no_data.
1: not used.
2: low clouds.
3: high clouds.
4: clouds shadows.
5: land.
6: water.
7: snow.

- `Classification/confidence_enhanced.tif` --- enhanced confidence map of the classification. The values are between 0 and 255 (coded on 1 bit).
The original confidence map is, for each pixel, the proportion of votes for the majority class as the classification map has been created via a Random Forest algorithm.
A median filter has been applied to this confidence map. Finally, the value was saved on 1 bit, leading to the value being between 0 and 255.

- `Classification/contours.png` --- the contours of the classes from the classification map, overlayed on the scene. The color code depends on each class.
Green: low and high clouds. Yellow: cloud shadows. Blue: water. Purple: snow.

- `Classification/used_parameters.json` --- the parameters that were used to classify the scene. It includes the tile code, the cloudy and clear dates, along with their product reference.

- `Samples/` --- this directory contains all the shapefiles, one per class.

- `Statistics/k_fold_summary.json` --- results of the 10-fold cross-validation on the scene.
5 metrics are computed, in the order given in the "metrics_names". "all_metrics" is a list of the 10 folds, with the 5 metrics in the correct order for each fold.
"means" and "stds" are the means and standard deviations of the 10 folds.


# References

[1] Baetens, L.; Desjardins, C.; Hagolle, O. Validation of Copernicus Sentinel-2 Cloud Masks Obtained from MAJA, Sen2Cor, and FMask Processors Using Reference Cloud Masks Generated with a Supervised Active Learning Procedure. Remote Sens. 2019, 11, 433.

[2] A multi-temporal method for cloud detection, applied to FORMOSAT-2, VENµS, LANDSAT and SENTINEL-2 images, O Hagolle, M Huc, D. Villa Pascual, G Dedieu, Remote Sensing of Environment 114 (8), 1747-1755, 2010

[3] Hollstein, A.; Segl, K.; Guanter, L.; Brell, M.; Enesco, M. Ready-to-Use Methods for the Detection of Clouds, Cirrus, Snow, Shadow, Water and Clear Sky Pixels in Sentinel-2 MSI Images. Remote Sens. 2016, 8, 666</dc:description>
  <dc:identifier>https://zenodo.org/record/1460961</dc:identifier>
  <dc:identifier>10.5281/zenodo.1460961</dc:identifier>
  <dc:identifier>oai:zenodo.org:1460961</dc:identifier>
  <dc:language>eng</dc:language>
  <dc:relation>doi:10.5281/zenodo.1460960</dc:relation>
  <dc:relation>url:https://zenodo.org/communities/remote-sensing</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:subject>Sentinel-2</dc:subject>
  <dc:subject>Cloud mask</dc:subject>
  <dc:subject>Validation</dc:subject>
  <dc:title>Sentinel-2 reference cloud masks generated by an active learning method</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
2,044
712
views
downloads
All versions This version
Views 2,0442,043
Downloads 712712
Data volume 167.0 GB167.0 GB
Unique views 1,8151,814
Unique downloads 426426

Share

Cite as