Published June 29, 2021 | Version Version 1.0
Dataset Open

PixBox Sentinel-2 pixel collection for CMIX

  • 1. Brockmann Consult GmbH

Description

The PixBox-S2-CMIX dataset was used as a validation reference within the first Cloud Masking Inter-comparison eXercise (CMIX) conducted within the Committee Earth Observation Satellites (CEOS) Working Group on Calibration & Validation (WGCV) in 2019. The PixBox-S2-CMIX pixel collection was existing prior to CMIX and conducted already in 2018.

The overarching idea of PixBox is a quantitative assessment of the quality of a pixel classification which is the result of an automated algorithm/procedure. Pixel classification is defined as assigning a certain number of attributes to an image pixel, such as cloud, clear sky, water, land, inland water, flooded, snow etc. Such pixel classification attributes are typically used to further guide higher level processing.

The PixBox dataset production: trained experienced expert(s) manually classify pixels of an image sensor into a pre-defined detailed set of classes. These are typically different cloud transparencies, cloud shadow, condition of underlying surface (“semi-transparent clouds over snow”, “clouds over bright scattering water”). An average collected dataset includes several 10-thousands of pixels because it has to be representative for all classes, and for various observation and environmental conditions, such as climate zones, sun illumination etc. Quality control of the collected pixels is important in order to detect misclassifications and systematic errors. An auto-associative neural network is trained for this purpose.

The PixBox-S2-CMIX dataset is a pixel collection containing 17,351 pixels manually collected from 29 Sentinel-2 A & B Level 1C products. The dataset is spatially, temporally, and thematically well distributed. 

 

PixBox-S2-CMIX dataset

The PixBox-S2-CMIX dataset consists of two two main ZIP files, one holding the pixel collection and description, and another one with all used Sentinel-2 L1C data. The dataset is structured as follows:

  • PixBox-S2-CMIX.zip
    • The collected features (CSV file).
    • A description to all categories and classes, incl. linkage to the used Sentinel-2 L1C products.
  • Sentinel-2_L1C.zip
    • 29 zipped Sentinel-2 Level L1C products [1], used to produce the dataset.

Files 

pixbox_sentinel2_cmix_20180425.csv - This file contains all collected pixel information in CSV format. All collected classes are stored as integer values. A description of the categories and definition of the integers to class names is given in the additional description file.

 pixbox_sentinel2_cmix_20180425_description.txt  - This file gives a clear description of the categories and classes. It can be used to convert the class ID numbers, stored in the CSV, to class strings. Additionally, it links the satellite product ID, given in the CSV, to the Sentinel-2 L1C product names.

29 Sentinel-2 L1C products in ZIP format.

 

References

[1] Copernicus Sentinel data 2017/2018

Files

PixBox-S2-CMIX.zip

Files (22.0 GB)

Name Size Download all
md5:c279032077e046518fa7e9618be9d81f
1.5 MB Preview Download
md5:621c1185569e1976e12e7acc032273ce
22.0 GB Preview Download

Additional details

Related works

Is supplemented by
Dataset: 10.5281/zenodo.5040271 (DOI)