Training and validation data for artificial neural networks using three-dimensional partial convolutions to fill gaps in satellite image time series
Authors/Creators
Description
This dataset contains training and validation data for artificial neural networks using three-dimensional partial convolutions to fill gaps in satellite image time series. The data have been derived from Sentinel-5P total column carbon monoxide observations, using the offline processing stream.
Preprocessing
The following operations have been applied on the original S5P imagery:
- Images have been resampled to 0.1 by 0.1 degree spatial resolution
- Pixels with quality assessment value less than or equal to 0.5 have been set to NA
- Images have been aggregated by day of observation
- Images have been cropped to -60 to 60 degrees latitude
- Images have been devided into spatiotemporal blocks of size 128 x 128 pixels and 16 days
Imagery has been recorded between 2021-01-01 and 2021-11-25. Notice that both the training and the validation blocks have been randomly sampled from all available blocks.
Data Format and Naming Conventions
Input and output data blocks are stored as GeoTIFF files, where bands represent time. Notice the following file naming conventions:
- Files starting with X represent input measurements for training, where artificial gaps have been added.
- Files starting with Y represent true measurements without artificially added gaps (but still containing gaps in many cases).
- Binary masks of input data where all pixels with valid measurements are 1 and others 0 are stored in files whose name starts with MASK
- Files starting with VALMASK contain a binary mask where only pixels that are available in Y but not in X are 1. The latter is used for validation on artificially removed pixels only.
Numbers in filenames encode spatial and temporal block indexes.
In addition, the dataset contains prediction of the validation blocks from different models in the `predictions` directory. The subfolders contain output from different models:
- mean refers to simple block-wise mean predictions.
- timeseries refers to simple linear time series interpolation.
- gapfill refers to the method proposed in [1].
- stmra refers to the method proposed in [2].
- STpconv refers to predictions passed on an artificial neural netowork with three-dimensional partial convolutions.
References
[1] Gerber, F., de Jong, R., Schaepman, M. E., Schaepman-Strub, G., & Furrer, R. (2018). Predicting missing values in spatio-temporal remote sensing data. IEEE Transactions on Geoscience and Remote Sensing, 56(5), 2841-2853.
[2] Appel, M., & Pebesma, E. (2020). Spatiotemporal multi-resolution approximations for analyzing global environmental data. Spatial Statistics, 38, 100465.
Files
STpconv_data.zip
Files
(2.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:bffcae6b9118ea61726f35b5dae0f4bf
|
2.0 GB | Preview Download |