SEN2VENµS, a dataset for the training of Sentinel-2 super-resolution algorithms
- 1. CESBIO, Université de Toulouse, CNES, CNRS, INRAE, IRD, UT3
Description
1 Description
SEN2VENµS is an open dataset for the super-resolution of Sentinel-2 images by leveraging simultaneous acquisitions with the VENµS satellite. The dataset is composed of 10m and 20m cloud-free surface reflectance patches from Sentinel-2, with their reference spatially-registered surface reflectance patches at 5 meters resolution acquired on the same day by the VENµS satellite. This dataset covers 29 locations with a total of 132 955 patches of 256x256 pixels at 5 meters resolution, and can be used for the training of super-resolution algorithms to bring spatial resolution of 8 of the Sentinel-2 bands down to 5 meters.
Changelog with respect to version 1.0.0 (https://zenodo.org/records/6514159)
- All patches are now stored in indivual geoTiFF files with proper geo-referencing, regrouped in zip files per site and per category,
- The dataset now includes 20 meter resolution SWIR bands B11 and B12 from Sentinel-2 (L2A from Theia). Note that there is no HR reference for those bands, since the VENµS sensor has no SWIR band.
2 Files organization
The dataset is composed of separate sub-datasets embedded in separate zip files, one for each site, as described in table 1. Note that there might be slight variations in number of patches and number of pairs with respect to version 1.0.0, due do incorrect count of samples in previous version (an empty tensor was still accounted for).
Table 1: Number of patches and pairs for each site, along with VENµS viewing zenith angle
Site | Number of patches | Number of pairs | VENµS Zenith Angle |
---|---|---|---|
FR-LQ1 | 4888 | 18 | 1.795402 |
NARYN | 3813 | 24 | 5.010906 |
FGMANAUS | 129 | 4 | 7.232127 |
MAD-AMBO | 1442 | 18 | 14.788115 |
ARM | 15859 | 39 | 15.160683 |
BAMBENW2 | 9018 | 34 | 17.766533 |
ES-IC3XG | 8822 | 34 | 18.807686 |
ANJI | 2312 | 14 | 19.310494 |
ATTO | 2258 | 9 | 22.048651 |
ESGISB-3 | 6057 | 19 | 23.683871 |
ESGISB-1 | 2891 | 12 | 24.561609 |
FR-BIL | 7105 | 30 | 24.802892 |
K34-AMAZ | 1384 | 20 | 24.982675 |
ESGISB-2 | 3067 | 13 | 26.209776 |
ALSACE | 2653 | 16 | 26.877071 |
LERIDA-1 | 2281 | 5 | 28.524780 |
ESTUAMAR | 911 | 12 | 28.871947 |
SUDOUE-5 | 2176 | 20 | 29.170244 |
KUDALIAR | 7269 | 20 | 29.180855 |
SUDOUE-6 | 2435 | 14 | 29.192055 |
SUDOUE-4 | 935 | 7 | 29.516127 |
SUDOUE-3 | 5363 | 14 | 29.998115 |
SO1 | 12018 | 36 | 30.255978 |
SUDOUE-2 | 9700 | 27 | 31.295256 |
ES-LTERA | 1701 | 19 | 31.971764 |
FR-LAM | 7299 | 22 | 32.054056 |
SO2 | 738 | 22 | 32.218481 |
BENGA | 5857 | 28 | 32.587334 |
JAM2018 | 2564 | 18 | 33.718953 |
Each site zip file contains a subfolder with the site name. This subfolder contains secondary zip files for each date, following this naming convention as the pair id
: {site_name}_{acquisition_date}_{mgrs_tile}
. For each date, 5 zip files are available, as shown in table 2.Each zip file contain subfolder {bands}/{resolution}/
in which one GeoTiFF file per patch is stored, with the following naming convention: {site_name}_{idx}_{acquisition_date}_{mgr_tile}_{bands}_{resolution}.tif
. Pixel values are encoded as 16 bits signed integers and should be converted back to floating point surface reflectance by dividing each and every value by 10 000 upon reading.
Table 2: Naming convention for zip files associated to each date.
File | Content |
---|---|
{id}_05m_b2b3b4b8.zip |
5m patches (\(256\times256\) pix.) for S2 B2, B3, B4 and B8 (from VENµS) |
{id}_10m_b2b3b4b8.zip |
10m patches (\(128\times128\) pix.) for S2 B2, B3, B4 and B8 (from Sentinel-2) |
{id}_05m_b5b6b7b8a.zip |
5m patches (\(256\times256\) pix.) for S2 B5, B6, B7 and B8A (from VENµS) |
{id}_20m_b5b6b7b8a.zip |
20m patches (\(64\times64\) pix.) for S2 B5, B6, B7 and B8A (from Sentinel-2) |
{id}_20m_b11b12.zip |
20m patches (\(64\times64\) pix.) for S2 B11 and B12 (from Sentinel-2) |
Each file comes with a master index.csv
CSV (Comma Separated Values) file, with one row for each pair sampled in the given site. Columns are named after the {bands}_{resolution}
pattern, and contains the full path to the corresponding GeoTiFF wihin the corresponding zip file:
{site}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.zip/{bands}/{resolution}/{site}_{idx}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.tif
3 Licencing
3.1 Sentinel-2 patches
3.1.1 Copyright
Value-added data processed by CNES for the Theia data centre www.theia-land.fr using Copernicus products. The processing uses algorithms developed by Theia's Scientific Expertise Centres. Note: Copernicus Sentinel-2 Level 1C data is subject to this license: https://theia.cnes.fr/atdistrib/documents/TC_Sentinel_Data_31072014.pdf
3.1.2 Licence
Files *_b2b3b4b8_10m.tif
, *_b5b6b7b8a_20m.tif
and *_b11b12_20m.tif
are distributed under the the original licence of the Sentinel-2 Theia L2A products, which is the Etalab Open Licence Version 2.0 2.
3.2 VENµS patches
3.2.1 Copyright
Value-added data processed by CNES for the Theia data centre www.theia-land.fr using VENµS satellite imagery from CNES and Israeli Space Agency. The processing uses algorithms developed by Theia's Scientific Expertise Centres.
3.2.2 Licence
Files *_b2b3b4b8_05m.tif
and *_b5b6b7b8a_05m.tif
are distributed under the original licence of the VENµS products, which is Creative Commons BY-NC 4.0 3.
3.3 Remaining files
All remaining files are distributed under the Creative Commons BY 4.0 4 licence.
4 Note to users
Note that even if the VenµS2 dataset is sorted by sites and by pairs, we strongly encourage users to apply the full set of machine learning best practices when using it : random keeping separate pairs (or even sites) for testing purpose, and randomization of patches accross sites and pairs in the training and validation sets.
5 Citing
Please cite the following data paper (preprint, submitted to MDPI Data) and zenodo link when publishing work derived from this dataset:
Michel, J.; Vinasco-Salinas, J.; Inglada, J.; Hagolle, O. SEN2VENµS, a Dataset for the Training of Sentinel-2 Super-Resolution Algorithms. Data 2022, 7, 96. https://doi.org/10.3390/data7070096
Footnotes:
https://theia.cnes.fr/atdistrib/documents/Licence-Theia-CNES-Sentinel-ETALAB-v2.0-en.pdf
Files
ALSACE.zip
Files
(139.5 GB)
Name | Size | Download all |
---|---|---|
md5:a5e7d8529843c24a1ee8fd9bb6ba83b1
|
2.8 GB | Preview Download |
md5:99c1cd576f15e7b5f2ab12fcf5fb52ba
|
2.4 GB | Preview Download |
md5:e053ca18e6c7560fb4c3ac5d96895e4e
|
16.6 GB | Preview Download |
md5:5fd54493f9af08171a5efaafa1f80bec
|
2.1 GB | Preview Download |
md5:a61e7fe0da5ea5def9053083478cd4e8
|
9.8 GB | Preview Download |
md5:78fca1574a71fb98d0c631fec8019c03
|
5.9 GB | Preview Download |
md5:deafdf545acd6ed0ddf868ce623d5d31
|
9.3 GB | Preview Download |
md5:2a1ea3a6fc1627fc865e83bea9c535b7
|
1.8 GB | Preview Download |
md5:96c9b62f28f2699aa11badb5adf7d38a
|
3.1 GB | Preview Download |
md5:65b5dab4e25c105edc7c54f43d103766
|
3.1 GB | Preview Download |
md5:c973683f0fb354e7fb0126af7b30533e
|
6.1 GB | Preview Download |
md5:422713627defb78ee519eff2dc6537c3
|
889.8 MB | Preview Download |
md5:8b8e2bf764721c8b4a4ee88c76d3ca96
|
121.2 MB | Preview Download |
md5:ba5e37e45f7ec6fb0a261386609e909f
|
7.2 GB | Preview Download |
md5:73f4eb6a322fd14ffac371dd0202ba18
|
7.9 GB | Preview Download |
md5:bf2933619a1aa27308a3b5f63757f7ca
|
5.2 GB | Preview Download |
md5:556b325631af667469058947b3bc91be
|
2.7 GB | Preview Download |
md5:19fdf2eac804fd9314393f26b6f3c62e
|
1.3 GB | Preview Download |
md5:5e80cb602de2c51f3a6fb3bd69c9ce1c
|
7.9 GB | Preview Download |
md5:7a6951f1c29a0eb61ab05e137bfbf5ea
|
2.4 GB | Preview Download |
md5:1acd0359cb65c21490beb7a835581e42
|
1.5 GB | Preview Download |
md5:7c7e63cec125dad00f5b61bb4082d90f
|
4.0 GB | Preview Download |
md5:9e5a6e161774fa75e4fb48d0a08e5f9b
|
12.9 GB | Preview Download |
md5:69b47a2618c902287047174b461d9d18
|
789.7 MB | Preview Download |
md5:416dcc0a2892c36f3567217ac70cf7ff
|
10.5 GB | Preview Download |
md5:877d6ee65c264b4fdfff917235a25efd
|
5.7 GB | Preview Download |
md5:ade5b9a74cb3360f864d525a664821d4
|
954.8 MB | Preview Download |
md5:f8f754a954f0d99bee1fc8b71ce8d466
|
2.3 GB | Preview Download |
md5:47a2a0dc11f29f6a145cdd301c6c576b
|
2.6 GB | Preview Download |
Additional details
Related works
- Is documented by
- Journal article: 10.3390/data7070096 (DOI)