Published January 17, 2025 | Version 2.0.0
Dataset Open

SEN2VENµS, a dataset for the training of Sentinel-2 super-resolution algorithms

  • 1. CESBIO, Université de Toulouse, CNES, CNRS, INRAE, IRD, UT3

Description

1 Description

SEN2VENµS is an open dataset for the super-resolution of Sentinel-2 images by leveraging simultaneous acquisitions with the VENµS satellite. The dataset is composed of 10m and 20m cloud-free surface reflectance patches from Sentinel-2, with their reference spatially-registered surface reflectance patches at 5 meters resolution acquired on the same day by the VENµS satellite. This dataset covers 29 locations with a total of 132 955 patches of 256x256 pixels at 5 meters resolution, and can be used for the training of super-resolution algorithms to bring spatial resolution of 8 of the Sentinel-2 bands down to 5 meters.

Changelog with respect to version 1.0.0 (https://zenodo.org/records/6514159)

  • All patches are now stored in indivual geoTiFF files with proper geo-referencing, regrouped in zip files per site and per category,
  • The dataset now includes 20 meter resolution SWIR bands B11 and B12 from Sentinel-2 (L2A from Theia). Note that there is no HR reference for those bands, since the VENµS sensor has no SWIR band.

2 Files organization

The dataset is composed of separate sub-datasets embedded in separate zip files, one for each site, as described in table 1. Note that there might be slight variations in number of patches and number of pairs with respect to version 1.0.0, due do incorrect count of samples in previous version (an empty tensor was still accounted for).

Table 1: Number of patches and pairs for each site, along with VENµS viewing zenith angle

Site Number of patches Number of pairs VENµS Zenith Angle
FR-LQ1 4888 18 1.795402
NARYN 3813 24 5.010906
FGMANAUS 129 4 7.232127
MAD-AMBO 1442 18 14.788115
ARM 15859 39 15.160683
BAMBENW2 9018 34 17.766533
ES-IC3XG 8822 34 18.807686
ANJI 2312 14 19.310494
ATTO 2258 9 22.048651
ESGISB-3 6057 19 23.683871
ESGISB-1 2891 12 24.561609
FR-BIL 7105 30 24.802892
K34-AMAZ 1384 20 24.982675
ESGISB-2 3067 13 26.209776
ALSACE 2653 16 26.877071
LERIDA-1 2281 5 28.524780
ESTUAMAR 911 12 28.871947
SUDOUE-5 2176 20 29.170244
KUDALIAR 7269 20 29.180855
SUDOUE-6 2435 14 29.192055
SUDOUE-4 935 7 29.516127
SUDOUE-3 5363 14 29.998115
SO1 12018 36 30.255978
SUDOUE-2 9700 27 31.295256
ES-LTERA 1701 19 31.971764
FR-LAM 7299 22 32.054056
SO2 738 22 32.218481
BENGA 5857 28 32.587334
JAM2018 2564 18 33.718953

 

Each site zip file contains a subfolder with the site name. This subfolder contains secondary zip files for each date, following this naming convention as the pair id: {site_name}_{acquisition_date}_{mgrs_tile}. For each date, 5 zip files are available, as shown in table 2.Each zip file contain subfolder {bands}/{resolution}/ in which one GeoTiFF file per patch is stored, with the following naming convention: {site_name}_{idx}_{acquisition_date}_{mgr_tile}_{bands}_{resolution}.tif. Pixel values are encoded as 16 bits signed integers and should be converted back to floating point surface reflectance by dividing each and every value by 10 000 upon reading.

Table 2: Naming convention for zip files associated to each date.

File Content
{id}_05m_b2b3b4b8.zip 5m patches (\(256\times256\) pix.) for S2 B2, B3, B4 and B8 (from VENµS)
{id}_10m_b2b3b4b8.zip 10m patches (\(128\times128\) pix.) for S2 B2, B3, B4 and B8 (from Sentinel-2)
{id}_05m_b5b6b7b8a.zip 5m patches (\(256\times256\) pix.) for S2 B5, B6, B7 and B8A (from VENµS)
{id}_20m_b5b6b7b8a.zip 20m patches (\(64\times64\) pix.) for S2 B5, B6, B7 and B8A (from Sentinel-2)
{id}_20m_b11b12.zip 20m patches (\(64\times64\) pix.) for S2 B11 and B12 (from Sentinel-2)

 

Each file comes with a master index.csv CSV (Comma Separated Values) file, with one row for each pair sampled in the given site. Columns are named after the {bands}_{resolution} pattern, and contains the full path to the corresponding GeoTiFF wihin the corresponding zip file:

{site}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.zip/{bands}/{resolution}/{site}_{idx}_{acquisition_date}_{mgrs_tile}_{bands}_{resolution}.tif

3 Licencing

3.1 Sentinel-2 patches

3.1.1 Copyright

Value-added data processed by CNES for the Theia data centre www.theia-land.fr using Copernicus products. The processing uses algorithms developed by Theia's Scientific Expertise Centres. Note: Copernicus Sentinel-2 Level 1C data is subject to this license: https://theia.cnes.fr/atdistrib/documents/TC_Sentinel_Data_31072014.pdf

3.1.2 Licence

Files *_b2b3b4b8_10m.tif*_b5b6b7b8a_20m.tif and *_b11b12_20m.tif are distributed under the the original licence of the Sentinel-2 Theia L2A products, which is the Etalab Open Licence Version 2.0 2.

3.2 VENµS patches

3.2.1 Copyright

Value-added data processed by CNES for the Theia data centre www.theia-land.fr using VENµS satellite imagery from CNES and Israeli Space Agency. The processing uses algorithms developed by Theia's Scientific Expertise Centres.

3.2.2 Licence

Files *_b2b3b4b8_05m.tif and *_b5b6b7b8a_05m.tif are distributed under the original licence of the VENµS products, which is Creative Commons BY-NC 4.0 3.

3.3 Remaining files

All remaining files are distributed under the Creative Commons BY 4.0 4 licence.

4 Note to users

Note that even if the VenµS2 dataset is sorted by sites and by pairs, we strongly encourage users to apply the full set of machine learning best practices when using it : random keeping separate pairs (or even sites) for testing purpose, and randomization of patches accross sites and pairs in the training and validation sets.

5 Citing

Please cite the following data paper (preprint, submitted to MDPI Data) and zenodo link when publishing work derived from this dataset:

Michel, J.; Vinasco-Salinas, J.; Inglada, J.; Hagolle, O. SEN2VENµS, a Dataset for the Training of Sentinel-2 Super-Resolution Algorithms. Data 2022, 7, 96. https://doi.org/10.3390/data7070096

10.5281/zenodo.14603764

Footnotes:

1

https://pytorch.org/

2

https://theia.cnes.fr/atdistrib/documents/Licence-Theia-CNES-Sentinel-ETALAB-v2.0-en.pdf

3

https://creativecommons.org/licenses/by-nc/4.0/

4

https://creativecommons.org/licenses/by/4.0/

Files

ALSACE.zip

Files (139.5 GB)

Name Size Download all
md5:a5e7d8529843c24a1ee8fd9bb6ba83b1
2.8 GB Preview Download
md5:99c1cd576f15e7b5f2ab12fcf5fb52ba
2.4 GB Preview Download
md5:e053ca18e6c7560fb4c3ac5d96895e4e
16.6 GB Preview Download
md5:5fd54493f9af08171a5efaafa1f80bec
2.1 GB Preview Download
md5:a61e7fe0da5ea5def9053083478cd4e8
9.8 GB Preview Download
md5:78fca1574a71fb98d0c631fec8019c03
5.9 GB Preview Download
md5:deafdf545acd6ed0ddf868ce623d5d31
9.3 GB Preview Download
md5:2a1ea3a6fc1627fc865e83bea9c535b7
1.8 GB Preview Download
md5:96c9b62f28f2699aa11badb5adf7d38a
3.1 GB Preview Download
md5:65b5dab4e25c105edc7c54f43d103766
3.1 GB Preview Download
md5:c973683f0fb354e7fb0126af7b30533e
6.1 GB Preview Download
md5:422713627defb78ee519eff2dc6537c3
889.8 MB Preview Download
md5:8b8e2bf764721c8b4a4ee88c76d3ca96
121.2 MB Preview Download
md5:ba5e37e45f7ec6fb0a261386609e909f
7.2 GB Preview Download
md5:73f4eb6a322fd14ffac371dd0202ba18
7.9 GB Preview Download
md5:bf2933619a1aa27308a3b5f63757f7ca
5.2 GB Preview Download
md5:556b325631af667469058947b3bc91be
2.7 GB Preview Download
md5:19fdf2eac804fd9314393f26b6f3c62e
1.3 GB Preview Download
md5:5e80cb602de2c51f3a6fb3bd69c9ce1c
7.9 GB Preview Download
md5:7a6951f1c29a0eb61ab05e137bfbf5ea
2.4 GB Preview Download
md5:1acd0359cb65c21490beb7a835581e42
1.5 GB Preview Download
md5:7c7e63cec125dad00f5b61bb4082d90f
4.0 GB Preview Download
md5:9e5a6e161774fa75e4fb48d0a08e5f9b
12.9 GB Preview Download
md5:69b47a2618c902287047174b461d9d18
789.7 MB Preview Download
md5:416dcc0a2892c36f3567217ac70cf7ff
10.5 GB Preview Download
md5:877d6ee65c264b4fdfff917235a25efd
5.7 GB Preview Download
md5:ade5b9a74cb3360f864d525a664821d4
954.8 MB Preview Download
md5:f8f754a954f0d99bee1fc8b71ce8d466
2.3 GB Preview Download
md5:47a2a0dc11f29f6a145cdd301c6c576b
2.6 GB Preview Download

Additional details

Related works

Is documented by
Journal article: 10.3390/data7070096 (DOI)