Published June 10, 2024 | Version 1.0.0
Dataset Open

BreizhSR: multi-temporal cross-sensor super-resolution of satellite imagery

  • 1. ROR icon Institut de Recherche en Informatique et Systèmes Aléatoires
  • 2. Image Processing Laboratory (IPL), University of Valencia
  • 3. ROR icon Institut national de l’information géographique et forestière
  • 4. ROR icon Conservatoire National des Arts et Métiers
  • 5. ROR icon Centre d'Etudes et De Recherche en Informatique et Communications
  • 6. Laboratoire en Sciences et Technologies de l'Information Géographique pour la ville intelligente et les territoires durables

Description

BreizhSR, a super-resolution Sentinel-2 to SPOT-6/7 dataset 

1. Dataset motivation

BreizhSR is a dataset targetting super-resolution of (RGB bands of) Sentinel-2 images by providing time series colocated in space and time with SPOT-6/7 acquisitions. This dataset is composed of cloud free Sentinel-2 time series (visible bands at 10m resolution) and SPOT-6/7 pansharpened color images resampled 2.5m resolution. The study area is the region of Brittany (Breizh in the local language), located on the northwestern coast of France with an oceanic climate. The dataset covers about 35 000 km² with mostly agricultural areas (about 80 %). All acquisitions are from 2018 in the Brittany region of France.

2. Dataset organization

The dataset folder follows the structure detailed below :

BreizhSR
├── dataset_test.pkl
├── dataset_train.pkl
├── README.md
├── x
├── x_test
├── y
└── y_test

The README.md file contains the same information as this description.

Actual image patches are stored in the x and x_test folders for Sentinel-2 patches, and in the y and y_test folders for ground truth SPOT patches. Subfolders are organized using a integer identifier (e.g. 8355) that denote the series identifier. Therefore, for the S2 series x/8355, the corresponding SPOT patch is in subfolder y/8355.

This organization and additional metadata are described in two Pandas Dataframes : dataset_train.pkl and dataset_test.pkl. These files are Dataframes serialized using the pickle Python serialization protocol. The columns available in these Dataframes are described in the table below.

x y wkt spot6_name sen2_acquisitions dates_sen2 dates_spot6 split
Latitude of the center point (expressed in Lambert 93 CRS) Longitude of the center point (expressed in Lambert 93 CRS) Area of interest geometry in well-known text format Path to the SPOT ground truth Paths to the Sentinel-2 input series Acquisition dates for the Sentinel-2 images Acquisition date for the SPOT ground truth `train` or `test`

3. Data collection and preprocessing

Sentinel-2

Sentinel-2 constellation has twin satellites launched by the European Space Agency (ESA) in 2015 and 2017 that cover all Earth’s surfaces every five days at the equator. Level-2A images of the BreizhSR dataset are gathered via the THEIA platform, which employs the MAJA pre-processing algorithm to obtain atmospherically corrected ground reflectance. To match the SPOT-6 spectral characteristics, only RGB bands at a 10-meter spatial resolution (B4, B3,and B2) are used in the analysis. The images were collected for the nine tiles covering the Brittany region from the 1st of April 2018 to the 31st of August 2018, filtering images with a cloud cover under 5 %. Since the SPOT-6 data was acquired in the summer of 2018, the Sentinel-2 time period was chosen to include images from before and after the SPOT-6 acquisitions while staying in a range of similar seasonal and climate conditions.

Sentinel-2 tiles are cropped into 3x74x74 patches. The dataset is preprocessed with a min-max normalization, using the 2% and 98% percentile as an estimation of minimum and maximum values of Sentinel-2 data to take into account the presence of outliers due to artifacts such as clouds and their shadows.

SPOT-6/7

Orthorectified SPOT data under the Licence Ouverte is collected from the DINAMIS platform. Multispectral images at 6m resolution are pansharpened using the panchromatic 1.5m reference using the RCS algorithm Orfeo ToolBox, similar to the Brovey pansharpening algorithm. The pansharpened tiles are preprocessed with a min-max normalization, downsampled at 2.5m resolution and patches are finally cropped with dimensions 3x296x296.

4. License

SPOT images and the Sentinel-2 Theia L2A products are released under the Licence Ouverte 2.0 from the French government. This dataset contains modified Coprnicus Sentinel data from 2018, made available under free access by EU law. Other files in the dataset are licensed under Creative Commons Attribution 4.0 (CC BY 4.0).

Acknowledgements

We thank the support of GDR IASIS for funding this work under the SESURE project, the DINAMIS consortium, CNES/Airbus and IGN for access to the SPOT-6 data, and ESA for access to Sentinel-2 data. During the conduct of this research, Simon Donike received a European scholarship to engage in Master Copernicus in Digital Earth, Erasmus Mundus Joint Master Degree (EMJMD). We thank Dirk Tiede (Uni. Salzburg) for his help and feedback on BreizhSR. This work was performed using HPC resources from GENCI–IDRIS (grant 2022-AD011013003).

Files

Files (26.7 GB)

Name Size Download all
md5:56b6a21ea4864264e19af489661f0e09
26.7 GB Download

Additional details

Software

Repository URL
https://github.com/aimiokab/MISR-S2
Programming language
Python
Development Status
Active