Published June 2025 | Version v1.1
Dataset Open

The WorldStrat Dataset: Open High-Resolution Satellite Imagery With Paired Multi-Temporal Low-Resolution

  • 1. University College London, WhyHow Ltd
  • 2. Why How Ltd
  • 3. Oxford University, Why How Ltd
  • 1. ROR icon University College London
  • 2. WhyHow Ltd
  • 3. ROR icon University of Oxford

Description

What is this dataset?

Nearly 10,000 km² of free high-resolution and matched low-resolution satellite imagery of unique locations which ensure stratified representation of all types of land-use across the world: from agriculture to ice caps, from forests to multiple urbanization densities.

Those locations are also enriched with typically under-represented locations in ML datasets: sites of humanitarian interest, illegal mining sites, and settlements of persons at risk.

Each high-resolution image (1.5 m/pixel) comes with multiple temporally-matched low-resolution images from the freely accessible lower-resolution Sentinel-2 satellites (10 m/pixel).

We accompany this dataset with a paper, datasheet for datasets and an open-source Python package to: rebuild or extend the WorldStrat dataset, train and infer baseline algorithms, and learn with abundant tutorials, all compatible with the popular EO-learn toolbox.

Why make this?

We hope to foster broad-spectrum applications of ML to satellite imagery, and possibly develop the same power of analysis allowed by costly private high-resolution imagery from free public low-resolution Sentinel2 imagery. We illustrate this specific point by training and releasing several highly compute-efficient baselines on the task of Multi-Frame Super-Resolution.

Licences

  • The high-resolution Airbus imagery is distributed, with authorization from Airbus, under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
  • The labels, Sentinel2 imagery, and trained weights are released under Creative Commons with Attribution 4.0 International (CC BY 4.0).
  • The source code (will be shortly released on GitHub) under 3-Clause BSD license.

Files

hr_dataset.zip

Files (104.1 GB)

Name Size Download all
md5:5ae09bb3557ce131242a133d9758d9e7
40.8 GB Preview Download
md5:515f38e333bf06e79ac523fb2eab588d
11.4 GB Preview Download
md5:d97b8d86da83f7e51f2d3205509e4a7b
39.3 kB Preview Download
md5:e90ecfa4bf838ace0b51dea1031b5ed1
26.4 GB Preview Download
md5:7aa1878a37d22a6c7c4b84b022a14ad7
25.5 GB Preview Download
md5:1a66ac42b9a688be18debd0d95633fa1
18.4 MB Preview Download
md5:874612b59bbf7987f7de7edd48a30c70
306.9 kB Preview Download

Additional details

Related works

Is new version of
Dataset: 10.5281/zenodo.6810792 (DOI)

Funding

European Space Agency
Query Planet Project 4000124792/18/I-BG CCN3

Dates

Created
2022-07-13
First upload to Zenodo
Accepted
2022-09-16
Accepted at NeurIPS 2022
Updated
2025-06
Updated to v1.1

Software

Repository URL
https://github.com/worldstrat/worldstrat
Programming language
Python
Development Status
Active