The WorldStrat Dataset: Open High-Resolution Satellite Imagery With Paired Multi-Temporal Low-Resolution
What is this dataset?
Nearly 10,000 km² of free high-resolution and matched low-resolution satellite imagery of unique locations which ensure stratified representation of all types of land-use across the world: from agriculture to ice caps, from forests to multiple urbanization densities.
Those locations are also enriched with typically under-represented locations in ML datasets: sites of humanitarian interest, illegal mining sites, and settlements of persons at risk.
Each high-resolution image (1.5 m/pixel) comes with multiple temporally-matched low-resolution images from the freely accessible lower-resolution Sentinel-2 satellites (10 m/pixel).
We accompany this dataset with a paper, datasheet for datasets and an open-source Python package to: rebuild or extend the WorldStrat dataset, train and infer baseline algorithms, and learn with abundant tutorials, all compatible with the popular EO-learn toolbox.
Why make this?
We hope to foster broad-spectrum applications of ML to satellite imagery, and possibly develop the same power of analysis allowed by costly private high-resolution imagery from free public low-resolution Sentinel2 imagery. We illustrate this specific point by training and releasing several highly compute-efficient baselines on the task of Multi-Frame Super-Resolution.
- The high-resolution Airbus imagery is distributed, with authorization from Airbus, under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).
- The labels, Sentinel2 imagery, and trained weights are released under Creative Commons with Attribution 4.0 International (CC BY 4.0).
- The source code (will be shortly released on GitHub) under 3-Clause BSD license.
||39.3 kB||Preview Download|
||14.6 MB||Preview Download|
||282.6 kB||Preview Download|
||2.6 MB||Preview Download|