Published July 23, 2021 | Version v1
Dataset Open

Estimation of Air Pollution with Remote Sensing Data: Revealing Greenhouse Gas Emissions from Space

  • 1. University of St. Gallen

Description

Description

This dataset contains remote sensing data from the ESA Copernicus missions Sentinel-2 and Sentinel-5P (tropsopheric NO2 column density) in the 2018-2020 timespan. The satellite measurements each cover ~3100 locations in Europe and ~100 on the US Westcoast, each with a size of 1.2x1.2km. The locations are selected such that each measurement is centered at the location of an air quality measurement station on the ground (from the European Environment Agency or the US Environmental Protection Agency, measuring NO2). This makes it possible to analyze spatiotemporally aligned remote sensing and ground-based measurements.

 The 13 Sentinel-2 bands are upsampled (bilinear) to 10m resolution and cropped to 120x120 pixel. For some locations multiple Sentinel-2 images are available. The images are stored as binary numpy `.npy` files organized into directories based on their locations. 

The Sentinel-5P data was pre-processed by mapping the measurements from consecutive satellite overpasses onto a common rectangular grid of 0.05×0.05◦(∼5×5km) across Europe. To harmonize the Sentinel-2 (10m to 60m, upscaled to 10m) and Sentinel-5P (5×3.5km, rescaled to 5×5km) imaging resolutions, the Sentinel-5P data is linearly interpolated to 10m resolution and cropped to 120×120 pixel around the locations of interest. Additionally, all measurements with a QA flag (qa_value) below 75 were discarded, following ESA recommendations. The Sentinel-5P data are stored as `.netcdf` file, organized by location. For each location, three such files are available, containing averaged Sentinel-5P measurements at different temporal frequencies (2018-2020, quarterly, monthly).

The <p>samples_{frequency}_{area}.csv</p> files provide a list of observations with the corresponding file paths to a (cloud-free) Sentinel-2 image, the Sentinel-5P measurement, and the average NO2 concentration measurement by the EEA or EPA ground station. These files can be used for easy data-loading.

Content

The data is organized into the following files:

  • README.md - this file
  • sentinel-2-eea.tar.gz [33.1GB]
  • sentinel-5p-eea.tar.gz [80.1GB]
  • samples_2018_2020_eea.csv 
  • samples_quarterly_eea.csv
  • samples_monthly_eea.csv
  • sentinel-2-epa.tar.gz [0.15GB]
  • sentinel-5p-epa.tar.gz [1.8GB]
  • samples_2018_2020_epa.csv
  • samples_quarterly_epa.csv
  • samples_monthly_epa.csv

Acknowledgement

If you use this data set, please cite our publication:

Scheibenreif, L., Mommert, M., Borth, D., "Estimation of Air Pollution with Remote Sensing Data: Revealing Greenhouse Gas Emissions from Space", Tackling Climate Change with Machine Learning workshop at ICML 2021.

Please refer to this publication for additional information on the data set.

This data set contains modified Copernicus Sentinel data acquired in 2018-2020, processed by ESA.

 

Responsible Author

Linus Scheibenreif
University of St. Gallen, Institute of Computer Science
Chair Artificial Intelligence and Machine Learning
linus.scheibenreif ( at ) unisg.ch

Files

samples_S2S5P_2018_2020_eea.csv

Files (115.3 GB)

Name Size Download all
md5:bf9a77cd450bc9617d48c8f0da73283b
509.7 kB Preview Download
md5:08c24c23d7e1c7abcaf43c10915e4213
14.8 kB Preview Download
md5:175fa4eba48e9b370d0cda81bbae515c
10.8 MB Preview Download
md5:f70b6748c978552ce563556e04e20279
578.3 kB Preview Download
md5:5b61be061252f50d4b3a2c40a2f99533
3.6 MB Preview Download
md5:5feccbe32c2563b3f2ed6d18cd573c05
196.4 kB Preview Download
md5:732d7b22c142a30c78cb0f1ebbd995e6
33.1 GB Download
md5:37c4fe5ed8136ed475d6843dbc275f85
150.7 MB Download
md5:a90774dc8eb671cc1d12888f400e8788
80.1 GB Download
md5:7d9c34962879bd2327aa48b07d6f5019
1.8 GB Download