Published September 9, 2025 | Version v3
Dataset Open

EEA Air Quality In-Situ Measurement Station Data

Description

Introduction

This dataset is a value-added product based on 'Up-to-date air quality station measurements', administered by the European Environmental Agency (EEA) and collected by its member states. The original hourly measurement data (NO2, SO2, O3, PM10, PM2.5 in µg/m³) was reshaped, gapfilled and aggregated to different temporal resolutions, making it ready to use in time series analysis or spatial interpolation tasks.

Reproducible code for accessing and processing this data and notebooks for demonstration can be found on Github.

Accessing and pre-processing hourly data

Hourly data was retrieved through the API of the EEA Air Quality Download Service. Measurements (single files per station and pollutant) were joined to create a single time series per station with observations for multiple pollutants. As PM2.5 data is sparse but correlates well with PM10, gapfilling was performed according to methods described in Horálek et al., 2023¹. Validity and verification flags from the original data were passed on for quality filtering. Reproducible computational notebooks using the R programming language are available for the data access and the gapfilling procedure.

Temporal aggregates

Data was aggregated to three coarser temporal resolutions: day, month, and year. Coverage (ratio of non-missing value) was calculated for each pollutant and temporal increment. A threshold of 75% was applied to generate reliable aggregates. All pollutants were aggregated by their aritmethic mean. Additionally, two pollutants were aggregated using a percentile method, which has shown to be more appropriate for mapping applications. PM10 was summarized using the 90.41th percentile. Daily O3 was further summarized as the maximum of the 8-hour running mean. Based thereon, monthly and annual O3 was aggregated using the 93.15th percentile of the daily maxima. For more details refer to the reproducible computational notebook on temporal aggregation.

Data columns

column hourly daily monthly annual description
Air.Quality.Station.EoI.Code x x x x Unique station ID 
Countrycode x x x x Two-letter ISO country code
Start x       Start time of (hourly) measurement period 
<Pollutant> x x x x One of NO2; SO2; O3; O3_max8h_93.15; PM10; PM10_90.41; PM2.5 in µg/m³ 
Validity_<Pollutant> x       Validity flag of the respective pollutant
Verification_<Pollutant> x       Verification flag of the respective pollutant
filled_PM2.5  x       Flag indicating if PM2.5 value is measured or supplemented through gapfilling (boolean) 
year   x x x Year (2015-2023) 
cov.year_<Pollutant>   x   x Data coverage throughout the year (0-1) 
month   x x   Month (1-12) 
cov.month_<Pollutant>   x x   Data coverage throughout the month (0-1) 
doy   x     Day of year (0-366)        
cov.day_<Pollutant>   x     Data coverage throughout the day (0-1)

 

Station meta data

The below table lists relevant meta data on the station level, including type and area of measurement stations, as well as their coordinates. It is, hence, static and does not vary over time. These data are directly included in daily, monthly, and annual aggregates. To optimize file size for hourly data, it is stored seperately (in a file named "EEA_stations_meta_table.parquet") and can be joined. 

column description
Air.Quality.Station.EoI.Code Unique station ID (required for join)
Countrycode Two-letter ISO country code 
Station.Type One of "background", "industrial", or "traffic"
Station.Area One of "urban", "suburban", "rural", "rural-nearcity", "rural-regional", "rural-remote"
Longitude & Latitude Geographic coordinates of the station

Parquet file format

This dataset is shipped in Parquet files. Parquet is a relatively new and very memory-efficient format, that differs from traditional tabular file formats (e.g. CSV) in the sense that it is binary and cannot be opened and displayed by common tabular software (e.g. MS Excel, Libre Office, etc.).

Daily, monthly and annual data files are small (> 200Mb) and stored in a single file each. They are written in GeoParquet format, making them ready to use in e.g. GIS (via download) or cloud environment (via URL).

Hourly data is much larger (3.7Gb) and is therefore partitioned by `Countrycode` (one file per country) to enable reading smaller subsets. Users should use an Apache Arrow implementation to open the files, for example in Python, R, C++, or another scripting language. Reading the data there is straight forward (see the code samples below). Hourly data can be downloaded in single files or all together in a zipped archive.

 

R code:

# required libraries
library(arrow)
library(sf)


# read air quality and meta data
aq = read_parquet("airquality.no2.o3.so2.pm10.pm2p5_4.annual_pnt_20150101_20231231_eu_epsg.3035_v20240718.parquet") 
aq = st_as_sf(aq) |> st_set_crs(4326)

 

Python code:   

# required library
import geopandas as gpd

# read air quality
aq = gpd.read_parquet("airquality.no2.o3.so2.pm10.pm2p5_4.annual_pnt_20150101_20231231_eu_epsg.3035_v20240718.parquet")

Files

airquality.no2.o3.pm10.pm2p5_1.hourly_pnt_20150101_20231231_eu_epsg.3035_v20240718.zip

Files (8.9 GB)

Name Size Download all
md5:847fc33c7570b810fab5e44c3a3aace4
4.0 GB Preview Download
md5:2a63b2dd9b0a795692cc7e5ac6d2b706
204.0 MB Download
md5:8a372c7c4dd40d6f4a3708082b31521a
12.8 MB Download
md5:6a08eccd6b355683f34fda77512ce038
1.5 MB Download
md5:ef469e9e205574ce2d06c3a21bec5f06
2.4 MB Download
md5:f9c3b8ccfb2d3e4c71b56a810f982814
5.0 MB Download
md5:8dbb1290927180fb5254b1d28d6074bd
346.6 MB Download
md5:d4ad2d2acdf6ad34bdeb460b9afd7636
38.6 MB Download
md5:726cc97959d28c3fe5b081bae2892f44
66.5 MB Download
md5:f33545802a4fd222e20f3c6ad9625f7b
21.6 MB Download
md5:52c5080eeb1714bac414bd1e0c06d610
34.5 MB Download
md5:6e0b1096fd81fe18f329178cd3ddef59
3.9 MB Download
md5:d2d20f5943f8687dbc51b7b32949d3f5
102.0 MB Download
md5:2fc83a1a17ff10037a88f31dd0df1269
908.4 MB Download
md5:f563cacc3d18a00946c487115cd2f154
8.9 MB Download
md5:807c3497d4cfec0c880291aeadeb9245
3.9 MB Download
md5:528f5e7e9db2caf267cddcceff003804
437.7 MB Download
md5:68007b99c0b8a2055b7c6b1a52960b01
90.8 MB Download
md5:dfec651c7ec016d790fffd4a73c45dae
513.4 MB Download
md5:6bba0b66e88130983372a06fe926f290
145.4 MB Download
md5:5ba2dd961e3365b2abb8d66dce771821
23.1 MB Download
md5:b4ad37092b6e548bec747412f1b875a0
24.8 MB Download
md5:522ee4d60b274e91029aabc98bca0896
21.5 MB Download
md5:626ae8e51f141668e3829d9ce0874d74
19.8 MB Download
md5:3045581455094955b28e5cdbd0832d82
22.5 MB Download
md5:32c173b59c09e5d7cf1b09f3c9edb2c5
617.2 MB Download
md5:0ab034685c72edf6a565a1a645657908
22.1 MB Download
md5:b8fe6fcf9560959c8bf067f8b4f81891
6.1 MB Download
md5:9bbe848110547bf752999793a40fc1bd
6.7 MB Download
md5:be5328838ad4c8d17d12e0ff49794f61
3.6 MB Download
md5:29e0fd7de2bc9cac983d40f61734bead
25.8 MB Download
md5:3f68a08df2b3f5ebb2990c27a693adc1
4.4 MB Download
md5:ae9060b8d08ecec48b0413a71f006a9b
75.9 MB Download
md5:c17fd7c1e227c72c4f79a4b3f6bb2b44
115.9 MB Download
md5:70849a755a9838327e80a60b42c3fa73
398.9 MB Download
md5:913fec0c1693a8189ddc7d9baee2fc79
58.3 MB Download
md5:6b42e2346e6cb69c487a61c26fa9063e
120.2 MB Download
md5:b07eb04770f29b9ce1e8ea6a2a5b25ba
17.8 MB Download
md5:8d0344dbd5c4586290d0f84eb2fcb0ef
100.3 MB Download
md5:379191e549148973f055e20d3009fc9b
7.2 MB Download
md5:0db2a4e41db7b151b98a7c7b074cc20f
51.7 MB Download
md5:6eb72151b184254b5d0cca2480ad694c
241.2 MB Download
md5:12f48f4ae27c1aa424e5c4bf7822aae9
22.6 MB Download
md5:cb6bc4a45bf4bbc8b85a327716c33f12
150.2 kB Download

Additional details

Funding

European Commission
OEMC - Open-Earth-Monitor Cyberinfrastructure 101059548

Software

Repository URL
https://github.com/Open-Earth-Monitor/UseCase_AIRCON/tree/WP4_insitu
Programming language
R, Python
Development Status
Active

References

  • Horálek, J., Vlasáková, L., Schreiberová, M., Marková, J., Schneider, P., Kurfürst, P., Tognet, F., Schovánková, J., Vlček, O., Damašková, D., 2022. European air quality maps for 2020. PM10, PM2.5, Ozone, NO2, NOx and Benzo(a)pyrene spatial estimates and their uncertainties. (No. ETC HE Report 2022/12).