Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published July 21, 2021 | Version 1.0.1
Dataset Open

Britain Breathing 2016-2019 Air Quality and Meteorological Dataset

  • 1. University of Manchester
  • 2. Chinese University of Hong Kong

Description

This data set is a collection of daily mean and maximum values for a range of air quality and meterological measurements and model forecasts for the UK for the years 2016-2019, inclusive. The dataset contains Temperature, Relative Humidity, and Pressure data, downloaded from the Met Office MIDAS archives via the MEDMI server (https://www.data-mashup.org.uk/). Also downloaded from the MEDMI server are daily pollen measurements for the UK. PM10, PM2.5, NO2, NOx (as NO2), O3, and SO2 measurements from the DEFRA AURN network, and also model forecasts of the same made using the EMEP model.

The paper describing this dataset is available here: https://www.nature.com/articles/s41597-022-01135-6

The tools used to download and process these measurement datasets are available here: https://zenodo.org/record/4545257

The dataset is designed for use with the region estimator toolset, available in this repository: https://github.com/UoMResearchIT/region_estimators

Emissions over the UK for the EMEP model runs were generated using the NAEI 2016 UK emission dataset, available in netcdf form here: https://zenodo.org/record/3997165#.X9KUBF6nzUI. The running scripts, and operation inputs for EMEP, are available here: https://zenodo.org/record/3997301#.X9KUAF6nzUI and https://zenodo.org/record/3997271#.X9KV1F6nzUI.

The dataset is presented in CSV format, as three files:

  1. turing_aq_daily_met_pollen_pollution_original_data.csv: original data (described below)
  2. turing_aq_daily_met_pollen_pollution_with_imputation_data.csv: original plus imputed data (described below)
  3. site_location_data.csv: location metadata (site_id, latitude, longitude, postcode area)

 

The columns intended to be used as indexes are:

  • timestamp,
    • date of measurements on that row
  • site_id,
    • measurement site ID, corresponding to sites in the three networks:
      • AURN [indicated by AQ],
      • MIDAS [indicated by WEATHER],
      • or pollen [indicated by POLLEN].

The data columns are:

  • O3, PM10, PM2.5, NO2, NOXasNO2, SO2, 
    • daily mean and maximum values in ug/m3 (all with "_max", "_mean", and "_flag" tags)
    • AURN measurement data
  • O3_EMEP, NO2_EMEP, SO2_EMEP, NOXasNO2_EMEP, PM2.5_EMEP, PM10_EMEP,
    • daily mean and maximum values in ug/m3 (all with "_max", and "_mean" tags)
    • EMEP model forecasts
  • alnus, ambrosia, artemisia, betula, corylus, fraxinus, platanus, poaceae, quercus, salix, ulmus, urtica,
    • daily pollen grain counts
  • temperature, relativehumidity, pressure,
    • daily mean and maximum values in degC, %, and hPa (all with "_max", "_mean", and "_flag" tags).
    • Met Office measurement data

The "_flag" columns indicate data points which have been partially, or fully, imputed. The values for these will be in the range 0-1, and indicate the fraction of the hourly values within that day that are imputed (0 = none, 1 = all 24 hourly datapoints are imputed). No imputation is done in the original dataset, so the "_flag" data in this dataset will always be zero (the number of hourly data points used to calculate the daily mean and maximum are not recorded in this dataset).

The station location metadata includes longitude, latitude, and UK postcode area data. Where sites lie outside of the UK the postcode is replaced with regional indicator (here: Republic of Ireland (ROI)).

 

Please cite the following paper if you use this dataset: Reani, M., Lowe, D., Gledson, A., Topping, D., & Jay, C. (2022). UK daily meteorology, air quality, and pollen measurements for 2016–2019, with estimates for missing data. Scientific Data, 9(1), 43. https://doi.org/10.1038/s41597-022-01135-6

Notes

This release corrects a postcode entry in site_location_data.csv. All other data is the same as version 1.0

Files

site_location_data.csv

Files (230.5 MB)

Name Size Download all
md5:300ddfb7599c32679e18c13fcddffe9d
27.1 kB Preview Download
md5:6f486c463b543ddcd3223f8cc96ff018
113.4 MB Preview Download
md5:7339e1abc84df71e393d15b297ca01d6
117.0 MB Preview Download

Additional details

Related works

Cites
Software: 10.5281/zenodo.3997301 (DOI)
Dataset: 10.5281/zenodo.3997271 (DOI)
Dataset: 10.5281/zenodo.3997165 (DOI)