There is a newer version of the record available.

Published January 14, 2021 | Version 1.0
Dataset Open

Britain Breathing 2016-2019 Air Quality and Meteorological Regional Estimates Dataset

  • 1. University of Manchester
  • 2. Chinese University of Hong Kong

Description

This data set is a collection of estimated daily mean and maximum values for a range of air quality and meterological measurements and model forecasts for UK postcode districts (e.g. 'AB') for the years 2016-2019, inclusive.

The data uses a 'concentric regions' method to estimate the measurement for all regions, as follows. If measurements exist within the region, the mean of those measurements is used, if not, then a ring of neighbouring postcode regions are selected, and the mean of their measurement values used. If no measurement sites/data are found in the first ring, the process continues, taking the next ring of postcode district regions, working outwards until one or more sensors are found in a ring.  As well as the measurement estimations, the number of rings required to find site data and make the estimations is also published.

The meteorological, pollen and air quality measurement data used to make the regional estimations can be found at this Zenodo archive.  The data there contains Temperature, Relative Humidity, and Pressure data, downloaded from the Met Office MIDAS archives via the MEDMI server (https://www.data-mashup.org.uk/). Also downloaded from the MEDMI server are daily pollen measurements for the UK. PM10, PM2.5, NO2, NOx (as NO2), O3, and SO2 measurements from the DEFRA AURN network, and also model forecasts of the same made using the EMEP model.

The code used to make the estimations is available in this repository: https://github.com/UoMResearchIT/region_estimators

The dataset is presented in CSV format, as three files:

  1. postcode_district_data.csv: location metadata (region_id, geometry, description, Population, Nearest Postcode Areas, Country)
  2. turing_regional_estimates_aq_daily_met_pollen_pollution_imputed_data.csv: uses imputed site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)
  3. turing_regional_estimates_aq_daily_met_pollen_pollution_original_data.csv: uses original site data (timestamp, region_id, ...[measurement name, rings]) ('rings' is the number of rings required to make the estimation)
  4. turing_regional_estimates_aq_loc_type_daily_imputed_data.csv: uses imputed site data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)
  5. turing_regional_estimates_aq_loc_type_daily_original_data.csv: uses original data. Air quality regional estimates are calculated using specific AQ site location types* separately. (To prevent, for example, 'Traffic Urban' type sites being used to estimate 'non-traffic' or rural regions.)

* Air quality site types: 

  • Industrial: comprises 'urban industrial' (9 sites) and suburban industrial (2 sites)
  • 'Rural background' (14 sites)
  • 'Urban background' (48 sites)
  • 'Urban traffic' (47 sites)

Files

postcode_district_data.csv