Published May 17, 2025 | Version v7
Dataset Open

Dataset - High-resolution mapping of wood burning heat sources using Energy Performance Certificates: A case study of England and Wales

  • 1. ROR icon University College London

Description

This repository contains open data and code to replicate the analysis in the manuscript "High-resolution mapping of wood burning appliance hotspots using Energy Performance Certificates: A case study of England and Wales". 

To recreate the analysis on your local device, please carry out the following steps:

  1. Clone the GitHub repository (available at: https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code) to your local device, or download the codebase from the 'Code.zip' folder and unzip in your project directory. Please ensure you use the directory with the R Project in it as your root directory.

  2. Download the 'Data.tar' file and unzip the file in the R Project directory. The data should be in a folder called 'Data' in the root directory. All non-EPC data is provided under the UK Open Government License version 3.0. EPC data is provided under licence from the Department for Levelling Up, Housing and Communities: https://epc.opendatacommunities.org/docs/copyright. 

  3. Download the main EPC data to your local device and unzip (see below for detailed instructions on how to do this). For Windows users, the 'Scripts' folder of the repository contains a .bat file which can be used to unzip the data. Note that this file requires the user to have installed 7Zip and added 7Zip to the system path. Otherwise, the .tar file can be unzipped manually.

  4. In an R terminal, run 'renv::restore()'. This should install all the necessary packages to replicate the analysis. On Linux/MacOS operating systems, there may be errors relating to re-installing specific packages from the renv lockfile. If this happens, install the package manually from source (install.packages("package_name", type = "source"), then run 'renv::snapshot()' and 'renv::restore()' again. Some packages (e.g. "sf") also require additional dependencies to be installed on Linux. Please install these dependencies before running 'renv::restore()'.

  5. Once the project library has been successfully loaded, run the 'run.R' file in the 'Scripts' folder of the directory. You may need to change the 'path_data_epc_folders' variable to the path to the unzipped EPC data folders on your local device (see step 3). The full pipeline should now run.

  6. Once you have run the pipeline for the first time, you should see a file called 'data_epc_raw.parquet' in the 'Data/raw/epc_data' folder. If you run the pipeline again, you will be prompted that the raw EPC data .parquet file already exists, and you have the option to skip the merging of raw data files.

Files

EPC_mapping_project_code-main.zip

Files (1.4 GB)

Name Size Download all
md5:49afed8c150f5092273d0022e7a744ca
1.4 GB Download
md5:e32a1c4e1bbfa916586bcd4ae2e45d19
170.5 kB Preview Download

Additional details

Funding

Wellcome Trust
Using novel data linkages to quantify the health impact of non-fossil fuel air pollution 225195/Z/22/Z

Software

Repository URL
https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code
Programming language
R
Development Status
Active