There is a newer version of the record available.

Published February 13, 2025 | Version v5
Dataset Open

Dataset - High-resolution mapping of wood burning heat sources using Energy Performance Certificates: A case study of England and Wales

  • 1. ROR icon University College London

Description

This repository contains open data and code to replicate the analysis in the manuscript "High-resolution mapping of wood burning appliance hotspots using Energy Performance Certificates: A case study of England and Wales". 

To recreate the analysis on your local device, please carry out the following steps:

  1. Clone the GitHub repository (available at: https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code) to your local device, or download the codebase from the 'Code.zip' folder and unzip in your project directory. Please ensure you use the directory with the R Project in it as your root directory.

  2. Download the 'Data.tar' file and unzip the file in the R Project directory. The data should be in a folder called 'Data' in the root directory. All non-EPC data is provided under the UK Open Government License version 3.0. EPC data is provided under licence from the Department for Levelling Up, Housing and Communities: https://epc.opendatacommunities.org/docs/copyright. 

  3. Download the main EPC data to your local device and unzip (see below for detailed instructions on how to do this). For Windows users, the 'Scripts' folder of the repository contains a .bat file which can be used to unzip the data. Note that this file requires the user to have installed 7Zip and added 7Zip to the system path. Otherwise, the .tar file can be unzipped manually.

  4. Run the 'run.R' file in the 'Scripts' folder of the directory. You may need to change the 'path_data_epc_folders' variable to the path to the unzipped EPC data folders on your local device (see step 3). The full pipeline should now run.

  5. Once you have run the pipeline for the first time, you should see a file called 'data_epc_raw.parquet' in the 'Data/raw/epc_data' folder. If you run the pipeline again, you will be prompted that the raw EPC data .parquet file already exists, and you have the option to skip the merging of raw data files.

Files

Code.zip

Files (1.9 GB)

Name Size Download all
md5:1668cadf39da2a68f06f73fc196acff9
82.3 kB Preview Download
md5:34d7b5f83f7add368318003557a3efa2
1.9 GB Download

Additional details

Funding

Wellcome Trust
Using novel data linkages to quantify the health impact of non-fossil fuel air pollution 225195/Z/22/Z

Software

Repository URL
https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code
Programming language
R
Development Status
Active