Dataset - High-resolution mapping of wood burning heat sources using Energy Performance Certificates: A case study of England and Wales
Description
This repository contains open data and code to replicate the analysis in the manuscript "High-resolution mapping of wood burning appliance hotspots using Energy Performance Certificates: A case study of England and Wales".
To recreate the analysis on your local device, please carry out the following steps:
-
Clone the GitHub repository (available at: https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code) to your local device, or download the codebase from the 'Code.zip' folder and unzip in your project directory. Please ensure you use the directory with the R Project in it as your root directory.
-
Download the 'Data.tar' file and unzip the file in the R Project directory. The data should be in a folder called 'Data' in the root directory. All non-EPC data is provided under the UK Open Government License version 3.0. EPC data is provided under licence from the Department for Levelling Up, Housing and Communities: https://epc.opendatacommunities.org/docs/copyright.
-
Download the main EPC data to your local device and unzip (see below for detailed instructions on how to do this). For Windows users, the 'Scripts' folder of the repository contains a .bat file which can be used to unzip the data. Note that this file requires the user to have installed 7Zip and added 7Zip to the system path. Otherwise, the .tar file can be unzipped manually.
-
In an R terminal, run 'renv::restore()'. This should install all the necessary packages to replicate the analysis. On Linux/MacOS operating systems, there may be errors relating to re-installing specific packages from the renv lockfile. If this happens, install the package manually from source (install.packages("package_name", type = "source"), then run 'renv::snapshot()' and 'renv::restore()' again. Some packages (e.g. "sf") also require additional dependencies to be installed on Linux. Please install these dependencies before running 'renv::restore()'.
-
Once the project library has been successfully loaded, run the 'run.R' file in the 'Scripts' folder of the directory. You may need to change the 'path_data_epc_folders' variable to the path to the unzipped EPC data folders on your local device (see step 3). The full pipeline should now run.
-
Once you have run the pipeline for the first time, you should see a file called 'data_epc_raw.parquet' in the 'Data/raw/epc_data' folder. If you run the pipeline again, you will be prompted that the raw EPC data .parquet file already exists, and you have the option to skip the merging of raw data files.
Files
EPC_mapping_project_code-main.zip
Files
(1.4 GB)
Name | Size | Download all |
---|---|---|
md5:49afed8c150f5092273d0022e7a744ca
|
1.4 GB | Download |
md5:e32a1c4e1bbfa916586bcd4ae2e45d19
|
170.5 kB | Preview Download |
Additional details
Funding
Software
- Repository URL
- https://github.com/UCL-Wellcome-Trust-Air-Pollution/EPC_mapping_project_code
- Programming language
- R
- Development Status
- Active