Published May 25, 2022 | Version v1.0.0
Dataset Open

A subsection of England and Wales EPC households, joined with PPD data, used for simulation modelling

  • 1. Centre for Net Zero

Description

If you want to give feedback on this dataset, or wish to request it in another form (e.g csv), please fill out this survey here. We are a not-for-profit research organisation keen to see how others use our open models and tools, so all feedback is appreciated! It's a short form that takes 5 minutes to complete. 

Important Note: Before downloading this dataset, please read the License and Software Attribution section at the bottom.

This dataset aligns with the work published in Centre for Net Zero's report "Hitting the Target". In this work, we simulate a range of interventions to model the situations in which we believe the UK will meet its 600,000 heat pump installation per year target by 2028. For full modelling assumptions and findings, read our report on our website.

The code for running our simulation is open source here.

This dataset contains over 9 million households that have been address matched between Energy Performance Certificates (EPC) data and Price Paid Data (PPD). The code for our address matching is here. Since these datasets are Open Government License (OGL), this dataset is too. We basically model specific columns from various datasets, as set out in our methodology section in our report, to simplify and clean up this dataset for academic use. License information is also available in the appendix of our report above.

The EPC data loaders can be found here (the data is here) and the rest of the schemas and data download locations can be found here.

Note that this dataset is not regularly maintained or updated. It is correct as of January 2022. The data was curated and tested using dbt via this Github repository and would be simple to rerun on the latest data.

The schema / data dictionary for this data can be found here.

Our recommended way of loading this data is in Python. After downloading all "parts" of the dataset to a folder. You can run:

```

import pandas as pd

data = pd.read_parquet("path/to/data/folder/")

```

 

Licenses and software attribution:

For EPC, PPD and UK House Price Index data:

For the EPC data, we are permitted to republish this providing we mention that all researchers who download this dataset follow these copyright restrictions. We do not explicitly release any Royal Mail address data, instead we use these fields to generate a pseudonymised "address_cluster_id" which reflects a unique combination of the address lines and postcodes, as well as other metadata. When viewing ICO and GDPR guidelines, this still counts as personal data, but we have gone to measures to pseudonymise as much as possible to fulfil our obligations as a data processor. You must read this carefully before downloading the data, and ensure that you are using it for the research purposes as determined by this copyright notice.

Contains HM Land Registry data © Crown copyright and database right 2021. This data is licensed under the Open Government Licence v3.0.

Contains OS data © Crown copyright and database right 2022.

Contains Office for National Statistics data licensed under the Open Government Licence v.3.0.

The OGL v3.0 license states that we are free to:

  • copy, publish, distribute and transmit the Information;
  • adapt the Information;
  • exploit the Information commercially and non-commercially for example, by combining it with other Information, or by including it in your own product or application.

However we must (where we do any of the above):

  • acknowledge the source of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence;

You can see more information here.

For XOServe Off Gas Postcodes:

This dataset has been released openly for all uses here.

For the address matching:

GNU Parallel: O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014

Notes

If you have any questions regarding this dataset, please email research@centrefornetzero.org. For broader questions about Centre for Net Zero, including questions about licenses/copyright, please email info@centrefornetzero.org.

Files

Files (314.0 MB)

Name Size Download all
md5:683b39f25c8e1c31f8f2a6d96cacc550
24.2 MB Download
md5:6f9a485d41369fbdc55387b695853d58
24.2 MB Download
md5:1bb3a1b780bac05ba53a3eeee3a4dc0d
24.1 MB Download
md5:0dc6cb744f6dc668b2ade72e0da611bb
24.2 MB Download
md5:abad69ead0829a6488e6235f6787312d
24.2 MB Download
md5:bede0b051208f8eb9e3d3165b12ea012
24.1 MB Download
md5:56354977e41f125c94753a90ec223688
24.1 MB Download
md5:5f00132834347e091d11af7552c1e08c
24.1 MB Download
md5:fa9abd3255a4c4d465013c1a83c5f15a
24.2 MB Download
md5:216bcf313d476edccd23547a7cffc10d
24.2 MB Download
md5:f6fd06e4f15463f0bdf4cbe41f7a349b
24.2 MB Download
md5:a2282840c78dbdce85db6a9cf65b169a
24.2 MB Download
md5:e971994e68f7c73bb5d6765b946865eb
24.1 MB Download