SynPop-DE: Synthetic population of 40 million German households using generative neural networks
Authors/Creators
Description
This is the data repository for the SynPop-DE (Synthetic Population of Germany) accompanying the publication: "SynPop-DE: Synthetic population of 40 million German households using generative neural networks".
Introduction: Household microdata combining socio-demographic, housing, income and expenditure attributes are a core resource for many studies in quantitative social science, such as modelling the household-level impacts of the energy transition. Yet no such data are openly available for Germany’s full population. SynPop-DE provides a synthetic population of 40,235,916 households and their 81,629,116 members in all 400 German districts, calibrated to the 2022 census, with 34 attributes per household. Synthetic households are generated by estimating the joint attribute distribution of the German Household Budget Survey through a two-stage machine learning architecture. While an autoencoder first compresses high-dimensional categorical data into a continuous latent space, a generative adversarial network subsequently learns to sample new records from this representation. These records are then aligned with census marginals for all German districts using iterative proportional updating to ensure spatial representativeness. Validation along three dimensions confirms that the model learns attribute relationships and generates synthetic households that reproduce the statistical properties of the survey data (fidelity), supports downstream analyses with accuracy comparable to the original survey (utility), and prevents disclosure of individual respondents (privacy). The dataset is openly available at https://synpop.de.
Data: The current data for SynPop-DE can always be found at synpop.de
Code: The code to reproduce the data can be found here: https://gitlab.pik-potsdam.de/metab/synpopde
Files
regional_populations.zip
Files
(17.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:d92d101e4633b7ea02d91ad825e72b50
|
8.4 GB | Preview Download |
|
md5:56e7e30ccb942533c657af65d25f552e
|
9.1 GB | Download |
Additional details
Related works
- Is supplement to
- Preprint: 10.31235/osf.io/zha8v_v1 (DOI)
Funding
- Federal Ministry for Economic Affairs and Climate Action
- 03EI5248C
Software
- Repository URL
- https://gitlab.pik-potsdam.de/metab/synpopde
- Development Status
- Active