Published February 22, 2021 | Version v1
Dataset Open

Hybrid gridded demographic data for China, 1979-2100

  • 1. Tsinghua University
  • 2. China University of Geosciences (Beijing)

Description

This is a hybrid gridded dataset of demographic data for China from 1979 to 2100, given as 21 five-year age groups of population divided by gender every year at a 0.5-degree grid resolution.

The historical period (1979-2020) part of this dataset combines the NASA SEDAC Gridded Population of the World version 4 (GPWv4, UN WPP-Adjusted Population Count) with gridded population from the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP, Histsoc gridded population data).

The projection (2010-2100) part of this dataset is resampled directly from Chen et al.’s data published in Scientific Data.

This dataset includes 31 provincial administrative districts of China, including 22 provinces, 5 autonomous regions, and 4 municipalities directly under the control of the central government (Taiwan, Hong Kong, and Macao were excluded due to missing data).

Method - demographic fractions by age and gender in 1979-2020

Age- and gender-specific demographic data by grid cell for each province in China are derived by combining historical demographic data in 1979-2020 with the national population census data provided by the National Statistics Bureau of China.

To combine the national population census data with the historical demographics, we constructed the provincial fractions of demographic in each age groups and each gender according to the fourth, fifth and sixth national population census, which cover the year of 1979-1990, 1991-2000 and 2001-2020, respectively. The provincial fractions can be computed as:

\(\begin{align*} \begin{split} f_{year,province,age,gender}= \left \{ \begin{array}{lr} POP_{1990,province,age,gender}^{4^{th}census}/POP_{1990,province}^{4^{th}census} & 1979\le\mathrm{year}\le1990\\ POP_{2000,province,age,gender}^{5^{th}census}/POP_{2000,province}^{5^{th}census} & 1991\le\mathrm{year}\le2000\\ POP_{2010,province,age,gender}^{6^{th}census}/POP_{2010,province}^{6^{th}census}, & 2001\le\mathrm{year}\le2020 \end{array} \right. \end{split} \end{align*}\)

Where:

-    \( f_{\mathrm{year,province,age,gender}}\)is the fraction of population for a given age, a given gender in each province from the national census from 1979-2020.

-    \(\mathrm{PO}\mathrm{P}_{\mathrm{year,province,age,gender}}^{X^{\mathrm{th}}\mathrm{census} }\) is the total population for a given age, a given gender in each province from the Xth national census.

-    \(\mathrm{PO}\mathrm{P}_{\mathrm{year,province}}^{X^{\mathrm{th}}\mathrm{census} }\) is the total population for all ages and both genders in each province from the Xth national census.

Method - demographic totals by age and gender in 1979-2020

The yearly grid population for 1979-1999 are from ISIMIP Histsoc gridded population data, and for 2000-2020 are from the GPWv4 demographic data adjusted by the UN WPP (UN WPP-Adjusted Population Count, v4.11, https://beta.sedac.ciesin.columbia.edu/data/set/gpw-v4-population-count-adjusted-to-2015-unwpp-country-totals-rev11), which combines the spatial distribution of demographics from GPWv4 with the temporal trends from the UN WPP to improve accuracy. These two gridded time series are simply joined at the cut-over date to give a single dataset - historical demographic data covering 1979-2020.

Next, historical demographic data are mapped onto the grid scale to obtain provincial data by using gridded provincial code lookup data and name lookup table. The age- and gender-specific fraction were multiplied by the historical demographic data at the provincial level to obtain the total population by age and gender for per grid cell for china in 1979-2020.

Method - demographic totals and fractions by age and gender in 2010-2100

The grid population count data in 2010-2100 under different shared socioeconomic pathway (SSP) scenarios are drawn from Chen et al. published in Scientific Data with a resolution of 1km (~ 0.008333 degree). We resampled the data to 0.5 degree by aggregating the population count together to obtain the future population data per cell.

This previously published dataset also provided age- and gender-specific population of each provinces, so we calculated the fraction of each age and gender group at provincial level. Then, we multiply the fractions with grid population count to get the total population per age group per cell for each gender.

Note that the projected population data from Chen’s dataset covers 2010-2020, while the historical population in our dataset also covers 2010-2020. The two datasets of that same period may vary because the original population data come from different sources and are calculated based on different methods.

Disclaimer

This dataset is a hybrid of different datasets with independent methodologies. Spatial or temporal consistency across dataset boundaries cannot be guaranteed.

Files

Files (4.1 GB)

Name Size Download all
md5:69e68e2a2537d273d7a9eacb606cca88
204.8 MB Download
md5:b3c185cad2e292ab2f654ef01384cc5a
307.1 MB Download
md5:1c5f23da71f61fc5539c06ec568e3426
307.1 MB Download
md5:57a76acd9609bbd0c3f866a6182881b1
307.1 MB Download
md5:611ac1b76251fe5a720b8529b41a008a
307.1 MB Download
md5:307c1ca63e4fc7af9a3fdba5bbd58740
307.1 MB Download
md5:c2db2589ab33eb5da5577173cbaa646d
307.1 MB Download
md5:2b7ea888acfdf2c38fe9aa13f1405062
204.8 MB Download
md5:d7ad62a51c73e6eb22590db6e819a1f2
307.1 MB Download
md5:3f246c5d420082cac281e6a5acee8d1f
307.1 MB Download
md5:0eaf8363df3a319f5bff91b76393d076
307.1 MB Download
md5:0eaf8363df3a319f5bff91b76393d076
307.1 MB Download
md5:a15b46fdbee1804deef511a9356e27a6
307.1 MB Download
md5:2d33803943037b188ffeed9cd7c3f665
307.1 MB Download

Additional details

Related works

References
Journal article: 10.1038/s41597-020-0421-y (DOI)
Dataset: 10.5281/zenodo.3768003 (DOI)

References