Published February 22, 2026 | Version v1.1
Dataset Open

Synthetic data Using Population Profiles for cardiOvascular Risk facTors (SUPPORT) dataset

Description

The SUPPORT (Synthetic data Using Population Profiles for cardiOvascular Risk facTors) dataset is a large-scale resource comprising 777,358,492 synthetic individuals aged 35–84 across seven geographic regions of Mainland China (Central, East, North, Northeast, Northwest, South, and Southwest) for 2020. Each synthetic individual possesses a detailed profile of sociodemographic attributes and major cardiovascular disease (CVD) risk factors, including blood pressure, cholesterol levels, body mass index, and a history of diabetes. The population was constructed using iterative proportional fitting, multivariate normal distribution sampling, and multiple imputation, integrating data from China's Seventh National Population Census (2020), the Global Burden of Disease (GBD) study, and numerous health surveys. Technical validation against census statistics and independent cohorts, including the China Kadoorie Biobank, confirmed that the dataset accurately replicates marginal sociodemographic distributions and adequately approximates cardiovascular risk profiles of real-world populations. The R scripts used to generate and validate the synthetic dataset are publicly available on GitHub (https://github.com/pkuepi/SUPPORT). The SUPPORT is open-source and can be extended with additional attributes. This resource enables individual-level modeling of CVD and is publicly available at https://doi.org/10.5281/zenodo.17406895 .

Lists of provinces/autonomous region/municipality in seven regions of Mainland China*:

  • Central: Henan, Hubei, Hunan
  • East: Shanghai, Jiangsu, Zhejiang, Anhui, Fujian, Jiangxi, Shandong
  • North: Beijing, Tianjin, Hebei, Shanxi, Inner Mongolia
  • Northeast: Liaoning, Jilin, Heilongjiang
  • Northwest: Shaanxi, Gansu, Qinghai, Ningxia, Xinjiang
  • South: Guangdong, Guangxi, Hainan
  • Southwest: Chongqing, Sichuan, Guizhou, Yunnan, Xizang (Tibet)

* Note: The synthetic population refers to the population of the 31 provinces, autonomous regions and municipalities of the Chinese mainland, excluding residents in Hong Kong SAR, Macao SAR and Taiwan region.

Files

Files (7.1 GB)

Name Size Download all
md5:c7bd3db9c0949bbfcb5b895cefe97db5
558.6 MB Download
md5:c9c7e61b0bd65a2e7da1ea67934fbb99
557.0 MB Download
md5:f19bbcafb35a24e57edc333c5059c3c7
1.1 GB Download
md5:e76e1ec6cd999c38f68a68b81e524cec
1.1 GB Download
md5:31a5ce7ff94c51bd5a7c78380eb539cf
446.6 MB Download
md5:8f4ebfdcac9d79eac87a9abab78599e8
457.5 MB Download
md5:c42cf593e6eba9127538bac2bee0f38f
309.3 MB Download
md5:13e7894a1f31066a799bddff76c1f5bc
305.3 MB Download
md5:64e337f16b120b3654d6fbfea064306f
244.3 MB Download
md5:57546dbf365ab13c75023c8fcc512379
255.6 MB Download
md5:3273df5b3051f9d0e6b03936b4157047
387.4 MB Download
md5:906dfec569b0c1c2c4df4e3213989d58
414.8 MB Download
md5:d6c4b9b41b49bf49a96139c5f92e31e5
497.6 MB Download
md5:6b5fa7e7c7d021cad8ea14d697dcd5aa
513.8 MB Download

Additional details

Funding

National Natural Science Foundation of China
82373662

Dates

Updated
2026-02-22

Software

Repository URL
https://github.com/pkuepi/SUPPORT
Programming language
R
Development Status
Active