Published September 17, 2025 | Version v2
Poster Open

AI-driven environmental epidemiology: A generalisable framework to predict cardiovascular health using fair and explainable machine learning

Description

The interplay between environmental and socioeconomic factors and health is not yet well-understood, but is increasingly important in the context of climate change. Machine learning has the power to uncover these complex relationships, but is is underutilised in environmental epidemiology. We present a generalisable, explainable machine learning framework for predicting continuous cardiovascular health outcomes from environmental, demographic, and clinical data. The framework incorporates model comparison, SHAP-based feature attribution, and subgroup-based fairness evaluation.

We illustrate the approach with two large-scale datasets: (1) gridded environmental and socioeconomic data to predict cardiovascular mortality, and (2) individual-level cohort data (n > 200,000) from the NAKO health study to predict systolic blood pressure and BMI. 

We find that machine learning models outperform traditional regression in predicting cardiovascular mortality, especially in deprived areas. Environmental features such as water coverage, also emerge as important predictors.

This work was funded by the Helmholtz Association's Initiative and Networking Fund (INF): ZT-I-PF-5-42 and Health Data Research UK-Turing Wellcome Studentship Transition Fund: G109427

Files

2025-05 Helmholtz AI conference poster Claire final.pdf

Files (248.6 MB)