Time-Series ICU Patient Deterioration Predictor
Authors/Creators
Description
This repository presents a dual-architecture machine learning system for early detection of clinical deterioration in intensive care unit (ICU) patients. The system compares gradient-boosted decision trees (LightGBM) with temporal convolutional networks (TCN) to model complementary aspects of physiological risk using routinely collected clinical observations. Three NEWS2-derived deterioration outcomes are considered: maximum risk level attained during the ICU stay (max_risk), median sustained risk level across the stay (median_risk), and the proportion of time spent in a high-risk state (pct_time_high).
Models are trained and evaluated using the PhysioNet MIMIC-IV Clinical Demo v2.2 dataset via two distinct feature-engineering pipelines. The TCN operates on high-resolution timestamp-level temporal features (96-hour windows, 171 features) to capture short-term physiological instability, while the LightGBM model uses patient-level aggregated tabular features (40 features) to characterise longer-term exposure to risk. Comparative evaluation indicates complementary performance profiles: LightGBM exhibits superior calibration and regression fidelity for sustained risk estimation, while TCNs show stronger sensitivity and discrimination for acute deterioration events. Performance is assessed using ROC-AUC, Brier score, and R², alongside interpretability analyses based on SHAP values and saliency methods.
The end-to-end pipeline includes clinically validated NEWS2 preprocessing (including CO₂ retainer logic, Glasgow Coma Scale mapping, and supplemental oxygen protocols), comprehensive feature engineering, model training with hyperparameter optimisation, robust metric evaluation, and a command-line inference interface supporting batch prediction and per-patient lookup. Overall, the system demonstrates physiologically plausible predictive behaviour, clinically meaningful interpretability, and a reproducible workflow suitable for extension to full clinical datasets or downstream deployment contexts.
This repository contains code and documentation only; no patient-level clinical data are redistributed. Users must obtain the MIMIC-IV dataset directly from PhysioNet and comply with its data use requirements. The work is intended for research and educational use.
| Target Outcome | Best-Performing Model | Key Metric(s) | Notes |
max_risk |
TCN | ROC-AUC = 0.923; | Strong acute deterioration detection |
median_risk |
LightGBM | ROC-AUC = 0.872; Brier Score = 0.065 | Superior sustained risk calibration |
pct_time_high |
LightGBM | R² = 0.793; RMSE = 0.038 | Higher fidelity estimation of high-risk exposure |
Files
tcn_architecture_detailed.png
Files
(446.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8d008ab5190a127277c6aaff03357136
|
169.0 kB | Preview Download |
|
md5:2a8567afeb993aa46443926a7ed31a41
|
446.0 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/SimonYip22/Time-Series-ICU-Patient-Deterioration-Predictor (URL)
Software
- Repository URL
- https://github.com/SimonYip22/Time-Series-ICU-Patient-Deterioration-Predictor
- Programming language
- Python
- Development Status
- Active
References
- Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L. A., & Mark, R. (2024). MIMIC-IV (version 3.1). PhysioNet. https://doi.org/10.13026/kpb9-mt58
- Johnson, A.E.W., Bulgarelli, L., Shen, L., et al. (2023). MIMIC-IV, a freely accessible electronic health record dataset. Sci Data, 10, 1. https://doi.org/10.1038/s41597-022-01899-x
- Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., … & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215–e220