Published February 4, 2026 | Version 1.0.0
Software Open

Time-Series ICU Patient Deterioration Predictor

Description

This repository presents a dual-architecture machine learning system for early detection of clinical deterioration in intensive care unit (ICU) patients. The system compares gradient-boosted decision trees (LightGBM) with temporal convolutional networks (TCN) to model complementary aspects of physiological risk using routinely collected clinical observations. Three NEWS2-derived deterioration outcomes are considered: maximum risk level attained during the ICU stay (max_risk), median sustained risk level across the stay (median_risk), and the proportion of time spent in a high-risk state (pct_time_high).

Models are trained and evaluated using the PhysioNet MIMIC-IV Clinical Demo v2.2 dataset via two distinct feature-engineering pipelines. The TCN operates on high-resolution timestamp-level temporal features (96-hour windows, 171 features) to capture short-term physiological instability, while the LightGBM model uses patient-level aggregated tabular features (40 features) to characterise longer-term exposure to risk. Comparative evaluation indicates complementary performance profiles: LightGBM exhibits superior calibration and regression fidelity for sustained risk estimation, while TCNs show stronger sensitivity and discrimination for acute deterioration events. Performance is assessed using ROC-AUC, Brier score, and R², alongside interpretability analyses based on SHAP values and saliency methods.

The end-to-end pipeline includes clinically validated NEWS2 preprocessing (including CO₂ retainer logic, Glasgow Coma Scale mapping, and supplemental oxygen protocols), comprehensive feature engineering, model training with hyperparameter optimisation, robust metric evaluation, and a command-line inference interface supporting batch prediction and per-patient lookup. Overall, the system demonstrates physiologically plausible predictive behaviour, clinically meaningful interpretability, and a reproducible workflow suitable for extension to full clinical datasets or downstream deployment contexts.

This repository contains code and documentation only; no patient-level clinical data are redistributed. Users must obtain the MIMIC-IV dataset directly from PhysioNet and comply with its data use requirements. The work is intended for research and educational use.

Target Outcome Best-Performing Model Key Metric(s)  Notes
max_risk  TCN ROC-AUC = 0.923;  Strong acute deterioration detection
median_risk LightGBM ROC-AUC = 0.872; Brier Score = 0.065 Superior sustained risk calibration
pct_time_high  LightGBM R² = 0.793; RMSE = 0.038 Higher fidelity estimation of high-risk exposure 

 

Files

tcn_architecture_detailed.png

Files (446.2 MB)

Name Size Download all
md5:8d008ab5190a127277c6aaff03357136
169.0 kB Preview Download
md5:2a8567afeb993aa46443926a7ed31a41
446.0 MB Preview Download

Additional details

Software

References

  • Johnson, A., Bulgarelli, L., Pollard, T., Gow, B., Moody, B., Horng, S., Celi, L. A., & Mark, R. (2024). MIMIC-IV (version 3.1). PhysioNet. https://doi.org/10.13026/kpb9-mt58
  • Johnson, A.E.W., Bulgarelli, L., Shen, L., et al. (2023). MIMIC-IV, a freely accessible electronic health record dataset. Sci Data, 10, 1. https://doi.org/10.1038/s41597-022-01899-x
  • Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., … & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), e215–e220