Title:
Heart Failure Prediction using MIMIC-IV and MIMIC-IV-ED Datasets

Description:
This repository provides the full implementation for the study “Predicting Heart Failure Using MIMIC-IV and MIMIC-IV-ED: A Comparative Study of Machine Learning and Deep Learning Models.”
It includes Python scripts and Jupyter notebooks for model training and evaluation.

Dataset Information:

Source: PhysioNet’s MIMIC-IV v2.2 and MIMIC-IV-ED v2.0 databases (2008–2019)

Access: Requires credentialed PhysioNet access and completion of CITI training.

Data used: 17,892 patient records with structured features (vitals, lab values, demographics) and one unstructured admission note per patient.

Code Information:

ML_models.ipynb: implements Random Forest, Logistic Regression, Naïve Bayes, Decision Tree, and AdaBoost models.

DL_models.ipynb: implements CNN, DNN, and LSTM networks with TensorFlow/Keras.

Preprocessing includes Winsorization, KNN imputation, and Z-score normalization.

Usage Instructions:

1. Download both .ipynb notebooks and place them in the same directory.

2. Update the dataset path for each model.

3. Run the notebooks sequentially:

4. ML_models.ipynb for machine learning models

5. DL_models.ipynb for deep learning models

6. Evaluation metrics (accuracy, precision, recall, F1, specificity, AUC) will be printed automatically.

Requirements:
Python 3.10+
Libraries: pandas, numpy, scikit-learn, tensorflow, keras, matplotlib, seaborn, transformers, imbalanced-learn, scipy.

Methodology Summary:

Data preprocessing: label encoding, outlier correction (Winsorization), missing value imputation (KNN), standardization (Z-score).

Feature extraction from structured EHR and BERT embeddings from admission notes.

Model training across ML and DL architectures.

Ablation study comparing structured-only vs. structured+text features.

Performance metrics and confidence intervals computed via bootstrap resampling.

Citation:
If you use this code or data processing pipeline, please cite:
Teoh, J. R., Khin Wee, L., Hasikin, K., WONG, J. H. D., Ng, W. L., Lee, K. W., KIEW, L. V., & Wu, X. (2025). Predicting Heart Failure Using MIMIC-IV and MIMIC-IV-ED: A Comparative Study of Machine Learning and Deep Learning Models. Zenodo. https://doi.org/10.5281/zenodo.17535222

License:
MIT License (open for academic and non-commercial use).