Published April 20, 2026 | Version 1.0
Dataset Open

Systematically Fused Multimodal Dataset and Explainable ML Results for Alzheimer's Risk Prediction

Authors/Creators

Description

 

This repository contains the datasets and cross-validation results associated with our research on predicting Alzheimer's risk. 

*DISCLAIMER: Please note that the primary dataset provided here does not contain real, continuous patient records. It is a systematic, synthetically fused dataset created by mathematically combining metabolic indicators (from a diabetes dataset) and cognitive biometric data (from the DARWIN handwriting dataset) strictly for research, experimental validation, and academic purposes. It is not intended for direct clinical application.*

The repository includes:
1. Advanced Fused Dataset (Primary): The systematically generated dataset combining diabetes health indicators with cognitive handwriting features, used for training the predictive models.
2. DARWIN Raw Dataset: The original public handwriting dataset used as a foundation for cognitive feature extraction.
3. Model Cross-Validation Results: Excel files detailing the 5-fold cross-validation performance metrics (Accuracy, F1-Score, RMSE, etc.) for our XGBoost and Linear Regression models.
4. Feature Importance Analysis: Files containing SHAP values and feature importance scores that explain the contribution of specific metabolic and cognitive features to the model's predictions.

This data is provided to ensure full reproducibility of the results presented in our IEEE manuscript and to aid future research in explainable AI for clinical decision support systems.

 

Files

advanced_fused_dataset.csv

Files (1.1 MB)

Name Size Download all
md5:928c7b6a01af3326905a5b4d1d7342de
308.6 kB Preview Download
md5:202937dece4d933216e9161ffe56097d
740.5 kB Preview Download
md5:af06ea62e683aba9cc0caf94ac737307
589 Bytes Preview Download
md5:58b2190763ea69554cd6a7b2f577cf01
1.3 kB Preview Download

Additional details