Published March 1, 2026 | Version v1
Preprint Open

DURG-EduAI: A Multi-Task Machine Learning Framework for Student Academic Performance Prediction, Result Classification, and Dropout Risk Assessment in Indian Higher Education

Description

DURG-EduAI is a large-scale multi-task machine learning framework developed for academic performance prediction, result classification, dropout risk assessment, subject benchmarking, and early warning generation in Indian higher education.

The system is trained on a novel dataset of 248,139 student examination records collected from Hemchand Yadav University, Durg, Chhattisgarh, spanning undergraduate and postgraduate programs (2016–2025). The framework transforms raw examination HTML records into structured feature representations and applies gradient-boosted tree ensembles (XGBoost) for predictive modeling.

Key performance results include:

• SGPA regression: R² = 0.9969, MAE = 0.079
• Result classification (PASS / ATKT / FAIL): F1 ≈ 0.99+ across classes
• Dropout risk stratification (Low / Medium / High): near-perfect classification on engineered labels

The system integrates five modules into a unified inference pipeline capable of generating a complete student risk report from a single structured examination record.

To the best of our knowledge, this is the first publicly released multi-task ML system trained on real multi-program Indian university examination data at this scale.

Important note: Dropout risk labels are engineered proxy indicators derived from examination outcomes and not confirmed longitudinal withdrawal records. High predictive performance partially reflects recoverability of institutional grading rules.

Trained models and inference pipeline are publicly available at:
https://huggingface.co/collections/sameerbanchhor-work/durg-edu-ai

Files

DURG_EduAI_Paper.pdf

Files (1.1 MB)

Name Size Download all
md5:afac7b80fd37a2ed85c0411bac1454ba
1.1 MB Preview Download

Additional details

Software

Programming language
Python