Published June 3, 2026 | Version v2
Report Open

Intelligent Workforce Analytics: Predicting Employee Attrition Through Machine Learning

Authors/Creators

Description

Employee attrition represents one of the most consequential and underestimated costs in modern technology organizations. While most HR departments track headcount and exit interview data, they lack the predictive infrastructure to identify at-risk employees before their resignation decisions become irreversible. This study addresses this gap by developing a production-grade machine learning pipeline for attrition risk prediction, which is applied to a workforce dataset modeled on Palo Alto Networks.

Using a dataset of 1,470 employee records containing demographic, compensation, satisfaction, and engagement attributes, we developed a comprehensive ML pipeline incorporating eight engineered behavioral features, multi-model evaluation across six classifiers, and a systematic threshold-optimization protocol. Our champion model, Logistic Regression with an optimized decision threshold of 0.30, achieved a ROC-AUC of 0.7844, an attrition recall of 72.34%, a Macro F1 of 0.5995, and a Mean CV F1 of 0.8500 on a held-out test set of 294 employees.

A key finding of this study is that linear decision boundaries, when combined with rigorous behavioral feature engineering and SMOTETomek resampling, outperform complex ensemble methods on small, imbalanced HR datasets. This counterintuitive result has direct implications for how organizations should approach predictive HR analytics in data-scarce conditions. The pipeline delivers a three-tier risk scoring framework (High / Medium / Low Risk) and a deployable Streamlit dashboard, enabling department-level attrition monitoring and what-if scenario exploration.

Keywords: Employee Attrition Prediction, Logistic Regression, Feature Engineering, SMOTETomek, HR Analytics, Class Imbalance, Threshold Optimization, ROC-AUC, Workforce Risk Scoring, Palo Alto Networks

 

Files

Employee_Attrition_Research_Paper.pdf

Files (1.1 MB)

Name Size Download all
md5:c026a9e3591e9b1dec888b24d644b310
3.6 kB Download
md5:cf1a6eae451819514f6930fb158e7c5e
6.3 kB Download
md5:f7da5f8a548542ab28d2b28701af5ecf
988.1 kB Preview Download
md5:6cd08f555ef6427525e46b59a151aec4
1.1 kB Download
md5:913478ca9e7ad9953797f3061e5da725
3.4 kB Download
md5:803062b3d4d54c2a623b58ea0d84073e
488 Bytes Download
md5:06356c1c8290d74bcb11f927b51ff798
100.3 kB Download
md5:3f6c607b124f4c2fd0d6a483ca1ba72a
9.7 kB Download
md5:5a2df1a654d1bbd31a5ccc33ea1ee5e9
174 Bytes Preview Download

Additional details