Application of Machine Learning Models for Patients Health Insurance Cost Prediction

Dr. Annwesha Banerjee Majumder

doi:10.35940/ijsce.D3685.15040925

Published September 30, 2025 | Version CC-BY-NC-ND 4.0

Journal article Open

Application of Machine Learning Models for Patients Health Insurance Cost Prediction

Dr. Annwesha Banerjee Majumder (Contact person)¹

1. Assistant Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Contributors

Contact person:

Dr. Annwesha Banerjee Majumder¹

Researchers:

1. Assistant Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.
2. Associate Professor, Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.
3. Department of Information Technology, JIS College of Engineering, Kalyani (West Bengal), India.

Abstract: The use of machine learning models to forecast health insurance costs based on personal characteristics is examined in this study. Age, sex, BMI, number of children, smoking status, and region were among the demographic variables included in the dataset. It was investigated how well several machine learning methods, such as Random Forest, Gradient Boosting, and Linear Regression, estimated insurance costs. After preprocessing the dataset by scaling numerical features and encoding categorical variables, k-fold cross-validation was employed to train and evaluate the regression models. The coefficient of determination (R2), mean absolute error (MAE), and root mean squared error (RMSE) were used to evaluate performance. According to experimental results, Gradient Boosting performed better than Random Forest and Linear Regression.

Files

D368515040925.pdf

Files (976.0 kB)

Name	Size	Download all
D368515040925.pdf md5:dea1dcc2f2ad6523335b9c3c369f1c8f	976.0 kB	Preview Download

Additional details

DOI: 10.35940/ijsce.D3685.15040925
EISSN: 2231-2307

Accepted: 2025-09-15

Manuscript Received on 05 August 2025 | Revised Manuscript Received on 06 September 2025 | Manuscript Accepted on 15 September 2025 | Manuscript published on 30 September 2025.

Obermeyer, Z., & Emanuel, E. J. (2016). Predicting the Future — Big Data, Machine Learning, and Clinical Medicine. The New England Journal of Medicine, 375(13), 1216-1219. DOI: https://doi.org/10.1056/nejmp1606181
Wager, S., & Athey, S. (2018). Estimation and Inference of Heterogeneous Treatment Effects using Random Forests. Journal of the American Statistical Association, 113(523), 1228-1242. DOI: https://doi.org/10.1080/01621459.2017.1319839
Obermeyer, Z., Powers, B., Vogeli, C., & Mullainathan, S. (2019). Dissecting racial bias in an algorithm used to manage the health of populations. Science, 366(6464), 447- 453. DOI: https://doi.org/10.1126/science.aax2342
Goldstein, B. A., Navar, A. M., Pencina, M. J., & Ioannidis, J. P. (2017). Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. Journal of the American Medical Informatics Association, 24(1), 198-208. DOI: https://doi.org/10.1093/jamia/ocw042
Choi, E., Schuetz, A., Stewart, W. F., & Sun, J. (2016). Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24(2), 361-370. DOI: https://doi.org/10.1093/jamia/ocw112
Rajkomar, A., Oren, E., Chen, K., et al. (2018). Scalable and accurate deep learning for electronic health records. npj Digital Medicine, 1, 18. DOI: https://doi.org/10.1038/s41746-018-0029-1
Miotto, R., Wang, F., Wang, S., Jiang, X., & Dudley, J. T. (2018). Deep Learning for Healthcare: A Review, Opportunities, and Challenges. Briefings in Bioinformatics, 19(6), 1236-1246. DOI: https://doi.org/10.1093/bib/bbx044
Ng, K., Sun, J., Hu, J., Wang, F., & Shen, Y. (2017). Personalized predictive modeling and risk factor identification using patient similarity. AMIA Annual Symposium Proceedings, 2015, 1176-1185. https://pubmed.ncbi.nlm.nih.gov/26306255/
Paul Thomas, Yabin. (2024). Application Of Data Mining In Health Care. International Research Journal of Modernisation in Engineering, Technology, and Science. 06. 2582-5208. DOI: https://www.doi.org/10.56726/IRJMETS7375510
Futoma, J., Simons, M., Panch, T., Doshi-Velez, F., & Celi, L. A. (2017). Predicting disease progression with a model combining sequence and non-sequence data. International Conference on Machine Learning (ICML). https://proceedings.mlr.press/v56/Futoma16.html
Liu, Y., Chen, P. H. C., Krause, J., & Peng, L. (2019). How to Read Articles That Use Machine Learning: Users' Guides to the Medical Literature. JAMA, 322(18), 1806- 1816. DOI: https://doi.org/10.1001/jama.2019.16489
Davenport, T., & Kalakota, R. (2019). The Potential for Artificial Intelligence in Healthcare Future Healthcare Journal, 6(2), 94-98. DOI: https://doi.org/10.7861/futurehosp.6-2-94
Shah, N. D., Steyerberg, E. W., & Kent, D. M. (2018). Big Data and Predictive Analytics: Recalibrating Expectations. Journal of the American Medical Association, 320(1), 27-28. DOI: https://doi.org/10.1001/jama.2018.5602
Beam, A. L., & Kohane, I. S. (2018). Big Data and Machine Learning in Health Care. JAMA, 319(13), 1317-1318. DOI: https://doi.org/10.1001/jama.2017.18391
Chen, J. H., & Asch, S. M. (2017). Machine Learning and Prediction in Medicine — Beyond the Peak of Inflated Expectations. The New England Journal of Medicine, 376(26), 2507-2509. DOI: https://doi.org/10.1056/nejmp1702071
Rutter, J. L., & Boudreault, D. J. (2019). Artificial Intelligence in Health Care: Benefits and Challenges of Machine Learning Approaches. Applied Clinical Informatics, 10(5), 844-846. DOI: https://doi.org/10.3346/jkms.2020.35.e379

	All versions	This version
Views	26	26
Downloads	17	17
Data volume	19.5 MB	19.5 MB

Application of Machine Learning Models for Patients Health Insurance Cost Prediction

Contributors

Contact person:

Researchers:

Files

D368515040925.pdf

Files (976.0 kB)

Additional details

Identifiers

Dates

References

Application of Machine Learning Models for Patients Health Insurance Cost Prediction

Creators

Contributors

Contact person:

Researchers:

Description

Files

D368515040925.pdf

Files (976.0 kB)

Additional details

Identifiers

Dates

References