Published May 12, 2022 | Version v2
Journal article Open

Osteoporosis Pre-screening using Ensemble Machine Learning in Postmenopausal Korean Women

  • 1. Insilicogen, Inc.
  • 2. AIDX, Inc.
  • 3. Catholic Kwandong University College of Medicine, International St.Mary's Hospital

Description

As osteoporosis is a degenerative disease related to post-menopausal aging, early diagnosis is vital. This study used data from the Korea National Health and Nutrition Examination Surveys to predict a patient’s risk of osteoporosis using machine learning algorithms. Data from 1,431 postmenopausal women aged 40–69 years were used, including 20 features affecting osteoporosis, chosen by feature importance and recursive feature elimination. Random forest (RF), AdaBoost, and Gradient Boosting (GBM) machine learning algorithms were each used to train three models: A, checkup features; B, survey features; and C, both checkup and survey features, respectively. Of the three models, Model C generated the best outcomes with an accuracy of 0.832 for RF, 0.849 for AdaBoost, and 0.829 for GBM. Its area under the receiver operating characteristic curve (AUROC) was 0.919 for RF, 0.921 for AdaBoost, and 0.908 for GBM. By utilizing multiple feature selection methods, the ensemble models of this study achieved excellent results with an AUROC score of 0.921 with AdaBoost, which is 0.1–0.2 higher than those of the best performing models from recent studies. Our model can be further improved as a practical medical tool for the early diagnosis of osteoporosis after menopause. 

Files

supplementary_file_osteoporosis.zip

Files (260.3 kB)

Name Size Download all
md5:54ea7821e0c73a4e27162673d064d8a4
260.3 kB Preview Download