Published March 1, 2022 | Version v1
Journal article Open

Model optimisation of class imbalanced learning using ensemble classifier on over-sampling data

  • 1. Institut Teknologi dan Bisnis Kalbis

Description

Data imbalance is one of the problems in the application of machine learning and data mining. Often this data imbalance occurs in the most essential and needed case entities. Two approaches to overcome this problem are the data level approach and the algorithm approach. This study aims to get the best model using the pap smear dataset that combined data levels with an algorithmic approach to solve data imbalanced. The laboratory data mostly have few data and imbalance. Almost in every case, the minor entities are the most important and needed. Over-sampling as a data level approach used in this study is the synthetic minority oversampling technique-nominal (SMOTE-N) and adaptive synthetic-nominal (ADASYN-N) algorithms. The algorithm approach used in this study is the ensemble classifier using AdaBoost and bagging with the classification and regression tree (CART) as learner-based. The best model obtained from the experimental results in accuracy, precision, recall, and f-measure using ADASYN-N and AdaBoostCART.

Files

27 21410 1570730186.pdf

Files (500.2 kB)

Name Size Download all
md5:94558fd9779e62cc6fd500895efecfa5
500.2 kB Preview Download