Published February 29, 2020 | Version v1
Journal article Open

Hepatitis Patient Classification using Random Forest Algorithms with Cost-Sensitive Method

  • 1. Department of Computer Science - Postgraduate Programs STMIK Nusa Mandiri, Jakarta, Indonesia.
  • 2. Department of Information Systems Management, BINUS Graduate Program – Master of Information Systems Management, Bina Nusantara University, Jakarta. Indonesia.
  • 1. Publisher

Description

Hepatitis is a common worldwide public health problem that attacks almost every population in various countries. Machine learning has been widely used to classify various diseases, including hepatitis. In this research, the Random Forest algorithm will be used along with the dataset of patients with hepatitis to classify whether the patient's condition will live or die. Missing value and imbalance class exists in this dataset. In that class, the sample of healthy and sick patients that often occurs in the disease dataset. We replace missing values using mean and median and to deal with this imbalance of class, we use cost-sensitive methods to put penalty in classification. A manual selection feature process is also carried out to look for features that can be removed while still maintaining the quality of accuracy and classification. The validation method used is 10-fold Cross-Validation and using Random Forest Algorithm with tuned parameter to find the best result in classifying the class. This research prioritizes classification results by considering the small amount of data and the imbalance of the class, so it can classify the class more successfully and accurate for hepatitis patients. The accuracy value obtained is 85.80%.

Files

C5903029320.pdf

Files (472.8 kB)

Name Size Download all
md5:08406bad4da11e3ddbd47944033d18b4
472.8 kB Preview Download

Additional details

Related works

Is cited by
Journal article: 2249-8958 (ISSN)

Subjects

ISSN
2249-8958
Retrieval Number
C5903029320 /2020©BEIESP