Published April 30, 2021 | Version v1
Journal article Open

Towards Optimization of Malware Detection using Chi-square Feature Selection on Ensemble Classifiers

  • 1. Dept. Of Computer Science, Joseph Ayo Babalola, Ikeji-Arakeji. Osun-State. Nigeria.
  • 2. School of Computing, Federal University of Technology, Akure. Nigria.
  • 1. Publisher

Description

The multiplication of malware variations is probably the greatest problem in PC security and the protection of information in form of source code against unauthorized access is a central issue in computer security. In recent times, machine learning has been extensively researched for malware detection and ensemble technique has been established to be highly effective in terms of detection accuracy. This paper proposes a framework that combines combining the exploit of both Chi-square as the feature selection method and eight ensemble learning classifiers on five base learners- K-Nearest Neighbors, Naïve Bayes, Support Vector Machine, Decision Trees, and Logistic Regression. K-Nearest Neighbors returns the highest accuracy of 95.37%, 87.89% on chi-square, and without feature selection respectively. Extreme Gradient Boosting Classifier ensemble accuracy is the highest with 97.407%, 91.72% with Chi-square as feature selection, and ensemble methods without feature selection respectively. Extreme Gradient Boosting Classifier and Random Forest are leading in the seven evaluative measures of chi-square as a feature selection method and ensemble methods without feature selection respectively. The study results show that the tree-based ensemble model is compelling for malware classification.

Files

D23590410421.pdf

Files (1.1 MB)

Name Size Download all
md5:a25462ec085ab9c9946ec411cb79c4be
1.1 MB Preview Download

Additional details

Related works

Is cited by
Journal article: 2249-8958 (ISSN)

Subjects

ISSN
2249-8958
Retrieval Number
100.1/ijeat.D23590410421