Journal article Open Access

Towards Optimization of Malware Detection using Chi-square Feature Selection on Ensemble Classifiers

Fadare Oluwaseun Gbenga; Adetunmbi Adebayo Olusola; Oyinloye Oghenerukevwe Eloho; Mogaji Stephen Alaba

Sponsor(s)
Blue Eyes Intelligence Engineering and Sciences Publication(BEIESP)

The multiplication of malware variations is probably the greatest problem in PC security and the protection of information in form of source code against unauthorized access is a central issue in computer security. In recent times, machine learning has been extensively researched for malware detection and ensemble technique has been established to be highly effective in terms of detection accuracy. This paper proposes a framework that combines combining the exploit of both Chi-square as the feature selection method and eight ensemble learning classifiers on five base learners- K-Nearest Neighbors, Naïve Bayes, Support Vector Machine, Decision Trees, and Logistic Regression. K-Nearest Neighbors returns the highest accuracy of 95.37%, 87.89% on chi-square, and without feature selection respectively. Extreme Gradient Boosting Classifier ensemble accuracy is the highest with 97.407%, 91.72% with Chi-square as feature selection, and ensemble methods without feature selection respectively. Extreme Gradient Boosting Classifier and Random Forest are leading in the seven evaluative measures of chi-square as a feature selection method and ensemble methods without feature selection respectively. The study results show that the tree-based ensemble model is compelling for malware classification.

Files (1.1 MB)
Name Size
D23590410421.pdf
md5:a25462ec085ab9c9946ec411cb79c4be
1.1 MB Download
8
7
views
downloads
Views 8
Downloads 7
Data volume 7.5 MB
Unique views 8
Unique downloads 7

Share

Cite as