Towards Optimization of Malware Detection using Chi-square Feature Selection on Ensemble Classifiers
Creators
- 1. Dept. Of Computer Science, Joseph Ayo Babalola, Ikeji-Arakeji. Osun-State. Nigeria.
- 2. School of Computing, Federal University of Technology, Akure. Nigria.
Contributors
- 1. Publisher
Description
The multiplication of malware variations is probably the greatest problem in PC security and the protection of information in form of source code against unauthorized access is a central issue in computer security. In recent times, machine learning has been extensively researched for malware detection and ensemble technique has been established to be highly effective in terms of detection accuracy. This paper proposes a framework that combines combining the exploit of both Chi-square as the feature selection method and eight ensemble learning classifiers on five base learners- K-Nearest Neighbors, Naïve Bayes, Support Vector Machine, Decision Trees, and Logistic Regression. K-Nearest Neighbors returns the highest accuracy of 95.37%, 87.89% on chi-square, and without feature selection respectively. Extreme Gradient Boosting Classifier ensemble accuracy is the highest with 97.407%, 91.72% with Chi-square as feature selection, and ensemble methods without feature selection respectively. Extreme Gradient Boosting Classifier and Random Forest are leading in the seven evaluative measures of chi-square as a feature selection method and ensemble methods without feature selection respectively. The study results show that the tree-based ensemble model is compelling for malware classification.
Files
D23590410421.pdf
Files
(1.1 MB)
Name | Size | Download all |
---|---|---|
md5:a25462ec085ab9c9946ec411cb79c4be
|
1.1 MB | Preview Download |
Additional details
Related works
- Is cited by
- Journal article: 2249-8958 (ISSN)
Subjects
- ISSN
- 2249-8958
- Retrieval Number
- 100.1/ijeat.D23590410421