Towards Continuous Enrichment of Cyber Threat Intelligence: A Study on a Honeypot Dataset

Arnolnt Spyros; Angelos Papoutsis; Ilias Koritsas; Notis Mengidis; Christos Iliou; Dimitris Kavallieros; Theodora Tsikrika; Stefanos Vrochidis; Ioannis Kompatsiaris

doi:10.1109/CSR54599.2022.9850295

Published August 16, 2022 | Version v1

Conference paper Open

Towards Continuous Enrichment of Cyber Threat Intelligence: A Study on a Honeypot Dataset

1. Information Technologies Institute, CERTH, Thessaloniki, Greece

Cyber Threat Intelligence helps organisations in their fight against cyber threats to strategically design their defences and support decision making by continuously providing information regarding the cyber threat landscape. In this context, honeypots are a widespread solution for gathering intelligence about threat actors. However, honeypots do not inherently provide information about the origin of threat groups, their resources, capabilities, and potential impact. Thus, we propose an approach that classifies threats, as highly or less abusive, based on their behaviour characteristics using four ensemble machine learning algorithms applied on security incidents identified in a rule-based manner on a deployed honeypot. After prepossessing and hyper-tuning of the parameters, the four examined models, Random Forest Classifier (RFC), Adaptive Boosting Classifier (AdaBoost), Light Gradient Boosting Machine (LGBM) and Extreme Gradient Boosting (XGBoost), achieve good results, with RFC and LGBM achieving the best recall (84%, 83%) and LGBM and XGB the best AUC (91%, 90%).

Notes

This is the accepted version of the paper. The final version can be found on https://ieeexplore.ieee.org/document/9850295

Files