Conference paper Open Access

Towards Selecting Informative Content for Cyber Threat Intelligence

Panos Panagiotou; Christos Iliou; Konstantinos Apostolou; Theodora Tsikrika; Stefanos Vrochidis; Periklis Chatzimisios; Ioannis Kompatsiaris

Nowadays, there is an increasing need for cyber security professionals to make use of tools that automatically extract Cyber Threat Intelligence (CTI) relying on information collected from relevant blogs and news sources that are publicly available. When such sources are used, an important part of the CTI extraction process is content selection, in which pages that do not contain CTI-related information should be filtered out. For this task, we apply supervised machine learning-based text classification techniques, trained on a new dataset created for the purposes of this work. Furthermore, we show in practice the importance of a good content selection process in a commonly used CTI extraction pipeline, by inspecting the results of the Named Entity Recognition (NER) process that normally follows.

This is the accepted version of the paper. The final version of the paper can be found at
Files (155.6 kB)
Name Size
155.6 kB Download
Views 66
Downloads 102
Data volume 15.9 MB
Unique views 58
Unique downloads 94


Cite as