Published July 4, 2025 | Version v1
Conference paper Restricted

Uncovering Hidden Threats: Automated, Machine Learning-based Discovery & Extraction of Cyber Threat Intelligence from Online Sources

  • 1. Sphynx Hellas
  • 2. Sphynx Analytics
  • 3. Sphynx Technology Solutions AG

Description

The cyber-threat landscape is constantly and rapidly expanding, overwhelming human analysts in their effort to keep track of the latest threats. This affects both organisations that produce threat intelligence to be consumed by third parties, but also the end consumers of this threat intelligence, who want, for example, to configure proactive defences to protect their infrastructure. This paper presents a novel, Machine Learning-based, solution able to discover & ingest Cyber Threat Intelligence (CTI) data from unstructured online sources, such as dark web forums, social media and online chatrooms, producing a stream of standardised, structured STIX CTI data at its output. Further, a proof-of-concept is developed and assessed, to investigate its
effectiveness with real-life data sources, but also to offer insights into the large amount of potentially useful threat intelligence -relevant information that lies unused in online sources, and the positive impact that the discovery and structuring of this information in a standardised, easily shareable manner can have in terms of providing cyber defenders with an up-to-date and comprehensive view of the threat landscape.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Related works

Is referenced by
Publication: 10.3390/electronics12081789 (DOI)

Funding

European Commission
PHOENI2X - A EUROPEAN CYBER RESILIENCE FRAMEWORK WITH ARTIFICIAL INTELLIGENCE -ASSISTED ORCHESTRATION & AUTOMATION FOR BUSINESS CONTINUITY, INCIDENT RESPONSE & INFORMATION EXCHANGE 101070586