Combining text analysis techniques with unsupervised machine learning methodologies for improved software vulnerability management
Creators
- 1. CERTH-ITI
Description
Software vulnerability management constitutes a prominent research area for security analysts and researchers. One of the main pillars of the software vulnerability management is the grouping of vulnerabilities that have similar characteristics in order for the security analysts to organize more efficiently prevention and mitigation actions. For this reason, the proposed research study suggests an automated vulnerability grouping from technical descriptions based on unsupervised machine learning techniques such as Latent Dirichlet Allocation and K-means along with text analysis techniques. The results of the aforementioned methodology in a large vulnerability dataset (over 100.000 vulnerabilities) confirmed that this vulnerability clustering from the corresponding descriptions could assist in software vulnerability group homogeneity and in the simplicity of the vulnerability management procedures.