5G and Related Network Infrastructure CVE-Annotated Dataset: Distinguishing 5G Native, LTE, Auxiliary to 5G, and Non-5G Vulnerabilities

D'Alterio, Francesco; Bernardini, Andrea; Sagratella, Leonardo

doi:10.5281/zenodo.17450053

Published November 18, 2025 | Version v1

Dataset Open

5G and Related Network Infrastructure CVE-Annotated Dataset: Distinguishing 5G Native, LTE, Auxiliary to 5G, and Non-5G Vulnerabilities

1. Fondazione "Ugo Bordoni"

Dataset Description

The dataset was generated using the source code available at https://doi.org/10.5281/zenodo.17572825 and was subsequently manually annotated.

Starting from a selection of CVEs selected on the basis of a keyword whitelist of terms inherent to 5G, the dataset includes 1,531 annotated CVE entries retrieved from the NIST NVD, covering the years 2019 through 2025, classified into four labels:

5G: if the vulnerability directly impacts 5G infrastructures, protocols, or specific 5G components, it receives the ”5g” label, indicating direct relevance to 5G security.
auxiliary: if the vulnerability has indirect implications for 5G systems, such as those affecting shared infrastructure, common protocols, or components that bridge LTE and 5G networks, it is labeled ”auxiliary”.
lte: if the vulnerability does not directly affect 5G networks but is specific to LTE, it is classified ”lte”, representing legacy 4G vulnerabilities without 5G implications.
no5G: if the vulnerability demonstrates no relationship to 5G technology, either directly or indirectly, the ”no5g” category is assigned.

In the following table, the frequency of the labels is presented:

Label	Frequency
5g	255
auxiliary	169
lte	95
no5G	1012

The dataset exhibits a significant class imbalance, with varying distributions across the four classification categories. This imbalance reflects the real-world distribution of vulnerabilities but may pose challenges for machine learning model training and evaluation.

To address the class imbalance issue and facilitate binary classification tasks, a balanced version of the dataset is also provided as an additional column of the CSV file. This balanced subset comprises 255 samples for the 5G class and 255 samples for the no5G class, totaling 510 entries.

Technical info (English)

Dataset Characteristics

Size: 1531 CVE vulnerabilities records
Time period: January 1, 2019 - July 1, 2025
Format: CSV (Comma-Separated Values)
Encoding: UTF-8
Data collection date: July 11, 2025

Data Structure

The dataset is organized into 8 columns:

1. CVE ID

Name: CVE-ID
Type: String
Description: Unique vulnerability identifier according to CVE standard
Format: CVE-YYYY-NNNNN (e.g., CVE-2023-43239)
Purpose: Traceability and unique vulnerability reference

2. CVE Description

Name: Description
Type: String
Description: Detailed technical description of the vulnerability, providing information such as vulnerability type, affected components, and potential impact
Format: no format
Purpose: provide a human-readable description of the vulnerability
Language: English, specialized technical terminology

3. CPE (Common Platform Enumeration)

Name: CPE
Type: String
Description: Standardized identifiers of the vulnerable platforms/products (there can be multiple values)
Format: cpe:2.3:part:vendor:product:version:..., cpe:2.3:...
Purpose: Precise identification of the affected system or component

4. CWE (Common Weakness Enumeration)

Name: CWE
Type: String
Description: CWE (Common Weakness Enumeration) identifier that classifies the type of weakness related to the given vulnerability according to MITRE's standardized taxonomy
Format: CWE-[number] (e.g., CWE-79, CWE-89, CWE-787)
Purpose: Categorize and identify the nature of the vulnerability according to a standardized hierarchical classification, facilitating the identification of common patterns, searching for similar vulnerabilities, and implementing appropriate mitigations

5. CVSS (Common Vulnerability Scoring System)

Name: CVSS
Type: Float range [0,10.0]
Description: CVSS (Common Vulnerability Scoring System) score that provides a numerical assessment of the severity of a vulnerability based on its intrinsic characteristics.
Format: Decimal number with one decimal place precision
Purpose: An approach that measures the main aspects of a vulnerability and assigns it a numeric severity score, which can then be expressed as a risk level (low, medium, high, or critical)

6. CVSS Vector

Name: CVSS-Vector
Type: String
Description: CVSS (Common Vulnerability Scoring System) vector string that describes the metric characteristics of the vulnerability according to the CVSS v3.1 standard
Format: CVSS:3.1/AV:[N|A|L|P]/AC:[L|H]/PR:[N|L|H]/UI:[N|R]/S:[U|C]/C:[N|L|H]/I:[N|L|H]/A:[N|L|H]
Purpose: Provide a standardized and detailed representation of vulnerability characteristics to enable automatic CVSS score calculation and objective vulnerability comparison

7. Multiclass label for 5G, LTE, Auxiliary, Non-5G classes

Name: Multiclass
Type: String
Description: Final classification label
Format: [5g | lte | auxiliary | no55]
Purpose: Enable multi-label classification of vulnerabilities or systems based on their associated network technologies

8. Binary label for 5G/no5G

Name: Label
Type: String
Description: Binary classification label (with N/A values) for the entries of a balanced dataset
Format: [5g | no55 | N/A ], where "5g" is used for 5G network-related vulnerabilities, "no5g" for vulnerabilities not correlated to 5G networks, and "N/A" for entries not considered in the balanced dataset
Purpose: Binary classification for machine learning algorithms

Notes (English)

This work was partially supported by the SERICS project (PE00000014) under the NRRP MUR program, funded by the EU-NGEU.

Files

dataset_cve_5G_network.csv

Files (17.1 MB)

Name	Size	Download all
dataset_cve_5G_network.csv md5:87014c06645606f79b5a0bceb78de043	17.1 MB	Preview Download

Additional details

Continues: Dataset: 10.5281/zenodo.16736495 (DOI)
Is derived from: Software: 10.5281/zenodo.17572825 (DOI)

	All versions	This version
Views	204	204
Downloads	163	163
Data volume	3.8 GB	3.8 GB

5G and Related Network Infrastructure CVE-Annotated Dataset: Distinguishing 5G Native, LTE, Auxiliary to 5G, and Non-5G Vulnerabilities

Authors/Creators

Description

Dataset Description

Technical info (English)

Dataset Characteristics

Data Structure

Notes (English)

Files

dataset_cve_5G_network.csv

Files (17.1 MB)

Additional details

Related works