Published March 20, 2025 | Version v1
Dataset Open

SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection Dataset

  • 1. ROR icon SRI International
  • 2. ROR icon United States Military Academy
  • 3. ROR icon United States Department of Defense

Description

These datasets provide packet-level labeling of the payloads in the CIC-IDS-2017 and UNSW-NB15 network intrusion detection datasets. A full discussion of the data processing can be found in our Transactions on Machine Learning Research journal paper SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection. Code for additional processing and experimentation can be found here. The UNSW-NB15 dataset contains over 50 million non-empty payloads coming from nine attack classes with benign background traffic. The CIC-IDS-2017 dataset contains over 30 million non-empty payloads coming from fourteen attack classes with benign background traffic. Both datasets are highly imbalanced, with 20-25x more benign packets than malicious ones.

Files

Files (61.9 GB)

Name Size Download all
md5:4be18a6a8eb5869254c5283486f52703
49.3 GB Download
md5:fb2f0025942749f00d38aad230d4c97a
12.7 GB Download

Additional details

Software

Repository URL
https://github.com/SRI-CSL/trinity-packet
Programming language
Python

References

  • Moustafa, Nour, and Jill Slay. "UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)." 2015 military communications and information systems conference (MilCIS). IEEE, 2015.
  • Sharafaldin, Iman, Arash Habibi Lashkari, and Ali A. Ghorbani. "Toward generating a new intrusion detection dataset and intrusion traffic characterization." ICISSp 1.2018 (2018): 108-116.
  • Matejek, Brian, et al. "SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection". Transactions on Machine Learning Research, 2025, https://openreview.net/forum?id=hDywd5AbIM.