SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection Dataset
Authors/Creators
Description
These datasets provide packet-level labeling of the payloads in the CIC-IDS-2017 and UNSW-NB15 network intrusion detection datasets. A full discussion of the data processing can be found in our Transactions on Machine Learning Research journal paper SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection. Code for additional processing and experimentation can be found here. The UNSW-NB15 dataset contains over 50 million non-empty payloads coming from nine attack classes with benign background traffic. The CIC-IDS-2017 dataset contains over 30 million non-empty payloads coming from fourteen attack classes with benign background traffic. Both datasets are highly imbalanced, with 20-25x more benign packets than malicious ones.
Files
Files
(61.9 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:4be18a6a8eb5869254c5283486f52703
|
49.3 GB | Download |
|
md5:fb2f0025942749f00d38aad230d4c97a
|
12.7 GB | Download |
Additional details
Software
- Repository URL
- https://github.com/SRI-CSL/trinity-packet
- Programming language
- Python
References
- Moustafa, Nour, and Jill Slay. "UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)." 2015 military communications and information systems conference (MilCIS). IEEE, 2015.
- Sharafaldin, Iman, Arash Habibi Lashkari, and Ali A. Ghorbani. "Toward generating a new intrusion detection dataset and intrusion traffic characterization." ICISSp 1.2018 (2018): 108-116.
- Matejek, Brian, et al. "SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection". Transactions on Machine Learning Research, 2025, https://openreview.net/forum?id=hDywd5AbIM.