AIT Netflow Data Set
Creators
- 1. AIT Austrian Institute of Technology
Description
AIT Netflow Data Sets
This repository contains labeled synthetic netflows suitable for evaluation of intrusion detection systems, federated learning, and alert aggregation. The netflows are generated from the packet captures contained in the AIT-LDS-v2.0. A detailed description of that dataset is available in [1]. The packet captures were collected from eight testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by [2]. Please cite these papers if the data is used for academic publications.
In brief, each of the datasets corresponds to a testbed representing a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise over a time span of 4-6 days. At some point, a sequence of attack steps is launched against the network. The following attacks are launched in the network:
- Scans (nmap, WPScan, dirb)
- Webshell upload (CVE-2020-24186)
- Password cracking (John the Ripper)
- Privilege escalation
- Remote command execution
- Data exfiltration (DNSteal)
This repository contains the following files:
- <testbed>_netflows.zip: CSV files of labeled TCP and UDP netflows for each testbed.
- README.md: Instructions on how to reproduce the generation and labeling of the netflows from the AIT-LDS-v2.0. Note that it is only necessary to run the python scripts if you want to extend or change the labeling procedure.
- 1_format_dataset_info.ipynb: Generates the tables necessary for labeling (see README.md).
- 2_label_logs.ipynb: Labels the netflows (see README.md).
Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU projects GUARD (833456) and PANDORA (SI2.835928).
If you use the dataset, please cite the following publications:
[1] M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenberger, and A. Rauber. "Maintainable Log Datasets for Evaluation of Intrusion Detection Systems". IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482. [PDF]
[2] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]
Notes
Files
1_format_dataset_info.ipynb
Files
(273.7 MB)
Name | Size | Download all |
---|---|---|
md5:ebcd97a6049926090238e88520089798
|
10.4 kB | Preview Download |
md5:52eedcedbdc4391c575dd2606bc893cc
|
31.5 kB | Preview Download |
md5:d3024ef6033fbd45982464f6b6f50984
|
25.9 MB | Preview Download |
md5:840224491fb31d548e7ee0ecbd951db5
|
31.2 MB | Preview Download |
md5:4143e85bea156333b463ebd5548fa443
|
2.1 kB | Preview Download |
md5:290aa5b8b0a9d0515b7e99329c467364
|
20.8 MB | Preview Download |
md5:e217db1c0307a8966db9e79609789fcb
|
25.7 MB | Preview Download |
md5:25c5a62572d37f088353cf7bbd58f590
|
37.8 MB | Preview Download |
md5:3a73afcaef00a1bbe53f1c39060c2f17
|
34.6 MB | Preview Download |
md5:2c45193c0babc8ec45f356f3e39118a6
|
41.2 MB | Preview Download |
md5:2c10183cf7aca03e9fb80408a360f34c
|
56.4 MB | Preview Download |