Published June 14, 2023 | Version v1
Dataset Open

Network traffic datasets created by Single Flow Time Series Analysis

  • 1. Czech Technical University in Prague
  • 2. CESNET, a.l.e.

Description

Network traffic datasets created by Single Flow Time Series Analysis

Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:
 

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

 

In the following table is a description of each dataset file:

File name Detection problem Citation of original raw dataset
botnet_binary.csv  Binary detection of botnet  S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014. 
botnet_multiclass.csv  Multi-class classification of botnet  S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014. 
cryptomining_design.csv Binary detection of cryptomining; the design part  Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022 
cryptomining_evaluation.csv  Binary detection of cryptomining; the evaluation part  Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022 
dns_malware.csv  Binary detection of malware DNS  Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021. 
doh_cic.csv  Binary detection of DoH 

Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020 

doh_real_world.csv  Binary detection of DoH  Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022 
dos.csv  Binary detection of DoS  Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv  Binary detection of IoT malware  Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv  Multi-class classification of intrusion in IDS  Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018. 
ids_unsw_nb_15_binary.csv  Binary detection of intrusion in IDS  Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
ids_unsw_nb_15_multiclass.csv  Multi-class classification of intrusion in IDS  Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv  Binary detection of IoT malware  Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv  Binary detection of IoT malware  Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv  Multi-class classification of IoT malware  Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
tor_binary.csv  Binary detection of TOR  Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017. 
tor_multiclass.csv  Multi-class classification of TOR  Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017. 
vpn_iscx_binary.csv  Binary detection of VPN  Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016. 
vpn_iscx_multiclass.csv  Multi-class classification of VPN  Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016. 
vpn_vnat_binary.csv  Binary detection of VPN  Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
vpn_vnat_multiclass.csv Multi-class classification of VPN  Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

 

Notes

This research was funded by the Ministry of Interior of the Czech Republic, grant No. VJ02010024: Flow-Based Encrypted Traffic Analysis and also by the Grant Agency of the CTU in Prague, grant No. SGS23/207/OHK3/3T/18 funded by the MEYS of the Czech Republic.

Files

botnet_binary.csv

Files (36.7 GB)

Name Size Download all
md5:9c5c889ba2e005e326d7e563382e4768
236.3 MB Preview Download
md5:24098dd542dc0899bc101de76cf2e508
118.6 MB Preview Download
md5:8b3c8bac2a3ddcf9f475af91dbb1f153
1.7 GB Preview Download
md5:d60ec607ad715b322de9aaf322032467
921.5 MB Preview Download
md5:8af3b551e14b15116263c1d879367152
7.5 MB Preview Download
md5:4de72e9d705d057d4c72d06dbfd69a1e
1.1 GB Preview Download
md5:553333d3c12a6c210a8128252db765d7
5.5 GB Preview Download
md5:ef2125ff0677ec5be9454261c6a23d01
2.6 GB Preview Download
md5:539e43986d55a0c63827d39a373675bb
1.5 GB Preview Download
md5:0789e1b55877e36e555d55e7a2eee20d
1.5 GB Preview Download
md5:6e35857d0fd928b6f78d590b7c2681e8
267.6 kB Preview Download
md5:d81389e280f4cf2904b28826064b96d2
816.0 MB Preview Download
md5:6371628e3290b453e8dcd42849a47d83
2.3 GB Preview Download
md5:ace7d4d77eace8271759af80ad59ca26
2.3 GB Preview Download
md5:e58648235d8866f41cf3939002531ca4
3.1 GB Preview Download
md5:3b7105473af5a49be92dbe5932db95be
3.1 GB Preview Download
md5:cdf2757df73d7642f23ff5d63458b064
2.3 GB Preview Download
md5:98ec489567682322d79040e7fc9f9f99
3.6 GB Preview Download
md5:b7c769d59981d3c1e3f078f0ac72c04a
3.6 GB Preview Download
md5:e95f06ab64f3346857f5e8a1e5a8de89
106.3 MB Preview Download
md5:8fb51e42262d64f333524c899eec8b8d
101.7 MB Preview Download
md5:a668b406c54c0bf4013ad3b4e9be7171
133.2 MB Preview Download
md5:06f5fd6a9657d7725fc2f6cb8c27de01
20.7 MB Preview Download
md5:b503a11a0a552e6e6dc8375ea9d2a0c4
35.0 MB Preview Download
md5:538bd16e22c3af4d95acf0968478d6ed
32.2 MB Preview Download

Additional details

References

  • S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
  • Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
  • Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
  • Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
  • Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
  • Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
  • Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
  • Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
  • Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
  • Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
  • Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
  • Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
  • Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
  • Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
  • Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022