Dataset Open Access

HTTPS Brute-force dataset with extended network flows

Jan Luxemburk; Karel Hynek; Tomas Cejka

We are publishing a dataset we created for designing a brute-force detector of attacks in HTTPS. The dataset consists of extended network flows that we captured with flow exporter Ipifixprobe. Apart from traditional fields like source and destination IP addresses and ports, each flow contains information (size, direction, inter-packet time, TCP flags) about up to the first 100 packets. The sizes of packets are taken from the transport layer (TCP, UPD); packets with zero payload (e.g., TCP ACKs) are ignored.

We publish three files:

  • flows.csv, which contains raw flow data.
  • aggregated_flows.csv, which contains aggregated flows
  • samples.csv, which contains samples with extracted features. This data can be used for training a machine-learning classification model.

 

All IP addresses, source ports, TLS SNIs are sha256-hashed. Column CLASS is 0 for benign samples and 1 for brute-force samples.


Brute-force data
The brute-force data were generated with three popular attack tools - Ncrack, Thc-hydra, and Patator. Attacks were performed against these applications:

  •     WordPress
  •     Joomla 
  •     MediaWiki
  •     Ghost
  •     Grafana
  •     Discourse
  •     PhpBB
  •     OpenCart
  •     Redmine
  •     Nginx
  •     Apache

The SCENARIO columns indicate which tool and application were used to generate the sample.

Benign data
Bening data consists of eight captures from a backbone network. The SCENARIO column indicates individual captures.

 

 

 

Files (209.1 MB)
Name Size
brute-force-dataset.zip
md5:37467d0fddf27eeeeae1388800f0d4d8
209.1 MB Download
164
22
views
downloads
All versions This version
Views 164164
Downloads 2222
Data volume 4.6 GB4.6 GB
Unique views 127127
Unique downloads 2020

Share

Cite as