Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Dataset Open Access

HTTPS Brute-force dataset with extended network flows

Jan Luxemburk; Karel Hynek; Tomas Cejka

We are publishing a dataset we created for designing a brute-force detector of attacks in HTTPS. The dataset consists of extended network flows that we captured with flow exporter Ipifixprobe. Apart from traditional fields like source and destination IP addresses and ports, each flow contains information (size, direction, inter-packet time, TCP flags) about up to the first 100 packets. The sizes of packets are taken from the transport layer (TCP, UPD); packets with zero payload (e.g., TCP ACKs) are ignored.

We publish three files:

  • flows.csv, which contains raw flow data.
  • aggregated_flows.csv, which contains aggregated flows
  • samples.csv, which contains samples with extracted features. This data can be used for training a machine-learning classification model.


All IP addresses, source ports, TLS SNIs are sha256-hashed. Column CLASS is 0 for benign samples and 1 for brute-force samples.

Brute-force data
The brute-force data were generated with three popular attack tools - Ncrack, Thc-hydra, and Patator. Attacks were performed against these applications:

  •     WordPress
  •     Joomla 
  •     MediaWiki
  •     Ghost
  •     Grafana
  •     Discourse
  •     PhpBB
  •     OpenCart
  •     Redmine
  •     Nginx
  •     Apache

The SCENARIO columns indicate which tool and application were used to generate the sample.

Benign data
Bening data consists of eight captures from a backbone network. The SCENARIO column indicates individual captures.




Files (209.1 MB)
Name Size
209.1 MB Download
All versions This version
Views 346346
Downloads 8989
Data volume 18.6 GB18.6 GB
Unique views 290290
Unique downloads 7070


Cite as