Published March 1, 2020 | Version 1.0.0-alpha
Dataset Open

Topological Botnet Detection

  • 1. Harvard University
  • 2. Cornell University

Description

Large-scale topological botnet detection datasets for graph machine learning and network security, containing 4 different types of synthetic botnet topologies and 2 different real botnet topologies overlaid onto large real-world background traffic communication networks.

Each dataset contains a specific botnet topology, with 960 graphs in total, randomly split to train/val/test sets. There are labels on both nodes and edges indicating whether they were in the botnet (evil) community. Learning tasks could target at predicting on nodes to detect whether they are botnet nodes, or recovering the whole botnet community by also predicting on edges as whether they belong to the original botnet.

There are no unique features on nodes or edges, as the detection is purely based on topological properties within the graph (topological discovery).

 

Notes

For automatic data downloading, evaluation API, and GNN (graph neural network) training pipeline, check https://github.com/jzhou316/botnet-detection

Files

Files (21.6 GB)

Name Size Download all
md5:523e1374ad14081135f1af7458624109
3.7 GB Download
md5:567ed05fd6dcb5aea70b7dd6700fd619
3.4 GB Download
md5:7aae9cfd589d2d4766ae209ac73755f8
3.9 GB Download
md5:764d79fa0978a7f7eb354ad01a0dc10b
3.5 GB Download
md5:3e88036a2de6c24e6d556337afcc6418
3.4 GB Download
md5:09ad97bcc909f1c3425738502ccf8b56
3.6 GB Download