UPDATE: Zenodo migration postponed to Oct 13 from 06:00-08:00 UTC. Read the announcement.

Dataset Open Access

SAPPAN: Combined Network and Host Data

Cermak, Milan; Obrecht, Mischa

The data were acquired from a small simulated environment consisting of one Windows host (host data collection) and a router that observes all network traffic passing to the host. Two attack scenarios were performed in these small simulated environments, and data relevant to these attacks were extracted and further processed. In the case, the attack scenario was based on the Drupal web application's vulnerability, which enabled downloading and running of a malicious code that provided a remote shell to the attacker. In the case, the scenario was based on the old version of the Samba file-sharing that was vulnerable to Eternalblue attack allowing to execute commands and provide a remote shell to the attacker. 

The dataset is divided into separate directories according to the attacks contained. In the case of the Drupal vulnerability scenario, datasets from a failed and successful attempt to exploit the vulnerability are included. Four datasets were created during individual phases of SMB file sharing vulnerability scenario. Each directory contains a normalized network traffic capture and corresponding host data in preformatted JSON.


Drupal Vulnerability Scenario

The attack scenario is based on an old Drupal server (v 8.5.0) with known vulnerability CVE-2018-7600 (also called Drupalgeddon). This vulnerability is exploited by an attacker to remotely run code and gain access to the vulnerable server via a remote shell. This connection is realized by the Meterpreter trojan of type python/meterpreter/reverse_tcp.  The binary is created by Metasploit generator msfvenom and obfuscated using the attacker's custom obfuscation technique to bypass windows antivirus. The created binary file is delivered to the victim host using remote code execution in Drupal, based on which the "finger" command is executed to download the payload from the payload delivery server and C2 server. This trojan is then launched by an attacker using additional commands injected through the Drupal vulnerability. Once launched, it automatically establishes a connection with the attacker (remote shell) through the payload delivery and C2 server. As a result, the attacker gains full access to the system and can execute any commands (in the scenario, only the "whoami" command is executed).

Two datasets were generated during the scenario and its preparation. The first was obtained during the preparatory work when the server's defense mechanisms blocked an attacker's attempt to download the file (a command "MpCmdRun.exe" is used instead of the "finger" command). The second dataset contains a complete attack performed after modifying the executed commands to overcome the mentioned defense mechanisms.

Samba File Sharing Vulnerability Scenario

The attack scenario is based on an unpatched Windows 7 host with known vulnerability CVE-2017-0144 (also called EternalBlue). The scenario is divided into four parts covering the individual phases of the attack and failed exploitation attempts. In the first part, the attacker performs a scan of open ports on the client device and verifies if the SMB file sharing service is vulnerable to the EternalBlue attack. In the next phase, the attacker unsuccessfully tries to exploit the vulnerability using a standard Metasploit module. This procedure does not result in a remote connection. In the third phase, a specialized exploit is used to attack the service using previously known credentials. In the fourth phase, the attacker tried another script to make the scenario more complex, enabling the attack to be performed without credentials.

For each mentioned phase, a separate dataset was generated, capturing all events in the form of packet traces and corresponding host data.


Dataset Features

In the case of packet capture, the dataset contains standard PCAP files containing all captured packets, including the complete application layer.

The raw host data were reduced to contain only the following attributes:

  • event_id - unique Identifier of the event, assigned by a preprocessor
  • event_type - a type of the event
  • time_created - time when the sensor recorded the event
  • event_data - event type-specific payload
Files (17.4 MB)
Name Size
176.0 kB Download
5.7 MB Download
26.5 kB Download
14.6 kB Download
59.0 kB Download
187.6 kB Download
90.8 kB Download
174.3 kB Download
296.6 kB Download
9.8 MB Download
202.1 kB Download
765.6 kB Download
All versions This version
Views 143125
Downloads 9790
Data volume 113.6 MB112.5 MB
Unique views 10798
Unique downloads 2726


Cite as