Cyber Attack Manifestations - Log Data Set (CAM-LDS)
Authors/Creators
- 1. Austrian Institute of Technology
Description
This repository contains synthetic log data and network traffic generated as consequences of cyber attacks. The data set was collected in the virtual test environment AttackBed at the AIT Austrian Institute of Technology. It includes attacks corresponding to 13 tactics and 81 distinct techniques from MITRE ATT&CK. The data is labeled with technique identifiers and suitable for evaluation of log and alert interpretation approaches. The network topology of AttackBed is based on Linux and represents a small enterprise, including security zones (Internet, DMZ, LAN), file shares, video surveillance system, repository server, DNS, firewall, user workstations, etc. Check out our Github page for scripts to utilize this data set for LLM-based attack interpretation and some prompt-response samples. For more information on the generation of the data and a detailed description of all attack scenarios, please check out our publication [1]. Please cite that publication if you use the data.
Note that the data set focuses on attack manifestations; therefore, there no normal/benign user behavior simulation was active during collection. All logs and alerts generated during the attack intervals are thus consequences of cyber attacks or part of idle system activities. In each simulation run we collect logs (audit, authentication, Apache access and error, syslog, package management, cron, ZoneMinder, FTP, Puppet, Docker, Nextcloud, mail, system performance metrics, etc.) and netflows as well as alerts from host-based (Wazuh) and network-based (Suricata) intrusion detection systems. Due to their large size, we provide network packet captures of the CAM-LDS in a separate repository.
The data set comprises of seven scenarios, each representing an attack chain. Thereby, some scenarios involve variants of certain attack steps, which are simulated separately. The following enumeration provides an overview of all scenarios and some highlighted attack steps (lists are not exhaustive).
- Scenario 1: Video Server Exploit
- Scans (dnsenum, nmap, nikto, ffuf, linpeas), Exploits (Zoneminder/CVE-2023-26035), Privilege Escalation (logrotate race condition, PwnKit/CVE-2021-4034, reverse shells, local accounts), Persistence (PAM, SSH key, useradd), Discovery (credentials, devices)
- 18 simulation runs (6 variants for Privilege Escalation and 3 variants for Persistence)
- Scenario 2: Linux Malware
- Command and Control (implant, rootkit), Discovery (processes, configurations, credentials, policies, configurations), Exfiltration (archives), Scans (nmap)
- 2 simulation runs (2 variants for Command and Control)
- Scenario 3: Lateral Movement
- Brute-force Login (hydra), Credential Sniffing (tcpdump), Collection (repositories), Lateral Movement (shared files, malicious class, rebuild package), Impact (ransomware, deletion)
- 6 simulation runs (2 variants for Initial Access and 3 variants for Lateral Movement)
- Scenario 4: Network Attack
- Port Knocking, Sudo Caching, Modify Firewall Rules
- 1 simulation run
- Scenario 5: Network Sniffing
- Network Sniffing (bettercap), Re-use Access Token
- 1 simulation run
- Scenario 6: Attack on Client
- Command and Control (malicious file, malicious plugin, screensharing), Exfiltration (cron), Collection (keylogger, xclip)
- 5 simulation runs (2 variants for Command and Control and 2 variants for Persistence, plus 1 variant for Collection)
- Scenario 7: Docker Attack
- Scans (dnsenum, smtp-user-enum), Brute-force Login (hydra), Exploits (Nextcloud), Docker Container Escape, Exfiltration (credentials)
- 1 simulation run
The repository is structured as follows. Each zip-archive represents one simulation run. The name of the archive determines the scenario and variant: scenario_<Scenario-ID>_<Optional-Variants>. For example, scenario_1_autostart_localaccount is a recording of Scenario 1 with the autostart variant for Privilege Escalation and the localaccount variation for Persistence. The zip-archives contain a list of directories that correspond to the hosts where the data is collected from (e.g., firewall or video server). Relevant subdirectories and files are as follows.
- <host>/logs: System log files, network traffic, and alerts collected from that host.
- <host>/configs: Configuration files of system services on that host.
- <host>/facts.json: Summary of system configurations of that host in JSON format.
- attacker/logs/attackmate.json: Attack execution logs with time-based ground truth labels (MITRE ATT&CK techniques).
- attacker/logs/output.log: Output of attack step executions (e.g., scan results).
Alternatively to the attackmate.json file, you can find the labels of each scenario/variant/step here.
We also provide two separate archives where the system logs of individual attack steps have been extracted by correlating their timestamps with the time intervals of each step:
- manifestations_raw: Contains all logs extracted for specific attack steps.
- manifestations_filtered: Contains extracted logs that are direct consequences of attacks; in particular, logs that remain after filtering of recurring events and processes known to correspond to normal system activity. Logs that are difficult to relate to attack steps (collectd.log) or related to network traffic (eve.json) are only available in the raw manifestations.
We group the logs by step, sequence of steps, and techniques after extraction. In each manifestations-archive are therefore the following directories:
- steps: Contains the logs from each step of the attack chain, e.g., 1_autostart_localaccount-5 contains the logs that are generated at the fifth step of Scenario 1 with autostart and localaccount variants.
- sequences: Contains the logs from sequences of steps, where sequences are built from successive steps with identical techniques. Note that some logs are duplicated since multiple techniques may be assigned to single steps.
- techniques: Contains the logs corresponding to certain attack techniques, referenced by attack ID. Note that some logs are duplicated since multiple techniques may be assigned to single steps.
If you use the dataset, please cite the following publication:
[1] M. Landauer, W. Hotwagner, T. Boenke, F. Skopik, M. Wurzenberger. CAM-LDS: Cyber Attack Manifestations for Automatic Interpretation of System Logs and Security Alerts. [PDF]
Files
manifestations_filtered.zip
Files
(7.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:cfc14140b2e396a60989b8379eafca1f
|
213.8 MB | Preview Download |
|
md5:9cbce6fd08044d298335a77eacd2e83f
|
535.9 MB | Preview Download |
|
md5:95ba3c1ff5ab98613631df3301fad442
|
178.0 MB | Preview Download |
|
md5:3449b840fc5acb675a6581f0e4d0cd0e
|
177.4 MB | Preview Download |
|
md5:5953fe50f084aa1adbc14ec50eee5ff4
|
176.6 MB | Preview Download |
|
md5:aef4d07305b30b37010780250becba07
|
173.9 MB | Preview Download |
|
md5:fbfaa124a628b73486702cbf31d33564
|
173.7 MB | Preview Download |
|
md5:d9da77c3e5fb2e50d068dde0069b7026
|
173.6 MB | Preview Download |
|
md5:eaa091433ff9002b6b85bf2fa0080ba7
|
174.0 MB | Preview Download |
|
md5:874eb73fc20470f53b629ab00388826b
|
173.4 MB | Preview Download |
|
md5:e3c0c22fb072e9add09bb62ab8e0853d
|
175.7 MB | Preview Download |
|
md5:f53684af8dee51574577424937e01a5c
|
176.2 MB | Preview Download |
|
md5:8a5a5e30d3e0895646560bc96f5cb07b
|
174.3 MB | Preview Download |
|
md5:3a241884c830495d00b2243bf9070ad8
|
174.7 MB | Preview Download |
|
md5:f1c441a222148634afc3afa3100281c0
|
178.0 MB | Preview Download |
|
md5:e7e5bd19cda62a5bfb9a68525715cd78
|
175.3 MB | Preview Download |
|
md5:2788b802adc9ed6f009b0d1f5b03c37e
|
173.5 MB | Preview Download |
|
md5:24b2faf18e8810b989184b5b84218c5f
|
173.0 MB | Preview Download |
|
md5:a3a30007ea1e87bc894698db9f4bfdca
|
172.9 MB | Preview Download |
|
md5:514017c08291ac392cb3b6557711b139
|
173.1 MB | Preview Download |
|
md5:c33aca73d6962414bd6c2c3f82b934c4
|
160.5 MB | Preview Download |
|
md5:322b249ab429a1718e97f0086a321645
|
161.0 MB | Preview Download |
|
md5:0837edd952382f1d917d5491775cd0af
|
206.8 MB | Preview Download |
|
md5:5c9618023dcf6b44a69eb877d6135392
|
203.4 MB | Preview Download |
|
md5:845049f82c502e255ff93eda15ff4c81
|
197.8 MB | Preview Download |
|
md5:bfd7e44cdb2d038a9deab800350e4b98
|
206.6 MB | Preview Download |
|
md5:d6e6ae004c8cc253838a5256274141b0
|
202.7 MB | Preview Download |
|
md5:7b818ded4dc38d0e4dd717e7d97135c1
|
198.8 MB | Preview Download |
|
md5:f1bf6ecd793c6c51f922ee43e33d7cf4
|
201.5 MB | Preview Download |
|
md5:c56fce722a7b714eba004d6fd6319879
|
202.8 MB | Preview Download |
|
md5:aedd7c0b51700b16c04539f10784d5d4
|
202.3 MB | Preview Download |
|
md5:1fb5d4047c39eb3664ac96a05d7e4add
|
198.6 MB | Preview Download |
|
md5:5ee9684d14d15c924f885faa16b9a95b
|
199.7 MB | Preview Download |
|
md5:1b8ed751081f0221a7153919addc8da2
|
364.3 MB | Preview Download |
|
md5:c1f7938bbbfec05bbf3e29604b69b479
|
447.1 MB | Preview Download |
|
md5:88122133a82079e07e7ed1b782bc203f
|
160.1 MB | Preview Download |