Published March 4, 2026 | Version v2
Dataset Open

Cyber Attack Manifestations - Log Data Set (CAM-LDS)

  • 1. Austrian Institute of Technology

Description

This repository contains synthetic log data and network traffic generated as consequences of cyber attacks. The data set was collected in the virtual test environment AttackBed at the AIT Austrian Institute of Technology. It includes attacks corresponding to 13 tactics and 81 distinct techniques from MITRE ATT&CK. The data is labeled with technique identifiers and suitable for evaluation of log and alert interpretation approaches. The network topology of AttackBed is based on Linux and represents a small enterprise, including security zones (Internet, DMZ, LAN), file shares, video surveillance system, repository server, DNS, firewall, user workstations, etc. Check out our Github page for scripts to utilize this data set for LLM-based attack interpretation and some prompt-response samples. For more information on the generation of the data and a detailed description of all attack scenarios, please check out our publication [1]. Please cite that publication if you use the data.

Note that the data set focuses on attack manifestations; therefore, there no normal/benign user behavior simulation was active during collection. All logs and alerts generated during the attack intervals are thus consequences of cyber attacks or part of idle system activities. In each simulation run we collect logs (audit, authentication, Apache access and error, syslog, package management, cron, ZoneMinder, FTP, Puppet, Docker, Nextcloud, mail, system performance metrics, etc.) and netflows as well as alerts from host-based (Wazuh) and network-based (Suricata) intrusion detection systems. Due to their large size, we provide network packet captures of the CAM-LDS in a separate repository.

The data set comprises of seven scenarios, each representing an attack chain. Thereby, some scenarios involve variants of certain attack steps, which are simulated separately. The following enumeration provides an overview of all scenarios and some highlighted attack steps (lists are not exhaustive).

  • Scenario 1: Video Server Exploit
    • Scans (dnsenum, nmap, nikto, ffuf, linpeas), Exploits (Zoneminder/CVE-2023-26035), Privilege Escalation (logrotate race condition, PwnKit/CVE-2021-4034, reverse shells, local accounts), Persistence (PAM, SSH key, useradd), Discovery (credentials, devices)
    • 18 simulation runs (6 variants for Privilege Escalation and 3 variants for Persistence)
  • Scenario 2: Linux Malware
    • Command and Control (implant, rootkit), Discovery (processes, configurations, credentials, policies, configurations), Exfiltration (archives), Scans (nmap)
    • 2 simulation runs (2 variants for Command and Control)
  • Scenario 3: Lateral Movement
    • Brute-force Login (hydra), Credential Sniffing (tcpdump), Collection (repositories), Lateral Movement (shared files, malicious class, rebuild package), Impact (ransomware, deletion)
    • 6 simulation runs (2 variants for Initial Access and 3 variants for Lateral Movement)
  • Scenario 4: Network Attack
    • Port Knocking, Sudo Caching, Modify Firewall Rules
    • 1 simulation run
  • Scenario 5: Network Sniffing
    • Network Sniffing (bettercap), Re-use Access Token
    • 1 simulation run
  • Scenario 6: Attack on Client
    • Command and Control (malicious file, malicious plugin, screensharing), Exfiltration (cron), Collection (keylogger, xclip)
    • 5 simulation runs (2 variants for Command and Control and 2 variants for Persistence, plus 1 variant for Collection)
  • Scenario 7: Docker Attack
    • Scans (dnsenum, smtp-user-enum), Brute-force Login (hydra), Exploits (Nextcloud), Docker Container Escape, Exfiltration (credentials)
    • 1 simulation run

The repository is structured as follows. Each zip-archive represents one simulation run. The name of the archive determines the scenario and variant: scenario_<Scenario-ID>_<Optional-Variants>. For example, scenario_1_autostart_localaccount is a recording of Scenario 1 with the autostart variant for Privilege Escalation and the localaccount variation for Persistence. The zip-archives contain a list of directories that correspond to the hosts where the data is collected from (e.g., firewall or video server). Relevant subdirectories and files are as follows.

  • <host>/logs: System log files, network traffic, and alerts collected from that host.
  • <host>/configs: Configuration files of system services on that host.
  • <host>/facts.json: Summary of system configurations of that host in JSON format.
  • attacker/logs/attackmate.json: Attack execution logs with time-based ground truth labels (MITRE ATT&CK techniques).
  • attacker/logs/output.log: Output of attack step executions (e.g., scan results).

Alternatively to the attackmate.json file, you can find the labels of each scenario/variant/step here.

We also provide two separate archives where the system logs of individual attack steps have been extracted by correlating their timestamps with the time intervals of each step:

  • manifestations_raw: Contains all logs extracted for specific attack steps.
  • manifestations_filtered: Contains extracted logs that are direct consequences of attacks; in particular, logs that remain after filtering of recurring events and processes known to correspond to normal system activity. Logs that are difficult to relate to attack steps (collectd.log) or related to network traffic (eve.json) are only available in the raw manifestations.

We group the logs by step, sequence of steps, and techniques after extraction. In each manifestations-archive are therefore the following directories:

  • steps: Contains the logs from each step of the attack chain, e.g., 1_autostart_localaccount-5 contains the logs that are generated at the fifth step of Scenario 1 with autostart and localaccount variants.
  • sequences: Contains the logs from sequences of steps, where sequences are built from successive steps with identical techniques. Note that some logs are duplicated since multiple techniques may be assigned to single steps.
  • techniques: Contains the logs corresponding to certain attack techniques, referenced by attack ID. Note that some logs are duplicated since multiple techniques may be assigned to single steps.

If you use the dataset, please cite the following publication:

[1] M. Landauer, W. Hotwagner, T. Boenke, F. Skopik, M. Wurzenberger. CAM-LDS: Cyber Attack Manifestations for Automatic Interpretation of System Logs and Security Alerts. [PDF]

Files

manifestations_filtered.zip

Files (7.4 GB)

Name Size Download all
md5:cfc14140b2e396a60989b8379eafca1f
213.8 MB Preview Download
md5:9cbce6fd08044d298335a77eacd2e83f
535.9 MB Preview Download
md5:95ba3c1ff5ab98613631df3301fad442
178.0 MB Preview Download
md5:3449b840fc5acb675a6581f0e4d0cd0e
177.4 MB Preview Download
md5:5953fe50f084aa1adbc14ec50eee5ff4
176.6 MB Preview Download
md5:aef4d07305b30b37010780250becba07
173.9 MB Preview Download
md5:fbfaa124a628b73486702cbf31d33564
173.7 MB Preview Download
md5:d9da77c3e5fb2e50d068dde0069b7026
173.6 MB Preview Download
md5:eaa091433ff9002b6b85bf2fa0080ba7
174.0 MB Preview Download
md5:874eb73fc20470f53b629ab00388826b
173.4 MB Preview Download
md5:e3c0c22fb072e9add09bb62ab8e0853d
175.7 MB Preview Download
md5:f53684af8dee51574577424937e01a5c
176.2 MB Preview Download
md5:8a5a5e30d3e0895646560bc96f5cb07b
174.3 MB Preview Download
md5:3a241884c830495d00b2243bf9070ad8
174.7 MB Preview Download
md5:f1c441a222148634afc3afa3100281c0
178.0 MB Preview Download
md5:e7e5bd19cda62a5bfb9a68525715cd78
175.3 MB Preview Download
md5:2788b802adc9ed6f009b0d1f5b03c37e
173.5 MB Preview Download
md5:24b2faf18e8810b989184b5b84218c5f
173.0 MB Preview Download
md5:a3a30007ea1e87bc894698db9f4bfdca
172.9 MB Preview Download
md5:514017c08291ac392cb3b6557711b139
173.1 MB Preview Download
md5:c33aca73d6962414bd6c2c3f82b934c4
160.5 MB Preview Download
md5:322b249ab429a1718e97f0086a321645
161.0 MB Preview Download
md5:0837edd952382f1d917d5491775cd0af
206.8 MB Preview Download
md5:5c9618023dcf6b44a69eb877d6135392
203.4 MB Preview Download
md5:845049f82c502e255ff93eda15ff4c81
197.8 MB Preview Download
md5:bfd7e44cdb2d038a9deab800350e4b98
206.6 MB Preview Download
md5:d6e6ae004c8cc253838a5256274141b0
202.7 MB Preview Download
md5:7b818ded4dc38d0e4dd717e7d97135c1
198.8 MB Preview Download
md5:f1bf6ecd793c6c51f922ee43e33d7cf4
201.5 MB Preview Download
md5:c56fce722a7b714eba004d6fd6319879
202.8 MB Preview Download
md5:aedd7c0b51700b16c04539f10784d5d4
202.3 MB Preview Download
md5:1fb5d4047c39eb3664ac96a05d7e4add
198.6 MB Preview Download
md5:5ee9684d14d15c924f885faa16b9a95b
199.7 MB Preview Download
md5:1b8ed751081f0221a7153919addc8da2
364.3 MB Preview Download
md5:c1f7938bbbfec05bbf3e29604b69b479
447.1 MB Preview Download
md5:88122133a82079e07e7ed1b782bc203f
160.1 MB Preview Download

Additional details

Funding

European Commission
MIRANDA - Monitoring, Investigation and Response to cyber-attacks with an Adaptive digital twiN moDel for Agile services over the computing continuum 101168144