Published May 9, 2025 | Version 1.0.0
Dataset Open

S7 Data Modification Attacks using an Industrial Control System Testbed

Description

This dataset contains data modification attacks on the S7 protocol. Three data sources are provided:

  • Network packets
  • Logs from the Siemens PLC 1512 (Battery power plant) and the PLC 1516 (PV power plant)
  • Process data from the Control Center

The Apache Parquet files are compressed using Gzip. Use Pandas with the Pyarrow backend to read them:

import pandas as pd
logs_df = pd.read_parquet("logs_dataset.parquet")
packets_df = pd.read_parquet("packets_dataset.parquet")
process_df = pd.read_parquet("process_dataset.parquet")

The attack scripts and a prototype Intrusion Detection System (IDS) are provided in the S7 Attacks GitHub repository.

Packet Data Source

Features:

  • timestamp
  • packet: binary representation of the network packet after the potential data modification attack
  • label: normal or data_modification
  • datamod_changes: list of changes to the original packet performed by the attack script
  •  s7_datablock: parsed user defined section of the S7 BSEND/BRECV packet

The S7 attacks repository includes the definition of the datablocks, which is needed to extract the process values from the user-defined section of the S7 BSEND/BRCV packet.

Process Data Source

Features:

  • archive_recv_ts
  • in_pv_temp_air
  • in_pv_wind_speed
  • in_pv_poa_direct
  • in_pv_poa_diffuse
  • in_pv_cell_temperature
  • in_pv_inverter_ac_power
  • in_pv_inverter_dc_power
  • in_batt_state_of_charge
  • in_batt_voltage
  • in_batt_current
  • in_batt_actual_charge_power
  • in_batt_temperature
  • state_batt_voltage
  • state_batt_current
  • state_batt_state_of_charge
  • state_batt_stored_energy
  • out_pv_on_off
  • out_pv_target_power
  • out_batt_on_off
  • out_batt_target_power

The process data was collected from the control center with a 1 second interval. archive_recv_ts is the timestamp when the record was saved in the database.

Every signal is prefixed with in, out or state:

  • in: Monitoring signal sent from power plant to control center
  • out: Control signal sent from control center to power plant
  • state: Calculated values from the power flow algorithm derived from the in_batt_* values

Logs Data Source

Features:

  • server_recv_ts: timestamp when log message was received by rsyslog
  • device_ts: timestamp when log message was sent by the PLC (starts at 2015-05-18)
  • hostname: name of the PLC
  • field_name: Actual_Charge_Power (Battery), Target_Charge_Power (Battery), OnOff (PV), Poa_direct (PV), Poa_diffuse (PV), Inverter_ac_power (PV), Inverter_dc_power (PV), Cell_Temperature (PV), Temp_Air (PV)
  • old_value: value before tripping the threshold 
  • new_value: value after tripping the threshold

The log messages are only emitted by the PLCs for the battery and the PV power plants, when a process value goes above or below a defined threshold.
The process values were extracted by parsing the log messages with regex rules.

Funding

This research is supported in part by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs (structure 46.23.02).

Files

Files (202.3 MB)

Name Size Download all
md5:28b94fef326526f46df96402be80e149
123.1 kB Download
md5:ebbdee3de97e25cf4020d8552f431762
201.8 MB Download
md5:ebc3e048e0319f044ea47cc08e498999
359.7 kB Download

Additional details

Related works

Is described by
Publication: 10.1145/3679240.3734645 (DOI)

Dates

Collected
2025-02-19