S7 Data Modification Attacks using an Industrial Control System Testbed
Creators
Description
This dataset contains data modification attacks on the S7 protocol. Three data sources are provided:
- Network packets
- Logs from the Siemens PLC 1512 (Battery power plant) and the PLC 1516 (PV power plant)
- Process data from the Control Center
The Apache Parquet files are compressed using Gzip. Use Pandas with the Pyarrow backend to read them:
import pandas as pd
logs_df = pd.read_parquet("logs_dataset.parquet")
packets_df = pd.read_parquet("packets_dataset.parquet")
process_df = pd.read_parquet("process_dataset.parquet")
The attack scripts and a prototype Intrusion Detection System (IDS) are provided in the S7 Attacks GitHub repository.
Packet Data Source
Features:
- timestamp
- packet: binary representation of the network packet after the potential data modification attack
- label: normal or data_modification
- datamod_changes: list of changes to the original packet performed by the attack script
- s7_datablock: parsed user defined section of the S7 BSEND/BRECV packet
The S7 attacks repository includes the definition of the datablocks, which is needed to extract the process values from the user-defined section of the S7 BSEND/BRCV packet.
Process Data Source
Features:
- archive_recv_ts
- in_pv_temp_air
- in_pv_wind_speed
- in_pv_poa_direct
- in_pv_poa_diffuse
- in_pv_cell_temperature
- in_pv_inverter_ac_power
- in_pv_inverter_dc_power
- in_batt_state_of_charge
- in_batt_voltage
- in_batt_current
- in_batt_actual_charge_power
- in_batt_temperature
- state_batt_voltage
- state_batt_current
- state_batt_state_of_charge
- state_batt_stored_energy
- out_pv_on_off
- out_pv_target_power
- out_batt_on_off
- out_batt_target_power
The process data was collected from the control center with a 1 second interval. archive_recv_ts is the timestamp when the record was saved in the database.
Every signal is prefixed with in, out or state:
- in: Monitoring signal sent from power plant to control center
- out: Control signal sent from control center to power plant
- state: Calculated values from the power flow algorithm derived from the in_batt_* values
Logs Data Source
Features:
- server_recv_ts: timestamp when log message was received by rsyslog
- device_ts: timestamp when log message was sent by the PLC (starts at 2015-05-18)
- hostname: name of the PLC
- field_name: Actual_Charge_Power (Battery), Target_Charge_Power (Battery), OnOff (PV), Poa_direct (PV), Poa_diffuse (PV), Inverter_ac_power (PV), Inverter_dc_power (PV), Cell_Temperature (PV), Temp_Air (PV)
- old_value: value before tripping the threshold
- new_value: value after tripping the threshold
The log messages are only emitted by the PLCs for the battery and the PV power plants, when a process value goes above or below a defined threshold.
The process values were extracted by parsing the log messages with regex rules.
Funding
This research is supported in part by funding from the topic Engineering Secure Systems of the Helmholtz Association (HGF) and by KASTEL Security Research Labs (structure 46.23.02).
Files
Files
(202.3 MB)
Name | Size | Download all |
---|---|---|
md5:28b94fef326526f46df96402be80e149
|
123.1 kB | Download |
md5:ebbdee3de97e25cf4020d8552f431762
|
201.8 MB | Download |
md5:ebc3e048e0319f044ea47cc08e498999
|
359.7 kB | Download |
Additional details
Related works
- Is described by
- Publication: 10.1145/3679240.3734645 (DOI)
Dates
- Collected
-
2025-02-19