IoT-Zoo Network Traffic 8 hour capture
Authors/Creators
Contributors
Researcher (3):
Description
This dataset consists of 34269178 network packet samples extracted from the IoT-Zoo testbed. It represents a 28800-second execution of a heterogeneous IoT environment, featuring 43 distinct device profiles spanning Urban Observatory, Industrial, e-Health, and Smart Farming domains.
Technical Specifications
The dataset is the result of a synchronized fusion between two network analysis engines (Scapy and Tshark), providing a high-dimensional view of each packet. Unlike flow-based datasets, this is a packet-level collection, where each row represents an individual network frame.
Dataset Characteristics
- Total Samples: 34269178 packets.
- Total Features: 17 columns.
- Trace Duration: 28800 seconds.
- Device Heterogeneity: Covers telemetry from multiple domains with preserved temporal dynamics.
- Application Semantics: Includes structured payloads (JSON/XML) replayed from real-world datasets.
Column Definitions (Schema)
- pkt_index: Unique sequential identifier for each packet.
- ip_ttl: time to live value for the ip header, decreases by 1 at each router.
- tcp_seq: TCP sequence number, used to identify the sequence of tcp segments.
- tcp_flags_str: Human-readable TCP flag mnemonics (e.g., PA, S, A) extracted via Scapy.
- frame.time_epoch: High-precision Unix timestamp of arrival.
- frame.len: The total length of the Ethernet frame in bytes.
- ip_src / ip_dst: Source and Destination IPv4 addresses.
- ip_proto: Layer 3 protocol identifier (e.g., 6 for TCP).
- tcp.src_port / dst_port: Layer 4 source and destination ports (e.g., 1883 for MQTT).
- tcp_flags_hex: Raw TCP flags in hexadecimal format (0x00000000), optimized for numerical Machine Learning input.
- _ws.col.protocol: Application layer protocol identified via Tshark's deep packet inspection (e.g., MQTT, NTP, DNS, RTSP).
- mqtt.topic: Represents the publish/subscribe channel for MQTT messages, representing the origin topic and device. Only populated for MQTT packets; empty otherwise
- mqtt.msgtype: MQTT message type.
-
mqtt.qos: MQTT Quality of Service level goes from 0 to 2.
- mqtt.len: Length of MQTT payload in bytes.
Intended Use
This CSV is ready for downstream Machine Learning tasks such as:
- Anomaly Detection: Using frame_len and time_epoch (IAT) to identify volumetric or timing-based attacks.
- Protocol Classification: Leveraging app_protocol and tcp_flags_hex for identifying IoT-specific behaviors.
- Security Research: Serving as a baseline for legitimate IoT traffic patterns in heterogeneous environments.
Files
capture_8hours.zip
Files
(7.6 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:8aec6c18cc598624a3bc26f974a23bf7
|
7.2 GB | Preview Download |
|
md5:62c4080d87b019c6f025f1e48f271fa4
|
346.7 MB | Download |
Additional details
Related works
- Is variant form of
- Dataset: 10.5281/zenodo.19370194. (DOI)
Software
- Programming language
- Python