Published April 10, 2026 | Version v1
Dataset Open

IoT-Zoo Network Traffic 8 hour capture

  • 1. ROR icon Universidade Federal do Rio Grande do Sul
  • 2. Federal University of Pampa
  • 1. ROR icon Universidade Federal do Rio Grande do Sul
  • 2. Federal University of Pampa

Description

This dataset consists of  34269178 network packet samples extracted from the IoT-Zoo testbed. It represents a 28800-second execution of a heterogeneous IoT environment, featuring 43 distinct device profiles spanning Urban Observatory, Industrial, e-Health, and Smart Farming domains.

Technical Specifications

The dataset is the result of a synchronized fusion between two network analysis engines (Scapy and Tshark), providing a high-dimensional view of each packet. Unlike flow-based datasets, this is a packet-level collection, where each row represents an individual network frame.

Dataset Characteristics

  • Total Samples: 34269178 packets.
  • Total Features: 17 columns.
  • Trace Duration: 28800 seconds.
  • Device Heterogeneity: Covers telemetry from multiple domains with preserved temporal dynamics.
  • Application Semantics: Includes structured payloads (JSON/XML) replayed from real-world datasets.

Column Definitions (Schema)

  • pkt_index: Unique sequential identifier for each packet.
  • ip_ttl: time to live value for the ip header, decreases by 1 at each router.
  • tcp_seq: TCP sequence number, used to identify the sequence of tcp segments.
  • tcp_flags_str: Human-readable TCP flag mnemonics (e.g., PA, S, A) extracted via Scapy.
  • frame.time_epoch: High-precision Unix timestamp of arrival.
  • frame.len: The total length of the Ethernet frame in bytes.
  • ip_src / ip_dst: Source and Destination IPv4 addresses.
  • ip_proto: Layer 3 protocol identifier (e.g., 6 for TCP).
  • tcp.src_port / dst_port: Layer 4 source and destination ports (e.g., 1883 for MQTT).
  • tcp_flags_hex: Raw TCP flags in hexadecimal format (0x00000000), optimized for numerical Machine Learning input.
  • _ws.col.protocol: Application layer protocol identified via Tshark's deep packet inspection (e.g., MQTT, NTP, DNS, RTSP).
  •  mqtt.topic: Represents the publish/subscribe channel for MQTT messages, representing the origin topic and device. Only populated for MQTT packets; empty otherwise
  •  mqtt.msgtype: MQTT message type.
  • mqtt.qos: MQTT Quality of Service level goes from 0 to 2.

  • mqtt.len: Length of MQTT payload in bytes.

Intended Use

This CSV is ready for downstream Machine Learning tasks such as:

  • Anomaly Detection: Using frame_len and time_epoch (IAT) to identify volumetric or timing-based attacks.
  • Protocol Classification: Leveraging app_protocol and tcp_flags_hex for identifying IoT-specific behaviors.
  • Security Research: Serving as a baseline for legitimate IoT traffic patterns in heterogeneous environments.

 

Files

capture_8hours.zip

Files (7.6 GB)

Name Size Download all
md5:8aec6c18cc598624a3bc26f974a23bf7
7.2 GB Preview Download
md5:62c4080d87b019c6f025f1e48f271fa4
346.7 MB Download

Additional details

Related works

Is variant form of
Dataset: 10.5281/zenodo.19370194. (DOI)

Software

Programming language
Python