Published October 2, 2025 | Version v1
Dataset Open

CESNET-QUICEXT-25: A year-long QUIC traffic dataset with EXTended attributes

  • 1. ROR icon Czech Education and Scientific Network
  • 2. ROR icon TU Dresden

Description

This dataset captures the evolution of QUIC traffic in an ISP network over a year-long collection period. Its goal is to provide a basis for studies of QUIC deployment as well as experiments in encrypted traffic classification.

Collected at: CESNET3 network - https://www.cesnet.cz/en/sit-cesnet3-eng
Sampling rate: uniform 1:100
Software used:

Parts of the dataset (specifically the July 2024 and April 2025 data) were used in the following publication:

Waiting for QUIC: Passive Measurements to Understand QUIC Deployments. Jonas Mücke et al. 2025. Proc. ACM Netw. 3, CoNEXT4, Article 41 (September 2025), 26 pages. https://doi.org/10.1145/3768988

We are preparing a short paper that describes the dataset in more detail. Until its publication, please cite the above-mentioned paper by Mücke et al. when using the dataset.

Dataset structure

The dataset is organized into 12 per-month ZIP files covering the period from June 2024 to May 2025. Each ZIP file contains a Parquet file for every day of the month, except for several dates affected by data outages, which are listed below. Each sample in the dataset represents a bidirectional flow record describing an observed QUIC connection, with the available fields detailed below.

Available fields

DST_IP: An anonymized identifier of the destination host
DST_IP_SUBNET: An anonymized identifier of the destination host subnet (a /24 prefix for IPv4 and a /64 prefix for IPv6)
DST_IP_VERSION: IP version (IPv4 or IPv6)
DST_ASN: Autonomous System Number of the destination host
DST_COUNTRY: Country of the destination host, derived from a geolocation database
DST_PORT: Destination port
PROTOCOL: Protocol used (UDP for all samples)
TIME_FIRST: Time of the first packet
TIME_LAST: Time of the last packet
DURATION: Duration of the flow in seconds
FLOW_END_REASON: Flow termination reason, using values assigned by IANA
BYTES: Number of bytes transmitted from client to server
BYTES_REV: Number of bytes transmitted from server to client
PACKETS: Number of packets sent from client to server
PACKETS_REV: Number of packets sent from server to client
QUIC_VERSION: QUIC version from the first server long-header packet
QUIC_CLIENT_VERSION: QUIC version from the first client long-header packet
QUIC_TOKEN_LENGTH: Token length from an Initial or Retry packet
QUIC_MULTIPLEXED: Indicates whether multiplexing occurred (value > 0 if at least two distinct QUIC_OSCID values were observed)
QUIC_ZERO_RTT: Number of 0-RTT packets observed in the flow
QUIC_OCCID: Original client Connection ID from the first client packet
QUIC_OSCID: Original server Connection ID from the first client packet
QUIC_SCID: Server Connection ID
QUIC_RETRY_SCID: Server Connection ID from a Retry packet
QUIC_SNI: Server Name Indication domain
QUIC_USER_AGENT: User-Agent string, if available in an Initial packet
QUIC_TLS_EXT_TYPE: List of TLS extensions used
QUIC_TLS_EXT_LEN: Corresponding lengths of the listed TLS extensions
QUIC_PACKETS: Sequence of QUIC long-header packet types observed in the flow
PPI: Packet sequence represented as [[inter-packet times], [packet directions], [packet sizes]]
PPI_LEN: Number of packets in the PPI sequence
PPI_DURATION: Duration of the PPI sequence in seconds
PPI_ROUNDTRIPS: Number of roundtrips in the PPI sequence
PHIST_SRC_SIZES: Histogram of packet sizes from client to server
PHIST_DST_SIZES: Histogram of packet sizes from server to client
PHIST_SRC_IPT: Histogram of inter-packet times from client to server
PHIST_DST_IPT: Histogram of inter-packet times from server to client

Missing data

  •  31.10.2024, with 30.10. and 1.11. showing reduced data volume.
  • 11.12.2024–14.12.2024, with 10.12. and 15.12. showing reduced data volume.
  • A smaller data outage in 15.3.2025–16.3.2025.

Ethics

The privacy of users is of utmost importance to us. We emphasize that the dataset does not include client IP addresses; therefore, it is not possible to trace the identity of data subjects. Moreover, the dataset consists solely of flow records—no payload data, apart from metadata available in QUIC handshakes, is included.  The data was collected on the basis of a legitimate interest (i.e., not consent), among other things, for the purpose of ensuring the further development of services provided to the scientific and research community. We further applied the following anonymization measures:

  • Destination IP addresses are hashed with a secret salt, transforming them into non-reversible identifiers.
  • Flow start times are clipped to the hour, with end times adjusted accordingly.
  • Source ports are omitted.

Files

2024-06.zip

Files (28.2 GB)

Name Size Download all
md5:4b1fd8bcf5ddd143f7b350aa7c0d4814
1.7 GB Preview Download
md5:be97737ce804412dbe42c90a69a8aa05
1.5 GB Preview Download
md5:f9f6eb67dc539a9aadc16b5dc08f0951
1.5 GB Preview Download
md5:839364e8ddb12755be5055853fd8ecfb
2.8 GB Preview Download
md5:d0c5fdcea1f5c7bfb8cd239a8ef64116
3.4 GB Preview Download
md5:b74636bdf04677876141212f65cee781
3.5 GB Preview Download
md5:9a5d64d705678310f5b60aadc6e7ff27
2.0 GB Preview Download
md5:4e7e0fae9a613fcaf38eec308797f419
3.1 GB Preview Download
md5:1ba75841cdd69669df490196f0cb1ce7
2.2 GB Preview Download
md5:acba163cbbe7b0257948c9996ee2db2b
2.0 GB Preview Download
md5:3d67fb1743435d2fade34d3f4d87c347
2.3 GB Preview Download
md5:7528eab9e82ca832b6f086be7f80d6d2
2.2 GB Preview Download

Additional details

Dates

Collected
2024-06-01
Start of data collection
Collected
2025-05-31
End of data collection