Published March 27, 2024 | Version 1.1
Dataset Open

A User DNS Fingerprint Dataset

  • 1. ROR icon Czech Technical University in Prague
  • 2. University of South Bohemia in České Budějovice

Description

Using a user DNS fingerprint allows one to identify a specific network user regardless of the knowledge of his IP address. This method is proper, for example, when examining the behavior of a monitored network user in more depth. In contrast to other studies, this work introduces a dataset for possible user identification based only on the knowledge of its DNS fingerprint created from the previously sent DNS queries.

We created a large dataset from the real network traffic of a metropolitan Internet service provider. The dataset was created from 2.3 billion DNS queries representing 6.2 million different domain names. The data collection took place over three months from 12/2023 to 02/2024.

The dataset contains a detailed user activity description in the sense of overall daily activity statistics and detailed 24-hour activity statistics. Each dataset record contains a list of 1137 classification attributes. The absolutely unique feature of this data set is the classification of user activity based on categories of content accessed by a user.

The new dataset can be used for the creation of machine learning models, allowing the identification of a specific user without direct knowledge of their IP addresses or additional network location information. The dataset can also serve as a reference dataset for the creation of DNS fingerprints of users.

Files

dataset.zip

Files (420.8 MB)

Name Size Download all
md5:8001d345487b0819b779ebef2279f855
420.8 MB Preview Download