Dataset Open Access

A web tracking data set of online browsing behavior of 2,148 users

Kulshrestha, Juhi; Oliveira, Marcos; Karacalik, Orkut; Bonnay, Denis; Wagner, Claudia

This anonymized data set consists of one month's (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.

We acknowledge the support of Respondi AG, which provided the web tracking and survey data free of charge for research purposes, with special thanks to François Erner and Luc Kalaora at Respondi for their insights and help with data extraction.

The data set is analyzed in the following paper: 

  • Kulshrestha, J., Oliveira, M., Karacalik, O., Bonnay, D., Wagner, C. "Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data." Proceedings of the International AAAI Conference on Web and Social Media. 2021. https://arxiv.org/abs/2012.15112

The code used to analyze the data is also available at https://github.com/gesiscss/web_tracking.

If you use data or code from this repository, please cite the paper above and the Zenodo link.

 

Files (217.9 MB)
Name Size
README.txt
md5:7759ca79ee86a1f4f6dfa1f17b897f87
1.3 kB Download
web_tracking_code.zip
md5:884537d2b8c5894f466befb2830e9220
23.8 MB Download
web_tracking_data.tar.gz
md5:475519cdb23aad093ccd86990cbaec09
194.1 MB Download
198
47
views
downloads
All versions This version
Views 19895
Downloads 4720
Data volume 2.3 GB2.3 GB
Unique views 14676
Unique downloads 3110

Share

Cite as