Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published December 30, 2020 | Version v1
Dataset Open

A web tracking data set of online browsing behavior of 2,148 users

  • 1. GESIS - Leibniz Institute for the Social Sciences, Germany
  • 2. Université Paris Nanterre, France

Description

This anonymized data set consists of one month's (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.

We acknowledge the support of Respondi AG, which provided the web tracking and survey data free of charge for research purposes, with special thanks to François Erner and Luc Kalaora at Respondi for their insights and help with data extraction.

The data set is analyzed in the following paper: 

  • Kulshrestha, J., Oliveira, M., Karacalik, O., Bonnay, D., Wagner, C. "Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data." Proceedings of the International AAAI Conference on Web and Social Media. 2021. https://arxiv.org/abs/2012.15112

The code used to analyze the data is also available at https://github.com/gesiscss/web_tracking.

If you use data or code from this repository, please cite the paper above and the Zenodo link.

 

Files

README.txt

Files (217.9 MB)

Name Size Download all
md5:7759ca79ee86a1f4f6dfa1f17b897f87
1.3 kB Preview Download
md5:884537d2b8c5894f466befb2830e9220
23.8 MB Preview Download
md5:475519cdb23aad093ccd86990cbaec09
194.1 MB Download