Dataset Open Access

A web tracking data set of online browsing behavior of 2,148 users

Kulshrestha, Juhi; Oliveira, Marcos; Karacalik, Orkut; Bonnay, Denis; Wagner, Claudia


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nmm##2200000uu#4500</leader>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">online behavior</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">web browsing behavior</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">web tracking</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">panel data</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">demographics</subfield>
  </datafield>
  <controlfield tag="005">20210514014814.0</controlfield>
  <controlfield tag="001">4757574</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="d">2021</subfield>
    <subfield code="g">ICWSM</subfield>
    <subfield code="a">International AAAI Conference on Web and Social Media</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">GESIS - Leibniz Institute for the Social Sciences, Germany</subfield>
    <subfield code="0">(orcid)0000-0003-3407-5361</subfield>
    <subfield code="a">Oliveira, Marcos</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">GESIS - Leibniz Institute for the Social Sciences, Germany</subfield>
    <subfield code="a">Karacalik, Orkut</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Université Paris Nanterre, France</subfield>
    <subfield code="a">Bonnay, Denis</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">GESIS - Leibniz Institute for the Social Sciences, Germany</subfield>
    <subfield code="a">Wagner, Claudia</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1323</subfield>
    <subfield code="z">md5:7759ca79ee86a1f4f6dfa1f17b897f87</subfield>
    <subfield code="u">https://zenodo.org/record/4757574/files/README.txt</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">23818399</subfield>
    <subfield code="z">md5:884537d2b8c5894f466befb2830e9220</subfield>
    <subfield code="u">https://zenodo.org/record/4757574/files/web_tracking_code.zip</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">194055856</subfield>
    <subfield code="z">md5:475519cdb23aad093ccd86990cbaec09</subfield>
    <subfield code="u">https://zenodo.org/record/4757574/files/web_tracking_data.tar.gz</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="y">Conference website</subfield>
    <subfield code="u">https://www.icwsm.org/2021/index.html</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-12-30</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire_data</subfield>
    <subfield code="o">oai:zenodo.org:4757574</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">GESIS - Leibniz Institute for the Social Sciences, Germany</subfield>
    <subfield code="0">(orcid)0000-0002-4375-4641</subfield>
    <subfield code="a">Kulshrestha, Juhi</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">A web tracking data set of online browsing behavior of 2,148 users</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by-nc/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution Non Commercial 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;This anonymized data set consists of one month&amp;#39;s (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.&lt;/p&gt;

&lt;p&gt;We acknowledge the support of Respondi AG, which provided the web tracking and survey data&amp;nbsp;free of charge for research purposes, with special thanks to Fran&amp;ccedil;ois Erner and Luc Kalaora at Respondi for their insights and help with data extraction.&lt;/p&gt;

&lt;p&gt;The&amp;nbsp;data set is analyzed in the following paper:&amp;nbsp;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;Kulshrestha, J., Oliveira, M., Karacalik, O., Bonnay, D., Wagner, C. &amp;quot;&lt;em&gt;Web Routineness and Limits of Predictability: Investigating Demographic and Behavioral Differences Using Web Tracking Data&lt;/em&gt;&lt;strong&gt;.&lt;/strong&gt;&amp;quot; Proceedings of the International AAAI Conference on Web and Social Media. 2021.&amp;nbsp;&lt;a href="https://arxiv.org/abs/2012.15112"&gt;https://arxiv.org/abs/2012.15112&lt;/a&gt;.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The code used to analyze the data is also available at&amp;nbsp;&lt;a href="https://github.com/gesiscss/web_tracking"&gt;https://github.com/gesiscss/web_tracking&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you use data or code from this repository, please cite the paper above and the Zenodo link.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.4383163</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.4757574</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">dataset</subfield>
  </datafield>
</record>
224
58
views
downloads
All versions This version
Views 224112
Downloads 5827
Data volume 3.1 GB3.1 GB
Unique views 16590
Unique downloads 3713

Share

Cite as