Published September 13, 2023 | Version v2
Dataset Open

Cookiescanner Dataset

  • 1. University of Bamberg

Description

This upload contains the dataset from the paper "Cookiescanner: An Automated Tool for Detecting and Evaluating GDPR Consent Notices on Websites" presented at ARES 2023.

Folder Structure
- 01_bert_classifier: The final BERT model as well as the datasets and Jupyter notebook to train/evaluate it. Additionally, the folder also contains a small flask application and a Docker file to deploy it as a web service.
- 02_raw_dataset: The 1.000 sampled scans. The results.json contains the raw scan data, while the rest of the subfolders contain the screenshots of the detection methods.
- 03_banner_detection: Contains the analysis CSV file, as well as a folder with the banner screenshots.
- 04_dark_patterns: Analysis files as well as the screenshots of banners with the dark patterns from the paper.

Source Code
For the source code of the scanner, please refer to https://github.com/UBA-PSI/cookiescanner.

Files

Cookiescanner_Dataset.zip

Files (6.4 GB)

Name Size Download all
md5:ac7470d4f0de2db07538bbbbd90ce8a9
6.4 GB Preview Download

Additional details

Related works

Is cited by
Journal article: 10.1145/3600160.3605000 (DOI)
Preprint: 10.48550/arXiv.2309.06196 (DOI)