There is a newer version of the record available.

Published May 1, 2023 | Version v1
Working paper Open

Cookiescanner Data

Authors/Creators

  • 1. Anonymous

Description

Trustbus dataset for cookiescanner

# Foreword
Due to the large size of the scans and the long upload duration, we only uploaded a serialized sample of 100 of the scans. If the paper is accepted, we will upload all of the sampled scans and the databases with all scans, if space is available.

# Folder Structure
- 00_source_code: The source code of cookiescanner
- 01_bert_classifier: The final BERT model as well as the datasets and Jupyter notebook to train/evaluate it. Additionally, the folder also contains a small flask application and a Docker file to deploy it as a web service.
- 02_raw_dataset: 100 serialized sampled scans. The results.json contains the raw scan data, while the rest of the subfolders contain the screenshots of the detection methods.
- 03_banner_detection: Contains the analysis CSV file, as well as a folder with the banner screenshots.
- 04_dark_patterns: Analysis Files

Files

TRUSTBUS_DATASET.zip

Files (1.1 GB)

Name Size Download all
md5:fd51336cfab92a30efc1aabfd11534d6
1.1 GB Preview Download