Dataset Open Access

Webis-Web-Errors-19

Kiesel, Johannes; Hubricht, Fabienne; Stein, Benno; Potthast, Martin

The Webis-Web-Errors-19 comprises various annotations for the 10,000 web page archives of the Webis-Web-Archive-17. The annotations are whether the page is (1) mostly advertisement, (2) cut off, (3) still loading, (4) pornographic; and whether it shows (not/a bit/ very) (5) pop-ups, (6) CAPTCHAs, or (7) error messages. If you use this dataset in your research, please cite it using this paper.

Files (681.6 kB)
Name Size
annotation-interface.png
md5:7aaa146411dbc0d770dbd319f00cd864
263.5 kB Download
curation-interface.png
md5:ed0aa6eb30370b38f69265f463c59d5c
101.7 kB Download
webis-web-archive-17-content-error-tags.txt
md5:862ffd4469c832922af95007ca4b3a44
4.1 kB Download
webis-web-archive-17-content-errors.csv
md5:098d08ef7c0e0c69023b27951277caa1
312.3 kB Download
289
172
views
downloads
All versions This version
Views 289194
Downloads 172114
Data volume 37.2 MB21.5 MB
Unique views 238165
Unique downloads 12081

Share

Cite as