Published June 8, 2020
| Version cikm20-final
Dataset
Open
Webis-WebSeg-20
Creators
- 1. Bauhaus-Universität Weimar
- 2. Leipzig University
Description
The Webis-WebSeg-20 dataset comprises 42,450 crowdsourced segmentations for 8,490 web pages from the Webis-Web-Archive-17. Segmentations were fused from the segmentations of five crowd workers each. If you use this dataset in your research, please cite it using this paper.
Files
README.txt
Files
(13.5 GB)
Name | Size | Download all |
---|---|---|
md5:e1c15c08939635ef26bb9694f31d7d12
|
2.7 kB | Preview Download |
md5:16d1b7e858bddb9d6629f99745c9dd56
|
5.8 MB | Preview Download |
md5:a06202b70dd114f0addd38a0485d6163
|
25.6 MB | Preview Download |
md5:123b15e7d1eb6f77ffa3893fe926d7ec
|
380.9 MB | Preview Download |
md5:c2105bd4444502fced4b004d4d148673
|
7.8 MB | Preview Download |
md5:119c60a663ee4384f5ca1c6bdafe4a1d
|
1.1 GB | Preview Download |
md5:868cda90bca002fad005b293b9939df9
|
12.0 GB | Preview Download |
Additional details
Related works
- Is documented by
- Conference paper: https://webis.de/publications.html#kiesel_2020b (URL)
- Is supplement to
- Dataset: 10.5281/zenodo.1002203 (DOI)
- Is supplemented by
- Software: https://github.com/webis-de/cikm20-web-page-segmentation-revisited-evaluation-framework-and-dataset (URL)
- Dataset: 10.5281/zenodo.4146889 (DOI)