Published June 8, 2020
| Version cikm20-final
Dataset
Open
Webis-WebSeg-20
Authors/Creators
- 1. Bauhaus-Universität Weimar
- 2. Leipzig University
Description
The Webis-WebSeg-20 dataset comprises 42,450 crowdsourced segmentations for 8,490 web pages from the Webis-Web-Archive-17. Segmentations were fused from the segmentations of five crowd workers each. If you use this dataset in your research, please cite it using this paper.
Files
README.txt
Files
(13.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:e1c15c08939635ef26bb9694f31d7d12
|
2.7 kB | Preview Download |
|
md5:16d1b7e858bddb9d6629f99745c9dd56
|
5.8 MB | Preview Download |
|
md5:a06202b70dd114f0addd38a0485d6163
|
25.6 MB | Preview Download |
|
md5:123b15e7d1eb6f77ffa3893fe926d7ec
|
380.9 MB | Preview Download |
|
md5:c2105bd4444502fced4b004d4d148673
|
7.8 MB | Preview Download |
|
md5:119c60a663ee4384f5ca1c6bdafe4a1d
|
1.1 GB | Preview Download |
|
md5:868cda90bca002fad005b293b9939df9
|
12.0 GB | Preview Download |
Additional details
Related works
- Is documented by
- Conference paper: https://webis.de/publications.html#kiesel_2020b (URL)
- Is supplement to
- Dataset: 10.5281/zenodo.1002203 (DOI)
- Is supplemented by
- Software: https://github.com/webis-de/cikm20-web-page-segmentation-revisited-evaluation-framework-and-dataset (URL)
- Dataset: 10.5281/zenodo.4146889 (DOI)