Published September 1, 2016
| Version v1
Dataset
Open
Wikidata Vandalism Corpus 2016 (WDVC-16)
- 1. Paderborn University
- 2. Bauhaus-Universität Weimar
Description
The Wikidata vandalism corpus 2016 (WDVC-16) is a corpus for the evaluation of automatic vandalism detectors for Wikidata. It was employed as part of the WSDM Cup 2017. For research purposes the corpus can be used free of charge.
When using the data, please make sure to refer to it as follows:
@inproceedings{heindorf2017overview,
author = {Stefan Heindorf and
Martin Potthast and
Gregor Engels and
Benno Stein},
title = {Overview of the Wikidata Vandalism Detection Task at {WSDM} Cup 2017},
booktitle = {{WSDM Cup 2017 Notebook Papers}},
url = {https://arxiv.org/abs/1712.05956},
year = {2017}
}
Files
Files
(29.8 GB)
Name | Size | Download all |
---|---|---|
md5:7b02fe367256e236fb0602c6936b56d7
|
18.6 MB | Download |
md5:00118133c1a46bb01dbe10dbb818fc39
|
103.0 MB | Download |
md5:fcc792db769e366a4e79d4667501095e
|
135.3 MB | Download |
md5:edfe4bc5ed72a622ea2920153f87499d
|
384.4 MB | Download |
md5:1ba956055f4f8ef1f08db6645c528e19
|
488.6 MB | Download |
md5:ccfc0f901aa82c8f7fdd90b3da1f03c8
|
472.6 MB | Download |
md5:079ce7abe468c3f353bb34a874f5f250
|
463.2 MB | Download |
md5:65ee3674d685f716826dcfa3ededb9d5
|
505.0 MB | Download |
md5:1f48c419bb36ada070b332e7d9487482
|
603.6 MB | Download |
md5:61dbd58672650c80978e6e6c6418f6b0
|
618.1 MB | Download |
md5:95f04e8579301cab1e08e600731f9b92
|
1.1 GB | Download |
md5:e2ee6d7dfe308bd5cc0b1adde9cab04f
|
1.4 GB | Download |
md5:4895a3d7c862da760aafbc849b51deaa
|
1.5 GB | Download |
md5:d7e24cdda2bbb12ea34002e3ca047712
|
1.3 GB | Download |
md5:bd4a6b51d2848b50ee199ce16805b074
|
1.4 GB | Download |
md5:794f17d62bb8dfc514bd6a3b730b5a1b
|
1.7 GB | Download |
md5:9f62a6405b986386dd12a759ce3092d5
|
1.3 GB | Download |
md5:749fc71a0316fb3bf021453be0669ad6
|
1.4 GB | Download |
md5:48a66e25778296faac6fbc9e988153b4
|
1.8 GB | Download |
md5:eed0b18a77f4dae6c2d2a6869eb1467c
|
3.3 GB | Download |
md5:be7e40edd8b828a9b0709c42bb1cb40c
|
3.5 GB | Download |
md5:b16cd532270459cc831b40746e604430
|
3.2 GB | Download |
md5:5aecd9a5253c0d7588a9cbd8fe78b26f
|
20.5 MB | Download |
md5:53f94cdb1143a9fa91c88be6d3050fe9
|
4.6 MB | Download |
md5:004d1d33b3f47e6b0baa354dae3177d6
|
2.9 GB | Download |
md5:78dddb2187e2bcb10a21597428260f4f
|
31.6 MB | Download |
md5:7dd3fe6ed9b966760b2ef6146b546e98
|
5.7 MB | Download |
md5:2b8056f0a5603ca84b205ad0e0f1a3f9
|
176.5 MB | Download |
md5:ccfc90e1810c6290f4954df838fbef91
|
45.9 MB | Download |
Additional details
References
- Stefan Heindorf, Martin Potthast, Hannah Bast, Björn Buchhold, and Elmar Haussmann. WSDM Cup 2017: Vandalism Detection and Triple Scoring. In WSDM, pages 827-828. ACM, 2017
- Stefan Heindorf, Martin Potthast, Gregor Engels, and Benno Stein. Overview of the Wikidata Vandalism Detection Task at WSDM Cup 2017. In WSDM Cup 2017 Notebook Papers, 2017