Dataset Open Access
Potthast, Martin;
Stein, Benno;
Eiselt, Andreas;
Barrón-Cedeño, Alberto;
Rosso, Paolo
{ "publisher": "Zenodo", "DOI": "10.5281/zenodo.3250123", "language": "eng", "title": "PAN Plagiarism Corpus 2010 (PAN-PC-10)", "issued": { "date-parts": [ [ 2010, 5, 1 ] ] }, "abstract": "<p>This corpus is outdated. Please use its successor PAN-PC-11: https://doi.org/10.5281/zenodo.3250095</p>\n\n<p>The PAN plagiarism corpus 2010 (PAN-PC-10) is a corpus for the evaluation of automatic plagiarism detection algorithms. For research purposes the corpus can be used free of charge.</p>\n\n<p>The PAN-PC-10 contains documents in which artificial plagiarism has been inserted automatically as well as documents in which simulated plagiarism has been inserted manually. The former have been constructed using a so-called random plagiarist, a computer program which constructs plagiarism according to a number of parameters, while the latter have been obtained with crowdsourcing via Amazon's Mechanical Turk.</p>", "author": [ { "family": "Potthast, Martin" }, { "family": "Stein, Benno" }, { "family": "Eiselt, Andreas" }, { "family": "Barr\u00f3n-Cede\u00f1o, Alberto" }, { "family": "Rosso, Paolo" } ], "type": "dataset", "id": "3250123" }
All versions | This version | |
---|---|---|
Views | 625 | 626 |
Downloads | 426 | 426 |
Data volume | 388.3 GB | 388.3 GB |
Unique views | 571 | 572 |
Unique downloads | 191 | 191 |