Published May 5, 2016
| Version v1
Dataset
Open
PAN16 Author Identification: Clustering
Creators
- 1. 0000-0001-9033-2217
- 2. Universität Leipzig
Description
We provide a collection of (up to 100) documents to identify authorship links and groups of documents by the same author. All documents are single-authored, in the same language, and belong to the same genre. However, the topic or text-length of documents may vary. The number of distinct authors whose documents are included in the collection is not given.
More information: Link
Files
pan16-author-clustering-test-and-training.zip
Files
(5.3 MB)
Name | Size | Download all |
---|---|---|
md5:711e95fed2a865a82faffcf77475c3e9
|
5.3 MB | Preview Download |
Additional details
References
- Efstathios Stamatatos, Michael Tschuggnall, Ben Verhoeven, Walter Daelemans, Günther Specht, Benno Stein, and Martin Potthast. Clustering by Authorship Within and Across Documents. In Working Notes Papers of the CLEF 2016 Evaluation Labs volume 1609 of CEUR Workshop Proceedings, September 2016. CEUR-WS.org. ISSN 1613-0073.