Kestemont, Mike
Tschuggnall, Michael
Stamatatos, Efstathios
Daelemans, Walter
Specht, Günther
Stein, Benno
Potthast, Martin
2018-09-10
<p>We provide a corpus which comprises a set of cross-domain authorship attribution problems in each of the following 5 languages: English, French, Italian, Polish, and Spanish. Note that we specifically avoid to use the term 'training corpus' because <strong>the sets of candidate authors of the development and the evaluation corpora are not overlapping</strong>. Therefore, your approach should not be designed to particularly handle the candidate authors of the development corpus.</p>
<p>Each problem consists of a set of known fanfics by each candidate author and a set of unknown fanfics located in separate folders. The file <code>problem-info.json</code> that can be found in the main folder of each problem, shows the name of folder of unknown documents and the list of names of candidate author folders.</p>
<p>The true author of each unknown document can be seen in the file <code>ground-truth.json</code>, also found in the main folder of each problem.</p>
<p>In addition, to handle a collection of such problems, the file <code>collection-info.json</code>includes all relevant information. In more detail, for each problem it lists its main folder, the language (either <code>"en"</code>, <code>"fr"</code>, <code>"it"</code>, <code>"pl"</code>, or <code>"sp"</code>) and encoding (always <code>UTF-8</code>) of its documents.</p>
<p>More information: <a href="https://pan.webis.de/clef18/pan18-web/authorship-attribution.html">Link</a></p>
new version: removed passwords inside packages
https://doi.org/10.5281/zenodo.3737849
oai:zenodo.org:3737849
eng
Zenodo
https://zenodo.org/communities/pan
https://doi.org/10.5281/zenodo.3737684
info:eu-repo/semantics/openAccess
PAN at CLEF 2018, Conference title: PAN at Conference and Labs of the Evaluation Forum 2018
author
authorship
attribution
fanfiction
pan
2018
PAN18 Author Identification: Attribution
info:eu-repo/semantics/other