Dataset Restricted Access
Stamatatos, Efstathios;
Daelemans Daelemans amd Ben Verhoeven, Walter;
Juola, Patrick;
López-López, Aurelio;
Potthast, Martin;
Stein, Benno
{ "publisher": "Zenodo", "DOI": "10.5281/zenodo.3737563", "container_title": "CLEF 2015 Labs and Workshops, Notebook Papers", "language": "eng", "title": "PAN15 Author Identification: Verification", "issued": { "date-parts": [ [ 2015, 9, 8 ] ] }, "abstract": "<p>We provide you with a training corpus that comprises a set of author verification problems in several languages/genres. Each problem consists of some (up to five) known documents by a single person and exactly one questioned document. All documents within a single problem instance will be in the same language. However, their genre and/or topic may differ significantly. The document lengths vary from a few hundred to a few thousand words.</p>\n\n<p>The documents of each problem are located in a separate folder, the name of which (problem ID) encodes the language of the documents. The following list shows the available sub-corpora, including their language, type (cross-genre or cross-topic), code, and examples of problem IDs:</p>\n\n<p>Language; Type; Code; Problem IDs<br>\nDutch; Cross-genre; DU; DU001, DU002, DU003, etc.<br>\nEnglish; Cross-topic; EN; EN001, EN002, EN003, etc.<br>\nGreek; Cross-topic; GR; GR001, GR002, GR003, etc.<br>\nSpanish; Cross-genre; SP; SP001, SP002, SP003, etc.</p>\n\n<p>The ground truth data of the training corpus found in the file <code>truth.txt</code> include one line per problem with problem ID and the correct binary answer (Y means the known and the questioned documents are by the same author and N means the opposite). For example:</p>\n\n<pre>EN001 N\nEN002 Y\nEN003 N\n...</pre>", "author": [ { "family": "Stamatatos, Efstathios" }, { "family": "Daelemans Daelemans amd Ben Verhoeven, Walter" }, { "family": "Juola, Patrick" }, { "family": "L\u00f3pez-L\u00f3pez, Aurelio" }, { "family": "Potthast, Martin" }, { "family": "Stein, Benno" } ], "id": "3737563", "type": "dataset", "event": "Conference title: PAN at Conference and Labs of the Evaluation Forum 2015 (PAN at CLEF 2015)" }
All versions | This version | |
---|---|---|
Views | 508 | 508 |
Downloads | 39 | 39 |
Data volume | 236.0 MB | 236.0 MB |
Unique views | 384 | 384 |
Unique downloads | 37 | 37 |