Published September 15, 2014
                      
                       | Version v1
                    
                    
                      
                        
                          Dataset
                        
                      
                      
                        
                          
                        
                        
                          Open
                        
                      
                    
                  PAN14 Author Identification: Verification
Creators
- 1. Universität Leipzig
- 2. Bauhaus-Universität Weimar
Description
We provide you with a training corpus that comprises a set of author verification problems in several languages/genres. Each problem consists of some (up to five) known documents by a single person and exactly one questioned document. All documents within a single problem instance will be in the same language and best efforts are applied to assure that within-problem documents are matched for genre, register, theme, and date of writing. The document lengths vary from a few hundred to a few thousand words.
More information: Link
Files
      
        pan14-authorship-verification-test-and-training.zip
        
      
    
    
      
        Files
         (24.8 MB)
        
      
    
    | Name | Size | Download all | 
|---|---|---|
| md5:f7c504fd27f3c6e1f92b526fd15bc942 | 24.8 MB | Preview Download | 
Additional details
References
- Efstathios Stamatatos, Walter Daelemans, Ben Verhoeven, Martin Potthast, Benno Stein, Patrick Juola, Miguel A. Sanchez-Perez, and Alberto Barrón-Cedeño. Overview of the Author Identification Task at PAN 2014. In Linda Cappellato, Nicola Ferro, Martin Halvey, and Wessel Kraaij, editors, Working Notes Papers of the CLEF 2014 Evaluation Labs, September 2014. CEUR-WS.org. ISSN 1613-0073.