Dataset Open Access

Paderborn Genre Analysis Corpus 2012 (PaGA-12)

Baumann, Michael; Lettmann, Theodor; Stein, Benno


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3250070", 
  "language": "deu", 
  "title": "Paderborn Genre Analysis Corpus 2012 (PaGA-12)", 
  "issued": {
    "date-parts": [
      [
        2012, 
        1, 
        1
      ]
    ]
  }, 
  "abstract": "<p>The Paderborn Genre Analysis 2012 corpus (PaGA-12) contains 1,639 HTML documents of 26 genres. All documents were collected from 2009-10-18 to 2009-11-20, and each document is manually assigned to exactly one genre. For each genre, the corpus provides at least 50 documents.</p>\n\n<p>All HTML documents contain German text only, and framesets are removed. The corpus is delivered in form of a MySQL database dump; the database structure is detailed in a README file delivered with the corpus.</p>", 
  "author": [
    {
      "family": "Baumann, Michael"
    }, 
    {
      "family": "Lettmann, Theodor"
    }, 
    {
      "family": "Stein, Benno"
    }
  ], 
  "type": "dataset", 
  "id": "3250070"
}
184
19
views
downloads
All versions This version
Views 184183
Downloads 1919
Data volume 392.8 MB392.8 MB
Unique views 166165
Unique downloads 1515

Share

Cite as