Dataset Open Access
The GermaParl Corpus has been prepared in the PolMine Project (http://polmine.github.io) and comprises all protocols of plenary sessions in the German Bundestag (1996 - 2016). This version of the corpus is based on plain text documents issued by the German Bundestag. For a period between 2008 and 2010, txt files are not available. To fill the gap, pdf documents were processed. As part of the corpus preparation pipeline, the data has been linguistically annotated (using the TreeTagger) and imported into the Corpus Workbench (CWB). See the GermaParl documentation website (http://polmine.github.io/GermaParl) for further information.
Name | Size | |
---|---|---|
germaparl_lda_speeches_100.rds
md5:fabd9cf6bae3388b6301cf5dd2b129e0 |
164.6 MB | Download |
germaparl_lda_speeches_150.rds
md5:985c89ae9c82c3017e82c1765a7ba067 |
182.9 MB | Download |
germaparl_lda_speeches_175.rds
md5:e3fde0215a249425c8c5c42e3fe6d9d5 |
194.1 MB | Download |
germaparl_lda_speeches_200.rds
md5:0d49cdb6af4090a3fcc7e3a3b3b2ecc8 |
202.2 MB | Download |
germaparl_lda_speeches_225.rds
md5:cee3d0da503e2885ff8446f5d9dfc732 |
210.0 MB | Download |
germaparl_lda_speeches_250.rds
md5:fda8c5623a4715545e0bf7e93b7dcfa9 |
213.4 MB | Download |
germaparl_lda_speeches_275.rds
md5:db1023b94c54f745107a8711f03f602a |
221.6 MB | Download |
germaparl_lda_speeches_300.rds
md5:74407eae89dcf460a23b12f0596cc895 |
227.4 MB | Download |
germaparl_lda_speeches_350.rds
md5:3d340900040d1c0ac79588b25db57707 |
230.5 MB | Download |
germaparl_lda_speeches_400.rds
md5:3ffda6509be6f3dad44c1c802b8f4199 |
244.2 MB | Download |
germaparl_lda_speeches_450.rds
md5:6d87306d1ed13a92a8a5b51431c8b21b |
254.2 MB | Download |
germaparl_v1.0.6.tar.gz
md5:4ac55082645ec0b8864c4a62ce9b749b |
958.5 MB | Download |
All versions | This version | |
---|---|---|
Views | 4,141 | 1,852 |
Downloads | 4,162 | 3,835 |
Data volume | 4.0 TB | 3.4 TB |
Unique views | 2,842 | 1,422 |
Unique downloads | 2,247 | 1,995 |