There is a newer version of this record available.

Dataset Open Access

GermaParl Corpus of Plenary Protocols

Blaette, Andreas

The GermaParl Corpus has been prepared in the PolMine Project (http://polmine.github.io) and comprises all protocols of plenary sessions in the German Bundestag (1996 - 2016). This version of the corpus is based on plain text documents issued by the German Bundestag. For a period between 2008 and 2010, txt files are not available. To fill the gap, pdf documents were processed. As part of the corpus preparation pipeline, the data has been linguistically annotated (using the TreeTagger) and imported into the Corpus Workbench (CWB). See the GermaParl documentation website (http://polmine.github.io/GermaParl) for further information.

Files (3.3 GB)
Name Size
germaparl_lda_speeches_100.rds
md5:fabd9cf6bae3388b6301cf5dd2b129e0
164.6 MB Download
germaparl_lda_speeches_150.rds
md5:985c89ae9c82c3017e82c1765a7ba067
182.9 MB Download
germaparl_lda_speeches_175.rds
md5:e3fde0215a249425c8c5c42e3fe6d9d5
194.1 MB Download
germaparl_lda_speeches_200.rds
md5:0d49cdb6af4090a3fcc7e3a3b3b2ecc8
202.2 MB Download
germaparl_lda_speeches_225.rds
md5:cee3d0da503e2885ff8446f5d9dfc732
210.0 MB Download
germaparl_lda_speeches_250.rds
md5:fda8c5623a4715545e0bf7e93b7dcfa9
213.4 MB Download
germaparl_lda_speeches_275.rds
md5:db1023b94c54f745107a8711f03f602a
221.6 MB Download
germaparl_lda_speeches_300.rds
md5:74407eae89dcf460a23b12f0596cc895
227.4 MB Download
germaparl_lda_speeches_350.rds
md5:3d340900040d1c0ac79588b25db57707
230.5 MB Download
germaparl_lda_speeches_400.rds
md5:3ffda6509be6f3dad44c1c802b8f4199
244.2 MB Download
germaparl_lda_speeches_450.rds
md5:6d87306d1ed13a92a8a5b51431c8b21b
254.2 MB Download
germaparl_v1.0.6.tar.gz
md5:4ac55082645ec0b8864c4a62ce9b749b
958.5 MB Download
4,141
4,162
views
downloads
All versions This version
Views 4,1411,852
Downloads 4,1623,835
Data volume 4.0 TB3.4 TB
Unique views 2,8421,422
Unique downloads 2,2471,995

Share

Cite as