There is a newer version of the record available.

Published April 6, 2020 | Version v1.0.6
Dataset Open

GermaParl Corpus of Plenary Protocols

  • 1. University of Duisburg-Essen

Description

The GermaParl Corpus has been prepared in the PolMine Project (http://polmine.github.io) and comprises all protocols of plenary sessions in the German Bundestag (1996 - 2016). This version of the corpus is based on plain text documents issued by the German Bundestag. For a period between 2008 and 2010, txt files are not available. To fill the gap, pdf documents were processed. As part of the corpus preparation pipeline, the data has been linguistically annotated (using the TreeTagger) and imported into the Corpus Workbench (CWB). See the GermaParl documentation website (http://polmine.github.io/GermaParl) for further information.

Files

Files (3.3 GB)

Name Size Download all
md5:fabd9cf6bae3388b6301cf5dd2b129e0
164.6 MB Download
md5:985c89ae9c82c3017e82c1765a7ba067
182.9 MB Download
md5:e3fde0215a249425c8c5c42e3fe6d9d5
194.1 MB Download
md5:0d49cdb6af4090a3fcc7e3a3b3b2ecc8
202.2 MB Download
md5:cee3d0da503e2885ff8446f5d9dfc732
210.0 MB Download
md5:fda8c5623a4715545e0bf7e93b7dcfa9
213.4 MB Download
md5:db1023b94c54f745107a8711f03f602a
221.6 MB Download
md5:74407eae89dcf460a23b12f0596cc895
227.4 MB Download
md5:3d340900040d1c0ac79588b25db57707
230.5 MB Download
md5:3ffda6509be6f3dad44c1c802b8f4199
244.2 MB Download
md5:6d87306d1ed13a92a8a5b51431c8b21b
254.2 MB Download
md5:4ac55082645ec0b8864c4a62ce9b749b
958.5 MB Download