Published November 26, 2017 | Version v1
Dataset Open

EuroparlExtract - Comparable Corpora Extracted from the European Parliament Proceedings Parallel Corpus

  • 1. University of Innsbruck

Description

This dataset contains comparable translational corpora extracted from the European Parliament Proceedings Corpus (Europarl) v7 created by Philipp Koehn (see http://www.statmt.org/europarl/). For the extraction, the EuroparlExtract corpus processing toolkit by Michael Ustszewski (2017) was used. Europarl Extract is freely available under the MIT License (see https://github.com/mustaszewski/europarl-extract).

Files

Files (1.5 GB)

Name Size Download all
md5:9008f27f03d745b83c7c671c04010dfa
1.5 GB Download