Published November 26, 2017
| Version v1
Dataset
Open
EuroparlExtract - Comparable Corpora Extracted from the European Parliament Proceedings Parallel Corpus
Description
This dataset contains comparable translational corpora extracted from the European Parliament Proceedings Corpus (Europarl) v7 created by Philipp Koehn (see http://www.statmt.org/europarl/). For the extraction, the EuroparlExtract corpus processing toolkit by Michael Ustszewski (2017) was used. Europarl Extract is freely available under the MIT License (see https://github.com/mustaszewski/europarl-extract).
Files
Files
(1.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9008f27f03d745b83c7c671c04010dfa
|
1.5 GB | Download |