Published July 1, 2019 | Version v1
Book chapter Open

Extracting and aligning multiword expressions from parallel corpora

Description

Bilingual lexicons of multiword expressions play a vital role in several natural language processing applications such as machine translation and cross-language information retrieval because they  often characterize domain-specific vocabularies. Word alignment approaches are generally used to construct bilingual lexicons automatically from parallel corpora. We present in this chapter three approaches to align multiword expressions from parallel corpora. We evaluate the bilingual lexicons produced by these approaches using two methods: a manual evaluation of the alignment quality and an evaluation of the impact of this alignment on the translation quality of the phrase-based statistical machine translation system Moses. We experimentally show that the integration of the bilingual lexicons of multiword expressions in the translation model improves the performance of Moses.

Files

10.pdf

Files (554.7 kB)

Name Size Download all
md5:182df7718a51e5d01566336c9246db92
554.7 kB Preview Download

Additional details

Related works