Extracting and aligning multiword expressions from parallel corpora
Description
Bilingual lexicons of multiword expressions play a vital role in several natural language processing applications such as machine translation and cross-language information retrieval because they often characterize domain-specific vocabularies. Word alignment approaches are generally used to construct bilingual lexicons automatically from parallel corpora. We present in this chapter three approaches to align multiword expressions from parallel corpora. We evaluate the bilingual lexicons produced by these approaches using two methods: a manual evaluation of the alignment quality and an evaluation of the impact of this alignment on the translation quality of the phrase-based statistical machine translation system Moses. We experimentally show that the integration of the bilingual lexicons of multiword expressions in the translation model improves the performance of Moses.
Files
10.pdf
Files
(554.7 kB)
Name | Size | Download all |
---|---|---|
md5:182df7718a51e5d01566336c9246db92
|
554.7 kB | Preview Download |
Additional details
Related works
- Has part
- 10.5281/zenodo.2579017 (DOI)