Conference paper Open Access

Multilingual Semantic Relatedness using lightweight machine translation

Barzegar, Siamak; Davis, Brian; Handschuh, Siegfried; Freitas, André

Distributional semantic models are strongly dependent on the size and the quality of the reference corpora, which embeds the commonsense knowledge necessary to build comprehensive models. While high-quality texts containing large-scale commonsense information are present in English, such as Wikipedia, other languages may lack sufficient textual support to build distributional models. This paper proposes using the combination of a lightweight (sloppy) machine translation model and an English Distributional Semantic Model (DSM) to provide higher quality word vectors for languages other than English. Results show that the lightweight MT model introduces significant improvements when compared to language-specific distributional models. Additionally, the lightweight MT outperforms more complex MT methods for the task of word-pair translation.

Files (463.2 kB)
Name Size
2018-Accepted-ICSC-MultiLingual.pdf
md5:2ff068318a232cd5e50aa9b2930fae4c
463.2 kB Download
31
31
views
downloads
All versions This version
Views 3131
Downloads 3131
Data volume 14.4 MB14.4 MB
Unique views 3030
Unique downloads 3030

Share

Cite as