Improving Wordnets for Under-Resourced Languages Using Machine Translation information

doi:10.5281/zenodo.2599952

Published January 12, 2018 | Version v1

Conference paper Open

Improving Wordnets for Under-Resourced Languages Using Machine Translation information

1. Insight Centre for Data Analytics, National University of Ireland Galway

Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not available for under-resourced languages. Even if wordnet-like resources are available for under-resourced languages, they are often not easily accessible, which can alter the results of applications using these resources. Our proposed method presents an expand approach for improving and generating wordnets with the help of machine translation. We apply our methods to improve and extend wordnets for the Dravidian languages, i.e., Tamil, Telugu, Kannada, which are severly under-resourced languages. We report evaluation results of the generated wordnet senses in term of precision for these languages. In addition to that, we carried out a manual evaluation of the translations for the Tamil language, where we demonstrate that our approach can aid in improving wordnet resources for under-resourced Dravidian languages.

Files

chakravarthi2018improving.pdf

Files (244.9 kB)

Name	Size	Download all
chakravarthi2018improving.pdf md5:85d25ce74378e4ceda641665839272b7	244.9 kB	Preview Download

Additional details

ELEXIS – European Lexicographic Infrastructure 731015: European Commission

	All versions	This version
Views	100	100
Downloads	63	63
Data volume	16.2 MB	16.2 MB

Improving Wordnets for Under-Resourced Languages Using Machine Translation information

Creators

Description

Files

chakravarthi2018improving.pdf

Files (244.9 kB)

Additional details

Funding