Published February 9, 2017 | Version v1
Book chapter Open

Enriching Slovene wordnet with domain-specific terms

  • 1. Dept. of Translation, Faculty of Arts, University of Ljubljana

Description

The paper describes an innovative approach to expanding the domain coverage of the Slovene wordnet (sloWNet) by exploiting multiple resources. In the experiment described here we are using a large monolingual Slovene corpus of texts from the domain of informatics to harvest terminology from, and a parallel English-Slovene corpus and an online dictionary as bilingual resources to facilitate the mapping of terms to sloWNet. We first identify the core terms of the domain in English using the Princeton University's WordNet 2.1, and then we translate them into Slovene using a bilingual lexicon produced from the parallel corpus. In the next step we extract multi-word terms from the Slovene domain-specific corpus using a hybrid approach, and finally match the term candidates to existing wordnet synsets. The proposed method appears to be a successful way to improve the domain coverage of the wordnet as it yields abundant term candidates and exploits various multilingual resources

Files

3.pdf

Files (144.1 kB)

Name Size Download all
md5:800cd1714b455f378392eaf6ab1626a7
144.1 kB Preview Download

Additional details

Related works

Is part of
10.5281/zenodo.283376 (DOI)