Published November 24, 2020 | Version v1
Conference paper Open

Interlinking Slovene Language Datasets

  • 1. ACDH-CH

Description

We present the current implementation state of our work consisting in interlinking language data and linguistic information included in different types of Slovenian language resources. The types of resources we currently deal with are a lexical database (which also contains collocations and example sentences), a morphological lexicon, and the Slovene WordNet. We first transform the encoding of the original data into the OntoLex-Lemon model and map the different descriptors used in the original sources onto the LexInfo vocabulary. This harmonization step is enabling the interlinking of the various types of information included in the different resources, by using relations defined in OntoLex-Lemon. As a result, we obtain a partial merging of the information that was originally distributed over different resources, which is leading to a cross-enrichment of those original data sources. A final goal of the presented work is to publish the linked and merged Slovene linguistic datasets in the Linguistic Linked Open Data cloud.

Files

EURALEX2020_ProceedingsBook-p073-080.pdf

Files (686.8 kB)

Name Size Download all
md5:bee7b09b47c3295009a06a521dcf88a7
686.8 kB Preview Download

Additional details

Funding

ELEXIS – European Lexicographic Infrastructure 731015
European Commission