Published May 11, 2020 | Version v1
Conference paper Open

Some Issues with Building a Multilingual Wordnet

  • 1. Nanyang Technological University
  • 2. National University of Ireland Galway
  • 3. Tallinn University of Technology

Description

In this paper we discuss the experience of bringing together over 40 different wordnets. We introduce some extensions to the GWA wordnet LMF format proposed in Vossen et al. (2016) and look at how this new information can be displayed. Notable extensions include: confidence, corpus frequency, orthographic variants, lexicalized and non-lexicalized synsets and lemmas, new parts of speech, and more. Many of these extensions already exist in multiple wordnets – the challenge was to find a compatible representation. To this end, we introduce a new version of the Open Multilingual Wordnet (Bond and Foster, 2013), that integrates a new set of tools that tests the extensions introduced by this new format, while also ensuring the integrity of the Collaborative Interlingual Index (CILI: Bond et al., 2016), avoiding the same new concept to be introduced through multiple projects.

Files

bond2020issues.pdf

Files (499.7 kB)

Name Size Download all
md5:e949d490a787190e211ccf224a216189
499.7 kB Preview Download

Additional details

Funding

ELEXIS – European Lexicographic Infrastructure 731015
European Commission