Published April 19, 2024 | Version v1
Book chapter Open

Representation of multiword expressions in the Bulgarian integrated lexicon for language technology

Description

The chapter introduces a representation model of multiword expressions from the perspective of integrated lexicons for Bulgarian. The lexicons considered are an inflectional one, a valency one, and a wordnet. We created a joint representation entry that incorporates morphology, valency potential and lexical semantics through synonym sets. The selected mechanism for displaying all the information is catena-based since the catena allows for better modeling of idiosyncratic elements and is tree-based. Also, a general typology of multiword expressions is proposed that focuses on fixedness and (dis)continuity. We believe that providing a unified representation of multiword expressions and common lexica would improve the performance of the various natural language processing applications.

Files

440-GiouliBarbuMititelu-2024-4.pdf

Files (224.4 kB)

Name Size Download all
md5:a5fecebb24ac408dcd2502f5099fa912
224.4 kB Preview Download

Additional details

Related works

Is part of
978-3-96110-470-3 (ISBN)
10.5281/zenodo.10949960 (DOI)