Published March 19, 2023 | Version v1
Conference paper Open

Methodological issues regarding the semi-automatic UD treebank creation of under-resourced languages: the case of Pomak

  • 1. Athena-Research and Innovation Center in Information, Communication and Knowledge Technologies

Description

Pomak is an endangered oral Slavic language of Thrace/Greece. We present a short de- scription of its interesting morphological and syntactic features in the UD framework. Be- cause the morphological annotation of the tree- bank takes advantage of existing resources, it requires a different methodological approach from the one adopted for syntactic annotation that has started from scratch. It also requires the option of obtaining morphological predic- tions/evaluation separately from the syntactic ones with state-of-the-art NLP tools. Active an- notation is applied in various settings in order to identify the best model that would facilitate the ongoing syntactic annotation.

Files

GURT_Workshop-τελικό.pdf

Files (160.2 kB)

Name Size Download all
md5:d625eb6a41f8deba340fd40ef121b049
160.2 kB Preview Download