Published March 19, 2023 | Version v1
Conference paper Open

Methodological issues regarding the semi-automatic UD treebank creation of under-resourced languages: the case of Pomak

  • 1. Athena-Research and Innovation Center in Information, Communication and Knowledge Technologies


Pomak is an endangered oral Slavic language of Thrace/Greece. We present a short de- scription of its interesting morphological and syntactic features in the UD framework. Be- cause the morphological annotation of the tree- bank takes advantage of existing resources, it requires a different methodological approach from the one adopted for syntactic annotation that has started from scratch. It also requires the option of obtaining morphological predic- tions/evaluation separately from the syntactic ones with state-of-the-art NLP tools. Active an- notation is applied in various settings in order to identify the best model that would facilitate the ongoing syntactic annotation.



Files (160.2 kB)

Name Size Download all
160.2 kB Preview Download