Methodological issues regarding the semi-automatic UD treebank creation of under-resourced languages: the case of Pomak
Creators
- 1. Athena-Research and Innovation Center in Information, Communication and Knowledge Technologies
Description
Pomak is an endangered oral Slavic language of Thrace/Greece. We present a short de- scription of its interesting morphological and syntactic features in the UD framework. Be- cause the morphological annotation of the tree- bank takes advantage of existing resources, it requires a different methodological approach from the one adopted for syntactic annotation that has started from scratch. It also requires the option of obtaining morphological predic- tions/evaluation separately from the syntactic ones with state-of-the-art NLP tools. Active an- notation is applied in various settings in order to identify the best model that would facilitate the ongoing syntactic annotation.
Files
GURT_Workshop-τελικό.pdf
Files
(160.2 kB)
Name | Size | Download all |
---|---|---|
md5:d625eb6a41f8deba340fd40ef121b049
|
160.2 kB | Preview Download |