Published October 31, 2023 | Version v1
Book chapter Open

IDION-POMAK: resource of idiomatic Pomak for the human user and NLP

  • 1. Athena-Research and Innovation Center in Information, Communication and Knowledge Technologies
  • 2. Democritus University of Thrace


IDION-Pomak, a resource of Verb MWEs (hereinafter: VMWEs) is presented in this article. Pomak is an endangered, non-standardised language variety of the East South Slavic dialect continuum. Its morphosyntactic features are outlined. Information on Pomak VMWEs has been collected via fieldwork. IDION, on the other hand, is a web based environment for the documentation of a wide range of VMWE properties. As regards Pomak VMWEs, the following information is en- coded: lemma form of the VMWE, variants (if attested), definition in Pomak and translation in other languages, gloss, usage examples for 60 VMWEs, morphosyn- tactic analysis in the Universal Dependencies framework as well as synonymous, opposite and causative/inchoative VMWEs and other verb alternations (if attested). Observations on Pomak VMWEs that are not encoded in IDION-Pomak but are in the article concern the types of VMWEs found in the data (light verb constructions, idioms) and the occurrence of very similar VMWEs in Modern Greek. The contents of IDION-Pomak are openly available; they belong to a set of resources of Pomak (corpus, morphological and syntactic models, embeddings, lexica) that have been developed as a case study of the Philotis project, which provides technological sup- port for the documentation of living languages.



