Published November 1, 2018 | Version v1
Conference paper Open

Challenges in Converting the Index Thomisticus Treebank into Universal Dependencies

  • 1. Università Cattolica del Sacro Cuore
  • 2. Università di Pavia
  • 3. Univerzita Karlova: Praha

Description

This  paper  describes  the  changes  applied  to the  original  process  used  to  convert  the Index Thomisticus Treebank, a corpus including texts in Medieval Latin by Thomas Aquinas, into the annotation style of Universal Dependencies.   The  changes  are  made  both  to  harmonise  the  Universal  Dependencies  version of  the Index  Thomisticus Treebank  with  the two other available Latin treebanks and to fix errors  and  inconsistencies  resulting  from  the original process. The paper details the treatment of different issues in PoS tagging, lemmatisation and assignment of dependency relations. Finally, it assesses the quality of the new conversion process by providing an evaluation against a gold standard.

Files

2018_Cecchini-Passarotti-et-alii_UDW18.pdf

Files (239.3 kB)

Name Size Download all
md5:43f44e45ee99bf61d67156ad43dd3b5b
239.3 kB Preview Download

Additional details

Related works

Is part of
978-1-948087-78-0 (ISBN)

Funding

LiLa – Linking Latin. Building a Knowledge Base of Linguistic Resources for Latin 769994
European Commission