Typological Approach to Improve Dependency Parsing for Croatian Language
Creators
- 1. Faculty of Humanities and Social Sciences, University of Zagreb
Description
This article presents the results of the experiments concerning different typological approaches considering syntactic structures with the aim to identify similar languages which can be combined with Croatian to improve UAS and LAS metrics when using a deep learning tool. From the eight selected languages coming from different linguistic families and genera, we showed that Slovene and Irish are the best candidates which improved significantly dependency parsing results. Slovak is the only language presenting negative synergy when combined with Croatian. Both typological approaches presented in this study, using quantitative data concerning rules from context-free grammar extracted from corpora using Marsagram tool and using syntactic features from lang2vec language vectors, did not allow us to explain the observed synergy when the different languages were combined. The traditional genealogical classification does not explain either the improvement provided by Irish or the negative impact of the Slovak language in both considered metrics.
Files
2021.tlt-1.1.pdf
Files
(108.6 kB)
Name | Size | Download all |
---|---|---|
md5:0110575d092ec4538669a053b7b9c859
|
108.6 kB | Preview Download |