Cross-lingual Inflection as a Data Augmentation Method for Parsing
Authors/Creators
Abstract (English)
We propose a morphology-based method for low-resource (LR) dependency parsing. We train a morphological inflector for target LR languages, and apply it to related rich-resource (RR) treebanks to create cross-lingual (x-inflected) treebanks that resemble the target LR language. We use such inflected treebanks to train parsers in zero- (training on x-inflected treebanks) and few-shot (training on x-inflected and target language treebanks) setups. The results show that the method sometimes improves the baselines, but not consistently.
Other (English)
This work is supported by a 2020 Leonardo Grant for Researchers and Cultural Creators from the FBBVA,3 as well as by the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant agreement No 714150). The work is also supported by ERDF/MICINN-AEI (SCANNER-UDC, PID2020-113230RB-C21), by Xunta de Galicia (ED431C 2020/11), and by Centro de Investigación de Galicia “CITIC” which is funded by Xunta de Galicia, Spain and the European Union (ERDF - Galicia 2014–2020 Program), by grant ED431G 2019/01.
Files
MuñozOrtiz_2022_Cross_lingual_inflection_data_augmentation_method_parsing.pdf
Files
(198.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:62c8cee738a7a48697a0165b3795a830
|
198.5 kB | Preview Download |
Additional details
Identifiers
- Handle
- 2183/36647