Synergistic Effects of Vocabulary Augmentation and Script Transliteration on POS Tagging Accuracy in Low-Resource Languages
Description
Pretrained multilingual language models have become a common tool in transferring NLP capabilities to low-resource languages, often with adaptations. In this work, we study the performance, extensibility, and interaction of two such adaptations: vocabulary augmentation and script transliteration. Our evaluations on part-of-speech tagging, universal dependency parsing, and named entity recognition in nine diverse low-resource languages uphold the viability of these approaches while raising new questions around how to optimally adapt multilingual models to low-resource settings.
Research goal: Does combining vocabulary augmentation with script transliteration yield synergistic improvements in part-of-speech tagging accuracy for low-resource languages over using either adaptation alone?
Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 7.5/10.
Notes
Files
paper.pdf
Files
(82.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a20dcdb29e7497fdf173d3062aee3797
|
82.2 kB | Preview Download |