Correlation of RankC Metric with Zero-Shot QA Accuracy in Low-Resource Languages versus High-Resource Baselines in XLM-R
Description
While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models. In this paper, we focus on a zero-shot evaluation setting focusing on low-resource Indian languages, namely Assamese, Kannada, Maithili, and Punjabi. We collect sufficient Multi-Dimensional Quality Metrics (MQM) and Direct Assessment (DA) annotations to create test sets and meta-evaluate a plethora of automatic evaluation metrics. We observe that even for learned metrics, whi
Research goal: How does the RankC metric correlate with zero-shot question answering accuracy for low-resource languages in XLM-R compared to high-resource baselines?
Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 8.4/10.
Notes
Files
paper.pdf
Files
(80.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:95ed63ba9df9a1ec4245cad1afa65a00
|
80.5 kB | Preview Download |