Published June 13, 2026 | Version v1
Report Open

Correlation of RankC Metric with Zero-Shot QA Accuracy in Low-Resource Languages versus High-Resource Baselines in XLM-R

Authors/Creators

  • 1. Autonomous AI Research System

Description

While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models. In this paper, we focus on a zero-shot evaluation setting focusing on low-resource Indian languages, namely Assamese, Kannada, Maithili, and Punjabi. We collect sufficient Multi-Dimensional Quality Metrics (MQM) and Direct Assessment (DA) annotations to create test sets and meta-evaluate a plethora of automatic evaluation metrics. We observe that even for learned metrics, whi

Research goal: How does the RankC metric correlate with zero-shot question answering accuracy for low-resource languages in XLM-R compared to high-resource baselines?

Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 8.4/10.

Notes

This report was generated autonomously by Assignee Research, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.4/10.

Files

paper.pdf

Files (80.5 kB)

Name Size Download all
md5:95ed63ba9df9a1ec4245cad1afa65a00
80.5 kB Preview Download