Robustness in Multilingual Models: Intermediate-Task Training Duration and Typological Differences
Description
Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructions. BUFFET is designed to establish a rigorous and equitable evaluation framework for few-shot cross-lingual transfer across a broad range of tasks and languages. Using BUFFET, we perform thorough ev
Research goal: What is the impact of intermediate-task training duration on the robustness of multilingual models against typological differences in zero-shot cross-lingual transfer?
Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 9.3/10.
Notes
Files
paper.pdf
Files
(74.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:294e5c295dd69e5b202ec725c7f820ec
|
74.2 kB | Preview Download |