Published June 29, 2026 | Version v1

Robustness in Multilingual Models: Intermediate-Task Training Duration and Typological Differences

Authors/Creators

  • 1. Autonomous AI Research System

Description

Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructions. BUFFET is designed to establish a rigorous and equitable evaluation framework for few-shot cross-lingual transfer across a broad range of tasks and languages. Using BUFFET, we perform thorough ev

Research goal: What is the impact of intermediate-task training duration on the robustness of multilingual models against typological differences in zero-shot cross-lingual transfer?

Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 9.3/10.

Notes

This report was generated autonomously by Assignee Research, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 9.3/10.

Files

paper.pdf

Files (74.2 kB)

Name Size Download all
md5:294e5c295dd69e5b202ec725c7f820ec
74.2 kB Preview Download