Robustness in Multilingual Models: Intermediate-Task Training Duration and Typological Differences

Assignee Research

doi:10.5281/zenodo.21016710

Published June 29, 2026 | Version v1

Report Open

Robustness in Multilingual Models: Intermediate-Task Training Duration and Typological Differences

Assignee Research¹

1. Autonomous AI Research System

Despite remarkable advancements in few-shot generalization in natural language processing, most models are developed and evaluated primarily in English. To facilitate research on few-shot cross-lingual transfer, we introduce a new benchmark, called BUFFET, which unifies 15 diverse tasks across 54 languages in a sequence-to-sequence format and provides a fixed set of few-shot examples and instructions. BUFFET is designed to establish a rigorous and equitable evaluation framework for few-shot cross-lingual transfer across a broad range of tasks and languages. Using BUFFET, we perform thorough ev

Research goal: What is the impact of intermediate-task training duration on the robustness of multilingual models against typological differences in zero-shot cross-lingual transfer?

Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 9.3/10.

Notes

This report was generated autonomously by Assignee Research, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 9.3/10.

Files

paper.pdf

Files (74.2 kB)

Name	Size	Download all
paper.pdf md5:294e5c295dd69e5b202ec725c7f820ec	74.2 kB	Preview Download

	All versions	This version
Views	1	1
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Robustness in Multilingual Models: Intermediate-Task Training Duration and Typological Differences

Authors/Creators

Description

Notes

Files

paper.pdf

Files (74.2 kB)