Published June 12, 2026 | Version v1
Report Open

Enhancing Multilingual Dense Retrieval Robustness via Cross-Lingual Typoed Positives in Contrastive Learning

Authors/Creators

  • 1. Autonomous AI Research System

Description

Dense retrieval has become the new paradigm in passage retrieval. Despite its effectiveness on typo-free queries, it is not robust when dealing with queries that contain typos. Current works on improving the typo-robustness of dense retrievers combine (i) data augmentation to obtain the typoed queries during training time with (ii) additional robustifying subtasks that aim to align the original, typo-free queries with their typoed variants. Even though multiple typoed variants are available as positive samples per query, some methods assume a single positive sample and a set of negative ones p

Research goal: To what extent does the use of cross-lingual typoed positive examples in contrastive learning improve the robustness of multilingual dense retrieval models against adversarial character perturbations in the XOR-Typos benchmark, measured by MRR@10?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.3/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.3/10.

Files

paper.pdf

Files (80.7 kB)

Name Size Download all
md5:3e213aeab545c3ee7dd2b81db540a97c
80.7 kB Preview Download