Does increasing the diversity of pseudo-parallel synthetic data improve cross-domain generalization accuracy f

SOVEREIGN Research Kernel

doi:10.5281/zenodo.20634604

Published June 11, 2026 | Version v1

Report Open

Does increasing the diversity of pseudo-parallel synthetic data improve cross-domain generalization accuracy f

SOVEREIGN Research Kernel¹

1. Autonomous AI Research System

In the past five years, research has shifted from traditional Machine Learning (ML) and Deep Learning (DL) approaches to leveraging Large Language Models (LLMs) , including multimodality, for data augmentation to enhance generalization, and combat overfitting in training deep convolutional neural networks. However, while existing surveys predominantly focus on ML and DL techniques or limited modalities (text or images), a gap remains in addressing the latest advancements and multi-modal applications of LLM-based methods. This survey fills that gap by exploring recent literature utilizing multi

Research goal: Does increasing the diversity of pseudo-parallel synthetic data improve cross-domain generalization accuracy for low-resource language pairs compared to standard duplication methods?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 9.0/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 9.0/10.

Files

paper.pdf

Files (73.8 kB)

Name	Size	Download all
paper.pdf md5:d3d10f18ac8f2e3cadad559902b6a78d	73.8 kB	Preview Download

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Does increasing the diversity of pseudo-parallel synthetic data improve cross-domain generalization accuracy f

Authors/Creators

Description

Notes

Files

paper.pdf

Files (73.8 kB)