Robustness of OpenPangu-7B-MLA Performance on EchoMind under High-Noise Contamination and Domain Adaptation
Description
While large language models exhibit certain cross-lingual generalization capabilities, they suffer from performance degradation (PD) on unseen closely-related languages (CRLs) and dialects relative to their high-resource language neighbour (HRLN). However, we currently lack a fundamental understanding of what kinds of linguistic distances contribute to PD, and to what extent. Furthermore, studies of cross-lingual generalization are confounded by unknown quantities of CRL language traces in the training data, and by the frequent lack of availability of evaluation data in lower-resource related
Research goal: How does the robustness of OpenPangu-7B-MLA's performance on EchoMind correlate with the contamination rate under high-noise conditions, and can domain adaptation techniques improve its generalization across languages?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.4/10.
Notes
Files
paper.pdf
Files
(80.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:aa256cd18580c9aaf46e4a9fba869084
|
80.2 kB | Preview Download |