Cross-lingual Pretraining Impact on Zero-shot F1 Scores for CWE-200 Vulnerability Detection in Low-Resource Languages
Description
Zero-shot cross-lingual knowledge transfer enables the multilingual pretrained language model (mPLM), finetuned on a task in one language, make predictions for this task in other languages. While being broadly studied for natural language understanding tasks, the described setting is understudied for generation. Previous works notice a frequent problem of generation in a wrong language and propose approaches to address it, usually using mT5 as a backbone model. In this work, we test alternative mPLMs, such as mBART and NLLB-200, considering full finetuning and parameter-efficient finetuning wi
Research goal: How does cross-lingual pretraining affect the zero-shot F1 scores of CodeT5 versus mT5 for CWE-200 vulnerability detection in low-resource languages?
Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.5/10.
Notes
Files
paper.pdf
Files
(87.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:02e5490332cc32af0ace4693fb50c2c7
|
87.6 kB | Preview Download |