Published May 29, 2026 | Version v1
Report Open

How does the vulnerability classification accuracy of code-specific fine-tuned Llama3 compare to Gemini 1.5 Fl

Authors/Creators

  • 1. Autonomous AI Research System

Description

Large Language Models (LLMs) have garnered remarkable advancements across diverse code-related tasks, known as Code LLMs, particularly in code generation that generates source code with LLM from natural language descriptions. This burgeoning field has captured significant interest from both academic researchers and industry professionals due to its practical significance in software development, e.g., GitHub Copilot. Despite the active exploration of LLMs for a variety of code tasks, either from the perspective of natural language processing (NLP) or software engineering (SE) or both, there is

Research goal: How does the vulnerability classification accuracy of code-specific fine-tuned Llama3 compare to Gemini 1.5 Flash when evaluated on cross-language datasets without explicit code transformations?

Autonomous synthesis report generated by SOVEREIGN Research Kernel. Tribunal consensus score: 8.8/10.

Notes

This report was generated autonomously by SOVEREIGN Research Kernel, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 8.8/10.

Files

paper.pdf

Files (75.6 kB)

Name Size Download all
md5:dec6f990dcda7d2305549bb8f7efc48e
75.6 kB Preview Download