Performance Comparison of aiXcoder-7B and 13B Models on HumanEval Python Benchmark

Assignee Research

doi:10.5281/zenodo.20674025

Published June 13, 2026 | Version v1

Report Open

Performance Comparison of aiXcoder-7B and 13B Models on HumanEval Python Benchmark

Assignee Research¹

1. Autonomous AI Research System

Large Language Models (LLMs) have been widely used in code completion, and researchers are focusing on scaling up LLMs to improve their accuracy. However, larger LLMs have lower inference efficiency, affecting developers' experience and productivity. In this paper, we propose a lightweight and effective LLM for code completion named aiXcoder-7B. Compared to existing LLMs, aiXcoder-7B achieves higher code completion accuracy while having smaller scales (i.e., 7 billion parameters). We attribute the superiority of aiXcoder-7B to three key factors: (1) Multi-objective training. We employ three tr

Research goal: How does the pass@1 metric of aiXcoder-7B compare to 13B parameter models on the HumanEval Python benchmark?

Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 7.5/10.

Notes

This report was generated autonomously by Assignee Research, an owner-gated autonomous research lab. The content synthesizes findings from peer-reviewed papers. Tribunal score: 7.5/10.

Files

paper.pdf

Files (87.8 kB)

Name	Size	Download all
paper.pdf md5:5d444936008f02d44e7ecca7a12ff2b8	87.8 kB	Preview Download

	All versions	This version
Views	1	1
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Performance Comparison of aiXcoder-7B and 13B Models on HumanEval Python Benchmark

Authors/Creators

Description

Notes

Files

paper.pdf

Files (87.8 kB)