Published August 27, 2025 | Version v1
Dataset Open

Dataset for the ASE 2025 Challenge on Code Completion Context Collection Optimization

Contributors

Contact person:

  • 1. JetBrains
  • 2. JetBrains Research

Description

This dataset was used in the ASE 2025 Challenge on Code Completion Context Collection Optimization organized by JetBrains and Mistral AI. It includes the ground truth data and the associated repositories for all three phases: practice, public, and private. It also covers both tracks, Python and Kotlin. In addition, the dataset contains submitted solutions from the private phase as well as baseline solutions (BM25 and recent files) for further study.

  • Overall: 102 repositories, 1,746 revisions, and 1,764 completion points
  • Python: 47 points for Practice, 247 for Public, 394 for Private (total: 688 points from 52 repos)
  • Kotlin: 30 points for Practice, 400 for Public, 646 for Private (total: 1,076 points from 50 repos)

Notes (English)

Python Track Results (Private Phase)

Team Mellum Codestral Qwen Average
NoMoreActimel 0.656 0.820 0.725 0.734
SpareCodeComplete 0.695 0.766 0.713 0.725
REALISE Lab 0.613 0.710 0.608 0.644
WSPR_NCSU 0.582 0.710 0.615 0.636
Baseline: BM25 0.585 0.659 0.585 0.610
SaNDwich&TEST 0.590 0.661 0.578 0.610
Baseline: Recent 0.576 0.657 0.587 0.606

Kotlin Track Results (Private Phase)

Team Mellum Codestral Qwen Average
SpareCodeComplete 0.723 0.769 0.753 0.748
NoMoreActimel 0.684 0.791 0.719 0.731
WSPR_NCSU 0.616 0.709 0.653 0.660
REALISE Lab 0.652 0.688 0.637 0.659
SaNDwich&TEST 0.633 0.658 0.613 0.635
Baseline: BM25 0.627 0.652 0.621 0.634
Wu Wei 0.624 0.648 0.609 0.627
Baseline: Recent 0.618 0.636 0.605 0.620

Files

private-submissions.zip

Files (9.4 GB)

Name Size Download all
md5:8b28815623e1f883918e676bd3c86db0
163.8 kB Download
md5:90eaae1809a257ee65ef4827b8ab8e03
390.5 MB Preview Download
md5:b3f65b9942a35f56612a3f9eeeb08adc
3.4 MB Download
md5:c868a0207f904e85fdcafa6707e54af8
4.7 GB Preview Download
md5:f56e736c3ef15661d69a7600abf4eb29
2.3 MB Download
md5:2f4f173254eed1d16a2726562cbb7e14
2.1 GB Preview Download
md5:347df89cde93fadab3d05013fffa643c
27.8 MB Preview Download
md5:267b1cfa4930c613de7accbab652fb02
224.0 kB Download
md5:706933854b2079ce2fb5697a4ef56148
62.7 MB Preview Download
md5:539055ea5db135317f707028f66f4857
2.1 MB Download
md5:776ceb4d7716370d720f057cc4dc264e
1.3 GB Preview Download
md5:44700fefcf013ac11b5a804c96200992
1.2 MB Download
md5:def392464abb2b2df3b8083742c49d16
804.5 MB Preview Download

Additional details

Software

Repository URL
https://github.com/JetBrains-Research/ase2025-starter-kit
Programming language
Python
Development Status
Inactive