Artifact of the paper: Tutoring LLM into a Better CUDA Optimizer

Brabec, Matyáš; Klepl, Jiří; Töpfer, Michal; Kruliš, Martin

doi:10.5281/zenodo.15580207

Published June 3, 2025 | Version v1

Software Open

Artifact of the paper: Tutoring LLM into a Better CUDA Optimizer

1. Charles University

This repository contains the replication package for the paper titled "Tutoring LLM into a Better CUDA Optimizer" presented at Euro-Par 2025. The package contains the source files and supplementary scripts for a testing framework for evaluating LLM-generated computation kernels for three assignments: Game of Life simulation, histogram computation, and the k-Nearest Neighbors search. The package also contains all the collected measurements, the used LLM prompts and generated responses, the interactive scenarios presented in the paper, and all analyses conducted by the researchers.

The file structure of the package and further details are in the included README.md file. The package assumes a CUDA-accelerated Linux platform with GCC 13.2 or higher, NVCC 12.6 or higher, CMake 3.20 or higher, and Python 3.8 or higher.

The default configuration of LLM prompting is set up for GPT-o3-mini, which was the state-of-the-art model for coding tasks at the time of writing; however, the package is prepared for reproducibility on more recent models. Collecting LLM-generated implementations and their subsequent evaluation are fully automated.

For a quick confirmation that the target platform is prepared for the evaluation, enter the following shell commands:

# Depending on the LLM assignment you want to test, one of the following:
cd framework/histogram              # Histogram base directory
cd framework/game-of-life/infrastructure  # Game of Life base directory
cd framework/knn                   # k-NN base directory

make      # Compiling the code
make run  # Example: run the baseline implementation

The commands should output the mean time and standard deviation of the recordings and the validation result on baseline implementations. To re-evaluate the implementations generated by the GPT-o3-mini model, follow the instructions in the included README.md file (section Replication).

To reproduce the graphs presented in the paper, run the following shell commands (does not require re-evaluating the implementations):

cd measured-times
bash generate_all_graphs.sh

Files

artifact-Tutoring-LLMs-for-CUDA.zip

Files (3.6 MB)

Name	Size	Download all
artifact-Tutoring-LLMs-for-CUDA.zip md5:1c0bd372aab10c95b7105de056f8be8d	3.6 MB	Preview Download

Additional details

Is supplement to: Conference paper: 10.1007/978-3-031-99857-7_18 (DOI)

European Commission
ExtremeXP - EXPeriment driven and user eXPerience oriented analytics for eXtremely Precise outcomes and decisions 101093164
Ministry of Education Youth and Sports
Natural and Anthropogenic Georisks CZ.02.01.01/00/22_008/0004605
Charles University
SVV 260 821
Charles University
GAUK 269723

Repository URL: https://github.com/matyas-brabec/2025-europar-llm
Programming language: Cuda , C++

	All versions	This version
Views	142	142
Downloads	23	23
Data volume	87.3 MB	87.3 MB

artifact-Tutoring-LLMs-for-CUDA.zip

Files (3.6 MB)

Related works

Funding

Software

Artifact of the paper: Tutoring LLM into a Better CUDA Optimizer

Authors/Creators

Description

Files

artifact-Tutoring-LLMs-for-CUDA.zip

Files (3.6 MB)

Additional details

Related works

Funding

Software