Published January 21, 2025 | Version v1
Dataset Open

Bias, Randomness, and Blind-Faith: Large Language Model Code Generation and Security Analysis

Creators

Description

Testing bias in code generation and security in large language models (LLMs) ChatGPT, Claude, and Gemini. This data is accompanying a paper submitted to USENIX '25. In experimentation, three trials were completed, each testing five different categories. 

 

The file Trial Charts.zip includes the trial model versions, final results from each category and test, as well as results from manual analysis. Each bias has its own .csv (Sex, Age, Race & Ethnicity, Experience, Special Circumstances) for every trial's results and another file with the first letter of the bias + Overall (i.e. A Overall for Age, SC Overall for Special Circumstances) for overall results of the category. The files labeled Manual Analysis (or Manual A.) highlight some of the differences between the biased code and the control code. It also includes notable outputs from each bias.

 

The Trial Results.zip includes all of the raw data from testing the biases. The data was collected in OneNote. The files "Trial [1-3]" include all outputs from all categories. The files "Trial [1-3] Retests" include any non-control retests. The files "[1-3] Control Retests" give the results of retesting the control for all three trials. The files "[1-3] A and B Tests" show the results from resting with the new labels "a" and "b".

Files

Trial Charts.zip

Files (16.2 MB)

Name Size Download all
md5:3f682af3e6ed189acb2a1331ef0a5eb4
43.5 kB Preview Download
md5:f8e34ccd981b479fa9443e7b587aa9b0
16.2 MB Preview Download