Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Creators
Description
Required Libraries
The following libraries are required to run the scripts in this repository. You can install them using `pip`:
```bash pip install pandas numpy argparse json time random openai copy statistics krippendorff sklearn seaborn matplotlib together anthropic google-generativeai
Make sure to also install any other dependencies required by the specific model API if you plan on using models like GPT-4 or Claude:
openai
anthropic
together
All the experiments were done using python 3.10.11
For each dataset, we have a folder that contains process.py, heatmap.py, ira_sample.py. The folder also contains the relevant datasets and plots.
File Description:
- data_result: This folder contains the file with the dataset and few-shot samples. After running process.py, all the results will be accumuted to data_result folder. Note that this folder is already containing all the data and model generated results in .jsonl fomat files. You do not need to run process.py to generate the results.
- Plots: This folder is containing the generated plots which can be generated by running heatmap.py and ira_sample.py.
- process.py: This file will generate the results/annotations from the model based on the given parameters. We have shared the necessary command to run this file at the bottom. Note that you need API keys from different organizations to run the script. However, we have shared all the model generated results on data_result folder.
- heatmap.py: Running this file will generate the heatmap that we presented from Figure 1-5 in the paper. The generated plots will be stored in "Plots" folder.
- ira_sample.py: Running this file will generate the plots that we presented from Figure 7-10 in the paper. The generated plots will be stored in "Plots" folder.
Commands for datasets (Except Code Summarization):
Generating samples for different models:
python process.py --model gpt-4 --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --model gpt-3.5-turbo --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --model llama3--fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --model mixtral --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --model claude --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --model gemini --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
For Figure (1-5):
python heatmap.py
For Figure (7-10):
python ira_sample.py
Commands for datasets (Code Summarization):
python process.py --what accurate --model gpt-4 --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --what accurate --model gpt-3.5-turbo --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --what accurate --model llama3--fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --what accurate --model mixtral --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --what accurate --model claude --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
python process.py --what accurate --model gemini --fewshot yes --openai_key xxxx --together_key xxxx --claude_key xxxx --google_key xxxx
For Figure (1-5):
python heatmap.py
For Figure (7-10):
python ira_sample.py
What="accurate", "adequate", "concise", "similarity"
For Figure 6:
python scatter.py
For Figure 12 & 13, please copy majority.py and probability.py outside the shared folders.
For Figure 12:
python probability.py
For Figure 6:
python majority.py
We also provided sample prompts from all datasets in Prompts.pdf
Files
causality.zip
Files
(65.4 MB)
Name | Size | Download all |
---|---|---|
md5:708ccf6c6f7cf52735e1f5abecfcb967
|
3.8 MB | Preview Download |
md5:96571d0b27a8da40607f81fc8282ef52
|
11.2 MB | Preview Download |
md5:9c31840c63ea21c78575ed9ee18c6bf4
|
10.6 MB | Preview Download |
md5:eb7d21bb6b279ed85e2a0c1e650a8238
|
53.9 kB | Download |
md5:f000acd27e7ce6479e84610bbd18924d
|
201.2 kB | Preview Download |
md5:bfb6ef529c23ebb6020dd3b7be3239c8
|
6.1 kB | Download |
md5:6cd91e1176476503244075c3ac25215e
|
124.1 kB | Preview Download |
md5:18e66a9ae9d345ba495ee0a3a110682d
|
11.5 MB | Preview Download |
md5:25a06fc92817a3b71f5e13b72528d371
|
937 Bytes | Download |
md5:026269bc9eb8bdb1fd25be788a66b952
|
28.0 MB | Preview Download |