Can Developers Prompt? A Controlled Experiment for Code Documentation Generation [Replication Package]
Creators
Description
Summary of Artifacts
This is the replication package for the paper titled 'Can Developers Prompt? A Controlled Experiment for Code Documentation Generation' that is part of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME), from October 6 to 11, 2024, located in Flagstaff, AZ, USA.
Full Abstract
Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.
Author Information
Name | Affiliation | |
---|---|---|
Hans-Alexander Kruse | Universität Hamburg | hans-alexander.kruse@studium.uni-hamburg.de |
Tim Puhlfürß | Universität Hamburg | tim.puhlfuerss@uni-hamburg.de |
Walid Maalej | Universität Hamburg | walid.maalej@uni-hamburg.de |
Citation Information
@inproceedings{kruse-icsme-2024,
author={Kruse, Hans-Alexander and Puhlf{\"u}r{\ss}, Tim and Maalej, Walid},
booktitle={2022 IEEE International Conference on Software Maintenance and Evolution},
title={Can Developers Prompt? A Controlled Experiment for Code Documentation Generation},
year={2024},
doi={tba},
}
Artifacts Overview
1. Preprint
The file kruse-icsme-2024-preprint.pdf is the preprint version of the official paper. You should read the paper in detail to understand the study, especially its methodology and results.
2. Results
The folder results includes two subfolders, explained in the following.
Demographics RQ1 RQ2
The subfolder Demographics RQ1 RQ2 provides Jupyter Notebook file evaluation.ipynb for analyzing (1) the experiment participants' submissions of the digital survey and (2) the ad-hoc prompts that the experimental group entered into their tool. Hence, this file provides demographic information about the participants and results for the research questions 1 and 2. Please refer to the README file inside this subfolder for installation steps of the Jupyter Notebook file.
RQ2
The subfolder RQ2 contains further subfolders with Microsoft Excel files specific to the results of research question 2:
- The subfolder UEQ contains three times the official User Experience Questionnaire (UEQ) analysis Excel tool, with data entered from all participants/students/professionals.
- The subfolder Open Coding contains three Excel files with the open-coding results for the free-text answers that participants could enter at the end of the survey to state additional positive and negative comments about their experience during the experiment. The Consensus file provides the finalized version of the open coding process.
3. Extension
The folder extension contains the code of the Visual Studio Code (VS Code) extension developed in this study to generate code documentation with predefined prompts. Please refer to the README file inside the folder for installation steps. Alternatively, you can install the deployed version of this tool, called Code Docs AI, via the VS Code Marketplace.
You can install the tool to generate code documentation with ad-hoc prompts directly via the VS Code Marketplace. We did not include the code of this extension in this replication package due to license conflicts (GNUv3 vs. MIT).
4. Survey
The folder survey contains PDFs of the digital survey in two versions:
- The file Survey.pdf contains the rendered version of the survey (how it was presented to participants).
- The file SurveyOptions.pdf is an export of the LimeSurvey web platform. Its main purpose is to provide the technical answer codes, e.g., AO01 and AO02, that refer to the rendered answer texts, e.g., Yes and No. This can help you if you want to analyze the CSV files inside the results folder (instead of using the Jupyter Notebook file), as the CSVs contain the answer codes, not the answer texts. Please note that an export issue caused page 9 to be almost blank. However, this problem is negligible as the question on this page only contained one free-text answer field.
5. Appendix
The folder appendix provides additional material about the study:
- The subfolder tool_screenshots contains screenshots of both tools.
- The file few_shots.txt lists the few shots used for the predefined prompt tool.
- The file test_functions.py lists the functions used in the experiment.
Revisions
Version | Changelog |
---|---|
1.0.0 | Initial upload |
1.1.0 | Add paper preprint. Update abstract. |
1.2.0 | Update replication package based on ICSME Artifact Track reviews |
1.2.1 | Update the Zenodo description |
1.3.0 | Update the Jupyter Notebook installation process |
License
See LICENSE file.
Files
kruse-icsme-2024-replication-package-v130.zip
Files
(5.3 MB)
Name | Size | Download all |
---|---|---|
md5:04a9ad3fe1dff61f38ea2b4bc6d0a4ae
|
5.3 MB | Preview Download |