"TODO: Fix the Mess Gemini Created": Towards Understanding GenAI-Induced Self-Admitted Technical Debt
Authors/Creators
Description
đź§© Dataset: SATD Types and AI Roles in Developer Comments Overview
This dataset contains 81 source code comments annotated for both Self-Admitted Technical Debt (SATD) types and AI roles. It aims to explore how developers describe AI-assisted code in the context of technical debt, identifying both the type of debt and the role AI plays in that debt’s formation or resolution.
đź“‚ Dataset
File:satd_and_ai_role_annotated_data.csv
This CSV file includes:
Developer comment text
Annotated SATD type (following Maldonado & Shihab, 2015)
Annotated AI role (derived through open coding)
🏷️ Annotation of SATD Types
The SATD taxonomy follows:
E. d. S. Maldonado and E. Shihab, “Detecting and quantifying different types of self-admitted technical debt,” Proceedings of the 7th International Workshop on Managing Technical Debt (MTD), 2015.
Annotation guidelines were directly adapted from their replication package. You can find the official labeling instructions here: 👉 Labeling Tutorial (Maldonado et al.)
🤖 Annotation of AI Roles
AI role annotation was conducted using open coding, based on qualitative analysis of how developers describe AI’s influence in their comments.
The annotation instruction document can be found at: đź“„Ai Role Annotaion Guide.pdf
The identified roles include:
Catalyst – AI triggers developer awareness or action.
Source – AI introduces or causes technical debt.
Mitigator – AI assists in resolving or reducing technical debt.
Neutral – AI is mentioned without direct impact on debt.
We have found 15 instances of GIST in our manual analysis.
Files
AI Role Annotation Guide.pdf
Files
(110.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:f2c45603ab7ce87b6cd62e10c223974f
|
100.3 kB | Preview Download |
|
md5:2bc0a8f58af398805a0293b962d00a7d
|
9.8 kB | Preview Download |