Understanding Obsolete Knowledge Derived from Patches
Authors/Creators
Description
Project summary
In software development, developers use issue trackers and code repositories to record and manage issue reports and their patches. With the evolution of software, such systems accumulate a large number of patches when issue reports are resolved. From accumulated patches, researchers have mined rich knowledge that is useful in assisting various programming tasks, e.g., locating bugs and vulnerabilities. In recent decades, mining software repositories has been a hot research topic, and the mined knowledge assists various programming tasks. Despite the successful stories, the knowledge derived from patches can be unreliable or become obsolete with the evolution of software. Such knowledge is misleading and harmful. However, to the best of our knowledge, no study has explored this problem, and many research questions are still open.
In this paper, we conduct the first empirical study on obsolete knowledge derived from patches. We manually analyze 396 issue reports and their corresponding patches from nine Apache projects, building a taxonomy of the knowledge types that can be learned from patches. Based on this taxonomy, we assess whether such knowledge remains valid in the context of the latest source files. Our results show that project‑specific knowledge is more common than domain knowledge and is also more prone to obsolescence; overall, 44.9% of the patches we examined contain obsolete knowledge. We further evaluate four popular large language models (LLMs) on generating patches for obsolete contexts, finding that they produce a substantial proportion of obsolete patches, particularly for knowledge categories such as API calls, annotations, and scripts. Finally, we define a modification ratio to quantify how the added lines of a patch change over time, revealing that in long‑lived projects, about 40% of patches are modified, and that recent patches tend to have lower modification ratios than older ones.
Our findings highlight the risks of obsolete knowledge in both patch‑based research and LLM‑based software engineering tools, and point to practical implications for data cleansing, benchmark evaluation, and the development of techniques to identify and mitigate outdated knowledge.
Our findings
RQ1. Which knowledge becomes obsolete? Based on whether the knowledge is specific to a project, we classify the knowledge into domain knowledge and project-specific knowledge. In particular, projectspecific knowledge is learned for 75% of issue reports. The most frequent project-specific knowledge is about writing documents and algorithms, and the most frequent domain knowledge is about API calls (Finding 1). The knowledge about APIs and debugging is more stable, but the knowledge about refactoring is likely to be obsolete. In total, 44.9% of our analyzed patches are obsolete (Finding 2).
| K1. Domain Knowledge(99/396, 25%) | K1.1. API call (49/396, 12.37%) |
| K1.2. Compile configuration (41/396, 10.35%) | |
| K1.3. Others (9/396, 2.27%) | |
| K2. Project-specific Knowledge (297/396, 75%) | K2.1. Document (message) (55/396, 13.89%) |
| K2.2. Algorithm (54/396, 13.64%) | |
| K2.3. Configuration (45/396, 11.36%) | |
| K2.4. Test (44/396, 11.11%) | |
| K2.5. Initialization (28/396, 7.07%) | |
| K2.6. Debug (19/396, 4.80%) | |
| K2.7. Refactoring (13/396, 3.28%) | |
| K2.8. Script (11/396, 2.78%) | |
| K2.9. Annotations (10/396, 2.53%) | |
| K2.10. Others (18/396, 4.55%) |
Table 1. The full taxonomy of knowledge.
ratioc=NcNallratioc=NallNc
where NcNc is the number of the knowledge in category CC, and NallNall is the number of all the knowledge learned from resolving issue reports, i.e., 396.
We manually classified the knowledge and learned from resolving the issue report. The details are in: RQ1_modification_ratio_1.csv, RQ1_modification_ratio_0.csv
RQ2. How many LLM-generated patches are obsolete? All LLMs generate obsolete patches, with the ratios ranging from 59.4% to 80.1% (Finding3). LLMs tend to generate obsolete patches when they involves annotations, scripts, and API calls (Finding 4).
We collected 568 LLM-generated patches and manually inspect each generated patch to determine whether it is obsolete. The results are in: RQ2_result.csv
RQ3. To what degree are patches modified? If a project has a long history, about 40% of its patches are modified. (Finding 5). Recent patches have lower modification ratios than old patches. (Finding 6).
We identified the modification ratios of the patchs from nine projects. Their modification ratios are as follows: aries, calcite, cassandra, derby, flink, geode, hbase, hive, and nutch.
Files
patches.zip
Files
(374.1 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:4d64d53e36846bf797f5a270153fdea0
|
374.1 kB | Preview Download |