Published November 25, 2022 | Version v1
Journal article Restricted

Commit Messages in the Age of Large Language Models

Creators

Description

Commit messages are explanations of changes made to a codebase that are stored in version control systems. They help developers understand the codebase as it evolves. However, writing commit messages can be tedious and inconsistent among developers. To address this issue, researchers have tried using different methods to automatically generate commit messages, including rule-based, retrieval-based, and learning-based approaches. But these methods haven’t yet produced satisfactory results. Advances in language-generating AI models like ChatGPT offer new possibilities for generating commit messages. In this study, we used ChatGPT to generate commit messages based on code changes and compared the results to previous automatic commit message generation (ACMG) methods. Our goal was to see if ChatGPT could generate commit messages that were both quantitatively and qualitatively acceptable and outperform previous ACMG methods. We found that ChatGPT was able to outperform previous ACMG methods, suggesting that large language models are a promising solution for ACMG.

We replicate the results, NNGen, CoDiSum, and CommitBert these works is based on models that have the dataset and source code available, from which we can replicate the results of the original paper, and for which hyper-parameters are given.

The code comes from the following repositories:

1. https://github.com/SoftWiser-group/CoDiSum 

2. https://github.com/Tbabm/nngen

3. https://github.com/graykode/commit-autosuggestions 

4. https://sjiang1.github.io/commitgen

 

Within the package, we have included :

The code to replicate these models and a readme on how to run 

the models on one's machine. 

 

 

If you find any of these models useful for your research, please consider citing the paper

 

@inproceedings{liu2018neural,
  title={Neural-machine-translation-based commit message generation: how far are we?},
  author={Liu, Zhongxin and Xia, Xin and Hassan, Ahmed E and Lo, David and Xing, Zhenchang and Wang, Xinyu},
  booktitle={Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering},
  pages={373--384},
  year={2018}
}
@inproceedings{
author = {Xu, Shengbin and Yao, Yuan and Xu, Feng and Gu, Tianxiao and Tong, Hanghang and Lu, Jian},
title = {Commit Message Generation for Source Code Changes},
booktitle = {Proceedings of the 28th International Joint Conference on Artificial Intelligence},
year = {2019},
}
@inproceedings{jiang2017automatically,
  title={Automatically generating commit messages from diffs using neural machine translation},
  author={Jiang, Siyuan and Armaly, Ameer and McMillan, Collin},
  booktitle={2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)},
  pages={135--146},
  year={2017},
  organization={IEEE}
}
@article{jung2021commitbert,
  title={CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model},
  author={Jung, Tae-Hwan},
  journal={arXiv preprint arXiv:2105.14242},
  year={2021}
}

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.