Published May 20, 2017 | Version v1
Preprint Open

Classifying code comments in Java open-source software systems

  • 1. Delft University of Technology
  • 2. University of Zurich

Description

Code comments are a key software component containing information about the underlying implementation. Several studies have shown that code comments enhance the readability of the code. Nevertheless, not all the comments have the same goal and target audience. In this paper, we investigate how six diverse Java OSS projects use code comments, with the aim of understanding their purpose. Through our analysis, we produce a taxonomy of source code comments; subsequently, we investigate how often each category occur by manually classify- ing more than 2,000 code comments from the aforementioned projects. In addition, we conduct an initial evaluation on how to automatically classify code comments at line level into our taxonomy using machine learning; initial results are promising and suggest that an accurate classification is within reach.

Files

TUD-SERG-2017-017.pdf

Files (423.4 kB)

Name Size Download all
md5:c95d68e682d78fd3eba5ae6b517d9fc4
423.4 kB Preview Download

Additional details

Funding

European Commission
SENECA - Software ENgineering in Enterprise Cloud Applications systems 642954