Published August 2023 | Version v1
Journal article Open

Employing Source Code Quality Analytics for Enriching Code Snippets Data

  • 1. ROR icon Aristotle University of Thessaloniki
  • 2. Electrical and Computer Engineering Department, Aristotle University of Thessaloniki

Description

The availability of code snippets in online repositories like GitHub has led to an uptick in code reuse, this way further supporting an open-source component-based development paradigm. The likelihood of code reuse rises when the code components or snippets are of high quality, especially in terms of readability, making their integration and upkeep simpler. Toward this direction, we have developed a dataset of code snippets that takes into account both the functional and the quality characteristics of the snippets. The dataset is based on the CodeSearchNet corpus and comprises additional information, including static analysis metrics, code violations, readability assessments, and source code similarity metrics. Thus, using this dataset, both software researchers and practitioners can conveniently find and employ code snippets that satisfy diverse functional needs while also demonstrating excellent readability and maintainability.

Files

data-08-00140-v2.pdf

Files (439.5 kB)

Name Size Download all
md5:786ce21284086bc7010e1ff5a39d055a
439.5 kB Preview Download