Published January 1, 2020 | Version v1
Conference paper Open

Enhancing Code Refactoring Detection with Explanations from Commit Messages

  • 1. University of Notre Dame

Description

Our dataset is stored in (.csv) format. There are four projects included (Derby, Drools, Groovy, Infinispan) . 

Each project contains 5 columns:

commit_id  (type: ID)
commit_date (type: DATE)
message (type: TEXT)
refactoring_class (type: TEXT)
refactoring_type (type: TEXT)
The last two columns represent the labels of our data.

refactoring_class column has two binary values: "ref" and "nonref" (i.e., commit message has been identified as either refactoring related or non-refactoring related). 
refactoring_type column represents 12 different refactoring types detected for those commit messages from "ref" class.
The 12 refactoring types found across 4 projects are:

Encapsulate Field
Extract Class
Extract Method
Extract Subclass
Hide Method
Move Class
Move Field
Move Method
Add Parameter
Remove Parameter
Rename Class
Rename Method

Files

manualy_labeled_commits (goldset).zip

Files (119.7 kB)

Name Size Download all
md5:15d8fedcdef6f7f75df5c687c78cd791
119.7 kB Preview Download