Replication Package for a Study of Software Refactorings in Real-World Open-Source Java Projects
Description
This replication package accompanies a study on real-world refactorings in Java open-source projects. It contains:
-
A list of predefined keywords, borrowed from previous studies, used for commit message comparison.
-
A dataset of commits containing refactorings mined from six popular open-source Java applications: SpringFramework, Elasticsearch, Kafka, Hadoop, Tomcat, and JUnit4.
-
A consolidated taxonomy of refactorings discovered across these projects.
The file keywords.csv lists the predefined keywords used for commit message filtering. The file master_replication_file.csv contains the refactoring-related commits, their associated refactoring types, and context-specific indicators. The file taxonomy.csv classifies the refactorings by edit type and specificity.
The package is intended to support reproducibility and facilitate further research on refactoring detection, categorization, and tool support. Instructions for reproducing the results—such as cloning the repositories and inspecting the commits—are included in the README file. Each commit can be accessed using the command git show <commit-hash> from the appropriate project folder.