How Do Code Refactoring Activities Impact Software Developers' Sentiments? - An Empirical Investigation Into GitHub Commits

1. Does the paper propose a new opinion mining approach?

No

2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?

A tuned version of SentiStrength, in which the polarity weight of specific terms (e.g., block, garbage) has been adjusted. Adjusted terms available here: https://github.com/navdeep08/SentiRef/blob/master/EmotionLookupTable.txt

3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.

SentiStrength and the customisation performed by the authors https://github.com/navdeep08/SentiRef/blob/master/EmotionLookupTable.txt

4. What is the main goal of the whole study?

To present an empirical investigation into software developers’ sentiments extracted from refactoring-based commit messages recorded over multiple versions of a set of 60 GitHub projects.

5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?

Measure the sentiment polarity of commit messages.

6. Which dataset(s) the technique is applied on?

GitHub commit messages for commits implementing refactoring operations.

7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.

SentiRef: a dataset of identified Emotional Scores in Refactoring Commit Messages of GitHub projects. SentiRef consists of 3,171 commit messages with 4,891 refactoring instances. https://github.com/navdeep08/SentiRef

8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?

Yes, but the authors customise the SentiStrength dictionary.

9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?

Not really. But the customization of SentiStrength is verified through a randomly selected sample of 100 commit messages that the authors manually analyzed. They found a 4% increase in the precision of SentiStrength after this tuning.

10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).

No

11. What success metrics are used?

None

12. Write down any other comments/notes here.

-