MarValous: Machine Learning Based Detection of Emotions in the Valence-arousal Space in Software Engineering Text
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
MarValous the new tool proposed in this paper DEVA M. Islam and M. Zibran. 2018. DEVA: Sensing Emotions in the Valence Arousal Space in Software Engineering Text. In SAC. 1536–1543.
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
MarValous https://figshare.com/s/a3308b7087df910db38f. DEVA https://figshare.com/s/277026f0686f7685b79e
4. What is the main goal of the whole study?
propose a ML-based tool for emotion identification in software engineering text
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
idem
6. Which dataset(s) the technique is applied on?
Combination of Islam and Zibran (DEVA see above) and processed Gold Standard of Novielli et al. N. Novielli, F. Calefato, and F. Lanubile. 2018. A Gold Standard for Emotion Annotation in Stack Overflow. In MSR. 14–17.
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
The combined dataset is available from https://figshare.com/s/a3308b7087df910db38f.
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
No
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
It is not really verification. The authors compare DEVA and MarValous in terms of precision, recall and F1 and show that for depression (LALV) and relaxation (LAHV) MarValous consistently outperforms DEVA; for excitation (HAHV) MarValous achieved higher recall and F1 but lower precision; for stress (HALV) - higher precision and F1 but lower recall.
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
N/A
11. What success metrics are used?
N/A
12. Write down any other comments/notes here.
-