Bootstrapping a Lexicon for Emotional Arousal in Software Engineering
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
Rather than proposing a new opinion mining approach, the paper proposes a new lexicon for the detection of emotional arousal in the software engineering domain. The Software Engineering Arousal (SEA) lexicon is developed using a semiautomatic approach including bootstrapping from issue-tracking data.
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
Software Engineering Arousal (SEA) Lexicon, a lexicon for mining emotional arousal from software engineering texts. The authors adopt a bootstrapping approach by jointly leveraging word embedding and human annotation. SEA can be downloaded from Figshare at https://doi.org/10.6084/m9.figshare.4781188.v1
4. What is the main goal of the whole study?
Advancing the state of the art on emotion mining in software engineering by developing an SE-specific lexicon for mining emotional arousal from texts.
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
Calculate the emotional arousal in issue descriptions and comments. Emotional arousal increases alertness, activation and improves software engineers’ performance. The role of arousal in productivity and burnout in software engineering motivates the authors' interest in developing tools for arousal detection. In this paper, the emotional arousal conveyed in the issue descriptions and comments is used as a proxy for issue priority.
6. Which dataset(s) the technique is applied on?
700,000 issue reports from Apache Jira issue tracking system.
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
- Warriner Lexicon: The authors studied a general-purpose lexicon by Warriner et al. [1], containing arousal scores for roughly 14,000 English words: http://crr.ugent.be/archives/1003 - Jira dataset: From the Warriner lexicon, the authors select seed words potentially denoting high or low arousal in a software engineering context, also checking for their frequencies in the data set by Ortu et al. [2], containing 700,000 issue reports from Apache Jira issue tracking system. This dataset is not specifically developed for sentiment analysis studies and does not contain any emotion/opinion labels. This dataset is also used for the evaluation, i.e. to check if the SEA lexicon helps in better capturating the emotional arousal in the issue descriptions and comments, thus enabling the classificatin of the issue priority. 1) A. B. Warriner, V. Kuperman, and M. Brysbaert, “Norms of valence, arousal, and dominance for 13,915 English lemmas,” Behavior research methods, vol. 45, no. 4, pp. 1191–1207, 2013. 2) M. Ortu, G. Destefanis, A. Murgia, M. Marchesi, R. Tonelli, and B. Adams, “The JIRA Repository Dataset: Understanding Social Aspects of Software Development,” The 11th International Conference on Predictive Models and Data Analytics in Software Engineering, pp. 1–4, 2015
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
NA
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
To evaluate SEA, the authors compare the lexicon’s ability to differentiate between issue priorities in the data set by Ortu et al. Based on psychological literature linking urgency and emotional arousal, they make the assumption that higher priority issues are more urgent and fixing them results in higher reward in terms of system quality improvement. Thus, they assume that emotional lexicon as computed by looking at the text of issue descriptions and comments, would help in classifying the issue priority. The underlying assumption here is that higher priority associates with elevated emotional arousal.
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
Yes, they compare the results of the correlation analysis between arousal and issue priority with previous results obtained using the Warriner general-purpose lexicon, reported by Mantyla et al.: M. V. Mäntylä, B. Adams, D. Graziotin, and M. Ortu, “Mining Valence, Arousal, and Dominance – Possibilities for Detecting Burnout and Productivity? (MSR 2016)
11. What success metrics are used?
Cohen's d between issue priority and arousal in title, description, all comments, first comment, last comment in the issue. t-test p-values betwee inssue priorities and arousal in title, description, all comments, first comment, last comment in the issue.
12. Write down any other comments/notes here.
-