Analyzing, Classifying, and Interpreting Emotions in Software Users' Tweets

1. Does the paper propose a new opinion mining approach?

Yes

2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?

Naive Bayes (NB) and Support Vector Machines (SVM) are trained on the built dataset by using the bag of words of the tweets as features. SentiStrength used for comparison.

3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.

SentiStrength: http://sentistrength.wlv.ac.uk

4. What is the main goal of the whole study?

A study aimed at detecting, classifying, and interpreting emotions in software users’ tweets. Objectives of the study are to 1) identify the most effective techniques in detecting emotions and collective mood states in software-relevant tweets, and 2) investigate how such emotions are correlated with specific software-related events.

5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?

Compute the sentiment polarity of software-related tweets. Classify emotions in the tweets (Frustration, Dissatisfaction, Bug report, Satisfaction, Anticipation, Excitement).

6. Which dataset(s) the technique is applied on?

A dataset of 1000 tweets sampled from a broad range of software systems' Twitter feeds. Tweets have been manually examined by two human annotators.

7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.

1000-software-related-tweets: http://seel.cse.lsu.edu/data/semotion17.zip

8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?

Yes for SentiStrength. No for the ML approaches.

9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?

Yes, through a ten-fold validation on the built dataset. For the binary sentiment polarity (positive/negative), the customised techniques work better than SentiStrength in terms of precision/recall/F-Measure for both positive and negative class. SentiStrength achieves 74% FM and 69% for positive and negative, respectively. NB and 81% and 77%, VSM 78% and 70%. For the classification of the type of sentiment the performance go from 64% to 87% depending on the sentiment.

10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).

No

11. What success metrics are used?

Precision, Recall, F-Measure on each binary class or on each sentiment.

12. Write down any other comments/notes here.

-