Evaluating quality-in-use of FLOSS through analyzing user reviews

1. Does the paper propose a new opinion mining approach?

No

2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?

Ad hoc approach, which is an improved version of the one proposed by Socher et al in previous work () We use an improved recursive neural tensor network proposed by Socher et al to analyze the sentiment strengths of aspects. The recursive neural tensor network is trained by the training set provided by Socher et al. (2013) Socher's method, which embeds each word in a sentence before giving it a score and the sentiment strength of the sentence is on the root node. The proposed improved method consists of two steps: (1) find out all subjects, predicates, and objects in a sentence, (2) split them into different parts and find the root node of these parts. If a part has an aspect, the score at the root node is its sentiment strength for this review. ref. R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng, et al., "Recursive deep models for semantic compositionality over a sentiment treebank", Proceedings of the conference on empirical methods in natural language processing (EMNLP), vol. 1631, pp. 1642, 2013.

3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.

See above

4. What is the main goal of the whole study?

Assessment of Quality-in-use (QU), i.e. evaluating the software quality from the user perspective for FLOSS projects.

5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?

Step 1 We use a natural language processing method to process user reviews and classify them into two categories: informative reviews and noninformative ones. We then apply a topic model on the informative reviews so that topics users are interested in can be extracted, transform topics into characteristics of a QU model, and calculate the weight of each characteristic. Step 2 We extract aspects from reviews and analyze the sentiment strength of these aspects. Step 3 We match aspects with characteristics, evaluate QU of FLOSS using the aspect sentiment strengths and their characteristics weights. Wilson Interval is also taken to punish FLOSS with insufficient reviews.

6. Which dataset(s) the technique is applied on?

FLOSS projects' revies

7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.

no

8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?

It seems so but implementation details are not provided

9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?

no

10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).

no

11. What success metrics are used?

---

12. Write down any other comments/notes here.

-