What would users change in my app? summarizing app reviews for recommending software changes
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
Intent Classifier: S. Panichella, A. Di Sorbo, E. Guzman, C. Visaggio, G. Canfora, and H. Gall. How can I improve my app? classifying user reviews for software maintenance and evolution. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on, pages 281–290, Sept 2015. Topics Classification
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
Intent Classifier: https://www.ifi.uzh.ch/en/seal/people/panichella/tools/ARdoc.html SURF: https://www.ifi.uzh.ch/en/seal/people/panichella/tools/SURFTool.html
4. What is the main goal of the whole study?
to summarize thousands of reviews and generating an interactive, structured and condensed agenda of recommended software changes
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
intention classifier: detecting sentences in user reviews that are important from a maintenance perspective Topics Classification: extracting the topics
6. Which dataset(s) the technique is applied on?
3439 reviews from 17 different apps
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
no
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
no
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
intention classifier: verified in the original paper Topics Classification: evaluated the effectiveness of the NLP classifier comparing the labels assigned by the classifier with the labels assigned in the human oracle, and achieved a global recall of 0.79, a global precision of 0.73, and a global F-measure of 0.76
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
No
11. What success metrics are used?
a survey to evaluate: (i) the suitability and robustness, (ii) the practical usefulness, and (iii) the quality of the summaries
12. Write down any other comments/notes here.
-