App store mining is not enough for app improvement

1. Does the paper propose a new opinion mining approach?

No

2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?

category classifier (SVM and Naive Bayes) topic modeling (LDA) sentiment analysis (Pattern): De Smedt, T., Daelemans, W. (2012). Pattern for Python. Journal of Machine Learning Research, 13, 2031–2035.

3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.

Pattern: https://github.com/clips/pattern

4. What is the main goal of the whole study?

to study how Twitter can provide complementary information to support mobile app development

5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?

category classifier: to classify reviews and tweets into fine-grained categories (Feature requests, Bug reports, Others) topic modeling: to compare the topics extracted from tweets and app store reviews Sentiment analysis: to compare both polarity and subjectivity of app reviews and app related tweets

6. Which dataset(s) the technique is applied on?

30,793 apps with 4,867,870 relevant tweets

7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.

No

8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?

for pattern, it was designed for tweets, not specifically for app related tweets

9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?

category classifier: randomly selected 4,500 tweets and 8,300 reviews across different apps, labeled and applied 10-fold cross validation

10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).

No

11. What success metrics are used?

category classifier: precision and recall topic modeling: human judgment

12. Write down any other comments/notes here.

-