Using frame semantics for classifying and summarizing application store reviews
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
MARC-2.0: implementation done by authors Support Vector Machines (SVM), Naive Bayes (NB) Hybrid TF and Hybrid TF.IDF (Inouye and Kalita 2011), SumBasic (Nenkova and Vanderwende 2005), and LexRank (Erkan and Radev 2004)
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
https://github.com/seelprojects/MARC-2.0
4. What is the main goal of the whole study?
to investigate the performance of semantic frames in classifying informative user reviews into various categories of actionable software maintenance requests, and evaluate the performance of multiple summarization algorithms in generating concise and representative summaries of informative reviews
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
SVM, NB: to classify reviews into bug reports and feature requests Hybrid TF, Hybrid TF.IDF, SumBasic, and LexRank: to summarize reviews
6. Which dataset(s) the technique is applied on?
2912 reviews from 3 different sources
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
http://seel.cse.lsu.edu/data/emse18.zip
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
classifier retrained and summarization approach unsupervised
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
for classification, 10-fold cross validation for summarization: comparing the automatically-generated summaries with human-generated summaries (ground-truth)
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
No
11. What success metrics are used?
for classification, Precision, recall, F1 for summarization: average term overlap between the human-generated, or reference, summaries and the various automatically generated summaries. evaluated on both BOW and BOF (bags-of-frame) representation
12. Write down any other comments/notes here.
-