Recommending new features from mobile app descriptions
1. Does the paper propose a new opinion mining approach?
Yes
2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?
Similar App-based FEature Recommender (SAFER)
3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.
http://oscar-lab.org/SAFER/
4. What is the main goal of the whole study?
to recommend features for new apps
5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?
App Feature Extractor (AFE): use linguistic rules and Naïve Bayes classifier to automatically extracting feature-describing sentences Similar App-based FEature Recommender (SAFER): Latent Dirichlet Allocation (LDA) to identify the topic distribution of each app profile to identify similar apps
6. Which dataset(s) the technique is applied on?
8,359 apps from Google Play
7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.
http://oscar-lab.org/SAFER/ (1) The Training Datasets for the Feature Classifier in AFE (for classifying whether a sentence describes a feature or not) (2) The Annotated Feature Dataset (AFD) for Apps (randomly select 20 Apps from every category and employ volunteers to annotate their golden features. Totally, we achieve 533 golden features)
8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?
No
9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?
The elimination-recovery method: For each golden feature f in the golden feature set F of app A, we eliminate f from the set F and let a feature recommender takes in all the remaining features in F-{f} as input. After the feature recommender recommends a ranked list of features, we employ three volunteers to check whether the eliminated feature f is hit in the ranked list. Here, a feature f is said to be hit when two or three of the volunteers find f to be described by one of the features in the ranked list.
10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).
No
11. What success metrics are used?
Hit Ratio and Normalized Discounted Cumulative Gain (NDCG) are used as yardsticks to evaluate the performance of feature recommendation systems Precision, Recall, F-Measure, and Accuracy to evaluate the performance of AFE
12. Write down any other comments/notes here.
-