On using machine learning to identify knowledge in API reference documentation

1. Does the paper propose a new opinion mining approach?

No

2. Which opinion mining techniques are used (list all of them, clearly stating their name/reference)?

k-NN and SVM. RNN with an LSTM layer

3. Which opinion mining approaches in the paper are publicly available? Write down their name and links. If no approach is publicly available, leave it blank or None.

None

4. What is the main goal of the whole study?

to study how well simple machine learning for text classification, without additional feature engineering or advanced natural language processing (NLP) techniques, can identify 12 knowledge types

5. What the researchers want to achieve by applying the technique(s) (e.g., calculate the sentiment polarity of app reviews)?

to classify texts into 12 different knowledge types

6. Which dataset(s) the technique is applied on?

CaDO dataset created by Maalej and Robillard: https://cado.informatik.uni-hamburg.de and a new Python dataset consisting of 100 API documentation pages

7. Is/Are the dataset(s) publicly available online? If yes, please indicate their name and links.

https://doi.org/10.5281/zenodo.3265783

8. Is the application context (dataset or application domain) different from that for which the technique was originally designed?

retrained

9. Is the performance (precision, recall, run-time, etc.) of the technique verified? If yes, how did they verify it and what are the results?

yes with AUPRC, RNN identifies eight types more accurately than traditional machine learning. When considering multiple knowledge types at once (i.e., multi-label classification), RNN outperforms traditional machine learning approaches

10. Does the paper replicate the results of previous work? If yes, leave a summary of the findings (confirm/partially confirms/contradicts).

No

11. What success metrics are used?

N/A

12. Write down any other comments/notes here.

-