Published June 1, 2021
| Version v1
Other
Restricted
Supplementary Material for AntiCopyPaster
- 1. JetBrains Research
- 2. Higher School of Economics
- 3. JetBrains Research, Higher School of Economics
Description
This supplementary material contains additional information for the paper AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE.
- metrics.pdf lists and describes 117 metrics that were used as vector features in the development of AntiCopyPaster.
- dataset.pdf lists projects used for gathering the code fragments to train the classifiers in the development of AntiCopyPaster. The file describes the gathering of both positive and negative examples, and shows how many fragments were gathered from each project.
- models.pdf lists the results of the comparison of different models on the collected dataset. For the best model, Random Forest, there are also additional experiments in different settings.