Published June 1, 2021 | Version v1
Other Restricted

Supplementary Material for AntiCopyPaster

  • 1. JetBrains Research
  • 2. Higher School of Economics
  • 3. JetBrains Research, Higher School of Economics

Description

This supplementary material contains additional information for the paper AntiCopyPaster: Extracting Code Duplicates As Soon As They Are Introduced in the IDE.

  1. metrics.pdf lists and describes 117 metrics that were used as vector features in the development of AntiCopyPaster.
  2. dataset.pdf lists projects used for gathering the code fragments to train the classifiers in the development of AntiCopyPaster. The file describes the gathering of both positive and negative examples, and shows how many fragments were gathered from each project.
  3. models.pdf lists the results of the comparison of different models on the collected dataset. For the best model, Random Forest, there are also additional experiments in different settings.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.