Published March 1, 2022 | Version v1
Dataset Open

Artifacts of the paper under review by TSE

Authors/Creators

  • 1. Anonymous

Description

This is the online repository of *Predictive Comment Updating with Heuristics and AST-Path-Based Neural Learning: A Two-Phase Approach*, a research paper under review by TSE. We release the source code and relevant data of Toper, the data used in our evaluation, as well as the experiment results.

  • Dataset

Basically, the dataset is from Liu et al.'s ASE20 paper (i.e., Automating Just-In-Time Comment Updating), and then cleaned by Lin et al.'s ICPC21 paper (i.e., Automated Comment Update: How Far are We?). We classify the dataset into code-indicative and non-code-indicative items and store them in Data directory, which is named by the format of [data catagory]_Items_[Dataset].jsonl. For example, All_Item_Test.jsonl means this file includes all (i.e., including code-indicative and non-code-indicative) items in the test set. Similarly, NCIU_Items_Test.json means this file only covers non-code-indicative items in the test set.

  • The Code-Indicative Update Classifier

We design a classifier to differentiate the Code-Indicative and Non-Code-Indicative updates. The replication package is available at Code/TypeClassifier.py. To obtain the result of the classifier, please run the following command:

python3 TypeClassifier.py -training/FilePath FeaturesForClassifier/featuresForTrain.csv -testFilePath FeaturesForClassifier/featuresForTest.csv
  • Operation Path Extractor

The customized tool for extracting operation path from the dataset is provided by previous studies. To obtain the preprocessed data, run the following command:

java -cp OperationPathExtractor.jar Extractor.App --data_dir path/to/data --input_name semi-finished/data/path --output_name path/to/store/data --num_threads 1 The preprocessed data are stored in Data/Preprocessed.
  • The Non-Code-Indicative Comment Updater

Our replication code is available at Code, and the detail instructions of command are at comment_update.py. Or you can simply execute the following command:

python3 comment_update.py -data_path path/to/data -gpu -use_features

 

Files

Code.zip

Files (196.1 MB)

Name Size Download all
md5:94f2b30273942236959a7577961d040a
487.9 kB Preview Download
md5:f1034526b33af30e79f339313a5cca78
173.9 MB Preview Download
md5:e6faa0d30379488cf42749f2e41f9d11
21.7 MB Download