Published March 16, 2022 | Version 1.0
Software Open

CodeBERT for Code Clone Detection: A Replication Study

  • 1. LUMS
  • 2. Singapore Management University

Description

This replication pack is a companion to our paper "CodeBERT for Code Clone Detection: A Replication Study". It consists of three datasets and Python scripts to replicate the results of our clone detection experiments using CodeBERT on the three datasets.

Folder Structure 

Dataset: dataset folder includes all the datasets (BigCloneBench, AndroidDataset, and SemanticCloneBench)

Code BERT (Pre-trained): CodeBERT_Replication_Pack/code_bert_classifier folder includes all the Python notebooks used for experiments on CodeBERT pre-trained model for all the datasets. Pre-trained model weights are also present in this folder.

Code BERT (fine-tuned): CodeBERT_Replication_Pack/code_bert_finetune folder includes all the Python notebooks for finetuning and experiments on CodeBERT fine-tuned model for both datasets (AndroidDataset, and SemanticCloneBench).

Predictions: predictions folder includes all the outputs/predictions of CodeBERT.

 

Files

CodeBERT for Code Clone Detection: A Replication Study.zip

Files (575.6 MB)