The Artifact of the ESEC/FSE 2023 Paper Titled "Natural Language to Code: How Far are We?"
Description
In this online repository, we release the source code of each of the selected techniques as well as the experiment results from each technique (which are stored in the Results.zip file). For each technique, we also provide our scripts to fine this approach on the CodeSearchNet-Python dataset. For example, finetune.sh/inference.sh are used to finetune/evaluate CodeBERT and they are under "CodeBERT/CodeBERT".
Our evaluation dataset CodeSearchNet is a well-known benchmark and it can be downloaded on its official webpage.
The code to calculate the evaluation metrics are reused from CodeBLEU.
Below is a piece of code generated by CodeT5. In this case, CodeT5 generates a statement recurrently, which leads to the syntactic error. Despite that, the code itself fulfills certain functionalities, and that is why it can achieve a CodeBLEU of 24.9%.
def makeMimiLocal(filename):
try:
with open(filename, 'rb') as f:
data = f.read()
except IOError:
data = b''
data = data.decode('utf-8')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\x00', b'\x00')
data = data.replace(b'\
We also release the 100 randomly-selected queries as well as the code generated by ChatGPT in the chatGPT.jsonl.
Files
CodeBERT.zip
Files
(194.3 MB)
Name | Size | Download all |
---|---|---|
md5:858a1cb4f6a7e298203b0d7a939d9732
|
36.1 kB | Download |
md5:66d10ef1e23ef3374403a12df708bffd
|
25.8 kB | Preview Download |
md5:3d71bab161fa71b0e8846951865303d0
|
15.7 MB | Preview Download |
md5:c62cac7d388b218b265abe751df294cd
|
48.3 kB | Preview Download |
md5:4ce933cb5a6e73048b0327173a369f62
|
583.8 kB | Preview Download |
md5:f2fd8c9f36759f73e71b64511e3777eb
|
40.4 kB | Preview Download |
md5:321d29ed814a4c4942cd80d2a195bce4
|
16.7 MB | Preview Download |
md5:a14178ef775b791abf1f99c09bd500f8
|
5.4 MB | Preview Download |
md5:f72989f31459053f6032008f74dfa4e6
|
155.8 MB | Preview Download |
md5:5cc05323c324f3beb0aae86c47d122a3
|
37.5 kB | Preview Download |