Published December 10, 2024
| Version 0.1
Software
Open
On the Compression of Language Models for Code: An Empirical Study on CodeBERT
Description
This repository contains the data and scripts used in the paper On the Compression of Language Models for Code: An Empirical Study on CodeBERT accepted at IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2025) conference.
Repository Structure
The repository is structured as follows:
analysis: this folder contains the jupyter notebooks used to analyze the data and produce the figures and tables presented in the paper.Code-Code: this folder contains the code to fine-tune, compress, and evaluate CodeBERT on vulnerability detection task. Refer to theREADME.mdfile in this folder for more details.Code-Text: this folder contains the code to fine-tune, compress, and evaluate CodeBERT on code summarization task. Refer to theREADME.mdfile in this folder for more details.Text-Code: this folder contains the code to fine-tune, compress, and evaluate CodeBERT on code search task. Refer to theREADME.mdfile in this folder for more details.
Setup
Install the required dependencies by running one of the following commands:
pip
pip install -r requirements.txt
conda
conda env create -f environment.yml
conda activate lm_compress
Next, refer to the README.md file in each of the Code-Code, Code-Text and Text-Code subfolders to download the datasets for each task.
Files
giordanoDaloisio/lm-compression-evaluation-0.1.zip
Files
(44.7 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:7a1ba9729890daec3be6328b15366bec
|
44.7 MB | Preview Download |
Additional details
Related works
- Is supplement to
- Software: https://github.com/giordanoDaloisio/lm-compression-evaluation/tree/0.1 (URL)