InStructGen: Intent-oriented Code Review Comment Generation via Pretrained Models and Structure Learning
Description
Resources related by the research work "InStructGen: Intent-oriented Code Review Comment Generation via Pretrained Models and Structure Learning"
There are seven packages:
-
InStructGen.7z: The main code and materials for our project. -
models_1.7z: The separated models including implementation of baselines and graph-based structure learning (diff AST graph short/long) -
models_2.7z: The separated models including part of intent-oriented experiments on CRC-short (vanilla concatenation) and on CRC-long (all) -
models_3.7z: The separated models including the other part of intent-oriented experiments on CRC-short (line/span-grained diff and singe side) -
raw_data.7z: The raw data fetched from GitHub using GraphQL APIs, including all content used in the experiments and other content that may be useful for follow-up researches, such as commit messages. -
CRC-short: Our processed dataset that of a shorter average token length. It includes multiple sub-datasets processed for each experiments. -
CRC-long: Our processed dataset that of a longer average token length. It includes multiple sub-datasets processed for each experiments.
Due to the zenodo upload limit, the CRC-long dataset and the models are split into volumes. The models are seperated in other deposits:
-
models_1: https://zenodo.org/record/7783939 -
models_2: https://zenodo.org/record/7784101 -
models_3: https://zenodo.org/record/7784105