Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer
- 1. Université de Montréal
- 2. University of Montreal
Description
These are the artifacts related to the paper: "Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer" accepted in ICPC 2024.
-get_commits_and_create_data_sheet.py is the source code responsible for downloading the commits and creating the data sheet
-preprocess_and_create_batches.py is the source code responsible for pre-processing the commit messages and creating the batches for labeling
-dataset_4_labels_anaonymous.csv is the dataset with the labelling (YES/NO) of three raters for the four categories (Decision, Rationale, Supporting Facts, Inapplicable)
-dataset_3_labels_merged.csv is the dataset after 1) considering the union of the labelling of the three raters and 2) removing the sentences where at least two raters agreed it was Inapplicable, as explained in the paper.
-Anonymous_ICPC_notebook.ipynb is the source code responsible for all the analyses, results and figures in the paper.
Files
Anonymous_ICPC_notebook.ipynb
Files
(12.7 MB)
Name | Size | Download all |
---|---|---|
md5:b17b90fb15ac14201f53ea8b9b63e545
|
2.8 MB | Preview Download |
md5:908232f3ca7d7614dc88defa37e814bb
|
4.7 MB | Preview Download |
md5:fc081c33176336bee6af64aff48754bb
|
5.1 MB | Preview Download |
md5:ec1f9c4750a33219bb277ae11a7a088c
|
1.7 kB | Download |
md5:0ba8c0399b87209bd12b3919b3600e30
|
9.5 kB | Download |