Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer
Creators
Description
These are the artifacts related to the manuscript : "Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer" submitted to ICPC 2024.
-get_commits_and_crerate_data_sheet.py is the source code responsible for downloading the commits and creating the data sheet
-preprocess_and_create_batches.py is the source code responsible for pre-processing the commit messages and creating the batches for labeling
-dataset_4_labels_anaonymous.csv is the dataset with the labelling (YES/NO) of three raters for the four categories (Decision, Rationale, Supporting Facts, Inapplicable)
-dataset_3_labels_merged.csv is the dataset after 1) considering the union of the labelling of the three raters and 2) removing the sentences where at least two raters agreed it was Inapplicable, as explained in the manuscript.
-Anonymous_ICPC_notebook.ipynb is the source code responsible for all the analyses, results and figures in the manuscript.
Files
Anonymous_ICPC_notebook.ipynb
Files
(12.7 MB)
Name | Size | Download all |
---|---|---|
md5:b17b90fb15ac14201f53ea8b9b63e545
|
2.8 MB | Preview Download |
md5:908232f3ca7d7614dc88defa37e814bb
|
4.7 MB | Preview Download |
md5:fc081c33176336bee6af64aff48754bb
|
5.1 MB | Preview Download |
md5:ec1f9c4750a33219bb277ae11a7a088c
|
1.7 kB | Download |
md5:0ba8c0399b87209bd12b3919b3600e30
|
9.5 kB | Download |