There is a newer version of the record available.

Published October 30, 2023 | Version v1
Preprint Open

Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer

Creators

Description

These are the artifacts related to the manuscript : "Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer" submitted to ICPC 2024.

-get_commits_and_crerate_data_sheet.py is the source code responsible for downloading the commits and creating the data sheet

-preprocess_and_create_batches.py is the source code responsible for pre-processing the commit messages and creating the batches for labeling 

-dataset_4_labels_anaonymous.csv  is the dataset with the labelling (YES/NO) of three raters for the four categories (Decision, Rationale, Supporting Facts, Inapplicable)

-dataset_3_labels_merged.csv  is the dataset after 1) considering the union of the labelling of the three raters and 2) removing the sentences where at least two raters agreed it was Inapplicable, as explained in the manuscript.

-Anonymous_ICPC_notebook.ipynb is the source code responsible for all the analyses, results  and figures in the manuscript.

Files

Anonymous_ICPC_notebook.ipynb

Files (12.7 MB)

Name Size Download all
md5:b17b90fb15ac14201f53ea8b9b63e545
2.8 MB Preview Download
md5:908232f3ca7d7614dc88defa37e814bb
4.7 MB Preview Download
md5:fc081c33176336bee6af64aff48754bb
5.1 MB Preview Download
md5:ec1f9c4750a33219bb277ae11a7a088c
1.7 kB Download
md5:0ba8c0399b87209bd12b3919b3600e30
9.5 kB Download