Published November 1, 2023 | Version v2
Conference proceeding Open

Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer

  • 1. Université de Montréal
  • 2. University of Montreal

Description

These are the artifacts related to the paper: "Rationale Dataset and Analysis for the Commit Messages of the Linux Kernel Out-of-Memory Killer" accepted in ICPC 2024.

-get_commits_and_create_data_sheet.py is the source code responsible for downloading the commits and creating the data sheet

-preprocess_and_create_batches.py is the source code responsible for pre-processing the commit messages and creating the batches for labeling 

-dataset_4_labels_anaonymous.csv  is the dataset with the labelling (YES/NO) of three raters for the four categories (Decision, Rationale, Supporting Facts, Inapplicable)

-dataset_3_labels_merged.csv  is the dataset after 1) considering the union of the labelling of the three raters and 2) removing the sentences where at least two raters agreed it was Inapplicable, as explained in the paper.

-Anonymous_ICPC_notebook.ipynb is the source code responsible for all the analyses, results  and figures in the paper.

Files

Anonymous_ICPC_notebook.ipynb

Files (12.7 MB)

Name Size Download all
md5:b17b90fb15ac14201f53ea8b9b63e545
2.8 MB Preview Download
md5:908232f3ca7d7614dc88defa37e814bb
4.7 MB Preview Download
md5:fc081c33176336bee6af64aff48754bb
5.1 MB Preview Download
md5:ec1f9c4750a33219bb277ae11a7a088c
1.7 kB Download
md5:0ba8c0399b87209bd12b3919b3600e30
9.5 kB Download