Published April 26, 2018 | Version v1
Dataset Open

Data and material for: "Mining file histories: should we consider branches?"

  • 1. Delft University of Technology
  • 2. University of Zurich

Description

This repository is the online appendix of our paper:

Vladimir Kovalenko, Fabio Palomba, and Alberto Bacchelli. 2018. Mining file histories: should we consider branches? In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018). Association for Computing Machinery, New York, NY, USA, 202–213. DOI

The results folder contains full results per project for RQ1 and for the reviewer recommendation part of RQ2, as well as lists of projects used for the evaluation of defect prediction and change recommendation algorithms. The rest of the results are provided in the paper.

The code that we have used to download and process the data (code reviews, repositories, and change recommendation) is located in the processor folder.

The processor depends on git2neo -- a tool to load Git metadata to neo4j databases and retrieve the histories, which is described in Section III.B of the paper.
This release of git2neo is provided as is to facilitate the evaluation and reproducibility of our work.

Files

ase2018-dataNmaterial.zip

Files (660.3 kB)

Name Size Download all
md5:b533baca20676dfff80bb5305cc6ea6f
660.3 kB Preview Download

Additional details

Related works

Is supplement to
Conference paper: 10.1145/3238147.3238169 (DOI)

Funding

Data-driven Contemporary Code Review PP00P2_170529
Swiss National Science Foundation