Dataset Open Access

CROP: linking code reviews to source code changes

Matheus Paixao; Jens Krinke; Donggyun Han; Mark Harman

he Code Review Open Platform, a.k.a. CROP, is an open-source dataset of code review data. CROP collects code review information from open-source software systems and links this data to complete versions of the code base for each of these systems. CROP was first designed by Matheus Paixao as part of his PhD thesis in the CREST Centre at University College London. Dr. Jens KrinkeDonggyun Han and Prof. Mark Harman have also contributed for the first incarnation of CROP.

 

CROP collects code review information from open-source software systems and links this data to complete versions of the code base for each of these systems.

 

Given a certain software system, CROP contains code review data and versions of the code base for each revision ever submitted for review, including intermediary revisions before merging and revisions that were even abandoned by the system's developers. Each version of the system represents a complete snapshot of the system's code base, in a way that each revision of the system is fully buildable, compilable and testable.

 

By leveraging the data contained in CROP, software engineering researchers and practitioners can perform empirical studies to assess how effective the code review process is for different aspects of software development. Since CROP provides complete snapshots of the software system, these experiments can be enhanced by using a wide range of approaches for static and dynamic analysis.

 

Moreover, during code review, developers are constantly providing reasoning and rationale for the changes they make in the system, both when they submit code for review and when they inspect code from their peers. Thus, the data contained in CROP is a valuable source of knowledge regarding motivation for and explanation of software changes.

 

For more information on the CROP dataset, including its structure, technical details, publication history and so on, please visit its official website in crop-repo.github.io.

Files (3.7 GB)
Name Size
discussion.zip
md5:10035aa794c5619d9aa7dd50643dbf0f
117.2 MB Download
git_repos.zip
md5:95b4e1feec9e3c532f999337fc994d0c
3.5 GB Download
LICENSE
md5:d631c0c1641507d343fb8d2fbbb37a16
14.2 kB Download
metadata.zip
md5:f3d508e0655b56f568ec15bd45008e64
14.3 MB Download
328
283
views
downloads
All versions This version
Views 328328
Downloads 283283
Data volume 352.1 GB352.1 GB
Unique views 289289
Unique downloads 147147

Share

Cite as