Published February 25, 2021 | Version v1
Dataset Open

Replication Package for the paper: "Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study"

Description

This is the replication package for the paper: "Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study", published at the 18th International Conference on Mining Software Repositories (MSR'21).

It contains all the preliminary and final results of our empirical methodology as follows.
 

  • description_and_detection_mechanisms.zip: This file contains detailed descriptions for all types of symptoms.
  • extracted_degradation_symptoms_per_revision.zip: This file contains the raw data of types of symptoms extracted for all revisions.
  • design_impactful_changes_per_revision.zip: This file contains the raw data of design impactful changes per revision and system.
  • extracted_features_per_revision.zip: This file contains the raw data of extracted features (both social and technical features) per revision and system.
  • generated_ml_models.zip: This file contains the complete set of generated machine learning models grouped by feature set (social only, technical only, and social + technical together) and symptom category (fine-grained smells and coarse-grained smells).
  • performance_ml_models.zip: This file contains the complete description of the performance evaluation of the machine learning models considering the different combinations of feature sets and symptom categories (fine-grained smells and coarse-grained smells).
  • features_importance.zip: This file contains the complete set of features importance grouped by different rankings.
  • features_statistics.zip: This file contains the complete set of descriptive analysis of extracted features (social and technical) per system.
  • friedman_test_with_post_hoc_statistic.zip: This file contains the results, scripts, and visualizations of the Friedman with post hoc test employed in RQ2 and RQ3.
  • wilcoxon_script.zip: This file contains the script used in RQ1.
  • features_description.zip: This file contains the complete description of extracted features (social and technical).
  • hyperparametrization.zip: This file contains the hyper parameterization used in each ml model grouped by feature set, system, and smell granularity.

Feel free to use any part of this replication package in your study, please cite as:

Anderson Uchôa, Caio Barbosa, Daniel Coutinho, Willian Oizumi, Wesley K. G. Assunção, Silvia Regina Vergilio, Juliana Alves Pereira, Anderson Oliveira, Alessandro Garcia. Predicting Design Impactful Changes in Modern Code Review: A Large-Scale Empirical Study. Proceedings of the 18th International Conference on Mining Software Repositories (MSR),  Madrid, Spain, May 2021.

Files

description_and_detection_mechanisms.zip

Files (13.4 GB)

Name Size Download all
md5:23e734588d001c0d0677fec73885f5cf
8.2 MB Preview Download
md5:72570eb55cf8ba96ab1acae5ffe0d4ef
357.0 kB Preview Download
md5:713b2d303ffa4ad485b7cd6f6d73f46f
11.8 GB Preview Download
md5:36e21dbc351731d06f4d0c85ffe6d25e
3.7 MB Preview Download
md5:61f2f3a23a352cbdd915349e250ad041
14.3 kB Preview Download
md5:df682a81e1a8ecd8608b80c1d63164e4
54.4 kB Preview Download
md5:be31f1e80e03cc8b91b6ba34c10134b7
8.3 kB Preview Download
md5:4c95811126641ef6ce603a103ca120fa
97.0 kB Preview Download
md5:cc63bd0c9c55ca61cbe6aab813a890e2
1.6 GB Preview Download
md5:cff06cac0c5e8ef61a2ca8158b0719c0
23.9 kB Preview Download
md5:35dba0fabab887c565b1b78cd8b4b786
92.0 kB Preview Download
md5:f04bcbb586e5bc117a79df53243348e1
2.0 kB Preview Download