Replication Kit: "On the Defect-Detection Capabilities of Unit and Integration Tests"

doi:10.5281/zenodo.1257364

Published May 7, 2018 | Version 1.1.0

Dataset Open

Replication Kit: "On the Defect-Detection Capabilities of Unit and Integration Tests"

1. University of Goettingen

Replication Kit for the Paper "On the Defect-Detection Capabilities of Unit and Integration Tests"
This additional material shall provide other researchers with the ability to replicate our results. Furthermore, we want to facilitate further insights that might be generated based on our data sets.

Structure
The stucture of the replication kit is as follows:

additional_visualizations: contains additional visualizations (Venn-Diagrams) for each projects for each of the data sets that we used
data_analysis: contains two python scripts that we used to analyze our raw data (one for each research question)
data_collection_tools: contains all source code used for the data collection, including the used versions of the COMFORT framework, the BugFixClassifier, the script that was used to filter and collect the issues with their commits, and the used tools of the SmartSHARK environment
mongodb_no_authors: Archived dump of our MongoDB that we created by executing our data collection tools. The "comfort" database can be restored via the mongorestore command.
project_defects.csv: Raw data of our manual data collection process. It included all collected defects including the name of the issue, its description and the commit references in which these issues were fixed. Furthermore, it includes information about which test failed or errored after re-integrating the defect into the analyzed release version of the project.

Additional Visualizations
We provide three additional visualizations for each project:

Disjoint-Mutation-Data (visualizations for the DISJ data set)
Mutation-Data (visualizations for the ALL data set)
Seeded-Data (visualizations for the SEEDED data set)

For each of these data sets there exist one visualization for each project that shows six Venn-Diagrams, where five of them present the different defect types and one overall Venn-Diagram. These Venn-Diagrams show the number of defects that were detected by either unit, or integration tests (or both).

Analysis scripts
Requirements:

python3.5
tabulate
scipy
seaborn
mongoengine
pycoshark
pandas
matplotlib

Both python files contain all code for the statistical analysis we performed.

Data Collection Tools
We provide all data collection tools that we have implemented and used throughout our paper. Overall it contains six different projects and one python script:

BugFixClassifier: Used to classify our defects.
comfort-core: Core of the comfort framework. Used to classify our tests into unit and integration tests and calculate different metrics for these tests.
comfort-jacoco-listner: Used to intercept the coverage collection process as we were executing the tests of our case study projects.
filter_issues.py: Used to filter and collect issues with their commits (need the vcsSHARK and issueSHARK executed beforehand)
issueSHARK: Used to collect data from the ITSs of the projects.
pycoSHARK: Library that contains models for the used ORM mapper that is used insight the SmartSHARK environment.
vcsSHARK: Used to collect data from the VCSs of the projects.

Files

project_defects.csv

Files (15.8 GB)

Name	Size	Download all
additional_visualizations.tar.gz md5:af852f2ac81466b280bb79324c9a2a6d	269.2 kB	Download
data_analysis.tar.gz md5:7b520a0536e099679d2278b5231ba248	9.9 kB	Download
data_collection_tools.tar.gz md5:6d02d43037b318fa7193f4b15ae67149	506.7 kB	Download
mongodb_no_authors.agz md5:b6f34461e298d7d4a35a08d9c8c97bd5	15.8 GB	Download
project_defects.csv md5:0d3e9dff919c93ac1e761b80839ba6e1	64.4 kB	Preview Download
README.md md5:059ee9949a54f7f3ca2f269dbd966b75	3.3 kB	Preview Download

	All versions	This version
Views	1,123	124
Downloads	233	18
Data volume	684.4 GB	94.9 GB

Replication Kit: "On the Defect-Detection Capabilities of Unit and Integration Tests"

Creators

Description

Files

project_defects.csv

Files (15.8 GB)