Published February 4, 2025 | Version v1
Computational notebook Open

Cleaned data and notebooks for the paper "Towards Refined Code Coverage: A New Predictive Problem in Software Testing"

  • 1. ROR icon University of Córdoba
  • 2. ROR icon Delft University of Technology

Description

In this replication package we share our dataset and analysis notebooks for the reader to retrace our analysis steps and dive deeper into the results.

You can find:
- csv/allegro-hermes_cleaned.csv: the cleaned dataset we used for our analysis
- fig/...: figures produced during our analysis, some of which are included in the paper
- hyperparams/...: the scripts and results from our hyperparameter optimization
- data_analysis.ipynb: the notebook with our initial exploration of the data, e.g., looking at the correlation of the features with each other and the target variable
- machine_learning_experiment.ipynb: our notebook for training the four algorithms (Decision Tree (DT),k-Nearest Neighbors (kNN), Naive Bayes (NB) and Random Forest (RF)), as well as evaluating their performance. This also includes our explainability analysis to understand which features influence the prediction the most.

Files

understanding-covered-code-publish-anonymized.zip

Files (1.8 MB)