Published June 20, 2022 | Version 1.1
Dataset Open

Supplemental Material for a Systematic Literature Review on Benchmarks for Evaluating Debugging Approaches

  • 1. Graz University of Technology

Description

Bug benchmarks are used in development and evaluation of debugging approaches. Quantitative performance comparison of different debugging approaches is only possible when they have been evaluated on the same dataset or benchmark. However, benchmarks are often specialized towards usage for certain debugging approaches in their contained data, metrics, and artifacts. Such benchmarks can not be easily used on debugging approaches outside their scope as such approach may rely on specific data such as bug reports or code metrics not included in the dataset. Furthermore, benchmarks vary in their size w.r.t. the number of subject programs and the size of the individual subject programs. For these reasons, we have performed a systematic literature review where we have identified 73 benchmarks that can be used to evaluate debugging approaches.

We compare the different benchmarks with respect to their size and the provided information such as bug reports, contained test cases, and other code metrics. Furthermore, we have investigated how well the benchmarks realize the FAIR guiding principles. This comparison is intended to help researchers to quickly identify all suitable benchmarks for evaluating their specific debugging approaches. More information can be found in the publication:

Thomas Hirsch and Birgit Hofer: "A Systematic Literature Review on Benchmarks for Evaluating Debugging Approaches", Journal of Systems and Software, in press, 2022.

Files

open_science_material.zip

Files (792.7 kB)

Name Size Download all
md5:d6396f2dd792668191e574aa3c3d5368
792.7 kB Preview Download

Additional details

Funding

FWF Austrian Science Fund
Automated Debugging in Use P 32653