Published February 19, 2024 | Version FTW-2024
Dataset Open

Regression-Test History Data for Flaky Test-Research, Dataset

  • 1. Ludwig-Maximilians-Universität München (LMU)
  • 2. LMU Munich

Contributors

Data collector:

Description

The dataset comprises developer test results of Maven projects with flaky tests across a range of consecutive commits from the projects' git commit histories. The Maven projects are a subset of those investigated in an OOPSLA 2020 paper. The commit range for this dataset has been chosen as the flakiness-introducing commit (FIC) and iDFlakies-commit (see the OOPSLA paper for details). The commit hashes have been obtained from the IDoFT dataset.

The dataset will be presented at the 1st International Flaky Tests Workshop 2024 (FTW 2024). Please refer to our extended abstract for more details about the motivation for and context of this dataset.

The following table provides a summary of the data.

Slug (Module) FIC Hash Tests Commits Av. Commits/Test Flaky Tests Tests w/ Consistent Failures Total Distinct Histories
TooTallNate/Java-WebSocket   822d40 146   75   75 24    1 2.6x10^9
apereo/java-cas-client (cas-client-core)   5e3655 157   65 61.7   3    2 1.0x10^7
eclipse-ee4j/tyrus (tests/e2e/standard-config)   ce3b8c 185   16   16 12    0    261
feroult/yawp (yawp-testing/yawp-testing-appengine)   abae17     1 191 191   1    1        8
fluent/fluent-logger-java    5fd463   19 131 105.6 11    2 8.0x10^32
fluent/fluent-logger-java   87e957   19 160 122.4 11    3 2.1x10^31
javadelight/delight-nashorn-sandbox   d0d651   81 113 100.6   2    5 4.2x10^10
javadelight/delight-nashorn-sandbox   d19eee   81   93 83.5   1    5 2.6x10^9
sonatype-nexus-community/nexus-repository-helm   5517c8   18   32   32   0    0      18
spotify/helios (helios-services)     23260 190 448 448   0  37    190
spotify/helios (helios-testing)   78a864   43 474 474   0    7      43

 

The columns are composed of the following variables:

  • Slug (Module): The project's GitHub slug (i.e., the project's URL is https://github.com/{Slug}) and, if specified, the module for which tests have been executed.
  • FIC Hash: The flakiness-introducing commit hash for a known flaky test as described in this OOPSLA 2020 paper. As different flaky tests have different FIC hashes, there may be multiple rows for the same slug/module with different FIC hashes. 
  • Tests: The number of distinct test class and method combinations over the entire considered commit range.
  • Commits: The number of commits in the considered commit range
  • Av. Commits/Test: The average number of commits per test class and method combination in the considered commit range. The number of commits may vary for each test class, as some tests may be added or removed within the considered commit range.
  • Flaky Tests: The number of distinct test class and method combinations that have more than one test result (passed/skipped/error/failure + exception type, if any + assertion message, if any) across 30 repeated test suite executions on at least one commit in the considered commit range.
  • Tests w/ Consistent Failures: The number of distinct test class and method combinations that have the same error or failure result (error/failure + exception type, if any + assertion message, if any) across all 30 repeated test suite executions on at least one commit in the considered commit range.
  • Total Distinct Histories: The number of distinct test results (passed/skipped/error/failure + exception type, if any + assertion message, if any) for all test class and method combinations along all commits for that test in the considered commit range.

Files

Files (534.2 MB)

Name Size Download all
md5:00e3f51ddd8ae8f92b62ceafe1130cce
534.2 MB Download

Additional details

Related works

Is described by
Publication: 10.1145/3643656.3643901 (DOI)

Funding

Deutsche Forschungsgemeinschaft
IDEFIX: Identifizierung und Behebung von wiederkehrenden Softwarefehlern 4 9 6 5 8 8 2 4 2
Ludwig-Maximilians-Universität München
LMU Postdoc Support Fund LMU Postdoc Support Fund