Regression-Test History Data for Flaky Test-Research, Dataset
Creators
- 1. Ludwig-Maximilians-Universität München (LMU)
- 2. LMU Munich
Contributors
Data collector:
Description
The dataset comprises developer test results of Maven projects with flaky tests across a range of consecutive commits from the projects' git commit histories. The Maven projects are a subset of those investigated in an OOPSLA 2020 paper. The commit range for this dataset has been chosen as the flakiness-introducing commit (FIC) and iDFlakies-commit (see the OOPSLA paper for details). The commit hashes have been obtained from the IDoFT dataset.
The dataset will be presented at the 1st International Flaky Tests Workshop 2024 (FTW 2024). Please refer to our extended abstract for more details about the motivation for and context of this dataset.
The following table provides a summary of the data.
Slug (Module) | FIC Hash | Tests | Commits | Av. Commits/Test | Flaky Tests | Tests w/ Consistent Failures | Total Distinct Histories |
TooTallNate/Java-WebSocket | 822d40 | 146 | 75 | 75 | 24 | 1 | 2.6x10^9 |
apereo/java-cas-client (cas-client-core) | 5e3655 | 157 | 65 | 61.7 | 3 | 2 | 1.0x10^7 |
eclipse-ee4j/tyrus (tests/e2e/standard-config) | ce3b8c | 185 | 16 | 16 | 12 | 0 | 261 |
feroult/yawp (yawp-testing/yawp-testing-appengine) | abae17 | 1 | 191 | 191 | 1 | 1 | 8 |
fluent/fluent-logger-java | 5fd463 | 19 | 131 | 105.6 | 11 | 2 | 8.0x10^32 |
fluent/fluent-logger-java | 87e957 | 19 | 160 | 122.4 | 11 | 3 | 2.1x10^31 |
javadelight/delight-nashorn-sandbox | d0d651 | 81 | 113 | 100.6 | 2 | 5 | 4.2x10^10 |
javadelight/delight-nashorn-sandbox | d19eee | 81 | 93 | 83.5 | 1 | 5 | 2.6x10^9 |
sonatype-nexus-community/nexus-repository-helm | 5517c8 | 18 | 32 | 32 | 0 | 0 | 18 |
spotify/helios (helios-services) | 23260 | 190 | 448 | 448 | 0 | 37 | 190 |
spotify/helios (helios-testing) | 78a864 | 43 | 474 | 474 | 0 | 7 | 43 |
The columns are composed of the following variables:
- Slug (Module): The project's GitHub slug (i.e., the project's URL is https://github.com/{Slug}) and, if specified, the module for which tests have been executed.
- FIC Hash: The flakiness-introducing commit hash for a known flaky test as described in this OOPSLA 2020 paper. As different flaky tests have different FIC hashes, there may be multiple rows for the same slug/module with different FIC hashes.
- Tests: The number of distinct test class and method combinations over the entire considered commit range.
- Commits: The number of commits in the considered commit range
- Av. Commits/Test: The average number of commits per test class and method combination in the considered commit range. The number of commits may vary for each test class, as some tests may be added or removed within the considered commit range.
- Flaky Tests: The number of distinct test class and method combinations that have more than one test result (passed/skipped/error/failure + exception type, if any + assertion message, if any) across 30 repeated test suite executions on at least one commit in the considered commit range.
- Tests w/ Consistent Failures: The number of distinct test class and method combinations that have the same error or failure result (error/failure + exception type, if any + assertion message, if any) across all 30 repeated test suite executions on at least one commit in the considered commit range.
- Total Distinct Histories: The number of distinct test results (passed/skipped/error/failure + exception type, if any + assertion message, if any) for all test class and method combinations along all commits for that test in the considered commit range.
Files
Files
(534.2 MB)
Name | Size | Download all |
---|---|---|
md5:00e3f51ddd8ae8f92b62ceafe1130cce
|
534.2 MB | Download |
Additional details
Related works
- Is described by
- Publication: 10.1145/3643656.3643901 (DOI)