Published August 5, 2020 | Version 1.0
Dataset Open

AmadeusGitHubBugDataset

Description

This data set will be released as part of the following publication.
"Root cause prediction based on bug reports" by T. Hirsch and B. Hofer [in review]

This release consists of a data set of 54755 "bug" labeled issues reports collected from 103 open source projects hosted on GitHub.
10459 of which are considered "benchmark complete", and they contain: issue message, commit messages, java aware diff statistics, change location down to method level, GitHub issue metadata, and commit metadata.
Please have a look at the corresponding publication for a detailed description of applied heuristics and filter criteria.
 

Files

AmadeusGithubBugDataset.zip

Files (45.3 MB)

Name Size Download all
md5:76e0a70bac1745557f2f025ace113fc7
45.3 MB Preview Download

Additional details

Funding

FWF Austrian Science Fund
Automated Debugging in Use P 32653