Datasets for "Learning Realistic Mutations: Bug Creation for Neural Bug Detectors"
Description
This artifact includes the datasets used for Learning Realistic Mutations: Bug Creation for Neural Bug Detectors.
Included are preprocessed Java datasets. Using CodeSearchNet as a starting point, the datasets are seeded with bugs of a specific bug type. We distinguish Binary operator bugs, VarMisuse bugs and Function misuses. For each bug type, we employed three level of mutator: weak, strong and contextual.
In addition, we also include validation sets, which are used during experiments to validate the bug detection models, but do not relate to experiment results reported in the study.
For each bug type, we also included the real world benchmark as test sets.
For Python and JavaScript, we include the datasets preprocessed by the contextual mutator.
Files
javascript_bop_train_data.json
Files
(2.1 GB)
Name | Size | Download all |
---|---|---|
md5:493f1125c582ca0c665c7a62007d48f2
|
240.9 MB | Download |
md5:086ff3a3cd12a1fc185439c38a07008e
|
414.4 MB | Download |
md5:707370f1b477f042ac1fb13eeb22ac8c
|
420.7 MB | Download |
md5:af06d5d9ff9cc96c7553c0b5ee482675
|
556.7 MB | Preview Download |
md5:6ea1afa1a122a4df4bb55acd0cc8eb93
|
488.0 MB | Download |