Published October 4, 2021 | Version v1
Dataset Open

Datasets for "Learning Realistic Mutations: Bug Creation for Neural Bug Detectors"

  • 1. Carl von Ossietzky University of Oldenburg

Description

This artifact includes the datasets used for Learning Realistic Mutations: Bug Creation for Neural Bug Detectors.

Included are preprocessed Java datasets. Using CodeSearchNet as a starting point, the datasets are seeded with bugs of a specific bug type. We distinguish Binary operator bugs, VarMisuse bugs and Function misuses. For each bug type, we employed three level of mutator: weak, strong and contextual.

In addition, we also include validation sets, which are used during experiments to validate the bug detection models, but do not relate to experiment results reported in the study.

For each bug type, we also included the real world benchmark as test sets.

For Python and JavaScript, we include the datasets preprocessed by the contextual mutator.

Files

javascript_bop_train_data.json

Files (2.1 GB)

Name Size Download all
md5:493f1125c582ca0c665c7a62007d48f2
240.9 MB Download
md5:086ff3a3cd12a1fc185439c38a07008e
414.4 MB Download
md5:707370f1b477f042ac1fb13eeb22ac8c
420.7 MB Download
md5:af06d5d9ff9cc96c7553c0b5ee482675
556.7 MB Preview Download
md5:6ea1afa1a122a4df4bb55acd0cc8eb93
488.0 MB Download