There is a newer version of the record available.

Published July 31, 2022 | Version v1
Dataset Open

Generating Realistic Vulnerabilities via Neural Code Editing: An Empirical Study



Using a commonly used synthetic dataset and one real-world dataset, we investigate the potential and gaps of three state-of-the-art neural code editors (Graph2Edit, Hoppity, SequenceR) for DL-based realistic vulnerability data generation, and two state-of-the-art vulnerability detectors (Devign, ReVeal) to evaluate the effectiveness of the generated realistic vulnerability data.

Once the users have Docker installed download the Docker image "neural_editors_vulgen_docker.tar.xz".

Then, check the for detailed steps of reproducing the experiments.

Besides, we also provide the simple package of the artifact "". The raw data of our experiments is also provided in this simple package. However, using it to reproduce the experiments requires the users to set up the enviroments and dependencies for all the five tools, which is not recommanded.


Files (33.5 GB)

Name Size Download all
4.4 GB Preview Download
29.1 GB Download
6.9 kB Preview Download