There is a newer version of the record available.

Published July 31, 2022 | Version v1
Dataset Open

Generating Realistic Vulnerabilities via Neural Code Editing: An Empirical Study

Creators

Description

Using a commonly used synthetic dataset and one real-world dataset, we investigate the potential and gaps of three state-of-the-art neural code editors (Graph2Edit, Hoppity, SequenceR) for DL-based realistic vulnerability data generation, and two state-of-the-art vulnerability detectors (Devign, ReVeal) to evaluate the effectiveness of the generated realistic vulnerability data.

Once the users have Docker installed download the Docker image "neural_editors_vulgen_docker.tar.xz".

Then, check the README.md for detailed steps of reproducing the experiments.

Besides, we also provide the simple package of the artifact "neural_editors_vulgen.zip". The raw data of our experiments is also provided in this simple package. However, using it to reproduce the experiments requires the users to set up the enviroments and dependencies for all the five tools, which is not recommanded.

Files

neural_editors_vulgen.zip

Files (33.5 GB)

Name Size Download all
md5:bd173b636e12cdb8152259d0f91a8563
4.4 GB Preview Download
md5:de23ec2efc935e6b813adaa0fc565640
29.1 GB Download
md5:5a4f1553317a57dd8deabeecad42e55b
6.9 kB Preview Download