Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches (Replication Package Part 1: NVD Vuldeepecker Dataset)
Authors/Creators
Description
The Replication Package of
"Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches"
Part 1 (NVD Vuldeepecker Dataset)
- Code.zip that contains the codes to replicate some parts of this study:
a. 1_generate_datasets implements our methodology to generate the datasets.
b. 2_run_models runs the ML models during the evaluation.
c. 3_result_replication generates charts presented in the paper from the ML evaluation results. - Datasets.zip that contain 2 folders:
a. original datasets: 1 from NVD Vuldeepecker and 3 extracted from BigVul.
b. NVD Vuldeepecker datasets: train, validation, test sets for each time of observation extracted using our methodology from NVD Vuldeepecker dataset. - Pretrained-models.zip that we generated during our evaluation (3 test results for each time point in the timeline [2008-2019]).
- Results.zip of our evaluation, the folder ALL contains the overall results and other folders are results by model.
UPDATED version 8
- added a GLOBAL_README.md which contains the 3 stages and how they are connected to each other
- updated LineVul.ipynb: import AdamW from torch.optim instead of transformers
- updated README.md in Code2Vec with the prerequisites of Java to run gradlew for astminer
UPDATED version 9
- updated CodeBert.ipynb: import AdamW from torch.optim instead of transformers
Documentations
- INSTALL.pdf : how to install the codes
- README.pdf: readme file
- REQUIREMENTS.pdf: hardware and software requirements
- STATUS.pdf : status for artifact submission
- LICENSE.pdf: the license of this artifact
- PAPER.pdf: the camera-ready version of the paper
Please refer to the following repositories for the other datasets and pre-trained models:
- Part 2 LINUX : https://doi.org/10.5281/zenodo.10960662
- Part 3 OPENSSL : https://doi.org/10.5281/zenodo.10966117
- Part 4 POPPLER : https://doi.org/10.5281/zenodo.14713143
This work was partly funded by the EU under the H2020 Program AssureMOSS (Grant n. 952647) and the Horizon Europe Program Sec4AI4Sec (Grant n. 101120393), by the Italian Ministry of University and Research (MUR) under the P.N.R.R. – NextGenerationEU grant n.\ PE00000014 (SERICS subproject COVERT), and by the Dutch Research Council (NWO) under the grant NWA.1215.18.006 (Theseus) and grant KIC1.VE01.20.004 (HEWSTI).
Files
Code.zip
Files
(31.7 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:3d5903be490133fb9d053253bd49b1af
|
73.3 MB | Preview Download |
|
md5:7ffb179c4eab9f4e4b4962d54ff2dd75
|
27.2 MB | Preview Download |
|
md5:057d12d8811871afb466f38a12a79547
|
196.5 kB | Preview Download |
|
md5:e0e70cefb02ddbef90f1cd1e36d32ea4
|
79.2 kB | Preview Download |
|
md5:a8b4661df6898e8a14602202a2faf6d0
|
4.5 MB | Preview Download |
|
md5:9d6aa73d70b390ef62551dc7ae793e92
|
31.6 GB | Preview Download |
|
md5:9df90f1bbb221c710fb773eac0cd1e54
|
1.2 kB | Preview Download |
|
md5:a85682a67548b11306555341e6b6a8a6
|
82.5 kB | Preview Download |
|
md5:d9840d04aa595a30d0fff030a280e33e
|
143.0 kB | Preview Download |
|
md5:f2e0c799b9d047fdb40a9023c47b09f4
|
117.0 kB | Preview Download |
|
md5:ef03b78b116a4fe8c5af61e9a0eaae77
|
74.5 kB | Preview Download |
Additional details
Funding
Dates
- Updated
-
2025-05-17