Published February 11, 2025
| Version v2
Dataset
Restricted
Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches (Replication Package Part 4: Poppler Dataset)
Authors/Creators
Description
The Replication Package of
"Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches"
Part 4 (POPPLER Dataset)
This repository includes:
- Code that contains the codes to replicate some parts of this study:
a. 1_generate_datasets implements our methodology to generate the datasets.
b. 2_run_models runs the ML models during the evaluation.
c. 3_result_replication generates charts presented in the paper from the ML evaluation results. - Datasets that contain 2 folders:
a. original datasets: 1 from NVD Vuldeepecker and 3 extracted from BigVul.
b. POPPLER datasets: train, validation, test sets for each time of observation extracted using our methodology from BigVul dataset for project poppler. - Pretrained-models that we generated during our evaluation (3 test results for each time point in the timeline [2009-2018]).
- Results of our evaluation, the folder ALL contains the overall results and other folders are results by model.
Please refer to the following repositories for the other datasets and pre-trained models:
- Part 1 NVD Vuldeeepecker : https://doi.org/10.5281/zenodo.8207883
- Part 2 LINUX : https://doi.org/10.5281/zenodo.10960662
- Part 3 OPENSSL : https://doi.org/10.5281/zenodo.10966117
- Part 2 LINUX : https://doi.org/10.5281/zenodo.10960662
- Part 3 OPENSSL : https://doi.org/10.5281/zenodo.10966117