There is a newer version of the record available.

Published February 11, 2025 | Version v2
Dataset Restricted

Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches (Replication Package Part 4: Poppler Dataset)

  • 1. ROR icon University of Trento
  • 2. ROR icon Vrije Universiteit Amsterdam

Description

The Replication Package of

"Today's cat is tomorrow's dog: accounting for time-based changes in the labels of ML vulnerability detection approaches"

Part 4 (POPPLER Dataset)

This repository includes:
  1. Code that contains the codes to replicate some parts of this study:
    a. 1_generate_datasets implements our methodology to generate the datasets.
    b. 2_run_models runs the ML models during the evaluation.
    c. 3_result_replication generates charts presented in the paper from the ML evaluation results.
  2. Datasets that contain 2 folders:
    a. original datasets: 1 from NVD Vuldeepecker and 3 extracted from BigVul.
    b. POPPLER datasets: train, validation, test sets for each time of observation extracted using our methodology from BigVul dataset for project poppler.
  3. Pretrained-models that we generated during our evaluation (3 test results for each time point in the timeline [2009-2018]).
  4. Results of our evaluation, the folder ALL contains the overall results and other folders are results by model.
 
Please refer to the following repositories for the other datasets and pre-trained models:

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Additional details

Funding

European Commission
AssureMOSS - Assurance and certification in secure Multi-party Open Software and Services. 952647
European Commission
Sec4AI4Sec - Cybersecurity for AI-Augmented Systems 101120393
Dutch Research Council
Theseus NWA.1215.18.006
Dutch Research Council
HEWSTI KIC1.VE01.20.004