Reproducibility Artifacts for "Theseus: Navigating the Labyrinth of Evaluation Bias in Provenance-based Intrusion Detection"

Anonymous

doi:10.5281/zenodo.18489505

Published February 5, 2026 | Version v1

Model Open

Reproducibility Artifacts for "Theseus: Navigating the Labyrinth of Evaluation Bias in Provenance-based Intrusion Detection"

Anonymous

This repository contains the pre-computed artifacts required to reproduce the experimental results of the Theseus model presented in the paper "Theseus: Navigating the Labyrinth of Evaluation Bias in Provenance-based Intrusion Detection".

These artifacts allow researchers to bypass the computationally intensive steps of graph construction and model training, enabling the direct reproduction of the evaluation metrics of Theseus (Table 2 in the paper) using the exact checkpoints reported.

Graph Construction Cache: Pre-processed PyTorch Geometric (PyG) data objects for the DARPA TC E3 datasets (Theia, Cadets, Trace, Fivedirections). These files contain the fully parsed provenance graphs with temporal isolation applied, ready for loading.
Model Checkpoints: The specific trained model weights (.pt files) for Theseus used to generate the final results reported in the paper.
Word2Vec Embeddings: Domain-specific semantic embeddings trained on the training splits of each dataset, required to embed the node features.

Usage

These artifacts are designed to be used in conjunction with the Theseus source code.

Download the archive.
Extract the archive directly into the project root directory. This will create the cache/ and checkpoints/ folders with the necessary files.
Run the evaluation script to verify the results reported in the paper:
```
./scripts/reproduce_results.sh
```

Datasets Covered

DARPA Transparent Computing E3 (Theia, Cadets, Fivedirections, Trace)

Files

theseus_artifacts.zip

Files (20.1 GB)

Name	Size	Download all
theseus_artifacts.zip md5:c709103b179220da3ae76e79193f1a86	20.1 GB	Preview Download

	All versions	This version
Views	29	29
Downloads	0	0
Data volume	0 Bytes	0 Bytes

Reproducibility Artifacts for "Theseus: Navigating the Labyrinth of Evaluation Bias in Provenance-based Intrusion Detection"

Authors/Creators

Description

Contents

Usage

Datasets Covered

Files

theseus_artifacts.zip

Files (20.1 GB)