System for Prediction and Early Detection of Insider Attacks (SPEDIA) Dataset

David Álvarez; Luis Pérez; Alberto Mateo; Xavier Larriva-novo; Manuel Álvarez-Campana; Víctor A. Villagra

doi:10.5281/zenodo.15495572

Published May 23, 2025 | Version 1.0

Dataset Open

System for Prediction and Early Detection of Insider Attacks (SPEDIA) Dataset

1. Universidad Politécnica de Madrid

The SPEDIA dataset was developed as part of an academic cybersecurity project focused on insider threat detection and analysis. It was generated through a 30-day cyber exercise in which real users with technical backgrounds performed realistic insider attacks based on the MITRE ATT&CK framework.

The dataset integrates data from three sources:

Malicious activity performed by real participants during the cyber exercise.
Non-malicious activity simulated via a role-based behavioral model.
Synthetic events derived from the CERT Insider Threat dataset.

The dataset includes over 20 fields per event, capturing rich information such as SSH and FTP connections, command execution, HTTP and email activity, file modifications, and more. It features a balanced distribution of malicious and non-malicious events, making it suitable for training supervised anomaly detection models.

Applications:

Training and evaluation of insider threat detection models.
Behavioral analysis of users in controlled network environments.
Validation of incident response and risk assessment tools.

Format: CSV (cleaned version, with 23 key columns)

Files

logs_SPEDIA.csv

Files (43.0 MB)

Name	Size	Download all
logs_SPEDIA.csv md5:a78dd427e8fc2dab15aefd54f27557e6	43.0 MB	Preview Download

Additional details

Repository URL: https://github.com/UPM-RSTI/SPEDIA-Dataset

	All versions	This version
Views	127	65
Downloads	120	57
Data volume	8.8 GB	4.0 GB

System for Prediction and Early Detection of Insider Attacks (SPEDIA) Dataset

Creators

Description

Files

logs_SPEDIA.csv

Files (43.0 MB)

Additional details

Software