Published January 15, 2026 | Version v1.0.0
Dataset Open

INRIA SHIELD Framework Dataset - 5G Jamming Attack Detection

  • 1. ROR icon Centre de recherche Inria Lille - Nord Europe
  • 2. ROR icon University of Padua

Contributors

  • 1. ROR icon University College Dublin

Description

📌 Overview

This dataset contains physical layer (PHY) cellular network traces collected from an Android smartphone (OnePlus Nord 2T 5G) under 5G jamming attack scenarios.

It serves as the official training and validation data for the SHIELD Framework.

✍️ Authors

  • Jiali Xu (Inria Centre at the University of Lille)
  • Aya Moheddine (Inria Centre at the University of Lille)
  • Valéria Loscrì (Inria Centre at the University of Lille)
  • Alessandro Brighente (Department of Methematics, University of Padova)
  • Mauro Conti (Department of Methematics, University of Padova)

📂 File Description

1. Raw Data (data/raw/)

  • replay.log: The unprocessed Android radio log captured via adb logcat -b radio. It contains mixed streams of signal reports, thermal sensors, and modem debug messages.

2. Processed Data (data/processed/)

  • fused_input.csv: (Recommended for Use) The synchronized, feature-engineered dataset ready for Machine Learning.
    • Frequency: 1Hz (Resampled)
    • Dimensions: 60 Columns (+1 timestamp)
    • Format: Time-series matrix suited for LSTM/RNN models.

3. Configuration (config/)

  • 1plus-nord2t.yaml: Defines the Regular Expressions (Regex) used to parse the raw log file. It maps specific log tags (e.g., AT< +ECSQ) to data features.

4. Tooling (scripts/)

  • parse_data.py: A Python script that reads 1plus-nord2t.yaml to extract raw metrics from the log into intermediate CSVs.
  • fuse_data.py: A Python script that performs time-synchronization (linear interpolation) and feature extraction (rolling window statistics).

📊 Dataset Schema (fused_input.csv)

The fused dataset contains 60 feature columns. These are derived from 12 Raw Metrics processed through 5 Statistical Aggregations over a 10-second sliding window.

The 12 Raw Metrics

  1. Signal Strength (3): ssRsrp, ssRsrq, ssSinr (Standard 5G metrics).
  2. Extended Quality (6): ecsq_idx0, ecsq_idx1, ecsq_idx2, ecsq_idx5, ecsq_idx6, ecsq_idx8 (Specific modem quality indices from AT+ECSQ).
  3. Thermal (2): thermal_idx3, thermal_idx5 (Device internal temperature sensors).
  4. RF Transmission (1): erftx_idx9 (Uplink transmission power state).

The 5 Aggregations (Suffixes)

For each raw metric above, the following statistics are calculated:

  • _mean: Average value over the window.
  • _max: Maximum value.
  • _min: Minimum value.
  • _std: Standard deviation (Stability indicator).
  • _amplitude: Difference between Max and Min (max - min).

Total Dimensions: 12 metrics × 5 aggregations = 60 Columns.

Example Column Names:

  • ssRsrp_mean (Average Signal Power)
  • ssSinr_std (Signal to Noise Stability)
  • thermal_idx3_amplitude (Temperature fluctuation)

🛠️ Usage Instructions

Option A: Quick Start (ML Training)

Load the pre-processed file directly into your model.

import pandas as pd
df = pd.read_csv("data/processed/fused_input.csv", index_col="timestep")
print(df.shape)  
# Output: (Rows, 60)

Option B: Reproduce the Pipeline

If you wish to change the preprocessing parameters (e.g., change window size from 10s to 5s), follow these steps:

1. Create a python environment:

python -m venv venv
source venv/bin/activate

2. Install Requirements:

pip install pandas pyyaml

3. Run the Parser: Extracts the raw numbers from data/raw/replay.log using the rules in config/1plus-nord2t.yaml.

python scripts/parse_data.py

Output: Creates a parsed_data/ folder with individual CSVs.

4. Run the Fuser: Synchronizes the data to 1Hz and calculates rolling statistics.

python scripts/fuse_data.py

Output: Generates a new fused_data.csv.

📚 Citation

If you use the SHIELD Framework or this dataset in your research, please cite the following paper:

@InProceedings{10.1007/978-3-032-00624-0_12,
  author="Xu, Jiali and Moheddine, Aya and Loscr{\`i}, Val{\'e}ria and Brighente, Alessandro and Conti, Mauro",
  title="SHIELD: Scalable and Holistic Evaluation Framework for ML-Based 5G Jamming Detection",
  booktitle="Availability, Reliability and Security",
  year="2025",
  publisher="Springer Nature Switzerland",
  address="Cham",
  pages="235--256",
  isbn="978-3-032-00624-0"
}

If you wish to cite this specific dataset version, please use the citation generated by Zenodo (located in the right sidebar of this record).

🇪🇺 Acknowledgment & Funding

This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.

Files

SHIELD_Dataset_v1.zip

Files (319.1 kB)

Name Size Download all
md5:b8ec2824d648c18d502ea6bff9ec5639
319.1 kB Preview Download

Additional details

Related works

Is supplement to
Conference proceeding: 10.1007/978-3-032-00624-0_12 (DOI)

Funding

European Commission
MLSysOps - Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum 101092912