TUD Anomaly Detection Model (ONNX)
Authors/Creators
Description
Model Info
This repository contains a trained Autoencoder-based anomaly detection model developed in the context of the MLSysOps project (Machine Learning for Autonomic System Operation in the Heterogeneous Edge-Cloud Continuum), funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.
The model is exported in ONNX format for efficient inference on edge or cloud devices.
Purpose
This model performs unsupervised anomaly detection on node/VM telemetry metrics by learning to reconstruct normal observations.
- Input: A feature vector of telemetry metrics (float values), normalized with Min-Max scaling.
- Output: The reconstructed feature vector.
- Anomaly score: RMSE between input and reconstruction.
- Decision rule: anomaly if
RMSE > threshold(threshold stored inmodel_config.json).
Repository Structure
The repository provides the trained model and its configuration for easy deployment.
.
├── demo.py # Inference script (ONNXRuntime)
├── model/
│ ├── autoencoder.onnx # ONNX model
│ └── model_config.json # Model configuration (features, normalization, threshold)
├── requirements.txt # Python dependencies
└── README.md # Documentation
Training Data
The model was trained on telemetry data representing normal system behavior. The training dataset is not included in this Zenodo record unless explicitly provided in the uploaded files.
Important: The inference input must use the same feature ordering as the training data.
Features Used (Feature Order)
The expected feature order (last dimension of the input tensor) is:
- cpu_0_idle
- cpu_0_iowait
- cpu_0_irq
- cpu_0_nice
- cpu_0_softirq
- cpu_0_steal
- cpu_0_system
- cpu_0_user
- cpu_1_idle
- cpu_1_iowait
- cpu_1_irq
- cpu_1_nice
- cpu_1_softirq
- cpu_1_steal
- cpu_1_system
- cpu_1_user
- cpu_2_idle
- cpu_2_iowait
- cpu_2_irq
- cpu_2_nice
- cpu_2_softirq
- cpu_2_steal
- cpu_2_system
- cpu_2_user
- cpu_3_idle
- cpu_3_iowait
- cpu_3_irq
- cpu_3_nice
- cpu_3_softirq
- cpu_3_steal
- cpu_3_system
- cpu_3_user
- memory_used_bytes
- node_memory_Buffers_bytes
- node_memory_Cached_bytes
- node_memory_MemAvailable_bytes
- node_memory_MemFree_bytes
- node_memory_MemTotal_bytes
(These names must match model/model_config.json.)
Model Architecture
This model is a fully-connected Autoencoder with ReLU activations:
- Encoder dims:
feature_size -> int(0.75*feature_size) -> int(0.5*feature_size) -> int(0.25*feature_size) -> int(0.1*feature_size) - Decoder dims: symmetric back to
feature_size
Model Specification
Inputs
- Input name:
x - Shape:
[batch_size, 38] - Type:
float32 - Description: Min-Max normalized feature vector
Preprocessing
x_norm = (x - min) / (max - min)- If a feature has
max == min(constant feature in training), normalization must avoid division by zero (recommended: set the normalized feature to0.0). - Optionally clamp
x_normto[0, 1]if desired (configurable viamodel_config.json).
Outputs
- Output name:
reconstruction - Shape:
[batch_size, 38] - Type:
float32 - Description: Reconstructed feature vector
Post-processing (Anomaly Detection)
rmse = sqrt(mean((x_norm - reconstruction)^2))per sampleanomaly = 1 if rmse > threshold else 0thresholdis stored inmodel/model_config.json
Limitations
- Feature order & dimension are fixed: Inputs must have exactly 38 features in the specified order.
- Normalization is training-dependent: Min/Max parameters are derived from the training data distribution; out-of-distribution inputs may yield unreliable anomaly scores.
- Constant features: Features with
max == minrequire special handling during normalization (avoid division by zero). - ONNX output is reconstruction only: The anomaly score/label is computed in the inference script.
Usage Demo
1. Setup Environment
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
2. Run Inference Script
python demo.py --model model/autoencoder.onnx --config model/model_config.json --csv telemetry.csv --row 0
CSV Format Requirements
- CSV must include a header row.
- Numeric columns only (or ensure the numeric columns match the 38 features exactly).
- Column order must match the feature list and
model_config.json.
Citation
If you wish to cite this model, please use the citation generated by Zenodo (located in the right sidebar of this record).
Acknowledgement & Funding
This work is part of the MLSysOps project, funded by the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101092912.
More information about the project is available at https://mlsysops.eu/
Files
TUD_anomaly_detection_model.zip
Files
(20.2 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:c7c161d098380f65558e751fbd02b5c0
|
20.2 kB | Preview Download |
Additional details
Related works
- Is derived from
- Dataset: 10.5281/zenodo.18311353 (DOI)
- Is supplement to
- Software: https://github.com/mlsysops-eu/model-anomaly-detection (URL)
Funding
Software
- Repository URL
- https://github.com/mlsysops-eu/model-anomaly-detection