AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems
Authors/Creators
Description
AnoMod is a multimodal anomaly dataset built on two open-source microservice systems, SocialNetwork and TrainTicket. The goal of the dataset is to support research on anomaly detection and fine-grained root cause analysis (RCA) in complex microservice architectures.
We design and inject four categories of anomalies—performance-level, service-level, database-level and code-level—to emulate realistic failure scenarios observed in production systems. For each anomaly case, we collect five data modalities:
1. Logs: timestamped textual records from application containers capturing events, errors and contextual information.
2. Metrics: time-series measurements collected via Prometheus (CPU, memory, disk, network, request counts and process-level indicators).
3. Distributed traces: request-level call chains captured with Jaeger (for SocialNetwork) and SkyWalking (for TrainTicket), revealing service dependencies and latency.
4. API responses: client-side observations of HTTP status codes, latencies and response bodies, providing a black-box view of user-visible behaviour.
5. Code coverage reports: coverage information generated by gcov (C++) and JaCoCo (Java) linking runtime execution to source code lines and branches, enabling fine-grained RCA.
Data were collected using an automated three-phase pipeline: (i) EvoMaster generates workloads based on OpenAPI specifications to simulate realistic user requests; (ii) anomalies are injected using a predefined anomaly library with configurations adapted from ChaosMesh/ChaosBlade; (iii) for each anomaly case, the system is reset to a clean state, the anomaly is activated, the workload is executed and all five modalities are captured synchronously. Each anomaly run is saved in its own folder.
The dataset supports cross-modal anomaly detection, fusion/ablation studies and service/code-level RCA research. For detailed directory structure and instructions to access the data and scripts, see the README files in the GitHub repository.
This dataset accompanies the paper ‘AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems’ published at MSR’26 (Rio de Janeiro, April 13–14, 2026).
Files
AnoMod.zip
Files
(201.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:30c5835190c3550c5efc34d6be4f9238
|
201.9 MB | Preview Download |
Additional details
Related works
- Is described by
- Conference paper: 10.1145/3793302.3793324 (DOI)