Published April 13, 2026 | Version v1
Dataset Open

AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems

Description

AnoMod is a multimodal anomaly dataset built on two open-source microservice systems, SocialNetwork and TrainTicket. The goal of the dataset is to support research on anomaly detection and fine-grained root cause analysis (RCA) in complex microservice architectures.

We design and inject four categories of anomalies—performance-level, service-level, database-level and code-level—to emulate realistic failure scenarios observed in production systems. For each anomaly case, we collect five data modalities:
1. Logs: timestamped textual records from application containers capturing events, errors and contextual information.  
2. Metrics: time-series measurements collected via Prometheus (CPU, memory, disk, network, request counts and process-level indicators).  
3. Distributed traces: request-level call chains captured with Jaeger (for SocialNetwork) and SkyWalking (for TrainTicket), revealing service dependencies and latency.  
4. API responses: client-side observations of HTTP status codes, latencies and response bodies, providing a black-box view of user-visible behaviour.  
5. Code coverage reports: coverage information generated by gcov (C++) and JaCoCo (Java) linking runtime execution to source code lines and branches, enabling fine-grained RCA.

Data were collected using an automated three-phase pipeline: (i) EvoMaster generates workloads based on OpenAPI specifications to simulate realistic user requests; (ii) anomalies are injected using a predefined anomaly library with configurations adapted from ChaosMesh/ChaosBlade; (iii) for each anomaly case, the system is reset to a clean state, the anomaly is activated, the workload is executed and all five modalities are captured synchronously. Each anomaly run is saved in its own folder.

The dataset supports cross-modal anomaly detection, fusion/ablation studies and service/code-level RCA research. For detailed directory structure and instructions to access the data and scripts, see the README files in the GitHub repository.

This dataset accompanies the paper ‘AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems’ published at MSR’26 (Rio de Janeiro, April 13–14, 2026).

Files

AnoMod.zip

Files (201.9 MB)

Name Size Download all
md5:30c5835190c3550c5efc34d6be4f9238
201.9 MB Preview Download

Additional details

Related works

Is described by
Conference paper: 10.1145/3793302.3793324 (DOI)