Dataset Artifact for paper "Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?"

Pham, Luan; Ha, Huong; Zhang, Hongyu

doi:10.5281/zenodo.13305663

Published August 12, 2024 | Version 0.0.1

Dataset Open

Dataset Artifact for paper "Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?"

1. RMIT University
2. Chongqing University

Artifacts for the paper titled Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?.

This artifact repository contains 9 compressed folders, as follows:

ID	File Name	Description
1	syn_circa.zip	CIRCA10, and CIRCA50 datasets for Causal Discovery
2	syn_rcd.zip	RCD10, and RCD50 datasets for Causal Discovery
3	syn_causil.zip	CausIL10, and CausIL50 datasets for Causal Discovery
4	rca_circa.zip	CIRCA10, and CIRCA50 datasets for RCA
5	rca_rcd.zip	RCD10, and RCD50 datasets for RCA
6	online-boutique.zip	Online Boutique dataset for RCA
7	sock-shop-1.zip	Sock Shop 1 dataset for RCA
8	sock-shop-2.zip	Sock Shop 2 dataset for RCA
9	train-ticket.zip	Train Ticket dataset for RCA

Each zip file contains the generated/collected data from the corresponding data generator or microservice benchmark systems (e.g., online-boutique.zip contains metrics data collected from the Online Boutique system).

Details about the generation of our datasets

1. Synthetic datasets

We use three different synthetic data generators from three previous RCA studies [15, 25, 28] to create the synthetic datasets: CIRCA, RCD, and CausIL data generators. Their mechanisms are as follows:

1. CIRCA datagenerator [28] generates a random causal directed acyclic graph (DAG) based on a given number of nodes and edges. From this DAG, time series data for each node is generated using a vector auto-regression (VAR) model. A fault is injected into a node by altering the noise term in the VAR model for two timestamps.

2. RCD data generator [25] uses the pyAgrum package [3] to generate a random DAG based on a given number of nodes, subsequently generating discrete time series data for each node, with values ranging from 0 to 5. A fault is introduced into a node by changing its conditional probability distribution.

3. CausIL data generator [15] generates causal graphs and time series data that simulate the behavior of microservice systems. It first constructs a DAG of services and metrics based on domain knowledge, then generates metric data for each node of the DAG using regressors trained on real metrics data. Unlike the CIRCA and RCD data generators, the CausIL data generator does not have the capability to inject faults.

To create our synthetic datasets, we first generate 10 DAGs whose nodes range from 10 to 50 for each of the synthetic data generators. Next, we generate fault-free datasets using these DAGs with different seedings, resulting in 100 cases for the CIRCA and RCD generators and 10 cases for the CausIL generator. We then create faulty datasets by introducing ten faults into each DAG and generating the corresponding faulty data, yielding 100 cases for the CIRCA and RCD data generators. The fault-free datasets (e.g. `syn_rcd`, `syn_circa`) are used to evaluate causal discovery methods, while the faulty datasets (e.g. `rca_rcd`, `rca_circa`) are used to assess RCA methods.

2. Data collected from benchmark microservice systems

We deploy three popular benchmark microservice systems: Sock Shop [6], Online Boutique [4], and Train Ticket [8], on a four-node Kubernetes cluster hosted by AWS. Next, we use the Istio service mesh [2] with Prometheus [5] and cAdvisor [1] to monitor and collect resource-level and service-level metrics of all services, as in previous works [ 25 , 39, 59 ]. To generate traffic, we use the load generators provided by these systems and customise them to explore all services with 100 to 200 users concurrently. We then introduce five common faults (CPU hog, memory leak, disk IO stress, network delay, and packet loss) into five different services within each system. Finally, we collect metrics data before and after the fault injection operation. An overview of our setup is presented in the Figure below.

Code

The code to reproduce the experimental results in the paper is available at https://github.com/phamquiluan/RCAEval.

References

As in our paper.

Files

online-boutique.zip

Files (606.2 MB)

Name	Size	Download all
online-boutique.zip md5:ed1c9989b57365ea78b72eb93d17bff9	31.0 MB	Preview Download
rca_circa.zip md5:68054cdd94458437712d939a5edd3be5	52.7 MB	Preview Download
rca_rcd.zip md5:a7fdff392b82aa61ab1ef0bedbd360a6	22.7 MB	Preview Download
sock-shop-1.zip md5:697e5ad6109a4c63d10d7b6d16617fdd	3.5 MB	Preview Download
sock-shop-2.zip md5:75c8c577be9671692330eb9dc220a1b3	79.1 MB	Preview Download
syn_causil.zip md5:c094efb7bbb51fe069efcf7764fb1383	73.8 MB	Preview Download
syn_circa.zip md5:dc6519ca9a774faf8cd9d491a874e079	52.5 MB	Preview Download
syn_rcd.zip md5:c54c6a5f147ff03e99a7bca4a5d1cc97	11.2 MB	Preview Download
train-ticket.zip md5:d585407d184e8da50bf9d01cd0fc928b	279.7 MB	Preview Download

Additional details

Repository URL: https://github.com/phamquiluan/RCAEval
Programming language: Python
Development Status: Active

	All versions	This version
Views	346	346
Downloads	6,277	6,277
Data volume	473.1 GB	473.1 GB

Dataset Artifact for paper "Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?"

Authors/Creators

Description

Files

online-boutique.zip

Files (606.2 MB)

Additional details

Software