Published September 16, 2016 | Version v1
Dataset Open

A Queueing Network Model for Performance Prediction of Apache Cassandra

  • 1. Imperial College London
  • 2. Politecnico di Milano

Description

The dataset consists in several csv files containing Cassandra and ScyllaDB performance.

The experiments are organized in folders. There are three main folders containing:
 - Cassandra 4 nodes: the files related to the Cassandra experiments conducted on a cluster composed of four nodes.
 - ScyllaDB 4 nodes: The files related to the ScyllaDB experiments conducted on a cluster composed of four nodes. 
 - Cassandra QUORUM variant: the simulation data where a different kind of QUORUM is implemented in Cassandra.
 
"Cassandra 4 nodes" and "ScyllaDB 4 nodes" include some subfolders, each one containing the files of the Consistency Level applied for those experiments. Each experiment is composed by three files (data*.csv) with the data reported by Yahoo! Cloud System Benchmark (YCSB) in the end of the experiment execution. Each folder contains also a sim.csv file with the data gathered from the simulation of the model inside Java Modeling Tool.

The data*.csv files are composed by:
 -Number of threads or clients
 -Overall Throughput
 -Number of Read requests
 -Overall Read Response Time
 -95 percentile Read Response Time
 -99 percentile Read Response Time
 -99.9 percentile Read Response Time
 
Differently, the sim.csv files are composed by:
 -Number of threads or clients
 -Overall Throughput
 -Overall Read Response Time

Notes

This is part of the DICE (H2020- 644869) project. This project is supported by EPSRC Centre for Doctoral Training in High Performance Embedded and Distributed Systems(EP/L016796/1).

Files

dataset.zip

Files (10.2 kB)

Name Size Download all
md5:0c1e9a3815acf3d90cd9baa6bc11b042
8.9 kB Preview Download
md5:b91cb4ac69122a9d5b6dd2cb69cd9311
1.4 kB Download

Additional details

Funding

DICE – Developing Data-Intensive Cloud Applications with Iterative Quality Enhancements 644869
European Commission