Published August 17, 2018 | Version 2.0.0
Dataset Open

GHTraffic: A Dataset for Reproducible Research in Service-Oriented Computing

  • 1. Massey University, New Zealand

Description

This is the latest version of the GHTraffic project. The main aim is to model a variety of transaction sequences to reflect more complex service behaviour.

It has two editions: Small (S) and Large (L) where the records were created by selecting the same repositories as the original Small and Large datasets. The newest S dataset contains records from google/guava repository. The L dataset contains records from eight repositories (i.e., twbs/bootstrapsymfony/symfonydocker/dockerHomebrew/homebrewrust-lang/rustkubernetes/kubernetesrails/rails, and angular/angular.js). 

The entire data generation process is quite similar to the original GHTraffic design. But it incorporates minor changes to the process of synthetic data generation where it uses a random date after successfully posting a resource to make up the request and response for all of the HTTP methods. It also adds yet another subset of unsuccessful transactions by stipulating requests before resource creation is successful.

This results in a far more dynamic series of transactions to named resources.

Scripts used for datasets construction are accessible from the repository.

Notes

Due to the use of random data generation, the GHTraffic scripts will produce slightly different datasets at each execution.

Files

ghtraffic-L-2.0.0.zip

Files (447.3 MB)

Name Size Download all
md5:2393164286467c2243d43c8056fec921
441.3 MB Preview Download
md5:cf00cf3774dc280939b992a650f4870f
6.1 MB Preview Download