Published May 26, 2021 | Version 0.0.3
Dataset Open

Characterizing Distributed Machine Learning Workloads on Apache Spark

Contributors

Data manager:

  • 1. ROR icon University of Neuchâtel

Description

This dataset was used for our submission at Middleware'22 titled: "Characterizing Distributed ML Workloads"

It will contains the description and the raw data, its format, as well as a detailed description of the cluster deployments used by these experiments.
 

The full paper is available here:

https://dl.acm.org/doi/10.1145/3590140.3629112

Files

MLlib.zip

Files (1.2 GB)

Name Size Download all
md5:6c9119f8c73fcd7f6754c319426a5f13
1.2 GB Preview Download
md5:23d19e1e7cdddcf9681bffcc43c23812
43.0 kB Preview Download

Additional details

Related works

Is described by
Publication: 10.1145/3590140.362911 (DOI)

Software

Development Status
Inactive