Towards Distribution Transparency for Supervised ML With Oblivious Training Functions

Moritz Meister; Sina Sheikholeslami; Robin Andersson; Alexandru A. Ormenisan; Jim Dowling

doi:10.5281/zenodo.3941640

Published March 4, 2020 | Version v1

Other Open

Towards Distribution Transparency for Supervised ML With Oblivious Training Functions

1. Logical Clocks AB
2. KTH Royal Institute of Technology
3. KTH Royal Institute of Technology, Logical Clocks AB

Building and productionizing Machine Learning (ML) models is a process of interdependent steps of iterative code updates, including exploratory model design, hyperparameter tuning, ablation experiments, and model training. Industrial-strength ML involves doing this at scale, using many compute resources, and this requires rewriting the training code to account for distribution. The result is that moving from a single host program to a cluster hinders iterative development of the software, as iterative development would require multiple versions of the software to be maintained and kept consistent. In this paper, we introduce the distribution oblivious training function as an abstraction for ML development in Python, whereby developers can reuse the same training function when running a notebook on a laptop or performing scale-out hyperparameter search and distributed training on clusters. Programs written in our framework look like industry-standard ML programs as we factor out dependencies using best-practice programming idioms (such as functions to generate models and data batches). We believe that our approach takes a step towards unifying single-host and distributed ML development.

Notes

Workshop

Files

oblivious-training_mlsys20.pdf

Files (691.4 kB)

Name	Size	Download all
oblivious-training_mlsys20.pdf md5:3b3d4dc33e8f0f295c5c118a5009d34a	691.4 kB	Preview Download

Additional details

ExtremeEarth – From Copernicus Big Data to Extreme Earth Analytics 825258: European Commission

	All versions	This version
Views	47	47
Downloads	28	28
Data volume	20.1 MB	20.1 MB

Towards Distribution Transparency for Supervised ML With Oblivious Training Functions

Creators

Description

Notes

Files

oblivious-training_mlsys20.pdf

Files (691.4 kB)

Additional details

Funding