The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures

Rafail Tsirbas; Giorgos Vasiliadis; Sotiris Ioannidis

doi:10.5281/zenodo.6410912

Published April 4, 2022 | Version v1

Conference paper Open

The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures

1. Foundation for Research and Technology - Hellas
2. Foundation for Research and Technology - Hellas, Hellenic Mediterranean University
3. Foundation for Research and Technology - Hellas, Technical University of Crete

A plethora of applications are using machine learning, the operations of which are becoming more complex and require additional computing power. At the same time, typical commodity system setups (including desktops, servers, and embedded devices) are now offering different processing devices, the most often of which are multi-core CPUs, integrated GPUs, and discrete GPUs. In this paper, we follow a data-driven approach, where we first show the performance of different processing devices when executing a diversified set of inference engines; some processing devices perform better for different performance metrics (e.g., throughput, latency, and power consumption), while at the same time, these metrics may also deviate significantly among different applications. Based on these findings, we propose an adaptive scheduling approach, tailored for machine learning inference operations, that enables the use of the most efficient processing device available. Our scheduler is device-agnostic and can respond quickly to dynamic fluctuations that occur at real-time, such as data bursts, application overloads and system changes. The experimental results show that it is able to match the peak throughput, by predicting correctly the optimal processing device with an accuracy of 92.5%, with energy savings up to 10%.

Files

IPDPSW2022_Tsirmpas_et_al_preprint.pdf

Files (828.3 kB)

Name	Size	Download all
IPDPSW2022_Tsirmpas_et_al_preprint.pdf md5:7d518d201181000d874a8b48fcb3872f	828.3 kB	Preview Download

Additional details

Is published in: Conference paper: 10.1109/IPDPSW55747.2022.00017 (DOI)

European Commission
CONCORDIA - Cyber security cOmpeteNCe fOr Research anD InnovAtion 830927
European Commission
C4IIoT - Cyber security 4.0: protecting the Industrial Internet Of Things 833828
European Commission
COLLABS - A COmprehensive cyber-intelligence framework for resilient coLLABorative manufacturing Systems 871518
European Commission
MARVEL - Multimodal Extreme Scale Data Analytics for Smart Cities Environments 957337

	All versions	This version
Views	499	498
Downloads	382	381
Data volume	329.7 MB	328.8 MB

The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures

Files

IPDPSW2022_Tsirmpas_et_al_preprint.pdf

Files (828.3 kB)

Additional details

Related works

Funding

The Best of Many Worlds: Scheduling Machine Learning Inference on CPU-GPU Integrated Architectures

Creators

Description

Files

IPDPSW2022_Tsirmpas_et_al_preprint.pdf

Files (828.3 kB)

Additional details

Related works

Funding