Learning Workflow Scheduling on Multi-Resource Clusters

Published August 16, 2019 | Version Camera ready

Conference paper Open

Workflow scheduling is one of the key issues in

the management of workflow execution. Typically, a workflow

application can be modeled as a Directed-Acyclic Graph (DAG).

In this paper, we present GoDAG, an approach that can learn

to well schedule workflows on multi-resource clusters. GoDAG

directly learns the scheduling policy from experience through

deep reinforcement learning. In order to adapt deep reinforcement

learning methods, we propose a novel state representation,

a practical action space and a corresponding reward definition

for workflow scheduling problem. We implement a GoDAG

prototype and a simulator to simulate task running on multiresource

clusters. In the evaluation, we compare the GoDAG with

three state-of-the-art heuristics. The results show that GoDAG

outperforms the baseline heuristics, leading to less average

makespan to different workflow structures.

Files

Name	Size	Download all
2019.8.conference.nas.camera.pdf md5:b9cc45d788adf288491267ffaf6bded9	390.9 kB	Preview Download

VRE4EIC – A Europe-wide Interoperable Virtual Research Environment to Empower Multidisciplinary Research Communities and Accelerate Innovation and Collaboration 676247: European Commission
ENVRI PLUS – Environmental Research Infrastructures Providing Shared Solutions for Science and Society 654182: European Commission
SWITCH – Software Workbench for Interactive, Time Critical and Highly self-adaptive cloud applications 643963: European Commission
ARTICONF – smART socIal media eCOsytstem in a blockchaiN Federated environment 825134: European Commission