Published March 2, 2021 | Version v1
Conference paper Open

HPC Application Cloudification: The StreamFlow Toolkit

Description

Finding an effective way to improve accessibility to High-Performance Computing facilities, still anchored to SSH-based remote shells and queue-based job submission mechanisms, is an open problem in computer science. This work advocates a cloudification of HPC applications through a cluster-as-accelerator pattern, where computationally demanding portions of the main execution flow hosted on a Cloud Finding an effective way to improve accessibility to High-Performance Computing facilities, still anchored to SSH-based remote shells and queue-based job submission mechanisms, is an open problem in computer science. This work advocates a cloudification of HPC applications through a cluster-as-accelerator pattern, where computationally demanding portions of the main execution flow hosted on a Cloud infrastructure can be offloaded to HPC environments to speed them up. We introduce StreamFlow, a novel Workflow Management System that supports such a design pattern and makes it possible to run the steps of a standard workflow model on independent processing elements with no shared storage. We validated the proposed approach’s effectiveness on the CLAIRE COVID-19 universal pipeline, i.e. a reproducible workflow capable of automating the comparison of (possibly all) state-of-the-art pipelines for the diagnosis of COVID-19 interstitial pneumonia from CT scans images based on Deep Neural Networks (DNNs).

Files

OASIcs-PARMA-DITAM-2021-5.pdf

Files (3.9 MB)

Name Size Download all
md5:467b0ee2387df882fc2cef0483eb0b42
3.9 MB Preview Download

Additional details

Funding

DeepHealth – Deep-Learning and HPC to Boost Biomedical Applications for Health 825111
European Commission