Published November 10, 2023 | Version v2
Video/Audio Open

Maximizing Data Utility for HPC Python Workflow Execution

  • 1. University of Notre Dame
  • 2. University of Chicago

Description

Large-scale HPC workflows are increasingly implemented in dynamic languages such as Python, which allow for more rapid development than traditional techniques. However, the cost of executing Python applications at scale is often dominated by the distribution of common datasets and complex software dependencies. As the application scales up, data distribution becomes a limiting factor that prevents scaling beyond a few hundred nodes. To address this problem, we present the integration of Parsl (a Python-native parallel programming library) with TaskVine (a data-intensive workflow execution engine). Instead of relying on a shared filesystem to provide data to tasks on demand, Parsl is able to express advance data needs to TaskVine, which then performs efficient data distribution at runtime. This combination provides a performance speedup of 1.48x over the typical method of on-demand paging from the shared filesystem, while also providing an average task speedup of 1.79x with 2048 tasks and 256 nodes.

Files

hppss_sc2023_demo_v2.mp4

Files (12.8 MB)

Name Size Download all
md5:b6fc084083c5a54b3d65205b7d9aad68
12.8 MB Preview Download