Published August 5, 2024 | Version v1
Presentation Open

Optimizing molecular dynamics AI model using HDF5 and DYAD

  • 1. ROR icon Lawrence Livermore National Laboratory

Description

The Massively Parallel Multiscale Machine-Learned Modeling Infrastructure (MuMMI) is a framework designed to execute multiscale modeling simulations of large molecular systems integrated through ML techniques. The DL training portion of the MuMMI workflow uses NPZ arrays that lead to inefficient data loading characteristics such as I/O inefficiency, sample distribution, and sample coverage. In this talk, we will discuss our initial experience with integrating HDF5 and DYAD to improve the data access behavior of the training and improve sample distribution and sample coverage for the DL training. This talk will focus on three main directions. First, the challenges and experience of moving workloads towards HDF5. Second, features that would assist in adapting HDF5 for optimizing DL training. Finally, our experience in integrating the DYAD solution for optimizing I/O with HDF5 for this DL workload. In conclusion, we will demonstrate that we accelerated the MuMMI workflow using DYAD and HDF5 on the Corona cluster at LLNL.

Files

Optimizing molecular dynamics Devarajan.pdf

Files (2.1 MB)

Name Size Download all
md5:c7279b6e6b781576b66706a22e5201bc
2.1 MB Preview Download

Additional details

Related works

Is supplemented by
Video/Audio: https://youtu.be/t8q_jCvpP3M (URL)