Optimizing molecular dynamics AI model using HDF5 and DYAD
Description
The Massively Parallel Multiscale Machine-Learned Modeling Infrastructure (MuMMI) is a framework designed to execute multiscale modeling simulations of large molecular systems integrated through ML techniques. The DL training portion of the MuMMI workflow uses NPZ arrays that lead to inefficient data loading characteristics such as I/O inefficiency, sample distribution, and sample coverage. In this talk, we will discuss our initial experience with integrating HDF5 and DYAD to improve the data access behavior of the training and improve sample distribution and sample coverage for the DL training. This talk will focus on three main directions. First, the challenges and experience of moving workloads towards HDF5. Second, features that would assist in adapting HDF5 for optimizing DL training. Finally, our experience in integrating the DYAD solution for optimizing I/O with HDF5 for this DL workload. In conclusion, we will demonstrate that we accelerated the MuMMI workflow using DYAD and HDF5 on the Corona cluster at LLNL.
Files
Optimizing molecular dynamics Devarajan.pdf
Files
(2.1 MB)
Name | Size | Download all |
---|---|---|
md5:c7279b6e6b781576b66706a22e5201bc
|
2.1 MB | Preview Download |
Additional details
Related works
- Is supplemented by
- Video/Audio: https://youtu.be/t8q_jCvpP3M (URL)
Software
- Repository URL
- https://github.com/flux-framework/dyad