PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems
Description
Sparse direct solvers play a vital role in large-scale high performance scientific and engineering computations. Existing distributed sparse direct methods employ the multifrontal/supernodal patterns to aggregate columns of near-identical forms and to exploit dense basic linear algebra subprograms (BLAS) for computation. However, such data layout may bring more unevenness when the structure of the input matrix is not ideal, and using dense BLAS may waste many floating point operations on zero fill-ins, accordingly.
We in this paper propose a new sparse direct solver called PanguLU. Unlike the multifrontal/supernodal layout, our work relies on simpler regular 2D blocking and stores the blocks in their sparse forms to avoid any extra fill-ins. According to the sparse patterns of the blocks, a variety of block-wise sparse BLAS methods are developed and selected for higher efficiency on local GPUs. To make PanguLU more scalable, we also adjust the mapping of blocks to processes for overall more balanced workload, and propose a synchronisation-free communication strategy considering the dependencies among different sub-tasks to reduce overall latency overhead.
Files
README.md
Files
(3.4 kB)
Name | Size | Download all |
---|---|---|
md5:448ae78cbd4fde00b1289f9c76f1ea0a
|
3.4 kB | Preview Download |