I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication
Description
Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) are a critical performance bottleneck in modern computing. Therefore it is a problem of high practical relevance to find algorithms that use a minimal number of I/O s. We present a cache-oblivious sparse matrix-sparse matrix multiplication algorithm that uses a worst-case number of I/O s that matches a previously established lower bound for this problem (0 (N2/B.M) read-I/Os and 0 (N2/B) write-I/Os, where N is the size of the problem instance, M is the size of the fast memory and B is the size of the cache lines). When the output does not need to be stored, also the number of write-I/Os can be reduced to 0 (N2/B.M). This improves the worst-case I/O-complexity of the previously best known algorithm for this problem (which is cache-aware) by a logarithmic multiplicative factor. Compared to other cache-oblivious algorithms our algorithm improves the worst-case number of I/Os by a multiplicative factor of Θ(M. N). We show how the algorithm can be applied to produce the first I/O-efficient solution for the sparse 2- vs 3-diameter problem on sparse directed graphs.
Files
IPDPS22sparse_matrices.pdf
Files
(704.7 kB)
Name | Size | Download all |
---|---|---|
md5:3acb4e8248439ca986813b10333a99b6
|
704.7 kB | Preview Download |