Published February 3, 2014 | Version v1
Working paper Open

Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture

Creators

  • 1. Bilkent University, Computer Engineering Department, 06800 Ankara, Turkey

Contributors

Other:

  • 1. Bilkent University, Computer Engineering Department, 06800 Ankara, Turkey

Description

In this whitepaper, we propose outer-product-parallel and inner-product-parallel sparse matrix-matrix
multiplication (SpMM) algorithms for the Xeon Phi architecture. We discuss the trade-offs between these two
parallelization schemes for the Xeon Phi architecture. We also propose two hypergraph-partitioning-based
matrix partitioning and row/column reordering methods that achieve temporal locality in these two
parallelization schemes. Both HP models try to minimize the total number of transfers from/to the memory while
maintaining balance on computational loads of threads. The experimental results performed for realistic SpMM
instances show that the Intel MIC architecture has the potential for attaining high performance in irregular
applications, as well as regular applications. However, intelligent data and computation reordering that considers
better utilization of temporal locality should be developed for attaining high performance in irregular
applications.

Files

WP144.pdf

Files (180.6 kB)

Name Size Download all
md5:a64dcbe50dbb6f6e3634b64ac6834dee
180.6 kB Preview Download

Additional details

Funding

PRACE-1IP – PRACE - First Implementation Phase Project 261557
European Commission