Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture

K. Akbudak

doi:10.5281/zenodo.822711

Published February 3, 2014 | Version v1

Working paper Open

Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture

K. Akbudak¹

1. Bilkent University, Computer Engineering Department, 06800 Ankara, Turkey

Contributors

Other:

C.Aykanat¹

1. Bilkent University, Computer Engineering Department, 06800 Ankara, Turkey

In this whitepaper, we propose outer-product-parallel and inner-product-parallel sparse matrix-matrix
multiplication (SpMM) algorithms for the Xeon Phi architecture. We discuss the trade-offs between these two
parallelization schemes for the Xeon Phi architecture. We also propose two hypergraph-partitioning-based
matrix partitioning and row/column reordering methods that achieve temporal locality in these two
parallelization schemes. Both HP models try to minimize the total number of transfers from/to the memory while
maintaining balance on computational loads of threads. The experimental results performed for realistic SpMM
instances show that the Intel MIC architecture has the potential for attaining high performance in irregular
applications, as well as regular applications. However, intelligent data and computation reordering that considers
better utilization of temporal locality should be developed for attaining high performance in irregular
applications.

Files

WP144.pdf

Files (180.6 kB)

Name	Size	Download all
WP144.pdf md5:a64dcbe50dbb6f6e3634b64ac6834dee	180.6 kB	Preview Download

Additional details

European Commission
PRACE-1IP – PRACE - First Implementation Phase Project 261557

	All versions	This version
Views	59	59
Downloads	41	40
Data volume	7.4 MB	7.2 MB

Exploiting Locality in Sparse Matrix-Matrix Multiplication on the Many Integrated Core Architecture

Creators

Contributors

Other:

Description

Files

WP144.pdf

Files (180.6 kB)

Additional details

Funding