Published March 30, 2014 | Version v1
Working paper Open

Analysis of SuperLU Solvers on Intel® MIC Architecture

Creators

  • 1. Istanbul Technical University, National Center for High Performance Computing of Turkey (UHeM), Istanbul 34469, Turkey; Istanbul Technical University,Department of Mathematics, Istanbul 34469, Turkey
  • 1. Istanbul Technical University, National Center for High Performance Computing of Turkey (UHeM), Istanbul 34469, Turkey; Istanbul Technical University, Informatics Institute, Istanbul 34469, Turkey

Description

Intel Xeon Phi is a coprocessor with sixty-one cores in a single chip. The chip has a more powerful FPU that contains 512-bit
SIMD registers. Intel Xeon Phi chip can benefit from the algorithms that operate with the large vectors. In this work, sequential,
multithreaded and distributed versions of SuperLU solvers are tested on the Intel Xeon Phi using offload programming model
and they work well. There are several offload programming alternatives depending on where to place pragma directives. We find
that the sequential SuperLU benefited up to 45% performance improvement from the offload programming depending on the
sparse matrix type and the size of transferred and processed data. On the other hand, the partitioning method of SuperLU_DIST
and SuperLU_MT generates very small sized submatrices. Therefore, we observe that the matrix partitioning method and several
other tradeoffs influence their performance via the Xeon Phi architecture.

Files

WP135.pdf

Files (190.9 kB)

Name Size Download all
md5:9461e3fead4e504faab6c69ca3f49a4a
190.9 kB Preview Download

Additional details

Funding

PRACE-1IP – PRACE - First Implementation Phase Project 261557
European Commission