Generating SIMD Instructions for Cerebras CS-1 using Polyhedral Compilation Techniques
Authors/Creators
- 1. Cerebras Systems
Description
The Cerebras CS-1 is a computing system based on a wafer-scale processor
having nearly 400,000 compute cores.
It is intended for training of and
inference on deep neural networks.
The architecture has several features specifically
designed for this and related fields.
One of these is a sophisticated SIMD engine
that can mimic a rectangular loop nest of depth at most four.
In order to achieve optimal performance,
it is crucial to use SIMD instructions as much as possible.
This paper describes a high-level polyhedral compiler
that takes a high-level algorithm description that
can be written manually or extracted
from a TensorFlow computation graph and
generates input to the low-level C-based compiler.
In this intermediate code, the use of SIMD instructions is made explicit.
The main focus of the paper is the generation of these
CS-1 SIMD instructions for convolution style algorithms.
What complicates the task is that the set of computation instances that need
to be performed may not at first sight look like
they form a rectangular loop nest.
The basis of the compilation is formed by an effective combination
of relatively well-known, but more specialized
polyhedral operations.
Files
dtg_simd.pdf
Files
(448.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:05f2e090e33f362bb06eb7c0742c37d8
|
448.6 kB | Preview Download |