Published November 9, 2021 | Version v1
Conference paper Open

Benchmarking Neural Networks on Heterogeneous Hardware Resources

  • 1. Aptiv Services Deutschland GmbH
  • 2. University of Hildesheim

Description

In recent years, artificial intelligence (AI) became a key enabling technology for many domains. To
achieve best performance, modern AI methods have high resource demands, e.g., GPU servers for the
training of neural networks. With the advent of further processor technologies, such as tensor processors
or re-wirable processors, AI methods may be executed in shorter time while even saving energy. For
many application domains such as autonomous driving or unmanned aerial vehicles, real-time constraints
mandate low end-to-end latencies in AI processing.

In this paper, we present a combined micro- and macro-benchmarking approach to analyze the
performance as well as the power demands of modern processor architectures using convolutional neural
networks as workload. We discuss tradeoffs among the different processor types and indicate issues and
challenges that arise when performing such benchmarks on heterogeneous hardware resources.


We show that FPGAs allow for an increase of 7x up to 45x in performance over high-end GPUs while
using only 10% of the power. In the consumer space, novel architectures such as the Apple M1 are able
to offer 3-5x better performance at 10-20% the power draw of current x86 CPU or GPU hardware.

This artifact contains the replication package for the respective paper (paper, slides included) on the Symposium of Software Performance 2021.

Files

cnn-bench.zip

Files (741.5 MB)

Name Size Download all
md5:13ec1cbabbd98a27ffdd1b90c7b53cb4
740.6 MB Preview Download
md5:44ca448b07c2a24e0c540cbaba08924f
658.4 kB Preview Download
md5:f94f94a8bb60e81cc390eb3ee02fa3ce
232.9 kB Preview Download