Planned intervention: On Wednesday April 3rd 05:30 UTC Zenodo will be unavailable for up to 2-10 minutes to perform a storage cluster upgrade.
Published May 7, 2020 | Version final
Poster Open

Deep Learning Inference on Commodity Network Interface Cards

  • 1. NEC
  • 2. Politecnico di Milano
  • 3. University of Cambridge

Description

Artificial neural networks’ fully-connected layers require memory-bound operations on modern processors, which are therefore forced to stall their pipelines while waiting for memory loads. Computation batching improves on the issue, but it is largely inapplicable when dealing with time-sensitive serving workloads, which lowers the overall efficiency of the computing infrastructure. In this paper, we explore the opportunity to improve on the issue by offloading fully-connected layers processing to commodity Network Interface Cards. Our results show that current network cards can already process the fully-connected layers of binary neural networks, and thereby increase a machine’s throughput and efficiency. Further preliminary tests show that, with a relatively small hardware design modification, a new generation of network cards could increase their fully-connected layers processing throughput by a factor of 10.

Files

2018sysml-nips-poster.pdf

Files (862.6 kB)

Name Size Download all
md5:95c5a4a26b315b2098b827887c1655e2
862.6 kB Preview Download

Additional details

Funding

5GCITY – 5GCITY 761508
European Commission