Poster Open Access

Deep Learning Inference on Commodity Network Interface Cards

Giuseppe Siracusano; Davide Sanvito; Salvator Galea; Roberto Bifulco

MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="">
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  <controlfield tag="005">20200511202032.0</controlfield>
  <controlfield tag="001">3813152</controlfield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="g">NeurIPS | 2018</subfield>
    <subfield code="a">Thirty-second Conference on Neural Information Processing Systems</subfield>
    <subfield code="c">Vancvouver, Canada</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Politecnico di Milano</subfield>
    <subfield code="a">Davide Sanvito</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">University of Cambridge</subfield>
    <subfield code="a">Salvator Galea</subfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">NEC</subfield>
    <subfield code="a">Roberto Bifulco</subfield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">862635</subfield>
    <subfield code="z">md5:95c5a4a26b315b2098b827887c1655e2</subfield>
    <subfield code="u"></subfield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2020-05-07</subfield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-5gcity</subfield>
    <subfield code="o"></subfield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">NEC</subfield>
    <subfield code="a">Giuseppe Siracusano</subfield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Deep Learning Inference on Commodity Network Interface Cards</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-5gcity</subfield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">761508</subfield>
    <subfield code="a">5GCITY</subfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u"></subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2"></subfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Artificial neural networks&amp;rsquo; fully-connected layers require memory-bound operations on modern processors, which are therefore forced to stall their pipelines while waiting for memory loads. Computation batching improves on the issue, but it is largely inapplicable when dealing with time-sensitive serving workloads, which lowers the overall efficiency of the computing infrastructure. In this paper, we explore the opportunity to improve on the issue by offloading fully-connected layers processing to commodity Network Interface Cards. Our results show that current network cards can already process the fully-connected layers of binary neural networks, and thereby increase a machine&amp;rsquo;s throughput and efficiency. Further preliminary tests show that, with a relatively small hardware design modification, a new generation of network cards could increase their fully-connected layers processing throughput by a factor of 10.&lt;/p&gt;</subfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.3813151</subfield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.3813152</subfield>
    <subfield code="2">doi</subfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">poster</subfield>
All versions This version
Views 5151
Downloads 3232
Data volume 27.6 MB27.6 MB
Unique views 4747
Unique downloads 3030


Cite as