Report Open Access

Processing in Memory: The Tipping Point

Radojković, Petar; Carpenter, Paul; Esmaili-Dokht, Pouya; Cimadomo, Rémy; Charles, Henri-Pierre; Sebastian, Abu; Amato, Paolo


MARC21 XML Export

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">M. Radulovic, D. Zivanovic, D. Ruiz, B. R. d. Supinski, S. A. McKee, P. Radojković and E. Ayguadé, "Another Trip to the Wall: How Much Will Stacked DRAM Benefit HPC?," in Proceedings of the International Symposium on Memory Systems (MEMSYS), 2015.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">H. S. Stone, "A Logic-in-Memory Computer," IEEE Transactions on Computers, vol. 19, 1970.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">P. Siegl, R. Buchty and M. Berekovic, "Data-Centric Computing Frontiers: A Survey on Processing-In-Memory," in Proceedings of the Second International Symposium on Memory Systems (MEMSYS), 2016.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Eurolab4HPC Long-Term Vision on High-Performance Computing (2nd Edition), 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">ETP4HPC's SRA 4, "Strategic Research Agenda for High-performance Computing in Europe," White Paper, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Samsung Electronics Co., Ltd., "288pin Registered DIMM based on 4Gb E-die," DDR4 SDRAM Datasheet, 2017.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">S. Li, Z. Yang, D. Reddy, A. Srivastava and B. Jacob, "DRAMsim3: A Cycle-Accurate, Thermal-Capable DRAM Simulator," IEEE Computer Architecture Letters, vol. 19, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">M. Radulovic, K. Asifuzzaman, D. Zivanovic, N. Rajovic, G. C. d. Verdiére, D. Pleiter, M. Marazakisl, N. Kallimanis, P. Carpenter, P. Radojković and E. Ayguadé, "Mainstream vs. Emerging HPC: Metrics, Trade-Offs and Lessons Learned," in Proceedings of 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2018.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">O. Mutlu, S. Ghose, J. Gómez-Luna and R. Ausavarungnirun, "A Modern Primer on Processing in Memory," in arXiv, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">K. Wang, K. Angstadt, C. Bo, N. Brunelle, E. Sadredini, T. Tracy, J. Wadden, M. Stan and K. Skadron, "An Overview of Micron's Automata Processor," in Proceedings of the Eleventh IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis (CODES), 2016.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">P. Dlugosch, D. Brown, P. Glendenning, M. Leventhal and H. Noyes, "An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing," IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 12, 2014.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">T. Finkbeiner, G. Hush, T. Larsen, P. Lea, J. Leidel and T. Manning, "In-Memory Intelligence," IEEE Micro, vol. 37, no. 4, 2017.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">F. Devaux, "The True Processing in Memory Accelerator," IEEE Hot Chips Symposium (HCS), 2019.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">J. Jeddeloh and B. Keeth, "Hybrid Memory Cube New DRAM Architecture Increases Density and Performance," in Proceedings of Symposium on VLSI Technology (VLSIT), 2012.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">JEDEC Solid State Technology Association, "High Bandwidth Memory (HBM) DRAM," White Paper, 2013.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">J. Jeffers, J. Reinders and a. A. Sodani, Intel Xeon Phi Processor High Performance Programming: Knights Landing Edition (2nd ed.), 2016.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">FUJITSU LIMITED, "FUJITSU Supercomputer PRIMEHPC Specifications," White Paper, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">FUJITSU LIMITED, "FUJITSU Supercomputer PRIMEHPC FX1000," White Paper, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Y. Kwon et al., "A 20nm 6GB Function-In-Memory DRAM, Based on HBM2 with a 1.2TFLOPS Programmable Computing Unit Using Bank-Level Parallelism, for Machine Learning Applications," in Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2021.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">J.-P. Noel, M. Pezzin, R. Gauchi, J.-F. Christmann, M. Kooli, H.-P. Charles, L. Ciampolini, M. Diallo, F. Lepin, B. Blampey, P. Vivet, S. Mitra and B. Giraud, "A 35.6 TOPS/W/mm² 3-Stage Pipelined Computational SRAM with Adjustable Form Factor for Highly Data-Centric Applications," IEEE Solid-State Circuits Letters, vol. 3, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">M. Kooli, H.-P. Charles, C. Touzet, B. Giraud and J.-P. Noel, "Smart Instruction Codes for In-Memory Computing Architectures Compatible with Standard SRAM Interfaces," in Proceedings of Design, Automation Test in Europe Conference Exhibition (DATE), 2018.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">R. Khaddam-Aljameh, P.-A. Francese, L. Benini and E. Eleftheriou, "An SRAM-Based Multibit In-Memory Matrix-Vector Multiplier with a Precision that Scales Linearly in Area, Time, and Power," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, 2021.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">R. Khaddam-Aljameh et al., "HERMES Core – A 14nm CMOS and PCM-based In-Memory Compute Core using an array of 300ps/LSB Linearized CCO-based ADCs and local digital processing," in Proc. Symposium on VLSI Circuits, 2021.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">A. Sebastian, M. L. Gallo, R. Khaddam-Aljameh and E. Eleftheriou, "Memory devices and applications for in-memory computing," Nature Nanotechnology, no. July, p. 529–544, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">M. Giordano, K. Prabhu, K. Koul, R. M. Radway, A. Gural, R. Doshi, Z. F. Khan, J. W. Kustin, T. Liu, G. B. Lopes, V. Turbiner, W.-S. Khwa, Y.-D. Chih, M.-F. Chang, G. Lallement, B. Murmann, S. Mitra and P. Raina, "CHIMERA: A 0.92 TOPS, 2.2 TOPS/W Edge AI Accelerator with 2 MByte On-Chip Foundry Resistive RAM for Efficient Training and Inference," Symposium on VLSI Circuits (VLSI), 2021.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">A. Valentian, F. Rummens, E. Vianello, T. Mesquida, C. L.-M. d. Boissac, O. Bichler and C. Reita, "Fully Integrated Spiking Neural Network with Analog Neurons and RRAM Synapses," IEEE International Electron Devices Meeting (IEDM), pp. 14.3.1-14.3.4, 2019.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Microchip Technology Inc., "Enhancing System Architecture Implementation for AI Applications, Microchip Delivers its Analog Embedded SuperFlash Technology," News Release, 2019.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Micron Technology, Inc., "ECC Brings Reliability and Power Efficiency to Mobile Devices," White Paper, 2017.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">M. Patel, J. S. Kim, H. Hassan and O. Mutlu, "Understanding and Modeling On-Die Error Correction in Modern DRAM: An Experimental Study Using Real Devices," in Proceedings of 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2019.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">P. Amato, C. Laurent, M. Sforzin, S. Bellini, M. Ferrari and A. Tomasoni, "Ultra fast, two-bit ECC for Emerging Memories," in IEEE 6th International Memory Workshop (IMW), 2014.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">D. D. Sharma, "Compute express link," White Paper, 2019.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">B. Benton, "CCIX, GEN-Z, OpenCAPI: Overview and Comparison.," White Paper, 2017.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">Y. Kim, R. Daly, J. Kim, C. Fallin, J. H. Lee, D. Lee, C. Wilkerson, K. Lai and O. Mutlu, "Flipping Bits in Memory Without Accessing Them: An Experimental Study of DRAM Disturbance Errors," in Proceedings of ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), 2014.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">UPMEM, "UPMEM PIM Security Benefits - Architecture and Features Overview," White Paper, 2020.</subfield>
  </datafield>
  <datafield tag="999" ind1="C" ind2="5">
    <subfield code="x">P. Radojković et al., "Towards Resilient EU HPC Systems: A Blueprint," European HPC resilience initiative, 2020.</subfield>
  </datafield>
  <datafield tag="041" ind1=" " ind2=" ">
    <subfield code="a">eng</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">Processing in Memory</subfield>
  </datafield>
  <datafield tag="653" ind1=" " ind2=" ">
    <subfield code="a">PIM</subfield>
  </datafield>
  <controlfield tag="005">20210918014829.0</controlfield>
  <datafield tag="500" ind1=" " ind2=" ">
    <subfield code="a">This work was supported by the by the Spanish Government (contract PID2019-107255GB), Generalitat de Catalunya (contracts 2017-SGR-1328 and 2017-SGR-1414), and the European Union's Horizon 2020 research and innovation programme under grant agreements No 955606 (DEEP-SEA) and No 682675 (Projected Memristor European Research Council grant). Paul Carpenter holds the Ramon y Cajal fellowship under contracts RYC2018-025628-I of the Ministry of Economy and Competitiveness of Spain. This work was also supported by the Collaboration Agreement between Micron Technology, Inc. and BSC. The authors wish to thank Xavier Martorell from BSC for his technical support, and Manolis Marazakis and André Brinkmann for their feedback.</subfield>
  </datafield>
  <controlfield tag="001">4767489</controlfield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Barcelona Supercomputing Center (BSC)</subfield>
    <subfield code="0">(orcid)0000-0002-9392-0521</subfield>
    <subfield code="a">Carpenter, Paul</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Barcelona Supercomputing Center</subfield>
    <subfield code="0">(orcid)0000-0001-8799-5773</subfield>
    <subfield code="a">Esmaili-Dokht, Pouya</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">UPMEM</subfield>
    <subfield code="a">Cimadomo, Rémy</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">CEA</subfield>
    <subfield code="0">(orcid)0000-0002-0119-0446</subfield>
    <subfield code="a">Charles, Henri-Pierre</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">IBM Research</subfield>
    <subfield code="0">(orcid)0000-0001-5603-5243</subfield>
    <subfield code="a">Sebastian, Abu</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Micron</subfield>
    <subfield code="0">(orcid)0000-0002-9601-1462</subfield>
    <subfield code="a">Amato, Paolo</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="s">1832999</subfield>
    <subfield code="z">md5:26dfe336ef5d55f048bdfa76d1638693</subfield>
    <subfield code="u">https://zenodo.org/record/4767489/files/ETP4HPC_WP_Processing-In-Memory_FINAL.pdf</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-07-29</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">openaire</subfield>
    <subfield code="p">user-etp4hpc</subfield>
    <subfield code="o">oai:zenodo.org:4767489</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Barcelona Supercomputing Center (BSC)</subfield>
    <subfield code="0">(orcid)0000-0002-9334-3330</subfield>
    <subfield code="a">Radojković, Petar</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Processing in Memory: The Tipping Point</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-etp4hpc</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">682675</subfield>
    <subfield code="a">PROJECTED MEMRISTOR: A nanoscale device for cognitive computing</subfield>
  </datafield>
  <datafield tag="536" ind1=" " ind2=" ">
    <subfield code="c">955606</subfield>
    <subfield code="a">DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">https://creativecommons.org/licenses/by/4.0/legalcode</subfield>
    <subfield code="a">Creative Commons Attribution 4.0 International</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">&lt;p&gt;Decades after being initially explored in the 1970s, Processing in Memory (PIM) is currently experiencing a renaissance.&amp;nbsp; By moving part of the computation to the memory devices, PIM addresses a fundamental issue in the design of modern computing systems, the mismatch between the von Neumann architecture and the requirements of important data-centric applications. A number of industrial prototypes and products are under development or already available in the marketplace, and these devices show the potential for cost-effective and energy-efficient acceleration of HPC, AI and data analytics workloads. This paper reviews the reasons for the renewed interest in PIM and surveys industrial prototypes and products, discussing their technological readiness.&lt;/p&gt;

&lt;p&gt;Wide adoption of PIM in production, however, depends on our ability to create an ecosystem to drive and coordinate innovations and co-design across the whole stack. European companies and research centres should be involved in all aspects, from technology, hardware, system software and programming environment, to updating of the algorithm and application. In this paper, we identify the main challenges that must be addressed and we provide guidelines to prioritise the research efforts and funding. We aim to help make PIM a reality in production HPC, AI and data analytics.&lt;/p&gt;</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.5281/zenodo.4767488</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.5281/zenodo.4767489</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">report</subfield>
  </datafield>
</record>
115
86
views
downloads
All versions This version
Views 115115
Downloads 8686
Data volume 157.6 MB157.6 MB
Unique views 103103
Unique downloads 8080

Share

Cite as