An SRAM-Based Multibit In-Memory Matrix-Vector Multiplier With a Precision That Scales Linearly in Area, Time, and Power

10.1109/TVLSI.2020.3037871 https://zenodo.org/records/5301067 oai:zenodo.org:5301067 Khaddam-Aljameh, Riduan Riduan Khaddam-Aljameh IBM Zurich Research Laboratory Francese, Pier-Andrea Pier-Andrea Francese IBM Zurich Research Laboratory Benini, Luca Luca Benini ETH Zurich Eleftheriou, Evangelos Evangelos Eleftheriou IBM Zurich Research Laboratory An SRAM-Based Multibit In-Memory Matrix-Vector Multiplier With a Precision That Scales Linearly in Area, Time, and Power Zenodo 2021 in-memory computing, SRAM 2021-08-28 2021-08-28 https://zenodo.org/communities/eu Creative Commons Attribution 4.0 International A novel interleaved switched-capacitor and SRAM-based multibit matrix-vector multiply-accumulate engine for in-memory computing is presented. Its operation principle is based on first converting an SRAM-stored n-bit weight into a proportional voltage using a pipeline D/A converter built from n+1 equally sized stages. A switched-capacitor stage then multiplies these voltages with an m-bit digital input activation. Finally, the output voltages that correspond to the different multiplication results are accumulated along one column by means of charge-sharing. With our proposed architecture, the required circuit area, computation time, and power consumption scale linearly versus the bit resolution of both the inputs and the weights. Analytical formulas are presented for the energy consumption in both capacitors and switches. Moreover, the impact of fabrication mismatch on analog computation accuracy is examined. The full system architecture is described, and the feasibility is demonstrated, via a full macro implementation study in 14 nm, detailing area and energy consumption, as well as the overall latency. Finally, a specific design of a 128 × 2048 6 -bit weight and 6-bit input signed matrix-vector multiplication accelerator system in 14 nm is presented, which runs at 2.43 TOP/s at an efficiency of 16.94 TOP/s/W, while using the nominal supply voltage of 0.8 V. If the operands' precision is considered in the metric, then the efficiency becomes 609.7 TOP/s/W. European Commission 682675 PROJECTED MEMRISTOR: A nanoscale device for cognitive computing