Concepedia

Publication | Closed Access

A 28-nm Compute SRAM With Bit-Serial Logic/Arithmetic Operations for Programmable In-Memory Vector Computing

215

Citations

35

References

2019

Year

TLDR

The paper proposes a hybrid in‑/near‑memory compute SRAM that supports vector‑based, bit‑serial arithmetic across 1‑ to 64‑bit widths and all integer and floating‑point operations. The design was implemented in a 28‑nm CMOS IoT processor featuring a Cortex‑M0 CPU and eight 16‑kB CRAM banks (128 kB total). The system runs at 475 MHz on 1.1 V, delivering 30 GOPS (1.4 GFLOPS) on 32‑bit operands, and achieves 0.56 TOPS/W for 8‑bit multiplication and 5.27 TOPS/W for 8‑bit addition at 0.6 V and 114 MHz.

Abstract

This article proposes a general-purpose hybrid in-/near-memory compute SRAM (CRAM) that combines an 8T transposable bit cell with vector-based, bit-serial in-memory arithmetic to accommodate a wide range of bit-widths, from single to 32 or 64 bits, as well as a complete set of operation types, including integer and floating-point addition, multiplication, and division. This approach provides the flexibility and programmability necessary for evolving software algorithms ranging from neural networks to graph and signal processing. The proposed design was implemented in a small Internet of Things (IoT) processor in the 28-nm CMOS consisting of a Cortex-M0 CPU and 8 CRAM banks of 16 kB each (128 kB total). The system achieves 475-MHz operation at 1.1 V and, with all CRAMs active, produces 30 GOPS or 1.4 GFLOPS on 32-bit operands. It achieves an energy efficiency of 0.56 TOPS/W for 8-bit multiplication and 5.27 TOPS/W for 8-bit addition at 0.6 V and 114 MHz.

References

YearCitations

Page 1