IMPACT: A 1-to-4b 813-TOPS/W 22-nm FD-SOI Compute-in-Memory CNN Accelerator Featuring a 4.2-POPS/W 146-TOPS/mm<sup>2</sup> CIM-SRAM With Multi-Bit Analog Batch-Normalization

Abstract

Amid a strife for ever-growing AI processing capabilities at the edge, compute-in-memory (CIM) SRAMs involving current-based dot-product (DP) operators have become excellent candidates to execute low-precision convolutional neural networks (CNNs) with tremendous energy efficiency. Yet, these architectures suffer from noticeable analog non-idealities and a lack of dynamic range adaptivity, leading to significant information loss during ADC quantization that hinders CNN performance with digital batch-normalization (DBN). To overcome these issues, we present IMPACT, a 1-to-4b mixed-signal accelerator in 22-nm FD-SOI intended for low-precision edge CNNs. It includes a novel 72-kB dual-supply CIM-SRAM macro with 6T-based DP operators as well as a multi-bit analog batch-normalization (ABN) unit to bypass the ADC quantization issue. IMPACT embeds the macro within a highly parallel, channel- and precision-adaptive digital datapath that handles memory transfers and provides input-reshaping capability. Finally, a co-designed CIM-aware CNN training framework accounts for the macro’s analog impairments, wherein non-linearity and variability. Measurement results showcase a 4b-normalized computing efficiency of 813 TOPS/W at 64 MHz for the whole accelerator. Taken aside, the CIM-SRAM macro achieves a peak energy efficiency and area efficiency of 4.2 POPS/W and 146 TOPS/mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> , respectively, surpassing all existing low-precision CIM designs to date.

References

Page 1

	Year	Citations

Page 1