Concepedia

Publication | Open Access

A compute-in-memory chip based on resistive random-access memory

739

Citations

52

References

2022

Year

TLDR

Edge AI demands unprecedented energy efficiency, and resistive‑RAM based compute‑in‑memory promises to meet this by storing weights in dense, non‑volatile devices and performing computation directly, yet achieving simultaneous high energy efficiency, versatility, and accuracy remains a challenge. The authors aim to co‑optimize across algorithms, architecture, circuits, and devices to deliver a RRAM‑based CIM chip that balances versatility, energy efficiency, and accuracy. NeuRRAM is a reconfigurable RRAM‑based CIM chip that supports diverse model architectures, achieves two‑fold energy savings over prior RRAM‑CIM chips, and matches software accuracy with four‑bit weight quantization. On benchmark tasks, NeuRRAM attains 99.0 % accuracy on MNIST, 85.7 % on CIFAR‑10, 84.7 % on Google speech command recognition, and reduces image‑reconstruction error by 70 % on a Bayesian image‑recovery task.

Abstract

Abstract Realizing increasingly complex artificial intelligence (AI) functionalities directly on edge devices calls for unprecedented energy efficiency of edge hardware. Compute-in-memory (CIM) based on resistive random-access memory (RRAM) 1 promises to meet such demand by storing AI model weights in dense, analogue and non-volatile RRAM devices, and by performing AI computation directly within RRAM, thus eliminating power-hungry data movement between separate compute and memory 2–5 . Although recent studies have demonstrated in-memory matrix-vector multiplication on fully integrated RRAM-CIM hardware 6–17 , it remains a goal for a RRAM-CIM chip to simultaneously deliver high energy efficiency, versatility to support diverse models and software-comparable accuracy. Although efficiency, versatility and accuracy are all indispensable for broad adoption of the technology, the inter-related trade-offs among them cannot be addressed by isolated improvements on any single abstraction level of the design. Here, by co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM—a RRAM-based CIM chip that simultaneously delivers versatility in reconfiguring CIM cores for diverse model architectures, energy efficiency that is two-times better than previous state-of-the-art RRAM-CIM chips across various computational bit-precisions, and inference accuracy comparable to software models quantized to four-bit weights across various AI tasks, including accuracy of 99.0 percent on MNIST 18 and 85.7 percent on CIFAR-10 19 image classification, 84.7-percent accuracy on Google speech command recognition 20 , and a 70-percent reduction in image-reconstruction error on a Bayesian image-recovery task.

References

YearCitations

Page 1