PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory

TLDR

Processing‑in‑memory (PIM) addresses the memory‑wall challenge, and ReRAM’s crossbar structure enables efficient matrix‑vector multiplication, making it a promising main‑memory technology for accelerating neural networks. The authors propose PRIME, a novel PIM architecture that accelerates neural‑network applications using ReRAM‑based main memory. PRIME implements a morphable design where part of the ReRAM crossbar arrays serve as neural‑network accelerators or normal memory, supported by microarchitecture and circuit designs that incur negligible area overhead and a software/hardware interface for NN deployment. Experimental results demonstrate that PRIME outperforms a state‑of‑the‑art neural processing unit by roughly 2360× in performance and 895× in energy consumption across evaluated benchmarks.

Abstract

Processing-in-memory (PIM) is a promising solution to address the "memory wall" challenges for future computer systems. Prior proposed PIM architectures put additional computation logic in or near memory. The emerging metal-oxide resistive random access memory (ReRAM) has showed its potential to be used for main memory. Moreover, with its crossbar array structure, ReRAM can perform matrix-vector multiplication efficiently, and has been widely studied to accelerate neural network (NN) applications. In this work, we propose a novel PIM architecture, called PRIME, to accelerate NN applications in ReRAM based main memory. In PRIME, a portion of ReRAM crossbar arrays can be configured as accelerators for NN applications or as normal memory for a larger memory space. We provide microarchitecture and circuit designs to enable the morphable functions with an insignificant area overhead. We also design a software/hardware interface for software developers to implement various NNs on PRIME. Benefiting from both the PIM architecture and the efficiency of using ReRAM for NN computation, PRIME distinguishes itself from prior work on NN acceleration, with significant performance improvement and energy saving. Our experimental results show that, compared with a state-of-the-art neural processing unit design, PRIME improves the performance by ~2360x and the energy consumption by ~895x, across the evaluated machine learning benchmarks.

References

Page 1

	Year	Citations

Page 1