Concepedia

TLDR

Conventional von Neumann architectures suffer from a memory–processor bandwidth bottleneck that limits large‑data applications, and while near‑memory computing mitigates this, truly in‑memory computing with resistive‑switching memories offers a promising, low‑power, massively parallel solution whose practical deployment still faces significant challenges. This work provides a qualitative and quantitative assessment of the principal challenges in deploying high‑capacity, high‑volume resistive‑switching memories to accelerate vector‑matrix multiplication for machine‑learning inference. The authors describe monolithic integration of resistive‑switching memories with CMOS circuitry, review device‑, circuit‑, and system‑level design choices, and outline future research directions.

Abstract

The low communication bandwidth between memory and processing units in conventional von Neumann machines does not support the requirements of emerging applications that rely extensively on large sets of data. More recent computing paradigms, such as high parallelization and near‐memory computing, help alleviate the data communication bottleneck to some extent, but paradigm‐shifting concepts are required. In‐memory computing has emerged as a prime candidate to eliminate this bottleneck by colocating memory and processing. In this context, resistive switching (RS) memory devices is a key promising choice, due to their unique intrinsic device‐level properties, enabling both storing and computing with a small, massively‐parallel footprint at low power. Theoretically, this directly translates to a major boost in energy efficiency and computational throughput, but various practical challenges remain. A qualitative and quantitative analysis of several key existing challenges in implementing high‐capacity, high‐volume RS memories for accelerating the most computationally demanding computation in machine learning (ML) inference, that of vector‐matrix multiplication (VMM), is presented. The monolithic integration of RS memories with complementary metal–oxide–semiconductor (CMOS) integrated circuits is presented as the core underlying technology. The key existing design choices in terms of device‐level physical implementation, circuit‐level design, and system‐level considerations is reviewed and an outlook for future directions is provided.

References

YearCitations

Page 1