Dot-product engine for neuromorphic computing

TLDR

Vector‑matrix multiplication dominates computation time and energy for many workloads, particularly neural network algorithms and linear transforms such as the Discrete Fourier Transform. Utilizing the natural current accumulation feature of memristor crossbars, we developed the Dot‑Product Engine (DPE) as a high‑density, high‑power‑efficiency accelerator for approximate matrix‑vector multiplication. We devised a conversion algorithm mapping matrix values to memristor conductances, employed close‑loop pulse tuning and access transistors for accurate resistance programming, and validated the design by simulating a state‑of‑the‑art neural network for pattern recognition on the DPE. The DPE achieved 99 % MNIST pattern‑recognition accuracy with only a 4‑bit DAC/ADC, and delivered a speed‑efficiency product 1,000×–10,000× higher than a custom digital ASIC.

Abstract

Vector-matrix multiplication dominates the computation time and energy for many workloads, particularly neural network algorithms and linear transforms (e.g, the Discrete Fourier Transform). Utilizing the natural current accumulation feature of memristor crossbar, we developed the Dot-Product Engine (DPE) as a high density, high power efficiency accelerator for approximate matrix-vector multiplication. We firstly invented a conversion algorithm to map arbitrary matrix values appropriately to memristor conductances in a realistic crossbar array, accounting for device physics and circuit issues to reduce computational errors. The accurate device resistance programming in large arrays is enabled by close-loop pulse tuning and access transistors. To validate our approach, we simulated and benchmarked one of the state-of-the-art neural networks for pattern recognition on the DPEs. The result shows no accuracy degradation compared to software approach (99 % pattern recognition accuracy for MNIST data set) with only 4 Bit DAC/ADC requirement, while the DPE can achieve a speed-efficiency product of 1,000× to 10,000× compared to a custom digital ASIC.

References

Page 1

	Year	Citations

Page 1