Publication | Closed Access
A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors
320
Citations
6
References
2018
Year
Unknown Venue
Sub-16ns Multiply-and-accumulateConvolutional Neural NetworkEngineeringMachine LearningNeural NetworkComputer ArchitectureHigh-performance ArchitectureSparse Neural NetworkEmbedded Machine LearningParallel ComputingComputer EngineeringComputer ScienceDeep LearningNeural Architecture SearchMemory ArchitectureMany Artificial IntelligenceDeep Neural NetworksHardware AccelerationEdge ComputingIn-memory Computing
Many artificial intelligence (AI) edge devices use nonvolatile memory (NVM) to store the weights for the neural network (trained off-line on an AI server), and require low-energy and fast I/O accesses. The deep neural networks (DNN) used by AI processors [1,2] commonly require p-layers of a convolutional neural network (CNN) and q-layers of a fully-connected network (FCN). Current DNN processors that use a conventional (von-Neumann) memory structure are limited by high access latencies, I/O energy consumption, and hardware costs. Large working data sets result in heavy accesses across the memory hierarchy, moreover large amounts of intermediate data are also generated due to the large number of multiply-and-accumulate (MAC) operations for both CNN and FCN. Even when binary-based DNN [3] are used, the required CNN and FCN operations result in a major memory I/O bottleneck for AI edge devices.
| Year | Citations | |
|---|---|---|
Page 1
Page 1