Publication | Closed Access
14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse
136
Citations
6
References
2020
Year
Unknown Venue
EngineeringMachine LearningMemory MacroComputer ArchitectureSpeech RecognitionData ScienceHigh-performance ArchitectureSparse Neural NetworkParallel ComputingManycore ProcessorCim ProcessorComputer EngineeringComputer ScienceComputing-in-memory-based Cnn ProcessorDeep LearningModel CompressionHardware AccelerationDomain-specific AcceleratorSpeech ProcessingParallel ProgrammingCim MacroIn-memory Computing
Computing-in-Memory (CIM) is a promising solution for energy-efficient neural network (NN) processors. Previous CIM chips [1], [4] mainly focus on the memory macro itself, lacking insight on the overall system integration. Recently, a CIM-based system processor [5] for speech recognition demonstrated promising energy efficiency. No prior work systematically explores sparsity optimization for a CIM processor. Directly mapping sparse NN models onto regular CIM macros is ineffective, since sparse data is usually randomly distributed and CIM macros cannot be power gated even when many zeros exist. For a high compression rate and high efficiency, the granularity of sparsity [6] needs to be explored based on CIM characteristics. Moreover, system-level weight mapping to a CIM macro and data-reuse strategies are not well explored - these directions are important for CIM macro utilization and energy efficiency.
| Year | Citations | |
|---|---|---|
Page 1
Page 1