14.3 A 65nm Computing-in-Memory-Based CNN Processor with 2.9-to-35.8TOPS/W System Energy Efficiency Using Dynamic-Sparsity Performance-Scaling Architecture and Energy-Efficient Inter/Intra-Macro Data Reuse

Abstract

Computing-in-Memory (CIM) is a promising solution for energy-efficient neural network (NN) processors. Previous CIM chips [1], [4] mainly focus on the memory macro itself, lacking insight on the overall system integration. Recently, a CIM-based system processor [5] for speech recognition demonstrated promising energy efficiency. No prior work systematically explores sparsity optimization for a CIM processor. Directly mapping sparse NN models onto regular CIM macros is ineffective, since sparse data is usually randomly distributed and CIM macros cannot be power gated even when many zeros exist. For a high compression rate and high efficiency, the granularity of sparsity [6] needs to be explored based on CIM characteristics. Moreover, system-level weight mapping to a CIM macro and data-reuse strategies are not well explored - these directions are important for CIM macro utilization and energy efficiency.

References

Page 1

	Year	Citations

Page 1