Concepedia

Publication | Closed Access

RaQu: An automatic high-utilization CNN quantization and mapping framework for general-purpose RRAM Accelerator

36

Citations

21

References

2020

Year

Abstract

Convolutional neural networks (CNNs) have become the state-of-the-art technique in many classification tasks in IoT system. However, the low-power and area-constraint edge devices are unable to afford the expensive cost of CNNs. Resistive random access memory (RRAM) is attractive for establishing the CNN accelerator at the edge end due to the features of scalability, low-power and in-situ dot-product. However, mapping a random network architecture onto a general-purpose RRAM accelerator suffers a severe issue of resource underutilization. The neural network quantization offers an opportunity to rescue the degraded resource utilization. Selecting the bit-width for the vast parameters is impractically completed by human labor. This paper proposes an AutoML-based array-aware quantization and mapping framework that generates the fine-grained mixed-precision neural networks to optimize resource utilization in RRAM. In this framework, we design a two-stage learning and array-aware grouping strategy to quickly explore the huge searching space. The experimental results show that the proposed framework achieves 18.2%~36.1% improvement in resource utilization and 0.9%~3.3% increase in model accuracy over prior coarse-grained quantization methods.

References

YearCitations

Page 1