Publication | Closed Access
RaQu: An automatic high-utilization CNN quantization and mapping framework for general-purpose RRAM Accelerator
36
Citations
21
References
2020
Year
Unknown Venue
EngineeringMachine LearningHardware AlgorithmComputer ArchitectureData ScienceEmbedded Machine LearningInternet Of ThingsMapping FrameworkResource UtilizationComputer EngineeringComputer ScienceDeep LearningNeural Architecture SearchComputer VisionHardware AccelerationEdge ComputingDegraded Resource UtilizationGeneral-purpose Rram AcceleratorConvolutional Neural NetworksDomain-specific Accelerator
Convolutional neural networks (CNNs) have become the state-of-the-art technique in many classification tasks in IoT system. However, the low-power and area-constraint edge devices are unable to afford the expensive cost of CNNs. Resistive random access memory (RRAM) is attractive for establishing the CNN accelerator at the edge end due to the features of scalability, low-power and in-situ dot-product. However, mapping a random network architecture onto a general-purpose RRAM accelerator suffers a severe issue of resource underutilization. The neural network quantization offers an opportunity to rescue the degraded resource utilization. Selecting the bit-width for the vast parameters is impractically completed by human labor. This paper proposes an AutoML-based array-aware quantization and mapping framework that generates the fine-grained mixed-precision neural networks to optimize resource utilization in RRAM. In this framework, we design a two-stage learning and array-aware grouping strategy to quickly explore the huge searching space. The experimental results show that the proposed framework achieves 18.2%~36.1% improvement in resource utilization and 0.9%~3.3% increase in model accuracy over prior coarse-grained quantization methods.
| Year | Citations | |
|---|---|---|
Page 1
Page 1