Publication | Closed Access
Mixed Precision Neural Architecture Search for Energy Efficient Deep Learning
61
Citations
34
References
2019
Year
Unknown Venue
Artificial IntelligenceHigh AccuracyConvolutional Neural NetworkEngineeringMachine LearningData ScienceModel CompressionEdge ComputingSparse Neural NetworkComputer EngineeringComputer ArchitectureEmbedded Machine LearningComputer ScienceDeep LearningNeural Architecture SearchNeural Network StructuresQuantization Policies
Large scale deep neural networks (DNNs) have achieved remarkable successes in various artificial intelligence applications. However, high computational complexity and energy costs of DNNs impede their deployment on edge devices with a limited energy budget. Two major approaches have been investigated for learning compact and energy-efficient DNNs. Neural architecture search (NAS) enables the design automation of neural network structures to achieve both high accuracy and energy efficiency. The other one, model quantization, leverages low-precision representation and arithmetic to trade off efficiency against accuracy. Although NAS and quantization are both critical components of the DNN design closure, limited research considered them collaboratively. In this paper, we propose a new methodology to perform end-to-end joint optimization over the neural architecture and quantization space. Our approach searches for the optimal combinations of architectures and precisions (bit-widths) to directly optimize both the prediction accuracy and hardware energy consumption. Our framework improves and automatizes the flow across neural architecture design and hardware deployment. Experimental results demonstrate that our proposed approach achieves better energy efficiency than advanced quantization approaches and efficiency-aware NAS methods on CIFAR-100 and ImageNet. We study different search and quantization policies, and offer insights for both neural architecture and hardware designs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1