Publication | Closed Access
Deep Neural Network Acceleration Based on Low-Rank Approximated Channel Pruning
43
Citations
44
References
2020
Year
Decay CurveConvolutional Neural NetworkImage AnalysisMachine LearningEngineeringHardware AccelerationSparse Neural NetworkComputer EngineeringIdc Evaluator ProveNeural Architecture SearchComputer ScienceVideo TransformerDeep LearningLow-rank Approximated ChannelChannel PruningModel CompressionComputer Vision
Acceleration and compression on deep Convolutional Neural Networks (CNNs) have become a critical problem to develop intelligence on resource-constrained devices. Previous channel pruning can be easily deployed and accelerated without specialized hardware and software. However, weight-level redundancy is not well explored in channel pruning, which results in a relatively low compression ratio. In this work, we propose a Low-rank Approximated channel Pruning (LAP) framework to tackle this problem with two targeted steps. First, we utilize low-rank approximation to eliminate the redundancy within filter. This step achieves acceleration, especially in shallow layers, and also converts filters into smaller compact ones. Then, we apply channel pruning on the approximated network in a global way and obtain further benefits, especially in deep layers. In addition, we propose a spectral norm based indicator to coordinate these two steps better. Moreover, inspired by the integral idea adopted in video coding, we propose an evaluator based on Integral of Decay Curve (IDC) to judge the efficiency of various acceleration and compression algorithms. Ablation experiments and IDC evaluator prove that LAP can significantly improve channel pruning. To further demonstrate the hardware compatibility, the network produced by LAP obtains impressive speedup efficiency on the FPGA.
| Year | Citations | |
|---|---|---|
Page 1
Page 1