Sparse, Quantized, Full Frame CNN for Low Power Embedded Devices

Abstract

This paper presents methods to reduce the complexity of convolutional neural networks (CNN). These include: (1) A method to quickly and easily sparsify a given network. (2) Fine tune the sparse network to obtain the lost accuracy back (3) Quantize the network to be able to implement it using 8-bit fixed point multiplications efficiently. (4) We then show how an inference engine can be designed to take advantage of the sparsity. These techniques were applied to full frame semantic segmentation and the degradation due to the sparsity and quantization is found to be negligible. We show by analysis that the complexity reduction achieved is significant. Results of implementation on Texas Instruments TDA2x SoC [17] are presented. We have modified Caffe CNN framework to do the sparse, quantized training described in this paper. The source code for the training is made available at https://github.com/tidsp/caffe-jacinto.

References

Page 1

	Year	Citations

Page 1