Accelerating Matrix Multiplication in Deep Learning by Using Low-Rank Approximation

Abstract

The open source frameworks of deep learning including TensorFlow, Caffe, Torch, etc. are widely used all over the world and its acceleration have great meaning. In these frameworks, a lot of computation time is spent on convolution, and highly tuned libraries such as cuDNN play important role on accelerating convolution. In these libraries, however, a convolution computation is performed without approximating a dense matrices. In this research, we propose a method to introduce the low-rank approximation method, widely used in the field of scientific and technical computation, into the convolution computation. As a result of investigating the influence on the recognition accuracy of the existing model, it is possible to reduce up to about 90% of rank of data matrices while keeping recognition accuracy -2% of baseline.

References

Page 1

	Year	Citations

Page 1