Publication | Closed Access
Lance: efficient low-precision quantized winograd convolution for neural networks based on graphics processing units
15
Citations
23
References
2020
Year
Unknown Venue
Convolutional Neural NetworkImage AnalysisMachine LearningData ScienceEngineeringHardware AccelerationSparse Neural NetworkFull-precision ConvolutionEfficient Low-precisionQuantization TechniquesComputer ScienceNeural NetworksDeep LearningWinograd ConvolutionLinear Quantization OperationsQuantization (Signal Processing)Model CompressionComputer Vision
Accelerating deep convolutional neural networks has become an active topic and sparked an interest in academia and industry. In this paper, we propose an efficient low-precision quan-tized Winograd convolution algorithm, called LANCE, which combines the advantages of fast convolution and quantization techniques. By embedding linear quantization operations into the Winograd-domain, the fast convolution can be performed efficiently under low-precision computation on graphics processing units. We test neural network models with LANCE on representative image classification datasets, including SVHN, CIFAR, and ImageNet. The experimental results show that our 8-bit quantized Winograd convolution improves the performance by up to 2.40× over the full-precision convolution with trivial accuracy loss.
| Year | Citations | |
|---|---|---|
Page 1
Page 1