Running sparse and low-precision neural network: When algorithm meets hardware - Concepedia

Concepedia

Abstract

Deep Neural Networks (DNNs) are pervasively applied in many artificial intelligence (AI) applications. The high performance of DNNs comes at the cost of larger size and higher compute complexity. Recent studies show that DNNs have much redundancy, such as the zero-value parameters and excessive numerical precision. To reduce computing complexity, many redundancy reduction techniques have been proposed, including pruning and data quantization. In this paper, we demonstrate our co-optimization of the DNN algorithm and hardware which exploits the model redundancy to accelerate DNNs.

References

	Year	Citations

Page 1