Concepedia

Abstract

Training convolutional neural network on device has become essential where it allows applications to consider user's individual environment. Meanwhile, the weight update operation from the training process is the primary factor of high energy consumption due to its substantial memory accesses. We propose a dedicated weight update architecture with two key features: (1) a specialized local buffer for the DRAM access deduction (2) a novel dataflow and its suitable processing element array structure for weight gradient computation to optimize the energy consumed by internal memories. Our scheme achieves 14.3%-30.2% total energy reduction by drastically eliminating the memory accesses.

References

YearCitations

Page 1