Publication | Closed Access
An FPGA-based processor for training convolutional neural networks
40
Citations
15
References
2017
Year
Unknown Venue
Xilinx Zu19eg FpgaConvolutional Neural NetworkEngineeringMachine LearningHardware AlgorithmComputer ArchitectureParallel ComputingFpga-based ProcessorMachine VisionComputer EngineeringComputer ScienceDeep LearningNeural Architecture SearchFpga DesignComputer VisionFpga-based Processor DesignHardware AccelerationConvolutional Neural NetworksDomain-specific AcceleratorParallel Programming
Convolutional neural networks (CNNs) have gained great success in various computer vision applications. However, training a CNN model is computation-intensive and time-consuming. Hence training is mainly processed on large clusters of high-performance processors like server CPUs and GPUs. In this paper, we propose an FPGA-based processor design to accelerate the training process of CNNs. We first analyze the operations in all types of CNN layers in the training process. A uniform computation engine design is proposed to efficiently carry out all kinds of operations based on the analysis. Then a scalable accelerator framework is presented that exploits the parallelism further by unrolling the loops in two levels. The proposed accelerator design is demonstrated by implementing a processor on the Xilinx ZU19EG FPGA working at 200 MHz. The evaluation results on a group of CNN models show that our processor is 5.7 to 10.7-fold faster than the software implementations on the Intel Core i5-4440 CPU(@3.10GHz).
| Year | Citations | |
|---|---|---|
Page 1
Page 1