Publication | Open Access
Heterogeneous Systolic Array Architecture for Compact CNNs Hardware Accelerators
34
Citations
26
References
2021
Year
EngineeringHardware AccelerationHigh-performance ArchitectureHardware AlgorithmNave Systolic ArrayComputer EngineeringComputer ArchitectureDomain-specific AcceleratorParallel ProgrammingComputer ScienceDepthwise Convolutional LayersParallel ComputingDeep LearningFpga DesignSystolic Array Accelerators
Compact convolutional neural networks have become a hot research topic. However, we find that the systolic array accelerators are extremely inefficient in dealing with compact models, especially when processing depthwise convolutional layers in the neural networks. To make systolic arrays more efficient for compact convolutional neural networks, we propose the heterogeneous systolic array (HeSA) architecture. It introduces heterogeneous processing elements that support multiple modes of dataflow, which can further exploit the reuse data chance of depthwise convolutional layers and without changing the scale or structure of the nave systolic array. By increasing the utilization rate of processing elements in the array, the HeSA improves the performance, throughput, and energy efficiency compared to the standard baseline. In addition, we design the flexible buffer structure for the communication between the computing array and external buffer. Through flexible routing, the HeSA can achieve a large-scale array design with maintaining high processing elements utilization rate and low communication costs. Based on our evaluation with typical workloads, the HeSA improves the utilization rate of the computing resource in depthwise convolutional layers by 4.5 - 11.2 and acquires 1.6 - 3.1 total performance speedup compared to the standard systolic array architecture. In the large-scale array design, the HeSA can reduce the data traffic by 40% while maintaining the same performance as the scaling-out method. By improving the on-chip data reuse chance and reducing data traffic, the HeSA saves over 20% in energy consumption. Meanwhile, the area of the HeSA is basically unchanged compared to the baseline due to its simple design.
| Year | Citations | |
|---|---|---|
Page 1
Page 1