Publication | Closed Access
SNAP: A 1.67 — 21.55TOPS/W Sparse Neural Acceleration Processor for Unstructured Sparse Deep Neural Network Inference in 16nm CMOS
45
Citations
0
References
2019
Year
Unknown Venue
Deep Neural NetworksEngineeringHardware AccelerationHigh-performance ArchitectureSparse Neural Network— 21.55Tops/wComputer EngineeringComputer ArchitectureComputing SystemsHardware SystemsDomain-specific AcceleratorComputer ScienceParallel ComputingDeep LearningManycore ProcessorPerformance ImprovementSnap Test ChipSnap Uses
A Sparse Neural Acceleration Processor (SNAP) is designed to exploit unstructured sparsity in deep neural networks (DNNs). SNAP uses parallel associative search to discover input pairs to maintain an average 75% hardware utilization. SNAP's two-level partial sum reduce eliminates access contention and cuts the writeback traffic by 22×. Through diagonal and row configurations of PE arrays, SNAP supports any CONV and FC layers. A 2.4mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> 16nm SNAP test chip is measured to achieve a peak effectual efficiency of 21.55TOPS/W (16b) at 0.55V and 260MHz for CONV layers with 10% weight and activation density. Operating on pruned ResNet-50, SNAP achieves 90.98fps at 0.80V and 480MHz, dissipating 348mW.