Publication | Closed Access
A Customizable Matrix Multiplication Framework for the Intel HARPv2 Xeon+FPGA Platform
82
Citations
25
References
2018
Year
Unknown Venue
Hardware SecurityArray ComputingEngineeringHardware AccelerationHardware AlgorithmComputer EngineeringComputer ArchitectureSystems EngineeringMatrix MultiplicationDomain-specific AcceleratorParallel ProgrammingComputer ScienceReconfigurable ArchitectureParallel ComputingDeep LearningFpga DesignGeneral Matrix
General Matrix to Matrix multiplication (GEMM) is the cornerstone for a wide gamut of applications in high performance computing (HPC), scientific computing (SC) and more recently, deep learning. In this work, we present a customizable matrix multiplication framework for the Intel HARPv2 CPU+FPGA platform that includes support for both traditional single precision floating point and reduced precision workloads. Our framework supports arbitrary size GEMMs and consists of two parts: (1) a simple application programming interface (API) for easy configuration and integration into existing software and (2) a highly customizable hardware template. The API provides both compile and runtime options for controlling key aspects of the hardware template including dynamic precision switching; interleaving and block size control; and fused deep learning specific operations. The framework currently supports single precision floating point (FP32), 16, 8, 4 and 2 bit Integer and Fixed Point (INT16, INT8, INT4, INT2) and more exotic data types for deep learning workloads: INT16xTernary, INT8xTernary, BinaryxBinary.
| Year | Citations | |
|---|---|---|
Page 1
Page 1