Publication | Closed Access
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point
54
Citations
0
References
2020
Year
EngineeringAdvanced ComputingVerificationHardware AlgorithmComputer ArchitectureAccuracy And PrecisionMicrosoft Floating PointCustom HardwareHardware ArchitectureHardware SecurityNarrow Precision InferencingData ScienceUncertainty QuantificationApproximate ComputingCalibrationHigh-performance ArchitectureHardware DesignParallel ComputingData ManagementComputer EngineeringComputer ScienceCloud ScaleHardware AccelerationCloud ComputingParallel ProgrammingBig Data
In this paper, we explore the limits of Microsoft Floating Point (MSFP), a new class of datatypes developed for production cloud-scale inferencing on custom hardware. Through the co-evolution of hardware design and algorithms, MSFP achieves accuracy comparable to or better than industry standards Bfloat16 and INT8 at 3x and 4x lower cost, respectively. MSFP incurs negligible impact to accuracy (