Concepedia

Publication | Closed Access

Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance

149

Citations

14

References

2004

Year

TLDR

Field‑programmable gate arrays have historically been limited by floating‑point support, but Moore’s law has increased their density, raising the question of how much peak floating‑point performance can be sustained. This paper investigates the performance of three core BLAS routines—vector dot product, matrix‑vector multiplication, and matrix multiplication—on different hardware platforms. The authors compare microprocessors, FPGAs, and reconfigurable computing platforms on these operations and analyze trends over the past six years to project performance for the next six years. The analysis shows that sustaining FPGA peak performance requires sufficient memory bandwidth and internal storage capacity.

Abstract

Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks - as long as floating-point arithmetic is not required. Fueled by the advance of Moore's law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however, is how much of this peak performance can be sustained. This paper examines three of the basic linear algebra subroutine (BLAS) functions: vector dot product, matrix-vector multiply, and matrix multiply. A comparison of microprocessors, FPGAs, and reconfigurable computing platforms is performed for each operation. The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs. This analysis considers the historical context of the last six years and is extrapolated for the next six years.

References

YearCitations

Page 1