Concepedia

Publication | Closed Access

A high-performance and energy-efficient architecture for floating-point based LUdecomposition on FPGAs

47

Citations

3

References

2004

Year

Abstract

Summary form only given. We first develop a novel architecture for fixed-point LU decomposition of streaming input matrices, on FPGAs. Our architecture, based on a circular linear array, achieves the minimal latency and is resource-efficient. We then extend it, by using a stacked matrices approach, to a floating-point based architecture, which achieves the minimal effective latency. Our design objective was to develop high-throughput and energy-efficient architectures for applications, which require computing LU decomposition. We analyze (1) the impact of high-throughput, pipelined floating-point units (with different depths of pipelining and different performance) on the architecture's performance, and (2) the impact of algorithm level design on the system-wide energy dissipation. We analyze the energy dissipation by capturing algorithm and architectural details of the target FPGA device. We analyze and compare our architecture with a state-of-art architecture implemented on FPGAs with respect to latency, area and energy. Our designs achieve a 10%-60% reduction in energy over that of the state-of-art architecture.

References

YearCitations

Page 1