Concepedia

Abstract

The three dimensional Fast Fourier Transform (3D FFT) is widely applied in various scientific applications. Distributed 3D FFTs require global communication: this becomes a serious concern when strong scaling is required as in long timescale molecular dynamics simulations. In this paper, we propose a parameterized 3D FFT design that targets at a 3D-torus FPGA-based network of various sizes. Characteristics include direct FPGA-FPGA communication links, support for various internal switch designs, and use of table-based routing which saves chip area and routing cycles. We find that even assuming extremely conservative parameters, we are able to run the 16 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> FFT in 3.9μs, 32 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> FFT in 5.46μs, 64 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> FFT in 9.52μs, and 128 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">3</sup> FFT in 25.72μs. These results indicate that clusters based on commodity FPGAs are likely to be appropriate when strong scaling is needed in applications limited by the 3D FFT.

References

YearCitations

Page 1