Concepedia

Publication | Closed Access

Towards an optimal bit-reversal permutation program

25

Citations

9

References

2002

Year

Abstract

The speed of many computations is limited not by the number of arithmetic operations but by the time it takes to move and rearrange data in the increasingly complicated memory hierarchies of modern computers. Array transpose and the bit-reversal permutation-trivial operations on a RAM-present non-trivial problems, when designing highly-tuned scientific library functions, particular for the Fast Fourier Transform. We prove a precise bound for RoCol, a simple pebble-type game that is relevant to implementing these permutations. We use RoCol to give lower bounds on the amount of memory traffic in a computer with four-levels of memory (registers, cache, TLB, and memory), taking into account such "messy" features as block moves and set-associative caches. The insights from this analysis lead to a bit-reversal algorithm whose performance is close to the theoretical minimum. Experiments show that it performs significantly better than every program in a comprehensive study of 30 published algorithms.

References

YearCitations

Page 1