Publication | Closed Access
Towards an optimal bit-reversal permutation program
25
Citations
9
References
2002
Year
Unknown Venue
Mathematical ProgrammingEngineeringAdvanced ComputingComputer ArchitectureIterative DecodingMemory TrafficComputational ComplexityArray ComputingApproximate ComputingParallel ComputingCombinatorial OptimizationSorting AlgorithmComputer EngineeringComputer ScienceAlgorithmic DevelopmentCryptographyArray TransposeExternal-memory AlgorithmHardware AccelerationProgram AnalysisParallel ProgrammingLower Bounds
The speed of many computations is limited not by the number of arithmetic operations but by the time it takes to move and rearrange data in the increasingly complicated memory hierarchies of modern computers. Array transpose and the bit-reversal permutation-trivial operations on a RAM-present non-trivial problems, when designing highly-tuned scientific library functions, particular for the Fast Fourier Transform. We prove a precise bound for RoCol, a simple pebble-type game that is relevant to implementing these permutations. We use RoCol to give lower bounds on the amount of memory traffic in a computer with four-levels of memory (registers, cache, TLB, and memory), taking into account such "messy" features as block moves and set-associative caches. The insights from this analysis lead to a bit-reversal algorithm whose performance is close to the theoretical minimum. Experiments show that it performs significantly better than every program in a comprehensive study of 30 published algorithms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1