Publication | Closed Access
Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations
68
Citations
39
References
2016
Year
EngineeringCompiler TechnologyComputer ArchitectureParallel ImplementationComputer-aided DesignArray ComputingConcurrent StartParallel ComputingCompilersComputational GeometryMassively-parallel ComputingParallelizing CompilerCompiler SupportComputer EngineeringDiamond TilingComputer ScienceComputational ScienceProgram AnalysisParallel ProcessingParallel ProgrammingMost Stencil ComputationsData-level Parallelism
Most stencil computations allow tile-wise concurrent start, i.e., there always exists a face of the iteration space and a set of tiling directions such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. However, existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique, called diamond tiling, that ensures concurrent start-up as well as perfect load-balance whenever possible. We first provide necessary and sufficient conditions for a set of tiling hyperplanes to allow concurrent start for programs with affine data accesses. We then provide an approach to automatically find such hyperplanes. Experimental evaluation on a 12-core Intel Westmere shows that diamond tiled code is able to outperform a tuned domain-specific stencil code generator by 10 to 40 percent, and previous compiler techniques by a factor of 1.3x to 10.1x.
| Year | Citations | |
|---|---|---|
Page 1
Page 1