Publication | Closed Access
Automated transformation for performance-critical kernels
17
Citations
23
References
2007
Year
Unknown Venue
EngineeringCompiler TechnologyComputer ArchitectureSoftware AnalysisNative KernelsCompute KernelData ScienceTuned KernelsParallel ComputingCompilersCompiler SupportComputer EngineeringPerformance-critical KernelsComputer ScienceProgram OptimizationPerformance Analysis ToolOptimizing CompilerAuto-tuningComputational ScienceProgram AnalysisParallel ProgrammingKey Computational KernelsSystem Software
The performance of many scientific applications depends on a small number of key computational kernels which require a level of efficiency rarely satisfied by existing native compilers. We present a new approach to high performance kernel optimization, where a general-purpose transformation engine automates the production of highly efficient library routines. The library routines are then empirically tested until an implementation with a satisfactory performance level is found. Our framework requires an annotated kernel specification and can automatically produce optimized implementations based on tuning parameters controlled by a search driver. The transformation engine includes an extensive suite of optimizations which can be easily expanded using a custom transformation language. We have applied our framework to generate code for key linear algebra kernels and have achieved similar performance as that achieved by ATLAS's highly tuned kernels. In several cases, our kernels were faster than ATLAS's native kernels; we have made these kernels available to ATLAS, which results in speedups for the ATLAS library, as we show.
| Year | Citations | |
|---|---|---|
Page 1
Page 1