Publication | Closed Access
Empirically tuning LAPACK’s blocking factor for increased performance
19
Citations
12
References
2008
Year
EngineeringComputer ArchitectureComputational ComplexitySoftware AnalysisPerformance IssueSeveral Lapack RoutinesHigh-performance ArchitectureSystems EngineeringPerformance TuningParallel ComputingIncreased PerformanceComputer EngineeringComputer SciencePerformance Analysis ToolOptimizing CompilerLapackpsilas Computational EngineProgram AnalysisParallel Performance EvaluationBlocking FactorParallel ProgrammingPerformance PortabilitySystem Software
LAPACK (linear algebra package) is a statically cache-blocked library, where the blocking factor (NB) is determined by the service routine ILAENV. Users are encouraged to tune NB to maximize performance on their platform/BLAS (the BLAS are LAPACKpsilas computational engine), but in practice very few users do so (both because it is hard, and because its importance is not widely understood). In this paper we (1) Discuss our empirical tuning framework for discovering good NB settings, (2) quantify the performance boost that tuning NB can achieve on several LAPACK routines across multiple architectures and BLAS implementations, (3) compare the best performance of LAPACKpsilas statically blocked routines against state of the art recursively blocked routines, and vendor-optimized LAPACK implementations, to see how much performance loss is mandated by LAPACKpsilas present static blocking strategy, and finally (4) use results to determine how best to block nonsquare matrices once good square blocking factors are discovered.
| Year | Citations | |
|---|---|---|
Page 1
Page 1