Publication | Closed Access
Exploiting the capabilities of modern GPUs for dense matrix computations
48
Citations
6
References
2009
Year
Numerical AnalysisEngineeringGpu BenchmarkingComputer ArchitectureGpu ComputingLinear SystemsArray ComputingCompute KernelParallel ComputingComputational GeometryGraphics ProcessorUnified ArchitectureComputer EngineeringComputer ScienceGpu ClusterDense Matrix ComputationsComputational ScienceGpu ArchitectureParallel Programming
Abstract We present several algorithms to compute the solution of a linear system of equations on a graphics processor (GPU), as well as general techniques to improve their performance, such as padding and hybrid GPU‐CPU computation. We compare single and double precision performance of a modern GPU with unified architecture, and show how iterative refinement with mixed precision can be used to regain full accuracy in the solution of linear systems, exploiting the potential of the processor for single precision arithmetic. Experimental results on a GTX280 using CUBLAS 2.0, the implementation of BLAS for NVIDIA ® GPUs with unified architecture, illustrate the performance of the different algorithms and techniques proposed. Copyright © 2009 John Wiley & Sons, Ltd.
| Year | Citations | |
|---|---|---|
Page 1
Page 1