Publication | Closed Access
Scalable parallelization of FLAME code via the workqueuing model
22
Citations
11
References
2008
Year
Cluster ComputingMassively-parallel ComputingEngineeringParallel SoftwareOpenmp ParallelizationParallel ProcessingHigher LevelComputer EngineeringComputer ArchitectureParallel ImplementationParallel ProgrammingComputer ScienceFlame CodeParallel ComputingLinear Algebra AlgorithmsData-level Parallelism
We discuss the OpenMP parallelization of linear algebra algorithms that are coded using the Formal Linear Algebra Methods Environment (FLAME) API. This API expresses algorithms at a higher level of abstraction, avoids the use loop and array indices, and represents these algorithms as they are formally derived and presented. We report on two implementations of the workqueuing model, neither of which requires the use of explicit indices to specify parallelism. The first implementation uses the experimental taskq pragma, which may influence the adoption of a similar construct into OpenMP 3.0. The second workqueuing implementation is domain-specific to FLAME but allows us to illustrate the benefits of sorting tasks according to their computational cost prior to parallel execution. In addition, we discuss how scalable parallelization of dense linear algebra algorithms via OpenMP will require a two-dimensional partitioning of operands much like a 2D data distribution is needed on distributed memory architectures. We illustrate the issues and solutions by discussing the parallelization of the symmetric rank-k update and report impressive performance on an SGI system with 14 Itanium2 processors.
| Year | Citations | |
|---|---|---|
Page 1
Page 1