Publication | Closed Access
Comparison of scalable parallel matrix multiplication libraries
19
Citations
6
References
2002
Year
Unknown Venue
Cluster ComputingMassively-parallel ComputingArray ComputingEngineeringParallel ProcessingIntel DeltaComputer ArchitectureComputer EngineeringParallel ImplementationLibrary RoutinesParallel ProgrammingComputer ScienceGeneral Library RoutinesParallel ComputingParallel Algorithms
This paper compares two general library routines for performing parallel distributed matrix multiplication. The PUMMA algorithm utilities block scattered data layout, whereas BiMMeR utilizes virtual 2-D torus wrap. The algorithmic differences resulting from these different layouts are discussed us well as the general issues associated with different data layouts for library routines. Results on the Intel Delta for the two matrix multiplication algorithms are presented.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
| Year | Citations | |
|---|---|---|
Page 1
Page 1