Publication | Closed Access
Petascale computing with accelerators
24
Citations
14
References
2009
Year
Unknown Venue
Cluster ComputingEngineeringComputer ArchitectureSystem-level DesignEmbedded SystemsSupercomputer ArchitectureProcessor ArchitectureHardware SystemsHigh-performance ArchitectureComputing SystemsParallel ComputingCompilersComputer EngineeringLinpack BenchmarkComputer ScienceExascale ComputingDomain-specific AcceleratorHybrid SystemsParallel ProgrammingPetascale Hybrid System
A trend is developing in high performance computing in which commodity processors are coupled to various types of computational accelerators. Such systems are commonly called hybrid systems. In this paper, we describe our experience developing an implementation of the Linpack benchmark for a petascale hybrid system, the LANL Roadrunner cluster built by IBM for Los Alamos National Laboratory. This system combines traditional x86-64 host processors with IBM PowerXCell™ 8i1 accelerator processors. The implementation of Linpack we developed was the first to achieve a performance result in excess of 1.0 PFLOPS, and made Roadrunner the #1 system on the Top500 list in June 2008. We describe the design and implementation of hybrid Linpack, including the special optimizations we developed for this hybrid architecture. We then present actual results for single node and multi-node executions. From this work, we conclude that it is possible to achieve high performance for certain applications on hybrid architectures when careful attention is given to efficient use of memory bandwidth, scheduling of data movement between the host and accelerator memories, and proper distribution of work between the host and accelerator processors.
| Year | Citations | |
|---|---|---|
Page 1
Page 1