Publication | Closed Access
PARRAY
19
Citations
19
References
2012
Year
Unknown Venue
Cluster ComputingProgramming InterfaceEngineeringParallel SoftwareArray ComputingProgram AnalysisCompiler TechnologyParallelizing CompilerFermi GpusComputer EngineeringComputer ArchitectureSoftware EngineeringParallel ProgrammingComputer ScienceParallel Programming ModelParallel ComputingParallelizing Arrays
This paper introduces a programming interface called PARRAY (or Parallelizing ARRAYs) that supports system-level succinct programming for heterogeneous parallel systems like GPU clusters. The current practice of software development requires combining several low-level libraries like Pthread, OpenMP, CUDA and MPI. Achieving productivity and portability is hard with different numbers and models of GPUs. PARRAY extends mainstream C programming with novel array types of distinct features: 1) the dimensions of an array type are nested in a tree, conceptually reflecting the memory hierarchy; 2) the definition of an array type may contain references to other array types, allowing sophisticated array types to be created for parallelization; 3) threads also form arrays that allow programming in a Single-Program-Multiple-Codeblock (SPMC) style to unify various sophisticated communication patterns. This leads to shorter, more portable and maintainable parallel codes, while the programmer still has control over performance-related features necessary for deep manual optimization. Although the source-to-source code generator only faithfully generates low-level library calls according to the type information, higher-level programming and automatic performance optimization are still possible through building libraries of sub-programs on top of PARRAY. The case study on cluster FFT illustrates a simple 30-line code that 2x outperforms Intel Cluster MKL on the Tianhe-1A system with 7168 Fermi GPUs and 14336 CPUs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1