Publication | Closed Access
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
82
Citations
20
References
2010
Year
Unknown Venue
EngineeringComputer ArchitectureSoftware EngineeringMultithreading (Computer Architecture)Software AnalysisHardware SecurityCuda Programming ModelMulticore PlatformsParallel ComputingCompilersManycore ProcessorParallelizing CompilerComputer EngineeringComputer ScienceProgram AnalysisParallel Performance EvaluationMany-core ArchitectureParallel ProgrammingFine-grained Spmd-threaded ProgramsSystem Software
In this paper we describe techniques for compiling fine-grained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Programs developed for manycore processors typically express finer thread-level parallelism than is appropriate for multicore platforms. We describe options for implementing fine-grained threading in software, and find that reasonable restrictions on the synchronization model enable significant optimizations and performance improvements over a baseline approach. We evaluate these techniques in a production-level compiler and runtime for the CUDA programming model targeting modern CPUs. Applications tested with our tool often showed performance parity with the compiled C version of the application for single-thread performance. With modest coarse-grained multithreading typical of today's CPU architectures, an average of 3.4x speedup on 4 processors was observed across the test applications.
| Year | Citations | |
|---|---|---|
Page 1
Page 1