Publication | Open Access
Compiler Techniques to Reduce the Synchronization Overhead of GPU Redundant Multithreading
29
Citations
18
References
2017
Year
Unknown Venue
EngineeringGpu BenchmarkingComputer ArchitectureMultithreading (Computer Architecture)Gpu Redundant MultithreadingRedundant Multi-threadingGpu ComputingHardware SecurityCompute KernelRedundant ThreadsCompiler TechniquesCompilersParallel ComputingComputer EngineeringComputer ScienceGpu ClusterGpu ArchitectureProgram AnalysisParallel ProgrammingSynchronization Overhead
Redundant Multi-Threading (RMT) provides a potentially low cost mechanism to increase GPU reliability by replicating computation at the thread level. Prior work has shown that RMT's high performance overhead stems not only from executing redundant threads, but also from the synchronization overhead between the original and redundant threads. The overhead of inter-thread synchronization can be especially significant if the synchronization is implemented using global memory. This work presents novel compiler techniques using fingerprinting and cross-lane operations to reduce synchronization overhead for RMT on GPUs. Fingerprinting combines multiple synchronization events into one event by hashing, and cross-lane operations enable thread-level synchronization via register-level communication. This work shows that fingerprinting yields a 73.5% reduction in GPU RMT overhead while cross-lane operations reduce the overhead by 43% when compared to the state-of-the-art GPU RMT solutions on real hardware.
| Year | Citations | |
|---|---|---|
Page 1
Page 1