Publication | Closed Access
A distributed multi-GPU system for fast graph processing
73
Citations
28
References
2017
Year
Cluster ComputingFast Graph ProcessingRuntime ConfigurationsEngineeringGpu BenchmarkingComputer ArchitectureNetwork AnalysisGpu ComputingParallel ComputingMassively-parallel ComputingComputer EngineeringDistributed SystemsComputer ScienceGpu ClusterPresent LuxLux ApplicationsGpu ArchitectureGraph TheoryEdge ComputingCloud ComputingParallel Programming
We present Lux, a distributed multi-GPU system that achieves fast graph processing by exploiting the aggregate memory bandwidth of multiple GPUs and taking advantage of locality in the memory hierarchy of multi-GPU clusters. Lux provides two execution models that optimize algorithmic efficiency and enable important GPU optimizations, respectively. Lux also uses a novel dynamic load balancing strategy that is cheap and achieves good load balance across GPUs. In addition, we present a performance model that quantitatively predicts the execution times and automatically selects the runtime configurations for Lux applications. Experiments show that Lux achieves up to 20X speedup over state-of-the-art shared memory systems and up to two orders of magnitude speedup over distributed systems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1