PowerGraph: distributed graph-parallel computation on natural graphs

TLDR

Large‑scale graph‑structured computation underpins tasks such as targeted advertising and natural language processing, and has spurred graph‑parallel abstractions like Pregel and GraphLab, but natural power‑law graphs challenge these abstractions’ assumptions, limiting performance and scalability. This work characterizes the challenges of computing on natural power‑law graphs within existing graph‑parallel frameworks and proposes the PowerGraph abstraction to exploit graph program structure and overcome these challenges. PowerGraph introduces a distributed graph placement and representation scheme that leverages power‑law structure, and the authors analyze and experimentally compare it to two popular graph‑parallel systems. Empirical evaluations on large‑scale real‑world problems demonstrate that PowerGraph achieves order‑of‑magnitude performance gains over existing systems.

Abstract

Large-scale graph-structured computation is central to tasks ranging from targeted advertising to natural language processing and has led to the development of several graph-parallel abstractions including Pregel and GraphLab. However, the natural graphs commonly found in the real-world have highly skewed power-law degree distributions, which challenge the assumptions made by these abstractions, limiting performance and scalability.In this paper, we characterize the challenges of computation on natural graphs in the context of existing graph-parallel abstractions. We then introduce the PowerGraph abstraction which exploits the internal structure of graph programs to address these challenges. Leveraging the PowerGraph abstraction we introduce a new approach to distributed graph placement and representation that exploits the structure of power-law graphs. We provide a detailed analysis and experimental evaluation comparing PowerGraph to two popular graph-parallel systems. Finally, we describe three different implementation strategies for PowerGraph and discuss their relative merits with empirical evaluations on large-scale real-world problems demonstrating order of magnitude gains.

References

Page 1

	Year	Citations
What is Twitter, a social network or a news media? Haewoon Kwak, Changhyun Lee, Hosung Park, EngineeringSocial Medium MonitoringJuly 2009Social TechnologiesTopological Characteristics	2010	6.6K
Spark: cluster computing with working sets Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, UC Berkeley	2010	4.2K
On power-law relationships of the Internet topology Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos Computational Social ScienceNetwork ScienceGraph TheoryEngineeringInternet Topology	1999	4.2K
Probabilistic Latent Semantic Indexing Thomas Hofmann ACM SIGIR Forum EngineeringProbabilistic VariantLatent Semantic IndexingSemantic WebCorpus Linguistics	2017	4K
Pregel Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, Cluster ComputingWeb GraphNetwork ScienceGraph TheoryEngineering	2010	3.5K
On power-law relationships of the Internet topology Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos ACM SIGCOMM Computer Communication Review Computational Social ScienceNetwork ScienceGraph TheoryEngineeringInternet Topology	1999	2.6K
Graph evolution Jure Leskovec, Jon Kleinberg, Christos Faloutsos ACM Transactions on Knowledge Discovery from Data Computational Social ScienceNetwork EvolutionNetwork ScienceGraph TheoryData Science	2007	2.4K
Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters Jure Leskovec, Kevin Lang, Anirban Dasgupta, Internet Mathematics Network Theory (Electrical Engineering)EngineeringInformation NetworksCommunity MiningNetwork Analysis	2009	1.9K
Distributed GraphLab Yucheng Low, Danny Bickson, Joseph E. Gonzalez, Proceedings of the VLDB Endowment Cluster ComputingData ConsistencyGraphlab AbstractionEngineeringData Science	2012	1.7K
Multilevelk-way Partitioning Scheme for Irregular Graphs George Karypis, Vipin Kumar Journal of Parallel and Distributed Computing Cluster ComputingMultilevelk-way Partitioning SchemeEngineeringGraph TheoryStructural Graph Theory	1998	1.7K

Page 1