Publication | Closed Access
GraphChi: large-scale graph computation on just a PC
901
Citations
41
References
2012
Year
Cluster ComputingEngineeringDistributed AlgorithmsGraphchi PerformsGraph DatabaseGraph ComputationGraph ProcessingWeb GraphData ScienceParallel ComputingLarge-scale Graph ComputationMassively-parallel ComputingGraph AlgorithmsKnowledge DiscoveryComputer EngineeringDistributed SystemsComputer ScienceData-intensive ComputingGraph AlgorithmGraph TheoryEdge ComputingBusinessParallel ProgrammingGraph AnalysisBig Data
Current graph computation systems require distributed clusters to handle very large real‑world problems such as social network or web graph analysis. This work introduces GraphChi, a disk‑based system that enables efficient computation on graphs with billions of edges using only a single PC. GraphChi partitions large graphs into small parts and employs a novel parallel sliding‑window method, allowing advanced data mining, graph mining, and machine learning algorithms to run on a single consumer‑level computer. Experiments and theory show that GraphChi performs well on SSDs and HDDs, processes over 100 k graph updates per second, and solves the same problems as distributed systems with only a fraction of the resources, making large‑scale graph computation accessible to anyone with a modern PC.
Current systems for graph computation require a distributed computing cluster to handle very large real-world problems, such as analysis on social networks or the web graph. While distributed computational resources have become more accessible, developing distributed graph algorithms still remains challenging, especially to non-experts.In this work, we present GraphChi, a disk-based system for computing efficiently on graphs with billions of edges. By using a well-known method to break large graphs into small parts, and a novel parallel sliding windows method, GraphChi is able to execute several advanced data mining, graph mining, and machine learning algorithms on very large graphs, using just a single consumer-level computer. We further extend GraphChi to support graphs that evolve over time, and demonstrate that, on a single computer, GraphChi can process over one hundred thousand graph updates per second, while simultaneously performing computation. We show, through experiments and theoretical analysis, that GraphChi performs well on both SSDs and rotational hard drives.By repeating experiments reported for existing distributed systems, we show that, with only fraction of the resources, GraphChi can solve the same problems in very reasonable time. Our work makes large-scale graph computation available to anyone with a modern PC.
| Year | Citations | |
|---|---|---|
Page 1
Page 1