Publication | Closed Access
Clause-iteration with MapReduce to scalably query datagraphs in the SHARD graph-store
68
Citations
10
References
2011
Year
Unknown Venue
Cluster ComputingEngineeringGraph DatabaseGraph Data ProcessingMap-reduceSemantic WebData ScienceGraph Query LanguageManagementData IntegrationQuery DatagraphsParallel ComputingData ManagementKnowledge DiscoveryShard Graph-storeComputer ScienceDistributed Query ProcessingQuery OptimizationData ProcessingGraph TheoryCloud ComputingParallel ProgrammingMassive Data ProcessingGraph DataBig Data
Graph data processing is an emerging application area for cloud computing because there are few other information infrastructures that cost-effectively permit scalable graph data processing. We present a scalable cloud-based approach to process queries on graph data utilizing the MapReduce model. We call this approach the Clause-Iteration approach. We present algorithms that, when used in conjunction with a MapReduce framework, respond to SPARQL queries over RDF data. Our innovation in the Clause-Iteration approach comes from 1) the iterative construction of query responses by incrementally growing the number of query clauses considered in a response, and 2) our use of flagged keys to join the results of these incremental responses. The Clause-Iteration algorithms form the basis of our scalable, SHARD graph-store built on the Hadoop implementation of MapReduce. SHARD performs favorably when compared to existing "industrial" graph-stores on a standard benchmark graph with 800 million edges. We discuss design considerations and alternatives associated with constructing scalable graph processing technologies.
| Year | Citations | |
|---|---|---|
Page 1
Page 1