Publication | Closed Access
GraphFrames
84
Citations
14
References
2016
Year
Unknown Venue
EngineeringGraph DatabaseSemantic WebGraph ProcessingData SciencePresent GraphframesGraph Query LanguageManagementData IntegrationParallel ComputingData ManagementQuery LanguagesGraph AlgorithmsKnowledge DiscoveryComputer ScienceQuery OptimizationGraph DatabasesRelational QueriesGraph TheoryParallel ProgrammingSemantic GraphGraph Data
Graph data is common across many domains, yet its analysis typically requires specialized engines, making it difficult for users and hindering optimization of end‑to‑end workflows. GraphFrames is an integrated system that lets users combine graph algorithms, pattern matching, and relational queries while optimizing across them. It materializes multiple graph views, executes iterative algorithms and pattern matching through joins, exposes a declarative data‑frame API, and applies graph‑aware join optimization to select the best view for each computation.
Graph data is prevalent in many domains, but it has usually required specialized engines to analyze. This design is onerous for users and precludes optimization across complete workflows. We present GraphFrames, an integrated system that lets users combine graph algorithms, pattern matching and relational queries, and optimizes work across them. GraphFrames generalize the ideas in previous graph-on-RDBMS systems, such as GraphX and Vertexica, by letting the system materialize multiple views of the graph (not just the specific triplet views in these systems) and executing both iterative algorithms and pattern matching using joins. To make applications easy to write, GraphFrames provide a concise, declarative API based on the "data frame" concept in R that can be used for both interactive queries and standalone programs. Under this API, GraphFrames use a graph-aware join optimization algorithm across the whole computation that can select from the available views.
| Year | Citations | |
|---|---|---|
Page 1
Page 1