Publication | Closed Access
PigSPARQL: a SPARQL query processing baseline for big data
39
Citations
5
References
2013
Year
Cluster ComputingEngineeringMap-reduceData ScienceDatabase SupportGraph Query LanguageManagementSparql Query ProcessingData IntegrationParallel ComputingQuery LanguageBig DataData ManagementHadoop MapreduceKnowledge DiscoveryComputer ScienceBig Data SearchDistributed Query ProcessingQuery OptimizationCloud ComputingParallel ProgrammingMassive Data ProcessingData Modeling
In this paper we discuss PigSPARQL, a competitive yet easy to use SPARQL query processing system on MapReduce that allows adhoc SPARQL query processing on large RDF graphs out of the box. Instead of a direct mapping, PigSPARQL uses the query language of Pig, a data analysis platform on top of Hadoop MapReduce, as an intermediate layer between SPARQL and MapReduce. This additional level of abstraction makes our approach independent of the actual Hadoop version and thus ensures the compatibility to future changes of the Hadoop framework as they will be covered by the underlying Pig layer. We revisit PigSPARQL and demonstrate the performance improvement when simply switching the underlying version of Pig from 0.5.0 to 0.11.0 without any changes to PigSPARQL itself. Because of this sustainability, PigSPARQL is an attractive long-term baseline for comparing various MapReduce based SPARQL implementations which is also underpinned by its competitiveness with existing systems, e.g. HadoopRDF.
| Year | Citations | |
|---|---|---|
Page 1
Page 1