Publication | Closed Access
Scalable semantic web data management using vertical partitioning
655
Citations
22
References
2007
Year
Unknown Venue
Efficient management of RDF data is crucial for realizing the Semantic Web, yet performance and scalability challenges are increasingly pressing as the technology is applied to real-world applications. This study investigates why existing RDF data management solutions scale poorly and explores the fundamental limitations that hinder their scalability. The authors review current performance‑enhancing approaches, critique the recently proposed table‑based solution, and propose vertical partitioning of RDF data, then empirically compare its performance against prior methods on a 50‑million‑triple library catalog. Vertical partitioning delivers performance comparable to property tables while being simpler to design, and when implemented on a column‑oriented DBMS it yields an order‑of‑magnitude speedup, reducing query times from minutes to seconds.
Efficient management of RDF data is an important factor in realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance for RDF databases and consider a recent suggestion, tables. We then discuss practically and empirically why this solution has undesirable features. As an improvement, we propose an alternative solution: vertically partitioning the RDF data. We compare the performance of vertical partitioning with prior art on queries generated by a Web-based RDF browser over a large-scale (more than 50 million triples) catalog of library data. Our results show that a vertical partitioned schema achieves similar performance to the property table technique while being much simpler to design. Further, if a column-oriented DBMS (a database architected specially for the vertically partitioned case) is used instead of a row-oriented DBMS, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds.
| Year | Citations | |
|---|---|---|
Page 1
Page 1