A sampling approach for XML query selectivity estimation

Abstract

As the Extensible Markup Language (XML) rapidly establishes itself as the de facto standard for presenting, storing, and exchanging data on the Internet, large volume of XML data and their supporting facilities start to surface. A fast and accurate selectivity estimation mechanism is of practical importance because selectivity estimation plays a fundamental role in XML query optimization. Recently proposed techniques are all based on some forms of structure synopses that could be time-consuming to build and not effective for summarizing complex structure relationships. In this research, we propose an innovative sampling method that can capture the tree structures and intricate relationships among nodes in a simple and effective way. The derived sample tree is stored as a synopsis for selectivity estimation. Extensive experimental results show that, in comparison with the state-of-the-art structure synopses, specifically the TreeSketch and Xseed synopses, our sample tree synopsis applies to a broader range of query types, requires several orders of magnitude less construction time, and generates estimates with considerably better precision for complex datasets.

References

Page 1

	Year	Citations

Page 1