Concepedia

Publication | Closed Access

Twig2Stack: bottom-up processing of generalized-tree-pattern queries over XML documents

188

Citations

16

References

2006

Year

Abstract

Tree pattern matching is one of the most fundamental tasks for XML query processing. Holistic twig query processing techniques [4, 16] have been developed to minimize the intermediate results, namely, those root-to-leaf path matches that are not in the final twig results. However, useless path matches cannot be completely avoided, especially when there is a parent-child relationship in the twig query. Furthermore, existing approaches do not consider the fact that in practice, in order to process XPath or XQuery statements, a more powerful form of twig queries, namely, Generalized-Tree-Pattern (GTP) [8] queries, is required. Most existing works on processing GTP queries generally calls for costly post-processing for eliminating redundant data and/or grouping of the matching results.In this paper, we first propose a novel hierarchical stack encoding scheme to compactly represent the twig results. We introduce Twig2Stack, a bottom-up algorithm for processing twig queries based on this encoding scheme. Then we show how to efficiently enumerate the query results from the encodings for a given GTP query. To our knowledge, this is the first GTP matching solution that avoids any post path-join, sort, duplicate elimination and grouping operations. Extensive performance studies on various data sets and queries show that the proposed Twig2Stack algorithm not only has better twig query processing performance than state-of-the-art algorithms, but is also capable of efficiently processing the more complex GTP queries.

References

YearCitations

Page 1