HKUST Library Institutional Repository Banner

HKUST Institutional Repository >
Computer Science and Engineering >
CSE Doctoral Theses >

Please use this identifier to cite or link to this item: http://hdl.handle.net/1783.1/2045
Title: Efficient structural query processing in XML databases
Authors: Jiang, Haifeng
Issue Date: 2004
Abstract: As XML gains unqualified success through being adopted as a universal data exchange format, particularly in the World Wide Web, more and more information is being stored, exchanged, and presented in it. The ability to intelligently query XML data sources, therefore, becomes increasingly important. This thesis studies the query processing of a core subset of XML query languages: twig queries with and without OR-predicates. An XML twig query, represented as a labeled query tree, is essentially a complex selection on both the structure and the content of an XML document. Matching a twig query means finding all the instances of the query tree embedded in the data tree that represents the queried XML document. While querying by content can leverage traditional database technologies, evaluating the structural relationships (specifically parent-child and ancestor-descendant relationships) specified in twig queries has imposed a great challenge to existing database technologies. We present in this thesis a series of holistic twig join algorithms by which query trees are matched not edge by edge but as a whole so that irrelevant intermediate results can be greatly reduced. Specifically, we first present a merge-based algorithm and an index-based algorithm for processing twig queries without OR-predicates. The algorithms access underlying XML data through a generic data access interface so that they are independent of specific physical storage methods. We then extend these two algorithms to handle general twig queries that can contain OR-predicates. To provide an efficient data access interface for our algorithms, we propose a novel external-memory index structure, namely XR-tree, and detail the implementation of the data access interface for XR-tree indexed XML data. We conduct extensive experiments with our algorithms in comparison to existing state-of-art ones. Our algorithms clearly outperform the existing ones, therefore we recommend that they should be adopted by performance-critical XML query systems.
Description: Thesis (Ph.D.)--Hong Kong University of Science and Technology, 2004
xiv, 125 leaves : ill. ; 30 cm
HKUST Call Number: Thesis COMP 2004 Jiang
URI: http://hdl.handle.net/1783.1/2045
Appears in Collections:CSE Doctoral Theses

Files in This Item:

File Description SizeFormat
th_redirect.html0KbHTMLView/Open

All items in this Repository are protected by copyright, with all rights reserved.