Efficient top-k search across heterogeneous XML data sources

Jianxin Li, C. Liu, J.X. X. Yu, R. Zhou

    Research output: Chapter in Book/Conference paperConference paperpeer-review

    7 Citations (Scopus)


    An important issue arising from XML query relaxation is how to efficiently search the top-k best answers from a large number of XML data sources, while minimizing the searching cost, i.e., finding the k matches with the highest computed scores by only traversing part of the documents. This paper resolves this issue by proposing a bound-threshold based scheduling strategy. It can answer a top-k XML query as early as possible by dynamically scheduling the query over XML documents. In this work, the total amount of documents that need to be visited can be greatly reduced by skipping those documents that will not produce the desired results with the bound-threshold strategy. Furthermore, most of the candidates in each visited document can also be pruned based on the intermediate results. Most importantly, the partial results can be output immediately during the query execution, rather than waiting for the end of all results to be determined. Our experimental results show that our query scheduling and processing strategies are both practical and efficient. © 2008 Springer-Verlag Berlin Heidelberg.
    Original languageEnglish
    Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Number of pages16
    Volume4947 LNCS
    Publication statusPublished - 2008
    Event13th International Conference on Database Systems for Advanced Applications, DASFAA 2008 - New Delhi
    Duration: 19 Mar 200821 Mar 2008


    Conference13th International Conference on Database Systems for Advanced Applications, DASFAA 2008


    Dive into the research topics of 'Efficient top-k search across heterogeneous XML data sources'. Together they form a unique fingerprint.

    Cite this