Top-k keyword search over probabilistic XML data

Jianxin Li, C. Liu, R. Zhou, W. Wang

Research output: Chapter in Book/Conference paperConference paper

56 Citations (Scopus)
12 Downloads (Pure)

Abstract

Despite the proliferation of work on XML keyword query, it remains open to support keyword query over probabilistic XML data. Compared with traditional keyword search, it is far more expensive to answer a keyword query over probabilistic XML data due to the consideration of possible world semantics. In this paper, we firstly define the new problem of studying top-k keyword search over probabilistic XML data, which is to retrieve k SLCA results with the k highest probabilities of existence. And then we propose two efficient algorithms. The first algorithm PrStack can find k SLCA results with the k highest probabilities by scanning the relevant keyword nodes only once. To further improve the efficiency, we propose a second algorithm EagerTopK based on a set of pruning properties which can quickly prune unsatisfied SLCA candidates. Finally, we implement the two algorithms and compare their performance with analysis of extensive experimental results. © 2011 IEEE.
Original languageEnglish
Title of host publication2011 IEEE 27th International Conference on Data Engineering (ICDE)
EditorsSerge Abiteboul, Christoph Koch , Kian-Lee Tan , Alfons Kemper , Wolfgang Nejdl
Place of PublicationGermany
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages673-684
Number of pages12
ISBN (Electronic)9781424489602
ISBN (Print)9781424489589
DOIs
Publication statusPublished - 2011
Externally publishedYes
Event2011 IEEE 27th International Conference on Data Engineering (ICDE) - Hannover, Germany
Duration: 11 Apr 201116 Apr 2011

Conference

Conference2011 IEEE 27th International Conference on Data Engineering (ICDE)
CountryGermany
CityHannover
Period11/04/1116/04/11

Fingerprint Dive into the research topics of 'Top-k keyword search over probabilistic XML data'. Together they form a unique fingerprint.

Cite this