›› 2013,Vol. 28 ›› Issue (2): 366-381.doi: 10.1007/s11390-013-1337-1

所属专题: Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

基于稳定匹配的高效SLCA 求解算法

Jun-Feng Zhou1,3 (周军锋), Member, CCF, Guo-Xiang Lan1 (蓝国翔), Zi-Yang Chen1 (陈子阳), Member, CCF and Xian Tang2 (汤显)   

  • 收稿日期:2011-11-11 修回日期:2012-05-30 出版日期:2013-03-05 发布日期:2013-03-05

Fast Smallest Lowest Common Ancestor Computation Based on Stable Match

Jun-Feng Zhou1,3 (周军锋), Member, CCF, Guo-Xiang Lan1 (蓝国翔), Zi-Yang Chen1 (陈子阳), Member, CCF and Xian Tang2 (汤显)   

  1. 1 School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China;
    2 School of Economics and Management, Yanshan University, Qinhuangdao 066004, China;
    3 Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Yanshan UniversityQinhuangdao 066004, China
  • Received:2011-11-11 Revised:2012-05-30 Online:2013-03-05 Published:2013-03-05
  • Supported by:

    This research was partially supported by the National Natural Science Foundation of China under Grant Nos. 61073060, 61040023, 61272124, and the Science and Technology Research and Development Program of Hebei Province of China under Grant No. 11213578.

本文研究基于SLCA 语义的高效XML 关键字查询处理问题。对于给定的包含m个关键字的查询Q, 我们提出基于稳定匹配的SLCA 求解算法, 其中每个稳定匹配M包含m 个结点, 这些结点分别属于Q 的m 个倒排表。假设M 的最低公共祖先是v, 我们证明了在Q 的最低公共祖先结点中, 不存在某个最低公共祖先可以出现在M 的第一个结点之后并且是v 的后代结点。基于此我们的算法在求解稳定匹配的过程中可以有效避免对很多无用结点的处理。我们提出了两种基于稳定匹配的SLCA 求解算法——BSLCA 和HSLCA。BSLCA 根据从短到长的次序依次处理Q 的所有倒排表, 而HSLCA可以一次性处理Q 的所有倒排表来避免BSLCA 存在的冗余计算问题。最后, 我们基于不同评价指标, 通过实验验证了本文所提出算法的性能优势。

Abstract: In this paper, we focus on efficient processing of XML keyword queries based on smallest lowest common ancestor (SLCA) semantics. For a given query Q with m keywords, we propose to use stable matches as the basis for SLCA computation, where each stable match M consists of m nodes that belong to the m distinct keyword inverted lists of Q. M satisfies that no other lowest common ancestor (LCA) node of Q can be found to be located after the first node of M and be a descendant of the LCA of M, based on which the operation of locating a stable match can skip more useless nodes. We propose two stable match based algorithms for SLCA computation, i.e., BSLCA and HSLCA. BSLCA processes two keyword inverted lists each time from the shortest to the longest, while HSLCA processes all keyword inverted lists in a holistic way to avoid the problem of redundant computation invoked by BSLCA. Our extensive experimental results verify the performance advantages of our methods according to various evaluation metrics.

[1] Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: A semanticsearch engine for XML. In Proc. the 29th International Con-ference on Very Large Data Bases (VLDB2003), September2003, pp.45-56.

[2] Guo L, Shao F, Botev C, Shanmugasundavam J. XRANK:Ranked keyword search over XML documents. In Proc. the2003 ACM SIGMOD International Conference on Manage-ment of Data (SIGMOD2003), June 2003, pp.16-27.

[3] Zhou R, Liu C, Li J. Fast ELCA computation for keywordqueries on XML data. In Proc. the 13th InternationalConference on Extending Database Technology (EDBT2010),March 2010, pp.549-560.

[4] Xu Y, Papakonstantinou Y. Efficient keyword search forsmallest LCAs in XML databases. In Proc. the ACMSIGMOD International Conference on Management of Data(SIGMOD2005), June 2005, pp.527-538.

[5] Li Y, Yu C, Jagadish H V. Schema-free xQuery. In Proc.the 30th International Conference on Very Large Data Bases(VLDB2004), Aug. 31-Sept. 3, 2004, pp.72-83.

[6] Chen L J, Papakonstantinou P. Supporting top-K keywordsearch in XML databases. In Proc. the 26th InternationalConference on Data Engineering (ICDE 2010), March 2010,pp.689-700.

[7] Sun C, Chan C Y, Goenka A K. Multiway SLCA-based key-word search in XML data. In Proc. the 16th InternationalConference on World Wide Web (WWW2007), May 2007,pp.1043-1052.

[8] Liu Z, Chen Y. Identifying meaningful return informationfor XML keyword search. In Proc. the 2007 ACM SIG-MOD International Conference on Management of Data(SIGMOD2007), June 2007, pp.329-340.

[9] Bao Z, Ling T W, Chen B, Lu J. Effective XML keywordsearch with relevance oriented ranking. In Proc. the 25thInternational Conference on Data Engineering (ICDE 2009),March 2009, pp.517-528.

[10] Li G, Feng J, Wang J, Zhou L. Effective keyword searchfor valuable LCAs over XML documents. In Proc. the 6thACM Conference on Information and Knowledge Manage-ment (CIKM2007), November 2007, pp.31-40.

[11] Li G, Ji S, Li C, Feng J. Efficient type-ahead search on rela-tional data: A TASTIER approach. In Proc. the 2009 ACMSIGMOD International Conference on Management of Data(SIGMOD2009), June 29-July 2, 2009, pp.695-706.

[12] Tatarinov I, Viglas S, Beyer K S, Shanmugasundaram J,Shekita E J, Zhang C. Storing and querying ordered XMLusing a relational database system. In Proc. the 2002 ACMSIGMOD International Conference on Management of Data(SIGMOD2002), June 2002, pp.204-215.

[13] Xu Y, Papakonstantinou Y. Efficient LCA based keywordsearch in XML data. In Proc. the 11th International Confer-ence on Extending Database Technology (EDBT2008), March2008, pp.535-546.

[14] Chen Y, Wang W, Liu Z, Lin X. Keyword search on struc-tured and semi-structured data. In Proc. the 2009 ACMSIGMOD International Conference on Management of Data(SIGMOD2009), June 29-July 2, 2009, pp.1005-1010.

[15] Chen Y, Wang W, Liu Z. Keyword-based search and explo-ration on databases. In Proc. the 27th International Confer-ence on Data Engineering (ICDE 2011), April 2011, pp.1380-1383.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] 高庆狮; 张祥; 杨树范; 陈树清;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] 黄河燕;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] 闵应骅; 韩智德;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] 唐同诰; 招兆铿;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] 张钹; 张铃;. Statistical Heuristic Search[J]. , 1987, 2(1): 1 -11 .
[10] 朱鸿;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: