›› 2012, Vol. 27 ›› Issue (1): 195-212.doi: 10.1007/s11390-012-1217-0

• Database and Data Management • Previous Articles     Next Articles

Related Axis: The Extension to XPath Towards Effective XML Search

Jun-Feng Zhou1,2 (周军锋), Member, CCF, Tok Wang Ling3 (林卓旺), Senior Member, ACM, IEEE, Zhi-Feng Bao3 (鲍芝峰), and Xiao-Feng Meng2 (孟小峰), Senior Member, CCF, Member, ACM, IEEE   

  1. 1. School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China;
    2. School of Information, Renmin University of China, Beijing 100872, China;
    3. School of Computing, National University of Singapore, 117417, Singapore
  • Received:2010-09-06 Revised:2011-07-22 Online:2012-01-05 Published:2012-01-05
  • Supported by:

    This research was partially supported by the National Science and Technology Major Project of China under Grant No. 2010ZX01042-002-003, the National Natural Science Foundation of China under Grant Nos. 61073060, 61040023, 61070055, 91024032, the Fundamental Research Funds for the Central Universities of China, and the Research Funds of Renmin University of China under Grant No. 10XNI018.

We investigate the limitations of existing XML search methods and propose a new semantics, related relation-ship, to effectively capture meaningful relationships of data elements from XML data in the absence of structural constraints. Then we make an extension to XPath by introducing a new axis, related axis, to specify the related relationship between query nodes so as to enhance the flexibility of XPath. We propose to reduce the cost of computing the related relationship by a new schema summary that summarizes the related relationship from the original schema without any loss. Based on this schema summary, we introduce two indices to improve the performance of query processing. Our algorithm shows that the evaluation of most queries can be equivalently transformed into just a few selection and value join operations, thus avoids the costly structural join operations. The experimental results show that our method is effective and efficient in terms of comparing the effectiveness of the related relationship with existing keyword search semantics and comparing the efficiency of our evaluation methods with existing query engines.

[1] Christophides V, Cluet S, Sim?eon S. On wrapping query lan-guages and efficient XML integration. In Proc. the 2000 ACMSIGMOD International Conference on Management of Data(SIGMOD2000), Dallas, USA, May 14-19, 2000, pp.141-152.

[2] Bruno N, Koudas N, Srivastava D. Holistic twig joins: Op-timal XML pattern matching. In Proc. the 2002 ACMSIGMOD International Conference on Management of Data(SIGMOD2002), Madison, USA, June 3-6, 2002, pp.310-321.

[3] Jiang H, Wang W, Lu H, Yu J X. Holistic twig joins onindexed XML documents. In Proc. the 29th InternationalConference on Very Large Data Bases (VLDB2003), Berlin,Germany, Sept. 12-13, 2003, pp.273-284.

[4] Lu J, Ling T W, Chan C Y, Chen T. From region encoding toextended dewey: On efficient processing of XML twig patternmatching. In Proc. the 31st International Conference onVery Large Data Bases (VLDB2005), Trondheim, Norway,Aug. 30-Sept. 2, 2005, pp.193-204.

[5] Chen T, Lu J, Ling T W. On boosting holism in XMLtwig pattern matching using structural indexing techniques.In Proc. the ACM SIGMOD International Conference onManagement of Data (SIGMOD2005), Baltimore, USA, June13-16, 2005, pp.455-466.

[6] Chen S, Li H, Tatemura J et al. Twig2Stack: Bottom-upprocessing of generalized-tree-pattern queries over XML doc-uments. In Proc. the 32nd International Conference on VeryLarge Data Bases (VLDB2006), Seoul, Korea, Sept. 12-15,2006, pp.283-294.

[7] Xu Y, Papakonstantinou Y. Efficient keyword search forsmallest LCAs in XML databases. In Proc. the ACMSIGMOD International Conference on Management of Data(SIGMOD2005), Baltimore, USA, June 13-16, 2005, pp.527-538.

[8] Guo L, Shao F, Botev C, Shanmugasunda J. XRANK: Rankedkeyword search over XML documents. In Proc. the 2003ACM SIGMOD International Conference on Managementof Data (SIGMOD2003), San Diego, USA, June 9-12, 2003,pp.16-27.

[9] Cohen S, Mamou J, Kanza Y, Sagiv Y. XSEarch: A seman-tic search engine for XML. In Proc. the 29th InternationalConference on Very Large Data Bases (VLDB2003), Berlin,Germany, Sept. 12-13, 2003, pp.45-56.

[10] Hristidis V, Papakonstantinou Y, Balmin A. Keyword proxi-mity search on XML graphs. In Proc. the 19th InternationalConference on Data Engineering (ICDE2003), Bangalor,India, March 5-8, 2003, pp.367-378.

[11] Liu Z, Chen Y. Reasoning and identifying relevant matchesfor XML keyword search. In Proc. the VLDB Endowment,Aug. 2008, 1(1): 921-932.

[12] Cohen S, Kanza Y, Kimelfeld B, Sagir Y. Interconnection se-mantics for keyword search in XML. In Proc. the 14th ACMCIKM International Conference on Information and Know-ledge Management (CIKM2005), Bremen, Germany, Oct. 31-Nov. 5, 2005, pp.389-396.

[13] Liu Z, Chen Y. Identifying meaningful return information forXML keyword search. In Proc. International Conference onManagement of Data (SIGMOD2007), Beijing, China, June12-14, 2007, pp.329-340.

[14] Li G, Feng J, Wang J, Zhou L. Effective keyword searchfor valuable LCAs over XML documents. In Proc. the6th ACM Conf. Information and Knowledge Management(CIKM2007), Lisbon, Portugal, Nov. 6-9, 2007, pp.31-40.

[15] Li Y, Yu C, Jagadish H V. Schema-free XQuery. InProc. the 30th International Conf. Very Large Data Bases(VLDB2004), Toronto, Canada, Aug. 29-Sept. 3, 2004, pp.72-83.

[16] Yu C, Jagadish H V. Querying complex structured databases.In Proc. the 33rd International Conf. Very Large Data Bases(VLDB2007), Vienna, Austria, Sept. 23-28, 2007, pp.1010-1021.

[17] Chen L J, Papakonstantinou Y. Supporting top-K keywordsearch in XML databases. In Proc. the 26th InternationalConference on Data Engineering (ICDE2010), Long Beach,USA, March 1-6, 2010, pp.689-700.

[18] Zhou R, Liu C, Li J. Fast ELCA computation for keywordqueries on XML data. In Proc. the 13th InternationalConference on Extending Database Technology (EDBT2010),Lausanne, Switzerland, March 22-26, 2010, pp.549-560.

[19] Kong L, Gilleron R, Lemay A. Retrieving meaningful relaxedtightest fragments for XML keyword search. In Proc. the12th International Conference on Extending Database Tech-nology (EDBT2009), Saint-Petersburg, Russia, Mar. 23-26,2009, pp.815-826.

[20] Bao Z, Ling T W, Chen B, Lu J. Effective XML keywordsearch with relevance oriented ranking. In Proc. the 25thInternational Conference on Data Engineering (ICDE2009),Shanghai, China, March 29-April 2, 2009, pp.517-528.

[21] Li J, Liu C, Zhou R, Wang W. Suggestion of promising re-sult types for XML keyword search. In Proc. the 13thInternational Conference on Extending Database Techno-logy (EDBT2010), Lausanne, Switzerland, Mar. 22-26, 2010,pp.561-572.

[22] Li G, Li C, Feng J, Zhou L. SAIL: Structure-aware indexingfor effective and progressive top-k keyword search over XMLdocuments. Inf. Sci., 2009, 179(21): 3745-3762.

[23] Feng J, Li G, Wang J, Zhou L. Finding and ranking com-pact connected trees for effective keyword proximity search inXML documents. Inf. Syst., 2010, 35(2): 186-203.

[24] Trotman A, Sigurbj?ornsson B. NEXI, now and next. In Proc.the 3rd International Workshop of the Initiative for the Eval-uation of XML Retrieval (INEX2004), Dagstuhl Castle, Ger-many, Dec. 6-8, 2004, pp.41-53.

[25] Fuhr N, Grofijohann K. XIRQL: A query language for infor-mation retrieval in XML documents. In Proc. the 24th An-nual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (SIGIR2001), NewOrleans, USA, Sept. 9-21, 2001, pp.172-180.

[26] Theobald M, Schenkel R, Weikum G. An efficient and versa-tile query engine for Topx search. In Proc. the 31st Interna-tional Conference on Very Large Data Bases (VLDB2005),Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.625-636.

[27] Yu C, Jagadish H V. Schema summarization. In Proc. the32nd International Conference on Very Large Data Bases(VLDB2006), Seoul, Korea, Sept. 12-15, 2006, pp.319-330.

[28] Hristidis V, Papakonstantinou Y. DISCOVER: Keywordsearch in relational databases. In Proc. the 28th Interna-tional Conference on Very Large Data Bases (VLDB2002),Hong Kong, China, Aug. 20-23, 2002, pp.670-681.

[29] Pal S, Cseri I, Seeliger O, Schaller G, Giakoumakis L, ZolotovV. Indexing XML data stored in a relational database. InProc. the 30th International Conference on Very Large DataBases (VLDB2004), Toronto, Canada, Aug. 29-Sept. 3, 2004,pp.1134-1145.

[30] Bex G J, Neven F, Vansummeren S. Inferring XML schemadefinitions from XML data. In Proc. the 33rd InternationalConference on Very Large Data Bases (VLDB2007), Vienna,Austria, Sept. 23-28, 2007, pp.998-1009.

[31] Bex G J, Neven F, Schwentick T, Tuyls K. Inference of con-cise DTDs from XML data. In Proc. the 32nd InternationalConference on Very Large Data Bases (VLDB2006), Seoul,Korea, Sept. 12-15, 2006, pp.115-126.

[32] Bernstein P A, Melnik S, Mork P. Interactive schema transla-tion with instance-level mappings. In Proc. the 31st Interna-tional Conference on Very Large Data Bases (VLDB2005),Trondheim, Norway, Aug. 30-Sept. 2, 2005, pp.1283-1286.

[33] Vries A P, Vercoustre A M, Thom J A, Craswell N, LalmasM. Overview of the INEX 2007 entity ranking track. In Proc.the 6th International Workshop of the Initiative for the Eval-uation of XML Retrieval (INEX2007), Germany: Springer,2007, pp.245-251.

[34] Golenberg K, Kimelfeld B, Sagir Y. Keyword proximitysearch in complex data graphs. In Proc. the ACM SIG-MOD International Conference on Management of Data(SIGMOD2008), Vancouver, Canada, June 10-12, 2008,pp.927-940.

[35] Reich G, Widmayer P. Beyond Steiner's problem: A VLSIoriented generalization. In Proc. the 15th InternationalWorkshop on Graph-Theoretic Concepts in Computer Sci-ence (WG1989), Castle Rolduc, Netherlands, June 14-16,1989, pp.196-210.

[36] Yu J X, Qin L, Chang L. Keyword search in relationaldatabases: A survey. IEEE Data Eng. Bull., 2010, 33(1):67-78.

[37] Chen Y, Wang W, Liu Z, Lin X. Keyword search on struc-tured and semi-structured data. In Proc. the InternationalConference on Management of Data (SIGMOD2009), Provi-dence, USA, June 29-July 2, 2009, pp.1005-1010.

[38] Amer-Yahiq S, Kondas N, Marian A, Srivastava D, TomanD. Structure and content scoring for XML. In Proc. the31st International Conference on Very Large Data Bases(VLDB2005), Trondheim, Norway, Aug. 30-Sept. 2, 2005,pp.361-372.

[39] Zhang S, Dyreson C. Symmetrically exploiting XML. InProc. the 15th International Conference on World Wide Web(WWW2006), Edinburgh, UK, May 22-26, 2006, pp.103-111.

[40] Botan I, Fischer P M, Florescu D, Kossmann D, Kraska T,Tamosevicius R. Extending XQuery with window functions.In Proc. the 33rd International Conference on Very LargeData Bases (VLDB2007), Vienna, Austria, Sept. 23-28, 2007,pp.75-86.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Wu Xindong;. Inductive Learning[J]. , 1993, 8(2): 22 -36 .
[2] Tang Weiqing; Wen Sili; Liu Shenquan;. An Object-Oriented Model ofUser Interface Generation Tool[J]. , 1994, 9(3): 275 -284 .
[3] Schubert Foo; Siu Cheung Hui;. System Architectural Design for DeliveringVideo Mail over the World-Wide-Web[J]. , 1997, 12(4): 372 -385 .
[4] SUN Yongqiang; LIN Kai; LU Chaojun;. Partial Completion of Equational Theories[J]. , 2000, 15(6): 552 -559 .
[5] Heng Li, Jin-Song Liu, Zhao Xu et al.. Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome[J]. , 2005, 20(4): 446 -453 .
[6] Hua Li, Shui-Cheng Yan, and Li-Zhong Peng[1]. Robust Non-Frontal Face Alignment with Edge Based Texture[J]. , 2005, 20(6): 849 -854 .
[7] Hong Mei, Dong-Gang Cao, and Fu-Qing Yang. Development of Software Engineering: A Research Perspective[J]. , 2006, 21(5): 682 -696 .
[8] Wen Zheng, Jun-Hai Yong, and Jean-Claude Paul. Visual Simulation of Multiple Unmixable Fluids[J]. , 2007, 22(1): 156 -160 .
[9] Dong-Nian Cheng, Yu-Xiang Hu, and Cai-Xia Liu. Parallel Algorithm Core: A Novel IPSec Algorithm Engine for Both Exploiting Parallelism and Improving Scalability[J]. , 2008, 23(5 ): 792 -805 .
[10] Guo-Dong Zhou (周国栋), Senior Member, CCF, Member, ACM, IEEE and Qiao-Ming Zhu (朱巧明), Senior Member, CCF. Kernel-Based Semantic Relation Detection and Classification via Enriched Parse Tree Structure[J]. , 2011, 26(1): 45 -56 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved