Loading [MathJax]/jax/output/SVG/jax.js
We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Jian-Hua Feng, Qian Qian, Jian-Yong Wang, Li-Zhu Zhou. Efficient Mining of Frequent Closed XML Query Pattern[J]. Journal of Computer Science and Technology, 2007, 22(5): 725-735.
Citation: Jian-Hua Feng, Qian Qian, Jian-Yong Wang, Li-Zhu Zhou. Efficient Mining of Frequent Closed XML Query Pattern[J]. Journal of Computer Science and Technology, 2007, 22(5): 725-735.

Efficient Mining of Frequent Closed XML Query Pattern

More Information
  • Received Date: October 11, 2006
  • Revised Date: April 02, 2007
  • Published Date: September 14, 2007
  • Previous research works have presented convincing arguments that afrequent pattern mining algorithm should not mine all frequent but onlythe closed ones because the latter leads to not only more compact yetcomplete result set but also better efficiency. Upon discovery offrequent closed XML query patterns, indexing and caching can beeffectively adopted for query performance enhancement. Most of the previousalgorithms for finding frequent patterns basically introduced astraightforward generate-and-test strategy. In this paper, wepresent SOLARIA, an efficient algorithm for mining frequentclosed XML query patterns without candidate maintenance and costlytree-containment checking. Efficient algorithm of sequence mining isinvolved in discovering frequent tree-structured patterns, which aimsat replacing expensive containment testing with cheap parent-childchecking in sequences. SOLARIA deeply prunes unrelated searchspace for frequent pattern enumeration by parent-child relationshipconstraint. By a thorough experimental study on various real-life data,we demonstrate the efficiency and scalability of SOLARIA overthe previous known alternative. SOLARIA is also linearlyscalable in terms of XML queries' size.
  • [1]
    Chen Q, Lim A -\it et al}. D(k)-index: An adaptive structural summary for graph-structured data. In -\it Proc. the ACM SIGMOD Int. Conf. Management of Data}, San Diego, CA, USA, Jun. 912, 2003, pp.134144.
    [2]
    Kaushik R, Shenoy P -\it et al}. Exploiting local similarity for efficient indexing of paths in graph structured data. In -\it Proc. the 18th Int. Conf. Data Engineering}, San Jose, CA, USA, Feb. 26Mar. 1, 2002, pp.129140.
    [3]
    Milo T, Suciu D. Index structures for path expressions. In -\it Proc. the 7th Int. Conf. Database Theory}, Jerusalem, Israel, Jan. 1012, 1999, pp.277295.
    [4]
    Yang L H, Lee M L -\it et al}. Efficient mining of XML query patterns for caching. In -\it Proc. the 29th Int. Conf. Very Large Data Bases}, Berlin, Germany, Sept. 912, 2003, pp.6980.
    [5]
    Yan X, Han J -\it et al}. Mining closed sequential patterns in large databases. In -\it Proc. the 3rd SIAM Int. Conf. Data Mining}, San Francisco, CA, USA, May 13, 2003, Electronic Edition.
    [6]
    Dehaspe L, Toivonen H -\it et al}. Finding frequent substructures in chemical compounds. In -\it Proc. 4th Int. Conf. Knowledge Discovery and Data Mining}, New York, USA, Aug. 2731, 1998, pp.3036.
    [7]
    Bettini C, Wang X -\it et al}. Mining temporal relationals with multiple granularities in time sequences. -\it IEEE Data Engineering Bulletin}, 1998, 21(1): 3238.
    [8]
    Pei J, Han J -\it et al}. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In -\it Proc. the 18th Int. Conf. Data Engineering}, Heidelberg, Germany, April 26, 2001, pp.215224.
    [9]
    Feng J, Qian Q -\it et al}. Exploit sequencing to accelerate hot XML query pattern mining. In -\it Proc. the 2006 ACM Symp. Applied Computing}, Dijon, France, Apr. 2327, 2006, pp.517524.
    [10]
    Qian Q, Feng J -\it et al}. Exploit sequencing to accelerate XML twig query answering. In -\it Proc. the 11th Int. Conf. Database Systems for Advanced Applications}, Singapore, Apr. 1215, 2006, pp.279294.
    [11]
    Wang J, Han J. BIDE: Efficient mining of frequent closed sequences. In -\it Proc. the 20th Int. Conf. Data Engineering}, Boston, MA, USA, Mar. 30Apr. 2, 2004, pp.7990.
    [12]
    Kuramochi M, Karypis G. Frequent subgraph discovery. In -\it Proc. the 1st IEEE Int. Conf. Data Mining}, San Jose, CA, USA, Nov. 29Dec. 2, 2001, pp.313320.
    [13]
    Agrawal R, Srikant R. Fast algorithms for mining association rules. In -\it Proc. the 20th Int. Conf. Very Large Data Bases}, Santiago de Chile, Chile, Sept. 1215, 1994, pp.487499.
    [14]
    Zaki M. Efficiently mining frequent trees in a forest. In -\it Proc. the 8th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining}, Edmonton, Alberta, Canada, Jul. 2326, 2002, pp.7180.
    [15]
    Asai T, Abe K -\it et al}. Efficient substructure discovery from large semi-structured data. In -\it Proc. the 2nd SIAM Int. Conf. Data Mining}, Arlington, VA, USA, Apr. 1113, 2002, Electronic Edition.
    [16]
    Termier A, Rousset M C -\it et al}. TreeFinder: A first step towards XML data mining. In -\it Proc. the 2nd IEEE Int. Conf. Data Mining}, Maebashi, Japan, Dec.~912, 2002, pp.450457.
    [17]
    Han J, Pei J -\it et al}. FreeSpan: Frequent pattern-projected sequential pattern mining. In -\it Proc. the 6th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining}, Boston, MA, USA, Aug. 2023, 2000, pp.355359.
    [18]
    Masseglia F, Cathala F -\it et al}. The PSP approach for mining sequential patterns. In -\it Proc. the 2nd European Symp. Principles of Data Mining and Knowledge Discovery}, Nantes, France, Sept. 2326, 1998, pp.176184.
    [19]
    Srikant R, Agrawal R. Mining sequential patterns: Generalizations and performance improvements. In -\it Proc. the 5th Int. Conf. Extending Database Technology}, Avignon, France, Mar. 2529, 1996, pp.317.
    [20]
    Ozden B, Ramaswamy S -\it et al}. Cyclic association rules. In -\it Proc. the 14th Int. Conf. Data Engineering}, Orlando, Florida, USA, Feb. 2327, 1998, pp.412421.
    [21]
    Han J, Dong G -\it et al}. Efficient mining of partial periodic patterns in time series database. In -\it Proc. the 18th Int. Conf. Data Engineering}, Sydney, Australia, Mar.~2326, 1999, pp.106115.
    [22]
    Yang J, Yu P S -\it et al}. Mining long sequential patterns in a noisy environment. In -\it Proc. 2003 ACM SIGMOD Int. Conf. Management of Data}, Madison, WI, USA, Jun. 36, 2002, pp.406417.
    [23]
    Chi Y, Xia Y -\it et al}. Mining closed and maximal frequent subtrees from databases of labeled rooted trees. -\it IEEE Trans. Knowledge and Data Engineering}, 2005, 17(2): 190202.
    [24]
    Berglund A, Boag S -\it et al}. XML path language (XPath) 2.0, W3C Candidate Recommendation, June, 2006, http://www.w3.org/TR/xpath20/.
    [25]
    Boag S, Chamberlin D -\it et al}. XQuery 1.0: An XML query language. W3C Candidate Recommendation, June, 2006, http://www.w3.org/TR/xquery.
    [26]
    Raw P R, Moon B. PRIX: Indexing and querying XML using Prufer sequences. In -\it Proc. the 20th Int. Conf. Data Engineering}, Boston, MA, USA, Mar.~30Apr.~2, 2004, pp.288300.
    [27]
    Picciotto S. How to encode a tree
    [Dissertation]. University of California, San Diego, USA, 1999.
    [28]
    Yang L, Lee M L -\it et al}. Mining frequent query patterns from XML queries. In -\it Proc. the 8th Int. Conf. Database Systems for Advanced Applications}, Kyoto, Japan, Mar. 2628, 2003, pp.355362.
  • Related Articles

    [1]Jie Tang, Pollawat Thanarungroj, Chen Liu, Shao-Shan Liu, Zhi-Min Gu, Jean-Luc Gaudiot. Pinned OS/Services: A Case Study of XML Parsing on Intel SCC[J]. Journal of Computer Science and Technology, 2013, 28(1): 3-13. DOI: 10.1007/s11390-013-1308-6
    [2]Jun-Feng Zhou, Tok Wang Ling, Zhi-Feng Bao, Xiao-Feng Meng. Related Axis: The Extension to XPath Towards Effective XML Search[J]. Journal of Computer Science and Technology, 2012, 27(1): 195-212. DOI: 10.1007/s11390-012-1217-0
    [3]Guo-Liang Li, Jian-Hua Feng. An Effective Semantic Cache for Exploiting XPath Query/View Answerability[J]. Journal of Computer Science and Technology, 2010, 25(2): 347-361.
    [4]Jian-Hua Feng, Guo-Liang Li, Na Ta. A Semantic Cache Framework for Secure XML Queries[J]. Journal of Computer Science and Technology, 2008, 23(6): 988-997.
    [5]Byron Choi, Gao Cong, Wenfei Fan, Stratis D. Viglas. Updating Recursive XML Views of Relations[J]. Journal of Computer Science and Technology, 2008, 23(4): 516-537.
    [6]Jian-Hua Feng, Yu-Guo Liao, Yong Zhang. HCH for checking containment of XPath fragment[J]. Journal of Computer Science and Technology, 2007, 22(5): 736-748.
    [7]Guo-Ren Wang, Xiao-Lin Zhang. Declarative XML Update Language Based on a Higher Data Model[J]. Journal of Computer Science and Technology, 2005, 20(3): 373-377.
    [8]Dun-Ren Che. Accomplishing Deterministic XML Query Optimization[J]. Journal of Computer Science and Technology, 2005, 20(3): 357-366.
    [9]LU Zhengding, LI Chunlin, LI Layuan. Coordinating Mobile Agents by the XML-Based Tuple Space[J]. Journal of Computer Science and Technology, 2002, 17(6).
    [10]Sun Yudong, Xie Zhiliang. Macro-Dataflow Computational Model and Its Simulation[J]. Journal of Computer Science and Technology, 1990, 5(3): 289-295.

Catalog

    Article views (19) PDF downloads (6368) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return