Special Issue: Data Management and Data Mining

• Articles • Previous Articles     Next Articles

Analyzing Sequential Patterns in Retail Databases

Unil Yun   

  1. Electronics and Telecommunications Research Institute, Telematics & USN Research Division LBS/Telematics Convergence Research Team, 161 Gajeong-dong, Yuseong-gu, Daejeon, 305-700, Korea
  • Received:2006-09-03 Revised:2006-11-17 Online:2007-03-10 Published:2007-03-10

Finding correlated sequential patterns in large sequence databases is one of the essential tasks in data mining since a huge number of sequential patterns are usually mined, but it is hard to find sequential patterns with the correlation. According to the requirement of real applications, the needed data analysis should be different. In previous mining approaches, after mining the sequential patterns, sequential patterns with the weak affinity are found even with a high minimum support. In this paper, a new framework is suggested for mining weighted support affinity patterns in which an objective measure, sequential ws-confidence is developed to detect correlated sequential patterns with weighted support affinity patterns. To efficiently prune the weak affinity patterns, it is proved that ws-confidence measure satisfies the anti-monotone and cross weighted support properties which can be applied to eliminate sequential patterns with dissimilar weighted support levels. Based on the framework, a weighted support affinity pattern mining algorithm (WSMiner) is suggested. The performance study shows that WSMiner is efficient and scalable for mining weighted support affinity patterns.

Key words: Chinese font; TrueType; X Window;



[1] Ester M. A top-down method for mining most specific frequent patterns in biological sequence data. In -\it Proc. the 4th SIAM Int. Conf. Data Mining}, Lake Buena Vista, Florida, USA, April 22--24, 2004, pp.91--101.

[2] Wang K, Xu Y, Yu J X. Scalable sequential pattern mining for biological sequences. In -\it Proc. the 2004 ACM CIKM Int. Conf. Information and Knowledge Management}, Washington DC, USA, November 8--13, 2004, pp.178--187.

[3] Cheng H, Yan X, Han J. IncSpan: Incremental mining of sequential patterns in large databases. In -\it Proc. the 10th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining}, Seattle, USA, August 22--25, 2004, pp.527--532.

[4] Chung H, Yan X, Han J. SeqIndex: Indexing sequences by sequential pattern analysis. In -\it Proc. the 5th SIAM Int. Conf. Data Mining}, Newport Beach, USA, April 21--23, 2005, pp.601--605.

[5] Pinto H, Han J, Pei J, Wang K. Multi-dimensional sequence pattern mining. In -\it Proc. the 2001 ACM CIKM Int. Conf. Information and Knowledge Management}, Atlanta, USA, November 5--10, 2001, pp.81--88.

[6] Kum H C, Pei J, Wang W, Duncan D. ApproxMAP: Approximate mining of consensus sequential patterns. In -\it Proc. the 3rd SIAM Int. Conf. Data Mining}, San Francisco, USA, May 1--3, 2003, pp.311--315.

[7] Yang J, Yu P S, Wang W, Han J. Mining long sequential patterns in a noisy environment. In -\it Proc. the 2002 ACM SIGMOD Int. Conf. Management of Data}, Madison, USA, June 3--6, 2002, pp.406--417.

[8] Garofalakis M, Rastogi R, Shim K. SPIRIT: Sequential pattern mining with regular expression constraints. In -\it Proc. 25th Int. Conf. Very Large Data Bases}, September 7--10, 1999, Edinburgh, UK, pp.223--234.

[9] Lorincz H A, Boulicaut J F. Mining frequent sequential patterns under regular expressions: A highly adaptive strategy for pushing constraints. In -\it Proc. the 3rd SIAM Int. Conf. Data Mining}, San Francisco, USA, May 1--3, 2003, pp.316--320.

[10] Pei J, Han J, Wang W. Mining sequential patterns with constraints in large databases. In -\it Proc. the 2002 ACM CIKM Int. Conf. Information and Knowledge Management}, McLean, USA, November 4--9, 2002, pp.18--25.

[11] Wang J, Han J. BIDE: Efficient mining of frequent closed sequences. In -\it Proc. the 20th Int. Conf. Data Engineering}, March 30--April 2, 2004, Boston, MA, USA, pp.79--90.

[12] Yan X, Han J, Afshar R. CloSpan: Mining closed sequential patterns in large datasets. In -\it Proc. the 3rd SIAM Int. Conf. Data Mining}, San Francisco, CA, USA, May 1--3, 2003, pp.166--177.

[13] Yan X, Han J. gSpan: Graph-based substructure pattern mining. In -\it Proc. the 2002 IEEE Int. Conf. Data Mining $($ICDM 2002$)$}, Maebashi City, Japan, 9--12, December, 2002, pp.721--724.

[14] Chiu D Y, Wu Y H, Chen A L. An efficient algorithm for mining frequent sequences by a new strategy without support counting. In -\it Proc. the 20th Int. Conf. Data Engineering, ICDE 2004}, Boston, MA, USA, 30 March--2 April, 2004, pp.375--386.

[15] Pei J, Han J, Wang J \it et al. \rm Mining sequential patterns by pattern-growth: The PrefixSpan approach. \it IEEE Trans. Knowledge and Data Engineering, \rm Oct. 2004, 16(1): 1424--1440.

[16] Zaki M. SPADE: An efficient algorithm for mining frequent sequences. \it Machine Learning, \rm January 2001, 42(1/2): 31--60.

[17] Srikant R, Agrawal R. Mining sequential patterns: Generalizations and performance improvements. In -\it Proc. Advances in Database Technology --EDBT'96, 5th Int. Conf. Extending Database Technology}, Avignon, France, March 25--29, 1996, pp.3--17.

[18] Agrawal R, Srikant R. Mining sequential patterns. In -\it Proc. the 11th Int. Conf. Data Engineering}, March 6--10, 1995, Taipei, pp.3--14.

[19] Han J, Pei J, Mortazavi-Asi B \it et al. \rm FreeSpan: Frequent pattern-projected sequential pattern mining. In -\it Proc. the 6th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining}, Boston, MA, USA, August 20--23, 2000, pp.355--359.

[20] Pei J, Han J, Mortazavi-Asi B, Pino H. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In -\it Proc. the 17th Int. Conf. Data Engineering}, Heidelberg, Germany, April 2--6, 2001, pp.215--224.

[21] Ayres J, Gehrke J, Yiu T, Flannick J. Sequential pattern mining using a bitmap representation. In -\it Proc. the 8th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining}, Edmonton, Alberta, Canada, July 23--26, 2002, pp.429--435.

[22] Tzvetkov P, Yan X, Han J. TSP: Mining Top-K closed sequential patterns. In -\it Proc. the 3rd IEEE Int. Conf. Data Mining $($ICDM 2003$)$}, Melbourne, Florida, USA, 19--22 December, 2003, pp.347--354.

[23] Yun U, Leggett J J. WFIM: Weighted frequent itemset mining with a weight range and a minimum weight. In -\it Proc. the 5th SIAM Int. Conf. Data Mining}, Newport Beach, USA, April 21--23, 2005, pp.636--640.

[24] Yun U, Leggett J J. WSpan: Weighted sequential pattern mining in large sequence databases. -\it Int. Conf. IEEE Intelligent Systems $($IEEE IS`06$)$}, UK, 2006, pp.512--517.
[1] YE Yimin; SUN Yufang;. Chinese TrueType Font Support in X Window [J]. , 1999, 14(1): 27-33.
[2] Xiong Zhiguo; Xu Xi; Dong Shihai;. CX11: A Chinese Language Supporting Interface for X Window Environment [J]. , 1995, 10(1): 15-22.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Lu Qi; Zhang Fubo; Qian Jiahua;. Program Slicing:Its Improved Algorithm and Application in Verification[J]. , 1988, 3(1): 29 -39 .
[2] Zhang Fuyan; Cai Shijie; Wang Shu; Ge Ruding;. The Human-Computer Dialogue Management of FCAD System[J]. , 1988, 3(3): 221 -227 .
[3] Shen Yidong;. Form alizing Incomplete Knowledge in Incomplete Databases[J]. , 1992, 7(4): 295 -304 .
[4] Wei Guoqing; Ma Songde;. 3D Motion Estimation and Motion Fusion by Affine Region Matching[J]. , 1993, 8(1): 17 -25 .
[5] Gao Qingshi; Liu Zhiyong;. K-Dimensional Optimal Parallel Algorithm for the Solution of a General Class of Recurrence Equations[J]. , 1995, 10(5): 417 -424 .
[6] wang Xuejun; Shi Chunyi;. A Multiagent Dynamic interaction Testbed:Theoretic Framework, System Architecture and Experimentation[J]. , 1997, 12(2): 121 -132 .
[7] Xu Yingqing; Hans Dehlinger; Qi Dongxu; Liu Shenquan;. Line-Art and its Mathematical Models[J]. , 1998, 13(1): 73 -78 .
[8] Matthew Hennessy;. Process Calculifor Describing Distributed Systems[J]. , 1998, 13(6): 490 .
[9] MA Huadong; LIU Shenquan;. Multimedia Data Modeling Based on TemporalLogic and XYZ System[J]. , 1999, 14(2): 188 -193 .
[10] BI Jun; WU Jianping;. An Approach to Concurrent TTCN Test Generation[J]. , 1999, 14(6): 614 -618 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved