›› 2014,Vol. 29 ›› Issue (1): 90-104.doi: 10.1007/s11390-013-1414-5

所属专题: Data Management and Data Mining

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

正则化最小二乘多分类机非监督训练研究

Tapio Pahikkala1, Member, ACM, IEEE, Antti Airola1, Fabian Gieseke2, and Oliver Kramer3   

  • 收稿日期:2013-09-01 修回日期:2013-10-28 出版日期:2014-01-05 发布日期:2014-01-05

On Unsupervised Training of Multi-Class Regularized Least-Squares Classifiers

Tapio Pahikkala1, Member, ACM, IEEE, Antti Airola1, Fabian Gieseke2, and Oliver Kramer3   

  1. 1 Department of Information Technology, University of Turku, Turku 20520, Finland;
    2 Department of Computer Science, University of Copenhagen, Copenhagen K 1017, Denmark;
    3 Computer Science Department, Carl von Ossietzky University of Oldenburg, Oldenburg 26111, Germany
  • Received:2013-09-01 Revised:2013-10-28 Online:2014-01-05 Published:2014-01-05
  • Supported by:

    Tapio Pahikkala is supported by the Academy of Finland under Grant No. 134020 and Fabian Gieseke by the German Academic Exchange Service (DAAD).

本文首次提出了一种高效的正则化最小二乘多分类机非监督训练算法。此种方法与支持向量分类机非监督扩展,也就是最大间隔聚类,密切相关。后者在二分类问题上最近得到了大量关注。本文提出了一种组合搜索算法用以结合最速下降策略和元启发方法以避免不良局部优化。在基于正则化最小二乘的问题描述之后,我们使用矩阵代数优化以使得在搜索的过程中以时间常数复杂度检查是否存在中间候选解。实验结果显示了本文提出新方法的有效性。在真实数据集上,本文所提出的方法展示了其相比于其它方法较优越的聚类性能。不论是时间复杂度分析,还是比较试验都表明本方法能够较好的扩展到实际规模的多分类问题上。

Abstract: In this work we present the first efficient algorithm for unsupervised training of multi-class regularized leastsquares classifiers. The approach is closely related to the unsupervised extension of the support vector machine classifier known as maximum margin clustering, which recently has received considerable attention, though mostly considering the binary classification case. We present a combinatorial search scheme that combines steepest descent strategies with powerful meta-heuristics for avoiding bad local optima. The regularized least-squares based formulation of the problem allows us to use matrix algebraic optimization enabling constant time checks for the intermediate candidate solutions during the search. Our experimental evaluation indicates the potential of the novel method and demonstrates its superior clustering performance over a variety of competing methods on real world datasets. Both time complexity analysis and experimental comparisons show that the method can scale well to practical sized problems.

[1] Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd edition). New York, NY, USA: Springer, 2009.
[2] Bao T, Cao H, Chen E, Tian J, Xiong H. An unsupervised approach to modeling personalized contexts of mobile users. Knowledge and Information Systems, 2012, 31(2): 345-370.
[3] Jain A, Dubes R. Algorithms for Clustering Data. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 1988.
[4] Schölkopf B, Smola A. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. Cambridge, MA, USA: MIT Press, 2001.
[5] Steinwart I, Christmann A. Support Vector Machines. New York, NY, USA: Springer-Verlag, 2008.
[6] Xu L, Neufeld J, Larson B, Schuurmans D. Maximum margin clustering. In Advances in Neural Information Processing Systems 17, Saul L, Weiss Y, Bottou L (eds.), MIT Press, 2005, pp.1537-1544.
[7] Pahikkala T, Airola A, Gieseke F, Kramer O. Unsupervised multi-class regularized least-squares classification. In Proc. the 12th IEEE International Conference on Data Mining (ICDM 2012), Dec. 2012, pp.585-594.
[8] Boyd S, Vandenberghe L. Convex Optimization. New York, NY, USA: Cambridge University Press, 2004.
[9] Valizadegan H, Jin R. Generalized maximum margin clustering and unsupervised kernel learning. In Advances in Neural Information Processing Systems 19, Schölkopf B, Platt J, Hoffman T (eds.), MIT Press, 2007, pp.1417-1424.
[10] Zhao B, Wang F, Zhang C. Efficient maximum margin clustering via cutting plane algorithm. In Proc. the SIAM International Conference on Data Mining, Apr. 2008, pp.751-762.
[11] Li Y, Tsang I, Kwok J, Zhou Z. Tighter and convex maximum margin clustering. In Proc. the 12th International Conference on Artificial Intelligence and Statistics, Apr. 2009, pp.344-351.
[12] Zhang K, Tsang I, Kwok J. Maximum margin clustering made practical. In Proc. the 24th International Conference on Machine Learning, June 2007, pp.1119-1126.
[13] Gieseke F, Pahikkala T, Kramer O. Fast evolutionary maximum margin clustering. In Proc. the 26th International Conference on Machine Learning, June 2009, pp.361-368.
[14] Zhao B,Wang F, Zhang C. Efficient multiclass maximum margin clustering. In Proc. the 25th International Conference on Machine Learning, July 2008, pp.1248-1255.
[15] Xu L, Schuurmans D. Unsupervised and semi-supervised multi-class support vector machines. In Proc. the 20th National Conference on Artificial Intelligence, July 2005, pp.904-910.
[16] Rifkin R, Klautau A. In defense of one-vs-all classification. Journal of Machine Learning Research, 2004, 5(Jan): 101141.
[17] Rifkin R, Yeo G, Poggio T. Regularized least-squares classification. In Advances in Learning Theory: Methods, Models and Applications, Suykens J, Horvath G, Basu S, Micchelli C, Vandewalle J (eds.), Amsterdam, The Netherlands: IOS Press, 2003, pp.131-154.
[18] Kimeldorf G,Wahba G. Some results on Tchebycheffian spline functions. Journal of Mathematical Analysis and Applications, 1971, 33(1): 82-95.
[19] Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995, 7(2): 219-269.
[20] Russell S, Norvig P. Artificial Intelligence: A Modern Approach (3rd edition). Upper Saddle River, NJ, USA: Prentice Hall Press, 2009.
[21] Kirkpatrick S, Gelatt C, Vecchi M. Optimization by simulated annealing. Science, 1983, 220(4598): 671-680.
[22] Arthur D, Vassilvitskii S. k-means++: The advantages of careful seeding. In Proc. the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, Jan. 2007, pp.1027-1035.
[23] Dempster A, Laird N, Rubin D. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1): 1-38.
[24] Shi J, Malik J. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905.
[25] Schölkopf B, Mika S, Burges C, Knirsch P, Müller K R, Rätsch G, Smola A. Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 1999, 10(5): 1000-1017.
[26] Nene S, Nayar S, Murase H. Columbia object image library (COIL-100). Techical Report, CUCS-006-96, Department of Computer Science, Columbia University, 1996.
[27] Hubert L, Arabie P. Comparing partitions. Journal of Classification, 1985, 2(1): 193-218.
[28] Wang F, Zhao B, Zhang C. Linear time maximum margin clustering. IEEE Transactions on Neural Networks, 2010, 21(2): 319-332.
[29] Waegeman W, Verwaeren J, Slabbinck B, De Baets B. Supervised learning algorithms for multi-class classification problems with partial class memberships. Fuzzy Sets and Systems, 2011, 184(1): 106-125.
[30] Williams C, Seeger M. Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13, Leen T, Dietterich T, Tresp V (eds.), MIT Press, 2001, pp.682-688.
[31] Halkidi M, Batistakis Y, Vazirgiannis M. On clustering validation techniques. Journal of Intelligent Information Systems, 2001, 17(2/3): 107-145.
[32] Zhao Q. Cluster validity in clustering methods [Ph.D. Thesis]. University of Eastern Finland, 2012.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 李卫东; 魏道政;. Test Derivation Through Critical Path Transitions[J]. , 1992, 7(1): 12 -18 .
[2] Harald E. Otto;. UNDO, An Aid for Explorative Learning?[J]. , 1992, 7(3): 226 -236 .
[3] 黎仁蔚; 何锫; 张文辉;. An Introduction to IN CAPS System[J]. , 1993, 8(1): 26 -37 .
[4] 王晖; 刘大有; 王亚飞;. Sequential Back-Propagation[J]. , 1994, 9(3): 252 -260 .
[5] 樊晓聪; 徐殿祥; 侯建民; 郑国梁;. Reasoning about Concurrent Actionsin Multi-Agent Systems[J]. , 1999, 14(4): 422 -428 .
[6] . 基于原子 Voronoi图构建原子结构Connolly曲面的方法[J]. , 2006, 21(2): 255 -260 .
[7] . 暂缺[J]. , 2006, 21(5): 823 -837 .
[8] . 对两种高速缓存错失同时作最小化[J]. , 2007, 22(4): 497 -504 .
[9] . 暂缺[J]. , 2008, 23(1): 44 -63 .
[10] . 基于有理DMS样条体的自由变形[J]. , 2008, 23(5 ): 862 -873 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: