›› 2011, Vol. 26 ›› Issue (1): 81-98.doi: 10.1007/s11390-011-1113-z

Special Issue: Artificial Intelligence and Pattern Recognition

• Special Section on Natural Language Processing • Previous Articles     Next Articles

Transfer Learning via Multi-View Principal Component Analysis

Yang-Sheng Ji (吉阳生), Jia-Jun Chen (陈家骏), Member, CCF, Gang Niu (牛罡), Lin Shang (商琳), Member, CCF, and Xin-Yu Dai (戴新宇), Member, CCF   

  1. Department of Computer Science and Technology, Nanjing University, Nanjing 210093, China;
    National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210093, China
  • Received:2009-12-31 Revised:2010-11-12 Online:2011-01-01 Published:2011-01-01
  • About author:Yang-Sheng Ji received his Bachelor's degree in computer science from Nanjing University, China, in 2006 and he is now pursuing a Ph.D. degree in Department of Computer Science and Technology, Nanjing University. He visited Machine Learning Group in Microsoft Research Asia as an intern in 2008. From November 2009 to November 2010, he visited University of Illinois at Urbana-Champaign, and worked with Prof. Cheng-Xiang Zhai. His research interests include sentiment analysis, review summarization, and transfer learning.
    Jia-Jun Chen is a professor of the Department of Computer Science & Technology at Nanjing University, China. He received his Ph.D. degree in computer science from Nanjing University, China, in 1998. Currently, Prof. Chen is the director of the Natural Language Processing Lab at Nanjing University. He is a member of China Computer Federation (CCF), and a member of the Machine Translation Specialization Committee of Chinese Information Processing Society of China. His research interests include machine translation, text categorization and information extraction. His research has been supported by the National 863 HighTech Projects of China, the National Natural Science Foundation of China (NSFC), the Social Science Foundation of China and the Natural Science Foundation of Jiangsu Province of China.
    Gang Niu received his Master's degree from Department of Computer Science and Technology in Nanjing University, China. He is currently a Ph.D. candidate in Department of Computer Science in Tokyo Institute of Technology, Japan. His research interests include support vector machines, spectral clustering and manifold learning algorithms.
    Lin Shang is an associate professor in the Department of Computer Science and Technology at the Nanjing University. She received her B.S. and M.S. degrees from Lanzhou University, China, in 1995 and 1998, and Ph.D. degree in computer science from the Nanjing University in 2004. She has visited Regina University, Canada, in May, 2007. Her research interests mainly include artificial intelligence, machine learning, data mining, and rough sets. In these areas she has published over 50 papers in the past five years. She is the program committee member of RSKT'07, RSKT'08, CRSSC 2006, 2007, 2008, 2009, and organization committee chair/co-chair of agent2008, CRSSC 2010, China. She is a member of CCF.
    Xin-Yu Dai received his Ph.D. degree in computer science from Nanjing University in 2005, China. He is currently an associate professor of the Department of Computer Science and Technology at Nanjing University. His research interests focus on machine translation and information retrieval. He is a member of CCF.
  • Supported by:

    This work is supported by the National Natural Science Foundation of China under Grant No. 61003112, the National Basic Research 973 Program of China under Grant No. 2010CB327903, the National High-Tech Research and Development 863 Program of China under Grant No. 2006AA010109, and the Natural Science Foundation of Jiangsu Province under Grant No. 2009233.

Transfer learning aims at leveraging the knowledge in labeled source domains to predict the unlabeled data in a target domain, where the distributions are different in domains. Among various methods for transfer learning, one kind of algorithms focus on the correspondence between bridge features and all the other specific features from different domains, and later conduct transfer learning via the single-view correspondence. However, the single-view correspondence may prevent these algorithms from further improvement due to the problem of incorrect correlation discovery. To tackle this problem, we propose a new method for transfer learning in a multi-view correspondence perspective, which is called Multi-View Principal Component Analysis (MVPCA) approach. MVPCA discovers the correspondence between bridge features representative across all domains and specific features from different domains respectively, and conducts the transfer learning by dimensionality reduction in a multi-view way, which can better depict the knowledge transfer. Experiments show that MVPCA can significantly reduce the cross domain prediction error of a baseline non-transfer method. With multi-view correspondence information incorporated to the single-view transfer learning method, MVPCA can further improve the performance of one state-of-the-art single-view method.

[1] Blitzer J, McDonald R, Pereira F. Domain adaptation with tructural correspondence learning. In Proc. Conference on mpirical Methods in Natural Language, Sydney, Australia, Jul. 22-23, 2006, pp.120-128.
[2] Blitzer J, Dredze M, Pereira F. Biographies, Bollywood, Boomboxes and Blenders: Domain adaptation for sentiment lassification. In Proc. the 45th Annual Meeting of the Assoiation of Computational Linguistics, Prague, Czech, Jun. 25-27, 2007, pp.432-439.
[3] Dhillon I S, Mallela S, Modha D S. Information-theoretic o-clustering. In Proc. the 9th ACM SIGKDD Internaional Conference on Knowledge Discovery and Data Mining KDD), Washington DC, USA, Aug. 24-27, 2003, pp.89-98.
[4] Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation ia transfer component analysis. In Proc. the 21st Internaional Joint Conference on Artificial Intelligence, Pasadena, SA, Jul. 11-17, 2009, pp.1187-1192.
[5] Daumé H. Frustratingly easy domain adaptation. In Proc. he 45th Annual Meeting of the Association for Computaional Linguistics, Prague, Czech, Jun. 25-27, 2007, pp.256-263.
[6] Daumé H, Marcu D. Domain adaptation for statistical clasifiers. Journal of Artificial Intelligence Research, 2006, 26: 01-126.
[7] Dai W Y, Xue G R, Yang Q, Yu Y. Co-clustering based clasification for out-of-domain documents. In Proc. the 13th CM SIGKDD International Conference on Knowledge Disovery and Data Mining, San Jose, USA, Aug. 12-15, 2007, p.210-219.
[8] Pan S J, Kwok J T, Yang Q, Pan J J. Adaptive localization in dynamic WiFi environment through multi-view learning. In Proc. the 22nd AAAI Conference on Artificial Intelligence, ancouver, Canada, Jul. 22-26, 2007, pp.1108-1113.
[9] DaiWY, Yang Q, Xue G R, Yu Y. Boosting for transfer learnng. In Proc. the 24th International Conference on Machine earning, Corvalis, USA, Jun. 20-24, 2007, pp.193-200.
[10] Huang J, Smola A, Gretton A, Borgwardt K M, Schölkopf B. orrecting sample selection bias by unlabeled data. In Proc. he 19th Annual Conference on Neural Information Processng Systems, Vancouver, Canada, Dec. 4-6, 2006, pp.601-608.
[11] Zadrozny B. Learning and evaluating classifiers under sample election bias. In Proc. the 21st International Conference on achine Learning, Banff, Canada, Jul. 4-8, 2004, pp.903-910.
[12] Jiang J, Zhai C X. Instance weighting for domain adaptation n NLP. In Proc. the 45th Annual Meeting of the Associaion of Computational Linguistics, Prague, Czech, Jun. 25-27, 2007, pp.264-271.
[13] Raina R, Battle A, Lee H, Packer B, Ng A Y. Self-taught earning: Transfer learning from unlabeled data. In Proc. the 24th International Conference on Machine Learning, Coralis, USA, Jun. 20-24, 2007, pp.759-766.
[14] Lee H, Battle A, Raina R, Ng A Y. Efficient sparse coding lgorithms. In Proc. the 19th Annual Conference on Neural nformation Processing Systems, Vancouver, Canada, Dec. 4-6, 2006, pp.801-808.
[15] Tan S, Cheng X, Wang Y, Xu H. Adapting naive Bayes to doain adaptation for sentiment analysis. In Proc. the 31st Euopean Conference on IR Research, Toulouse, France, Apr. 7-9, 2009, pp.337-349.
[16] Dai W Y, Xue G R, Yang Q, Yu Y. Transferring Naive Bayes lassifiers for text classification. In Proc. the 22nd AAAI onference on Artificial Intelligence (AAAI 2007), Vancouer, Canada, Jul. 22-26, 2007, pp.540-545.
[17] Sandler T, Blitzer J, Talukdar P, Pereira F. Regularized learnng with networks of features. In Proc. the 22nd Annual onference on Neural Information Processing Systems, Vanouver, Canada, Dec. 8-11, 2008, pp.1401-1408.
[18] Vinokourov A, Shawe-Taylor J, Cristianini N. Inferring a seantic representation of text via cross-language correlation nalysis. In Proc. the 15th Advances of Neural Informaion Processing Systems, Vancouver, Canada, Dec. 9-12, 2002, p.1473-1480.
[19] Li Y, Shawe-Taylor J. Using KCCA for Japanese-English cross-language information retrieval and classification. Joural of Intelligent Information Systems, 2006, 27(2): 117-133.
[20] Hardoon D R, Szedmak S, Shawe-Taylor J. Canonical corelation analysis: An overview with application to learning methods, Neural Computation, 2004, 16(12): 2639-2664.
[21] Hotelling H. Relations between two sets of variates. Biometrika, 1936, 28(3/4): 321-377.
[22] Diethe T, Hardoon D R, Shawe-Taylor J. Multiview Fisher iscriminant analysis. In Proc. Learning from Multiple ources Workshop (NIPS 2008), Whistler, Canada, Dec. 13, 008.
[23] Zhou Z H, Zhan D C, Yang Q. Semi-supervised learning with ery few labeled training examples. In Proc. the 22nd AAAI onference on Artificial Intelligence (AAAI 2007), Vancouer, Canada, Jul. 22-26, 2007, pp.675-680.
[24] Zhang D Q, Zhou Z H, Chen S C. Semi-supervised dimenionality reduction. In Proc. the 7th SIAM International onference on Data Mining (SDM2007), Minneapolis, USA, pr. 26-28, 2007, pp.629-634.
[25] Long C, Huang M L, Zhu X Y, Ming Li. A new approach for multi-document update summarization. Journal of Computer cience and Technology, 2010, 25(4): 739-749.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

         1860-4749(Online)
CN 11-2296/TP

Home
Editorial Board
Author Guidelines
Subscription
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
Tel.:86-10-62610746
E-mail: jcst@ict.ac.cn
 
  Copyright ©2015 JCST, All Rights Reserved