We use cookies to improve your experience with our site.

Indexed in:

SCIE, EI, Scopus, INSPEC, DBLP, CSCD, etc.

Submission System
(Author / Reviewer / Editor)
Wei Wu, Hang Li, Yun-Hua Hu, Rong Jin. A Kernel Approach to Multi-Task Learning with Task-Specific Kernels[J]. Journal of Computer Science and Technology, 2012, 27(6): 1289-1301. DOI: 10.1007/s11390-012-1305-1
Citation: Wei Wu, Hang Li, Yun-Hua Hu, Rong Jin. A Kernel Approach to Multi-Task Learning with Task-Specific Kernels[J]. Journal of Computer Science and Technology, 2012, 27(6): 1289-1301. DOI: 10.1007/s11390-012-1305-1

A Kernel Approach to Multi-Task Learning with Task-Specific Kernels

More Information
  • Received Date: September 14, 2011
  • Revised Date: April 26, 2012
  • Published Date: November 04, 2012
  • Several kernel-based methods for multi-task learning have been proposed, which leverage relations among tasks as regularization to enhance the overall learning accuracies. These methods assume that the tasks share the same kernel, which could limit their applications because in practice different tasks may need different kernels. The main challenge of introducing multiple kernels into multiple tasks is that models from different reproducing kernel Hilbert spaces (RKHSs) are not comparable, making it difficult to exploit relations among tasks. This paper addresses the challenge by formalizing the problem in the square integrable space (SIS). Specially, it proposes a kernel-based method which makes use of a regularization term defined in SIS to represent task relations. We prove a new representer theorem for the proposed approach in SIS. We further derive a practical method for solving the learning problem and conduct consistency analysis of the method. We discuss the relationship between our method and an existing method. We also give an SVM (support vector machine)- based implementation of our method for multi-label classification. Experiments on an artificial example and two real-world datasets show that the proposed method performs better than the existing method.
  • [1]
    Caruana R. Multitask learning. Machine Learning, 1997,28(1): 41-75.
    [2]
    Ben-David S, Schuller R. Exploiting task relatedness for mul-tiple task learning. In Lecture Notes in Computer Science,Carbonell J, Siekmann J (eds.), Springer, 2003, pp.567-580.
    [3]
    Evgeniou T, Micchelli C, Pontil M. Learning multiple taskswith kernel methods. Journal of Machine Learning Research,2005, 6(April): 615-637.
    [4]
    Kato T, Kashima H, Sugiyama M, Asai K. Multi-task learn-ing via conic programming. In Proc. the 22nd Conf. NeuralInformation Processing System, Dec. 2008, pp.737-744.
    [5]
    Evgeniou T, Pontil M. Regularized multi-task learning. InProc. the 10th SIGKDD, Aug. 2004, pp.109-117.
    [6]
    Micchelli C, Pontil M. Kernels for multi-task learning. InProc. the 19th NIPS, Dec. 2005, pp.921-928.
    [7]
    Ando R K, Zhang T. A framework for learning predictivestructures from multiple tasks and unlabeled data. Journalof Machine Learning Research, 2005, 6(Nov.): 1817-1853.
    [8]
    Argyriou A, Evgeniou T, Pontil M. Multi-task feature learn-ing. In Proc. NIPS, December 2007, pp.41-48.
    [9]
    Schölkopf B, Smola A. Learning with Kernels: Support VectorMachines, Regularization, Optimization, and Beyond. Cam-bridge, Massachusetts, USA: MIT Press, 2002.
    [10]
    Hofmann T, Schölkopf B, Smola A. Kernel methods in ma-chine learning. Annals of Statistics, 2008, 36(3): 1171-1220.
    [11]
    Lanckriet G R, Cristianini N, Bartlett P, Ghaoui L, Jordan M.Learning the kernel matrix with semi-definite programming.In Proc. the 19th ICML, July 2002, pp.323-330.
    [12]
    Bach F, Lanckriet G R, Jordan M. Multiple kernel learning,conic duality, and the SMO algorithm. In Proc. the 21stICML, July 2004, Article No. 6.
    [13]
    Tang L, Chen J, Ye J. On multiple kernel learning with multi-ple labels. In Proc. the 21st IJCAI, July 2009, pp.1255-1260.
    [14]
    Ji S, Sun L, Jin R, Ye J. Multi-label multiple kernel learning.In Proc. the 22nd NIPS, 2008, pp.777-784
    [15]
    Duan L, Tsang I, Xu D, Chua T. Domain adaptation frommultiple sources via auxiliary classifiers. In Proc. the 26thICML, June 2009, pp.289-296.
    [16]
    Aronszajn N. Theory of reproducting kernels. Transactionsof the American Mathematical Society, 1950, 68(3): 337-404.
    [17]
    Cucker F, Smale S. On the mathematical foundations of learn-ing. Bulletin of the American Mathematical Society, 2002,39(1): 1-49.
    [18]
    Renardy M, Rogers R. An Introduction to Partial DifferentialEquations. New York, USA: Springer-Verlag, 1993.
    [19]
    Elisseeff A, Weston J. Kernel methods for multi-labelled clas-sification and categorical regression problems. In Proc. the16th NIPS, December 2002, pp.681-688.
    [20]
    Lewis D. Evaluating text categorization. In Proc. the Work-shop on Speech and Natural Language, Feb. 1991, pp.312-318.
    [21]
    Lanckriet G R, Deng M, Cristianini N, Jordan M, Noble W.Kernel-based data fusion and its application to protein func-tion prediction in yeast. In Proc. Pacific Symp. Biocomput-ing, January 2004, pp.300-311.
    [22]
    Dieudonn? J. Foundations of Modern Analysis (2nd edition).New York, USA: Academic Press, 1969.
    [23]
    Belkin M, Niyogi P, Sindhwani V. Manifold regularization:A geometric framework for learning from labeled and un-labeled examples. Journal of Machine Learning Research,2006, 7(Nov.): 2399-2434.
    [24]
    Bartlett P, Mendelson S. Rademacher and Gaussian complex-ities: Risk bounds and structural results. Journal of MachineLearning Research, 2002, 3(Nov.): 463-482.
  • Related Articles

    [1]Cheng-Zhang Zhu, Rong Hu, Bei-Ji Zou, Rong-Chang Zhao, Chang-Long Chen, Ya-Long Xiao. Automatic Diabetic Retinopathy Screening via Cascaded Framework Based on Image- and Lesion-Level Features Fusion[J]. Journal of Computer Science and Technology, 2019, 34(6): 1307-1318. DOI: 10.1007/s11390-019-1977-x
    [2]Tie-Ke He, Hao Lian, Ze-Min Qin, Zhen-Yu Chen, Bin Luo. PTM: A Topic Model for the Inferring of the Penalty[J]. Journal of Computer Science and Technology, 2018, 33(4): 756-767. DOI: 10.1007/s11390-018-1854-z
    [3]Lan Yao, Feng Zeng, Dong-Hui Li, Zhi-Gang Chen. Sparse Support Vector Machine with Lp Penalty for Feature Selection[J]. Journal of Computer Science and Technology, 2017, 32(1): 68-77. DOI: 10.1007/s11390-017-1706-2
    [4]Ratnakar Dash, Pankaj Kumar Sa, Banshidhar Majhi. Particle Swarm Optimization Based Support Vector Regression for Blind Image Restoration[J]. Journal of Computer Science and Technology, 2012, 27(5): 989-995. DOI: 10.1007/s11390-012-1279-z
    [5]Yuan Ping, Ying-Jie Tian, Ya-Jian Zhou, Yi-Xian Yang. Convex Decomposition Based Cluster Labeling Method for Support Vector Clustering[J]. Journal of Computer Science and Technology, 2012, (2): 428-442. DOI: 10.1007/s11390-012-1232-1
    [6]Bin Zhang, Jia-Hai Yang, Jian-Ping Wu, Ying-Wu Zhu. Diagnosing Traffic Anomalies Using a Two-Phase Model[J]. Journal of Computer Science and Technology, 2012, (2): 313-327. DOI: 10.1007/s11390-012-1225-0
    [7]Lei Jia, Shi-Zhong Liao, Li-Zhong Ding. Learning with Uncertain Kernel Matrix Set[J]. Journal of Computer Science and Technology, 2010, 25(4): 709-727. DOI: 10.1007/s11390-010-1055-x
    [8]Bin Gu, Jian-Dong Wang, Tao Li. Ordinal-Class Core Vector Machine[J]. Journal of Computer Science and Technology, 2010, 25(4): 699-708. DOI: 10.1007/s11390-010-1054-y
    [9]Xu Sun, Hou-Feng Wang, Bo Wang. Predicting Chinese Abbreviations from Definitions: An Empirical Learning Approach Using Support Vector Regression[J]. Journal of Computer Science and Technology, 2008, 23(4): 602-611.
    [10]PENG wei, LU Xicheng. An Approach to Support IP Multicasting in Networks with Mobile Hosts[J]. Journal of Computer Science and Technology, 1999, 14(6): 529-538.

Catalog

    Article views (28) PDF downloads (1744) Cited by()
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return