›› 2014, Vol. 29 ›› Issue (1): 105-115.doi: 10.1007/s11390-013-1415-4

Special Issue: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining

• Data Management and Data Mining • Previous Articles     Next Articles

Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components

Fatemeh Dorri and Ali Ghodsi   

  1. Department of Computer Science, University of British Columbia, Vancouver, Canada; Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada
  • Received:2013-09-01 Revised:2013-11-01 Online:2014-01-05 Published:2014-01-05

Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.

[1] Ben-David S, Blitzer J, Crammer K, Kulesza A, Pereira F, Vaughan J W. A theory of learning from different domains. Machine Learning Journal, 2010, 79(1/2): 151-175.

[2] Huang J, Smola A J, Gretton A, Borgwardt K M, Schölkopf B. Correcting sample selection bias by unlabeled data. In Proc. the 2000 Annual Conf. Neural Information Processing System, Dec. 2007, pp.601-608.

[3] Liu Q, Mackey A, Roos D, Pereira F. Evigan: A hidden variable model for integrating gene evidence for Eukaryotic gene prediction. Bioinformatics, 2008, 24(5): 597-605.

[4] Jiang J. A literature survey on domain adaptation of statistical classifier. 2008. http://sifaka.cs.uiuc.edu/jiang4/domain adaptation/survey/da survey.pdf, Nov. 2013.

[5] Shimodaira H. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of Statistical Planning and Inference, 2000, 90(2): 227-244.

[6] Japkowicz N, Stephen S. The classic imbalance problem: A systematic study. Intelligent Data Analysis, 2002, 6(5): 429499.

[7] Chapelle O, Schölkopf B, Zien A (eds.). Semi-Supervised Learning. MIT Press, 2006.

[8] Dai W, Xue G, Yang Q, Yu Q. Transferring naive Bayes classifier for text classification. In Proc. the 22nd AAAI Conference on Artificial Intelligence, Jul. 2007, pp.540-545.

[9] Xing D, Dai W, Xue G, Yu Y. Bridged refinement for transfer learning. In Proc. the 11th European Conference on Principles and Practice of Knowledge Discovery in Databases, Sept. 2007, pp.324-335.

[10] Wang B, Tang J, Fan W, Chen S, Tan C, Yang Z. Querydependent cross-domain ranking in heterogeneous network. Knowledge and Information Systems, 2013, 34 (1): 109-145.

[11] Shao H, Tong B, Suzuki E. Extended MDL principle for feature-based inductive transfer learning. Knowledge and Information Systems, 2013, 35 (2): 365-389.

[12] Ben-David S, Schuller R. Exploiting task relatedness for multiple task learning. In Proc. the 16th Annual Conference on Learning Theory, Aug. 2003, pp.567-580.

[13] Xue Y, Liao X, Carin L, Krishnapuram B. Multi-task learning for classification with Dirichlet process priors. Journal of Machine Learning Research, 2007, 8: 35-63.

[14] Micchelli C A, Pontil M. Kernels for multi-task learning. In Proc. the Annual Conf. Neural Information Processing Systems, Dec. 2005, pp.921-928.

[15] Heckman J J. Sample selection bias as a specification error. Econometrica, 1979, 47(1): 153-161.

[16] Zadrozny B. Learning and evaluating classifiers under sample selection bias. In Proc. the 21st Annual International Conference on Machine Learning, Jul. 2004, Article No. 114.

[17] Morvant E, Habrard A, Ayache S. Parsimonious unsupervised and semi-supervised domain adaptation with good similarity functions. Knowledge and Information Systems, 2012, 33(2): 309-349.

[18] Chan Y S, Ng H T. Estimating class priors in domain adaptation for word sense disambiguation. In Proc. the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Jul. 2006, pp.89-96.

[19] Daumé III H, Kumar A, Saha A. Co-regularization based semi-supervised domain adaptation. In Proc. the 24th Annual Conference on Neural Information Processing Systems, Dec. 2010, pp.478-486.

[20] Xue G, Dai W, Yang Q, Yu Y. Topic-bridged PLSA for crossdomain text classification. In Proc. the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul. 2008, pp.627-634.

[21] Blitzer J, McDonald R, Pereira F. Domain adaptation with structural correspondence learning. In Proc. the 2006 Conference on Empirical Methods in Natural Language Processing, Jul. 2006, pp.120-128.

[22] Jegelka S, Gretton A, Schölkopf B, Sriperumbudur B K, Luxburg U V. Generalized clustering via kernel embeddings. In Proc. the 32nd Annual German Conference on Advances in Artificial Intelligence, Sept. 2009, pp.144-152.

[23] Gretton A, Bousquet O, Smola A J, Schölkopf B. Measuring statistical dependence between Hilbert-Schmidt norms. In Proc. the 16th International Conference on Algorithmic Learning Theory, Oct. 2005, pp.63-77.

[24] Pan S J, Kwok J T, Yang Q. Transfer learning via dimensionality reduction. In Proc. the 23rd AAAI, Jul. 2008, pp.677-682.

[25] Chen M,Weinberger K Q, Blitzer J C. Co-training for domain adaptation. In Proc. the 25th Annual Conference on Neural Information Processing Systems, Dec. 2011, pp.2456-2464.

[26] Pan S J, Tsang I W, Kwok J T, Yang Q. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 2011, 22(2): 199-210.

[27] Gretton A, Smola A, Huang J, Schmittfull M, Borgwardt K, Schöelkopf B. Covariate shift by kernel mean matching. In Dataset Shift in Machine Learning, Nonero-Candela J Q, Sugiyama M, Schwaighofer A, Lawrence N (eds.), Cambridge, MA: MIT Press, 2008, pp.131-160.

[28] Dorri F, Ghodsi A. Adapting component analysis. In Proc. the 12th IEEE International Conference on Data Mining, Dec. 2012, pp.846-851.
No related articles found!
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved