Journal of Computer Science and Technology
Quick Search in JCST
 Advanced Search 
      Home | PrePrint | SiteMap | Contact Us | FAQ
 
Indexed by   SCIE, EI ...
Bimonthly    Since 1986
Journal of Computer Science and Technology 2010, Vol. 25 Issue (4) :681-698    DOI: DOI 10.1007/s11390-010-1053-z
Special Section on Advances in Machine Learning and Applications Current Issue | Archive | Adv Search << Previous Articles | Next Articles >>
Combining Committee-Based Semi-Supervised Learning and Active Learning
Mohamed Farouk Abdel Hady and Friedhelm Schwenker
Institute of Neural Information Processing, University of Ulm, Ulm, Germany

Abstract
Reference
Related Articles
Download: [PDF 585KB]     Export: BibTeX or EndNote (RIS)  
Abstract 

Many data mining applications have a large amount of data but labeling data is usually difficult, expensive, or time consuming, as it requires human experts for annotation. Semi-supervised learning addresses this problem by using unlabeled data together with labeled data in the training process. Co-Training is a popular semi-supervised learning algorithm that has the assumptions that each example is represented by multiple sets of features (views) and these views are sufficient for learning and independent given the class. However, these assumptions are strong and are not satisfied in many real-world domains. In this paper, a single-view variant of Co-Training, called Co-Training by Committee (CoBC) is proposed, in which an ensemble of diverse classifiers is used instead of redundant and independent views. We introduce a new labeling confidence measure for unlabeled examples based on estimating the local accuracy of the committee members on its neighborhood. Then we introduce two new learning algorithms, QBC-then-CoBC and QBC-with-CoBC, which combine the merits of committee-based semi-supervised learning and active learning. The random subspace method is applied on both C4.5 decision trees and 1-nearest neighbor classifiers to construct the diverse ensembles used for semi-supervised learning and active learning. Experiments show that these two combinations can outperform other non committee-based ones.

Articles by authors
Mohamed Farouk Abdel Hady
Friedhelm Schwenker
Keywordsdata mining   classification   active learning   co-training   semi-supervised learning   ensemble learning   random subspace method   decision tree   nearest neighbor classifier     
Fund:

This work was partially supported by the Transregional Collaborative Research Centre SFB/TRR 62 Companion-Technology for Cognitive Technical Systems funded by the German Research Foundation (DFG). The first author was supported by a scholarship of the German Academic Exchange Service (DAAD).

Cite this article:   
Mohamed Farouk Abdel Hady and Friedhelm Schwenker.Combining Committee-Based Semi-Supervised Learning and Active Learning[J]  Journal of Computer Science and Technology, 2010,V25(4): 681-698
URL:  
http://jcst.ict.ac.cn:8080/jcst/EN/DOI 10.1007/s11390-010-1053-z
Copyright 2010 by Journal of Computer Science and Technology