›› 2013, Vol. 28 ›› Issue (5): 818-826.doi: 10.1007/s11390-013-1380-y

Special Issue: Artificial Intelligence and Pattern Recognition; Computer Graphics and Multimedia

• Special Section of CVM2013 • Previous Articles     Next Articles

Learning Structure Models with Context Information for Visual Tracking

Li-Wei Liu (刘力为), Student Member, IEEE, and Hai-Zhou Ai (艾海舟), Senior Member, IEEE   

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
  • Received:2013-05-05 Revised:2013-08-13 Online:2013-09-05 Published:2013-09-05
  • Supported by:

    This work is supported in part by the National Natural Science Foundation of China under Grant No. 61075026 and the National Basic Research 973 Program of China under Grant No. 2011CB302203.

Tracking objects that undergo abrupt appearance changes and heavy occlusions is a challenging problem which conventional tracking methods can barely handle. To address the problem, we propose an online structure learning algorithm that contains three layers: an object is represented by a mixture of online structure models (OSMs) which are learnt from block-based online random forest classifiers (BORFs). BORFs are able to handle occlusion problems since they model local appearances of the target. To further improve the tracking accuracy and reliability, the algorithm utilizes mixture relational models (MRMs) as multi-mode context information to integrate BORFs into OSMs. Furthermore, the mixture construction of OSMs can avoid over-fitting effectively and is more flxible to describe targets. Fusing BORFs with MRMs, OSMs capture the discriminative parts of the target, which guarantees the reliability and robustness of our tracker. In addition, OSMs incorporate with block occlusion reasoning to update our BORFs and MRMs, which can deal with appearance changes and drifting problems effectively. Experiments on challenging videos show that the proposed tracker performs better than several state-of-the-art algorithms.

[1] Lowe D G. Robust model-based motion tracking through the integration of search and estimation. Int. J. Comput. Vis., 1992, 8(2): 113-122.

[2] Shafique K, Shah M. A non-iterative greedy algorithm for multi-frame point correspondence. IEEE Trans. Pattern Anal. Mach. Intel., 2005, 27(1): 51-65.

[3] Zhao T, Nevatia R. Tracking multiple humans in complex situations. IEEE Trans. Pattern Anal. Mach. Intel., 2004, 26(9): 1208-1221.

[4] Kass M, Witkin A, Terzopoulos D. Snakes: Active contour models. Int. J. Comput. Vis., 1988, 1(4): 321-331.

[5] Kamijo S, Matsushita Y, Ikeuchi K, Sakauchi M. Occlusion robust tracking utilizing spatio-temporal Markov random field model. In Proc. the 15th Int. Conf. Pattern Recognition, Sept. 2000, Vol.1, pp.140-144.

[6] Wang S, Lu H, Yang F et al. Superpixel tracking. In Proc. the IEEE Int. Conf. Comput. Vis., Nov. 2011, pp.1323-1330.

[7] Viola P, Jones M J. Robust real-time face detection. Int. J. Comput. Vis., 2004, 57(2): 137-154.

[8] Huang C, Ai H, Li Y, Lao S. High performance rotation invariant multiview face detection. IEEE Trans. Pattern Anal. Mach. Intel., 2007, 29(4): 671-686.

[9] Wu B, Nevatia R. Detection and tracking of multiple, partially occluded humans by Bayesian combination of edgelet based part detectors. Int. J. Comput. Vis., 2007, 75(2): 247-266.

[10] Dalal N, Triggs B. Histograms of oriented gradients for human detection. In Proc. the 2005 IEEE Conf. Comput. Vis. Pattern Recognition, Jun. 2005, Vol.1, pp.886-893.

[11] Felzenszwalb P F, Girshick R B, McAllester D, Ramanan D. Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intel., 2010, 32(9): 1627-1645.

[12] Kuo C H, Nevatia R. Robust multi-view car detection using unsupervised sub-categorization. In Proc. the 2009 Workshop Appl. of Comput. Vis., Dec. 2009, pp.1-8.

[13] Song G Y, Lee K Y, Lee J W. Vehicle detection by edge-based candidate generation and appearance-based classification. In Proc. the 2008 IEEE Intelligent Vehicles Symposium, Jun. 2008, pp.428-433.

[14] Zhang L, Li Y, Nevatia R. Global data association for multiobject tracking using network flows. In Proc. the IEEE Conf. Comput. Vis. Pattern Recognition, Jun. 2008, pp.1-8.

[15] Pirsiavash H, Ramanan D, Fowlkes C C. Globally-optimal greedy algorithms for tracking a variable number of objects. In Proc. the IEEE Conf. Comput. Vis. Pattern Recognition, Jun. 2011, pp.1201-1208.

[16] Babenko B, Yang M H, Belongie S. Visual tracking with online multiple instance learning. In Proc. the IEEE Conf. Comput. Vis. Pattern Recognition, Jun. 2009, pp.983-990.

[17] Grabner H, Grabner M, Bischof H. Real-time tracking via online boosting. In Proc. British Mach. Vis. Conf., Sept. 2006, Article No. 6.

[18] Kalal Z, Mikolajczyk K, Matas J. Tracking-learningdetection. IEEE Trans. Pattern Anal. Mach. Intel., 2012, 34(7): 1409-1422.

[19] Kwak S, Nam W, Han B, Han J H. Learning occlusion with likelihoods for visual tracking. In Proc. the IEEE Int. Conf. Comput. Vis., Nov. 2011, pp.1551-1558.

[20] Jia X, Lu H, Yang M H. Visual tracking via adaptive structural local sparse appearance model. In Proc. the IEEE Conf. Comput. Vis. Pattern Recognition, Jun. 2012, pp.1822-1829.

[21] Saffari A, Leistner C, Santner J, Godec M, Bischof H. On-line random forests. In Proc. the 12th Int. Conf. Comput. Vis. Workshops, Sept. 27-Oct. 4, 2009, pp.1393-1400.

[22] Xing J, Ai H, Liu L, Lao S. Multiple player tracking in sports video: A dual-mode two-way Bayesian inference approach with progressive observation modeling. IEEE Trans. Image Processing, 2011, 20(6): 1652-1667.
No related articles found!
Full text



[1] Li Renwei;. Soundness and Completeness of Kung s Reasoning Procedure[J]. , 1988, 3(1): 7 -15 .
[2] Byounghyun Yoo and Soonhung Han. Representation of Urban Buildings Using Modified Relief Mapping[J]. , 2006, 21(2): 204 -208 .
[3] Min Liu, Zhong-Cheng Li, and Xiao-Bing Guo. An Efficient Handoff Decision Algorithm for Vertical Handoff Between WWAN and WLAN[J]. , 2007, 22(1): 114 -120 .
[4] Fei-Yue Wang, Ning-Hui Sun, Wen-Ji Mao, and Xiao-Wei Li. Preface[J]. , 2009, 24(6): 997 -999 .
[5] Chuan-Wen Li (李传文), Student Member,CCF, Member, ACM, Yu Gu (谷峪), Member,CCF, ACM Ge Yu (于戈), Senior Member,CCF, Member, ACM. Aggressive Complex Event Processing with Confidence over Out-of-Order Streams[J]. , 2011, 26(4): 685 -696 .
[6] Xiao-Hui Wang(王晓慧), Jia Jia(贾珈), Han-Yu Liao(廖捍宇), and Lian-Hong Cai(蔡莲红). Affective Image Colorization[J]. , 2012, 27(6): 1119 -1128 .
[7] Ying-Jun Wu(吴英骏), Han Huang(黄翰), Member, CCF, ACM, IEEE, Zhi-Feng Hao(郝志峰), and Feng Chen(陈丰). Local Community Detection Using Link Similarity[J]. , 2012, 27(6): 1261 -1268 .
[8] Fatemeh Dorri, Ali Ghodsi. Minimizing the Discrepancy Between Source and Target Domains by Learning Adapting Components[J]. , 2014, 29(1): 105 -115 .
[9] Jia Jia, Wai-Kim Leung, Yu-Hao Wu, Xiu-Long Zhang, Hao Wang, Lian-Hong Cai, Helen M. Meng. Grading the Severity of Mispronunciations in CAPT Based on Statistical Analysis and Computational Speech Perception[J]. , 2014, 29(5): 751 -761 .
[10] Xiang-Ke Liao, Zheng-Bin Pang, Ke-Fei Wang, Yu-Tong Lu, Min Xie, Jun Xia, De-Zun Dong, Guang Suo. High Performance Interconnect Network for Tianhe System[J]. , 2015, 30(2): 259 -272 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved