›› 2010, Vol. 25 ›› Issue (4): 750-760.doi: 10.1007/s11390-010-1058-7

Special Issue: Artificial Intelligence and Pattern Recognition; Data Management and Data Mining

• Special Section on Advances in Machine Learning and Applications • Previous Articles     Next Articles

Multiple Hypergraph Clustering ofWeb Images by MiningWord2Image Correlations

Fei Wu(吴 飞), Senior Member, CCF, Ya-Hong Han(韩亚洪), and Yue-Ting Zhuang(庄越挺), Member, IEEE   

  1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
  • Received:2009-05-15 Revised:2010-02-11 Online:2010-07-09 Published:2010-07-09
  • About author:
    Fei Wu is a senior member of CCF. He received the B.S. degree from Lanzhou University, China, the M.S. degree from Macao University, China, and the Ph.D. degree from Zhejiang University, Hangzhou, China. His research interest includes multimedia retrieval and statistic learning.
    Ya-Hong Han received the B.S. degree in 2000 from Zhengzhou University, China, the M.S. degree in 2003 from Hohai University, Nanjing, China. He is currently pursuing the Ph.D. degree at the College of Computer Science, Zhejiang University. His research interests are machine learning and multimedia retrieval.
    Yue-Ting Zhuang is a Member of IEEE. He received the B.S., M.S., and Ph.D. degrees from Zhejiang University, Hangzhou, China, in 1986, 1989, and 1998, respectively. Currently, he is a professor and Ph.D. supervisor at the College of Computer Science, Zhejiang University. His research interests include multimedia databases, artificial intelligence, and video-based animation.
  • Supported by:

    Supported by the National Natural Science Foundation of China under Grant Nos. 90920303, 60833006; the National Basic Research 973 Program of China under Grant No. 2010CB327905; the Program for Changjiang Scholars and Innovative Research Team in University of China under Grant Nos. IRT0652, PCSIRT.

In this paper, we consider the problem of clustering Web images by mining correlations between images and their corresponding words. Since Web images always come with associated text, the corresponding textual tags of Web images are used as a source to enhance the description of Web images. However, each word has different contribution for the interpretation of image semantics. Therefore, in order to evaluate the importance of each corresponding word of Web images, we propose a novel visibility model to compute the extent to which a word can be perceived visually in images, and then infer the correlation of word to image by the integration of visibility with tf-idf. Furthermore, Latent Dirichlet Allocation (LDA) is used to discover topic information inherent in surrounding text and topic correlations of images could be defined for image clustering. For integrating visibility and latent topic information into an image clustering framework, we first represent textual correlated and latent-topic correlated images by two hypergraph views, and then the proposed Spectral Multiple Hypergraph Clustering (SMHC) algorithm is used to cluster images into categories. The SMHC could be regarded as a new unsupervised learning process with two hypergraphs to classify Web images. Experimental results show that the SMHC algorithm has better clustering performance and the proposed SMHC-based image clustering framework is effective.

[1] Smeulders A W M et al. Content-based image retrieval: The end of the early years. IEEE Trans. PAMI, 2000, 22(12): 1349-1380.

[2] Rege M, Dong M, Hua J. Graph theoretical framework for simultaneously integrating visual and textual features for efficient Web image clustering. In Proc. the 17th Int. Conf. World Wide Web 2008, Beijing, China, April 21-25, 2008, pp.317-326.

[3] Jing F, Wang C et al. IGroup: A Web image search engine with semantic clustering of search results. In Proc. the 14th ACM Int. Conf. Multimedia, Singapore, Nov. 6-11, 2005, pp.377-384.

[4] Cai D, He X et al. Hierarchical clustering of WWW image search results using visual, textual and link information. In Proc. the 13th ACM Int. Conf. Multimedia, New York, USA, Oct. 10-16, 2004, pp.952-959.

[5] Saenko K, Darrell T. Unsupervised learning of visual sense models for polysemous words. In Proc. NIPS 2008, Vancouver, Canada, Dec. 8-11, 2008, pp.1393-1400.

[6] Blei D, Ng A, Jordan M. Latent Dirichlet allocation. Journal of Machine Learning Research, Mar. 2003, 3: 993-1022.

[7] Wu F, Liu Y, Zhuang Y. Tensor-based transductive learning for multi-modality video semantic concept detection. IEEE Transactions on Multimedia, 2009, 11(5): 868-878.

[8] Yang Y, Zhuang Y, Wu F, Pan Y. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Transactions on Multimedia, 2008, 10(3): 437-446.

[9] Wu F, Han Y, Zhuang Y, Shao J. Clustering web images by correlation mining of image-text. Journal of Software, http://www.jos.org.cn/1000-9825/3704.htm, 2010. (in Chinese)

[10] Zhou D, Burges C J C. Spectral clustering and transductive learning with multiple views. In Proc. the 24th Int. Conf. Machine Learning, Corvallis, OR, 2007, pp.1159-1166.

[11] Barnard K, Duygulu P et al. Matching words and pictures. Journal of Machine Learning Research, 2003, 3: 1107-1135.

[12] Wang X J, Zhang L et al. Annotating images by mining image search results. IEEE Trans. PAMI, 2008, 30(11): 1919-1932.

[13] Zhu X, Goldberg A et al. A text-to-picture synthesis system for augmenting communication. In Proc. the 22nd Conf. AAAI, Vancouver, Canada, July 22-26, 2007, pp.1590-1595.

[14] Li H, Tang J et al. Word2Image: Towards visual interpretation of words. In Proc. the 16th ACM Int. Conf. Multimedia, Vancouver, Canada, Oct. 26-31, 2008, pp.813-816.

[15] Xia D, Wu F, Zhuang Y. Search-based automatic Web image annotation using latent visual and semantic analysis. In Proc. the 9th Pacific Rim Conf. Multimedia, Tainan, China, Dec. 9-13, 2008.

[16] Berg T, Forsyth D. Animals on the Web. In Proc. CVPR 2006, Washington, DC, USA, June 17-22, 2006, pp.1463-1470.

[17] Zhou D, Huang J, Schõlkopf B. Learning with hypergraphs clustering, classification, and embedding. Advances in Neural Information Processing Systems, Vancouver/Whistler, Canada, Dec. 4-9, 2007, pp.1601-1608.

[18] Golub G, Loan C. Matrix Computations. 3rd Edition, The Johns Hopkins University Press, 1996.

[19] Griffiths T L, Steyvers M. Finding scientific topics. Proc. National Academy of Science, 2004, 101(Supp.1): 5228-5235.

[20] Dhillon I S. Co-clustering documents and words using bipartite spectral graph partitioning. In Proc. KDD, San Francisco, USA, Aug. 26-29, 2001, 269-274.

[21] LSCOM lexicon definitions and annotations version 1.0. In DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report 117-2006-3, 2006.

[22] Grubinger M, Clough P et al. The IAPR TC-12 Benchmark: A new evaluation resource for visual information systems. In Proc. Int. Workshop OntoImage'2006 Language Resources for Content-Based Image Retrieval, Genoa, Italy, May 22, 2006, pp.13-23.

[23] Alexander S, Joydeep G. Cluster ensembles --- A knowledge reuse framework for combining multiple partitions. Journal Machine Learning Research, December 2002, 3: 583-617.

[24] Long B, Zhang Z, Xu T. Clustering on complex graphs. In Proc. the 23rd Conf. AAAI 2008, Chicago, USA, July, 2008.

No related articles found!
Full text



[1] Han Jianchao; Shi Zhongzhi;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[2] Fei Xianglin; Liao Lei; Wang Hezhen; Wang Chengzao;. Structured Development Environment Based on the Object-Oriented Concepts[J]. , 1992, 7(3): 193 -201 .
[3] wang Xuejun; Shi Chunyi;. A Multiagent Dynamic interaction Testbed:Theoretic Framework, System Architecture and Experimentation[J]. , 1997, 12(2): 121 -132 .
[4] Li Minglu; Sun Yongqiang; Sheng Huany;. Nondeterministic Temporal Relations in Multimedia Data[J]. , 1997, 12(3): 244 -251 .
[5] Zhang Chenghong; Hu Yunfa; Shi Baile;. A Reasoning Mechanism for DeductiveObject-Oriented Databases[J]. , 1997, 12(4): 337 -345 .
[6] NIE Xumin; GUO Qing;. Renaming a Set of Non-Horn Clauses[J]. , 2000, 15(5): 409 -415 .
[7] Peter M. Haverty, Zhi-Ping Weng, and Ulla Hansen. Transcriptional Regulatory Networks Activated by PI3K and ERK Transduced Growth Signals in Human Glioblastoma Cells[J]. , 2005, 20(4): 439 -445 .
[8] Xu-Bin Deng, and Yang-Yong Zhu. L-tree Match: A New Data Extraction Model and Algorithm for Huge Text Stream with Noises[J]. , 2005, 20(6): 763 -773 .
[9] Cliff Reader. AVS Intellectual Property Rights (IPR) Policy[J]. , 2006, 21(3): 306 -309 .
[10] Fei-Long Tang, Ming-Lu Li, and Joshua Zhexue Huang. Automatic Transaction Compensation for Reliable Grid Applications[J]. , 2006, 21(4): 529 -536 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved