• Articles • Previous Articles    

Hierarchical Approximate Matching for Retrieval of Chinese Historical Calligraphy Character

Xia-Fen Zhang{1,2, Yue-Ting Zhuang1, Jiang-Qin Wu1, and Fei Wu1   

  1. 1Institute of Artificial Intelligence, Zhejiang University, Hangzhou 310027, China 2College of Information Engineering, Shanghai Maritime University, Shanghai 200135, China
  • Received:2005-08-10 Revised:2007-02-26 Online:2007-07-10 Published:2007-07-10

As historical Chinese calligraphy works are being digitized, the problem of retrieval becomes a new challenge. But, currently no OCR technique can convert calligraphy character images into text, nor can the existing Handwriting Character Recognition approach does not work for it. This paper proposes a novel approach to efficiently retrieving Chinese calligraphy characters on the basis of similarity: calligraphy character image is represented by a collection of discriminative features, and high retrieval speed with reasonable effectiveness is achieved. First, calligraphy characters that have no possibility similar to the query are filtered out step by step by comparing the character complexity, stroke density and stroke protrusion. Then, similar calligraphy characters are retrieved and ranked according to their matching cost produced by approximate shape match. In order to speed up the retrieval, we employed high dimensional data structure --- PK-tree. Finally, the efficiency of the algorithm is demonstrated by a preliminary experiment with 3012 calligraphy character images.

Key words: process calculus; transitional semantics; topology; limit behaviour;

[1] Chi-Wing Lo, Qin Lu, Kwun-Tak Ng. Character-image search engine. -\it IEEE International Conference on Systems, Man and Cybernetics}, Hammamet, Tunisia, 2002, 4: 6.

[2] Lavrenko V, Rath T M, Manmatha R. Holistic word recognition for handwritten historical documents. In -\it Proc. the Int. Workshop on Document Image Analysis for Libraries}, Palo Alto, CA, January 23$\sim$24, 2004, pp.278$\sim$287.

[3] Zhuang Y T, Zhang X F, Wu J Q \it et al. \rm Retrieval of Chinese calligraphic character image. In -\it Proc. IEEE 2004 Pacific-Rim Conference on Multimedia}, Tokyo, Japan, 2004, pp.17$\sim$24.

[4] Toni M Rath, R Manmatha, Victor Lavrenko. A search engine for historical manuscript images. In -\it Proc. 27th Annual Int. Research and Development in Information Retrieval $($SIGIR'04$)$}, Sheffield, United Kingdom, %ISBN:1-58113-881-4, 2004, pp.369$\sim$376.

[5] Shi Baile, Zhang Liang, Wang Yong \it et al. \rm Content-based Chinese script retrieval through visual similarity criteria. -\it Journal of Software}, 2001, 12(9): 1336$\sim$1342.

[6] Xu Zhiming, Wang Xiaolong. A new linguistic decoding method for online handwritten Chinese character recognition. \it Journal of Computer Science and Technology, \rm 2000, 15(6): 597$\sim$603.

[7] Wing Ho Leung, Tsuhan Chen. Hierarchical matching for retrieval of hand-drawn sketches. In -\it Proc. Int. Conf. Multimedia and Expo}, Maryland, US, 2003, 2: 29$\sim$32.

[8] Rumelhart D E, Zipser D. Feature discovery by competitive learning. \it Cognitive Science, \rm 1985, 9(1): 75$\sim$112.

[9] Hsin-Hung Chen. A feasibility study of using color indexing for reef fish identification. In -\it Proc. OCEANS}, 2003, 5: 256.

[10] Manjunath B S, Ma W Y. Texture features for browsing and retrieval of image data. -\it IEEE Trans. Pattern Anal, Machine Intell}., Aug. 1996, 18(8): 837$\sim$842.

[11] Berretti S, Bimbo A D, Pala P. Retrieval by shape similarity with perceptual distance and effective indexing. -\it IEEE Trans. Multimedia}, 2000, 2: 225$\sim$239.

[12] Weissberg N, Sagi S, Shaked D. Shape indexing by dynamic programming. In -\it Proc. 21st IEEE Convention of the Electrical and ELectronic Engineers}, Israel, 2000, pp.114$\sim$117.

[13] Suganthan P N. Shape indexing using self-organizing maps. -\it IEEE Trans. Neural Networks}, 2002, 13(5): 835$\sim$840.

[14] Turk M, Pentland A. Face recognition using eigenfaces. In -\it Proc. IEEE Conf. Computer Vision and Pattern Recognition}, Maui, USA, 1991, pp.586$\sim$591.

[15] Tenenbaum J B, de Silva V, Langford J C. A global geometric framework for nonlinear dimensionality reduction. -\it Science}, 2000, 290(5500): 2319$\sim$2323.

[16] Jihai Zhao, Chen Huang. Technical issues on the China-US million book digital library project. In -\it Proc. 7th Int. Conf. Asian Digital Libraries}, Shanghai, China, 2004, pp.220$\sim$226.

[17] Manmatha R, Chengfeng Han, E M Riseman \it et al. \rm Indexing handwriting using word matching. In -\it Proc. 1st ACM Int. Conf. Digital Libraries}, Maryland, USA, 1996, pp.151$\sim$159.

[18] Wu You-Shou, Ding Xiao-Qing. Chinese Character Recognition: The Principles and the Implementations. Beijing: Higher Education Press, 1992.

[19] Lau K K, Yuen P C, Tang Y Y. Stroke extraction and stroke sequence estimation on signatures. In -\it Proc. 16th Int. Conf. Pattern Recognition}, Quebec, Canada, 2002, 3: 119$\sim$122.

[20] Wenwei Wang, Brakensiek A, Rigoll G. Combination of multiple classifiers for handwritten word recognition. In -\it Proc. 8th Int. Workshop on Frontiers in Handwriting Recognition}, Ontario, US, 2002, pp.117$\sim$122.

[21] Wei Wang, Jiong Yang, Richard Muntz. PK-tree: A spatial index structure for high dimensional point data. Information Organization and Database, Tanaka K, Ghandeharizadeh S, Kambayashi Y (eds.), Boston/Dordrecht/London: Kluwer Academic Publishers, 2000, pp.281$\sim$293.
[1] Shi-Qi Lian, Ying Wang, Yin-He Han. DimRouter: A Multi-Mode Router Architecture for Higher Energy-Proportionality of On-Chip Networks [J]. Journal of Computer Science and Technology, 2018, 33(5): 984-997.
[2] Qing Jiang, Hang-Yu Hu, Guang-Min Hu. Two-Type Information Fusion Based IP-to-AS Mapping Table Refining [J]. , 2017, 32(3): 571-584.
[3] Juan Fang, Zhen-Yu Leng, Si-Tong Liu, Zhi-Cheng Yao, Xiu-Feng Sui. Exploring Heterogeneous NoC Design Space in Heterogeneous GPU-CPU Architectures [J]. , 2015, 30(1): 74-83.
[4] Jun-Cheng Huang (黄俊成), Member, ACM, IEEE, Xiu-Qi Li (李秀琦), Member, ACM, IEEE and Jie Wu (吴杰), Member, ACM, Fellow, IEEE. A Semantic Searching Scheme in Heterogeneous Unstructured P2P Networks [J]. , 2011, 26(6): 925-941.
[5] Xiaofang (Maggie) Wang, Member, IEEE, and Swetha Thota. A Resource-Efficient Communication Architecture for Chip Multiprocessors on FPGAs [J]. , 2011, 26(3): 434-447.
[6] Mahshid Rahnamay-Naeini, and Masoud Sabaei. A Combinational Perspective in Stimulating Cooperation in Mobile Ad Hoc Networks [J]. , 2011, 26(2): 256-268.
[7] Ming-Wen Chen (陈明文), Jian Zhang (张健), Song-Lin Hu (虎嵩林), Senior Member, CCF and Zhi-Yong Liu (刘志勇), Senior Member, CCF. Covering-Based Routing Algorithms for Cyclic Content-Based P/S Overlays [J]. , 2010, 25(6): 1214-1224.
[8] Jing Zhou, Member, ACM, Wendy Hall, Member, ACM, and David De Roure, Member, ACM. Building a Distributed Infrastructure for Scalable Triple Stores [J]. , 2009, 24(3): 447-462.
[9] Xian Xu. Expressing First-Order pi-Calculus in Higher-Order Calculus of Communicating Systems [J]. , 2009, 24(1 ): 122-137 .
[10] Pisai Setthawong and Surat Tanterdtid. Flood Avoidance Mechanisms for Bridged Resilient Packet Rings [J]. , 2008, 23(5 ): 815-824 .
[11] Gang Xu, Guo-Zhao Wang, and Xiao-Diao Chen. Free-Form Deformation with Rational DMS-Spline Volumes [J]. , 2008, 23(5 ): 862-873 .
[12] Kan Cai, Michael J. Feeley, Brendan Cully, and Sharath J. George. Understanding Performance for Two 802.11 Competing Flows [J]. , 2008, 23(3): 365-376 .
[13] Xiao-Ju Dong and Yu-Xi Fu. Barbed Congruence of Asymmetry and Mismatch [J]. , 2007, 22(4): 575-579 .
[14] Xin-Li Huang, Fu-Tai Zou, and Fan-Yuan Ma. Targeted Local Immunization in Scale-Free Peer-to-Peer Networks [J]. , 2007, 22(3): 457-468 .
[15] Ying He, Xian-Feng Gu, and Hong Qin. Automatic Shape Control of Triangular B-Splines of Arbitrary Topology [J]. , 2006, 21(2): 232-237 .
Full text



[1] Zheng Chongxun; Zhang Kenong;. Orthogonal Algorithm of Logic Probability and Syndrome-Testable Analysis[J]. , 1990, 5(2): 203 -209 .
[2] FU maxi;. Relative Properties of Frame Language[J]. , 1999, 14(4): 320 -327 .
[3] WU Xunwei; Massoud Pedram;. Bounded Algebra and Current-Mode Digital Circuits[J]. , 1999, 14(6): 551 -557 .
[4] ZHAO YiXin (赵邑新), YIN Xia (尹 霞) and WU JianPing (吴建平). Problems in the Information Dissemination of the Internet Routing[J]. , 2003, 18(2): 0 .
[5] Shu-Tao Xia. A Note on the Stopping Redundancy of Linear Codes[J]. , 2006, 21(6): 950 -951 .
[6] Yun-Jun Gao, Chun Li, Gen-Cai Chen, Ling Chen, Xian-Ta Jiang, and Chun Chen. Efficient k-Nearest-Neighbor Search Algorithms for Historical Moving Object Trajectories[J]. , 2007, 22(2): 232 -244 .
[7] Jun Yao, Ji-Wu Shu, and Wei-Min Zheng. Distributed Storage Cluster Design for Remote Mirroring Based on Storage Area Network[J]. , 2007, 22(4): 521 -526 .
[8] Rafiullah Chamlawi, Asifullah Khan, and Adnan Idris. Wavelet Based Image Authentication and Recovery[J]. , 2007, 22(6): 795 -804 .
[9] Gang Wu, Juan-Zi Li, Member, CCF, ACM, Jian-Qiang Hu, and Ke-Hong Wang, Member, CCF. System |Π: A Native RDF Repository Based on the Hypergraph Representation for RDF Data Model[J]. , 2009, 24(4): 652 -664 .
[10] Zhi-Hua Zhou and Hang Li. Preface[J]. , 2010, 25(4): 651 -652 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
E-mail: jcst@ict.ac.cn
  Copyright ©2015 JCST, All Rights Reserved