›› 2015,Vol. 30 ›› Issue (2): 353-363.doi: 10.1007/s11390-015-1528-z

所属专题: Artificial Intelligence and Pattern Recognition Computer Graphics and Multimedia

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

基于形态滤波增强的极大稳定极值区方法的视频文本检测

Yun-Zhi Zhuge1(诸葛云志), Hu-Chuan Lu2*(卢湖川), Member, CCF, ACM, IEEE   

  1. 1 College of Electronic, Communication and Physics, Shandong University of Science and Technology, Qingdao 266510 China;
    2 School of Information and Communication Engineering, Dalian University of Technology, Dalian 116023, China
  • 收稿日期:2014-12-23 修回日期:2015-01-20 出版日期:2015-03-05 发布日期:2015-03-05
  • 作者简介:Yun-Zhi Zhuge is an undergra-duate student of College of Electronic, Communication and Physics, Shandong University of Science and Technology.

Robust Video Text Detection with Morphological Filtering Enhanced MSER

Yun-Zhi Zhuge1(诸葛云志), Hu-Chuan Lu2*(卢湖川), Member, CCF, ACM, IEEE   

  1. 1 College of Electronic, Communication and Physics, Shandong University of Science and Technology, Qingdao 266510 China;
    2 School of Information and Communication Engineering, Dalian University of Technology, Dalian 116023, China
  • Received:2014-12-23 Revised:2015-01-20 Online:2015-03-05 Published:2015-03-05
  • About author:Yun-Zhi Zhuge is an undergra-duate student of College of Electronic, Communication and Physics, Shandong University of Science and Technology.

通常情况下,由于视频的有损压缩或视频本身的质量不高,而且视频的背景都比较复杂,视频的字幕经常会出现颜色渗透、边缘模糊和对比度低的现象,这就给视频文本检测带来了挑战,在本文中提出了一个鲁棒的视频文本检测框架来解决这些问题。首先,我们使用梯度幅度图(GAM)来增强输入图像的文本边界,克服了文本边界模糊和颜色渗透的问题;其次,使用两个方向的形态滤波滤除部分背景干扰并增强了文本与背景的对比度;再次,使用最稳定极值(MSER)区域检测器来检测视频文本的连通区域,以MSER 检测到文本区域的亮度均值作为Graph Cuts 的标签集,HSI 颜色空间的H、S、I 三通道的欧式距离作为平滑项得到文本的最佳分割;最后利用文本的几何分布特性将文本连成文本行,并用多帧确认和一些经验规则去除非文本区域。为了验证本文算法的有效性,我们对一系列具有挑战的视频进行测试,实验证明本文提出的文本检测框架具有很好的鲁棒性。

Abstract: Video text detection is a challenging problem, since video image background is generally complex and its subtitles often have the problems of color bleeding, fuzzy boundaries and low contrast due to video lossy compression and low resolution. In this paper, we propose a robust framework to solve these problems. Firstly, we exploit gradient amplitude map (GAM) to enhance the edge of an input image, which can overcome the problems of color bleeding and fuzzy boundaries. Secondly, a two-direction morphological filtering is developed to filter background noise and enhance the contrast between background and text. Thirdly, maximally stable extremal region (MSER) is applied to detect text regions with two extreme colors, and we use the mean intensity of the regions as the graph cuts' label set, and the Euclidean distance of three channels in HSI color space as the graph cuts smooth term, to get optimal segmentations. Finally, we group them into text lines using the geometric characteristics of the text, and then corner detection, multi-frame veri cation, and some heuristic rules are used to eliminate non-text regions. We test our scheme with some challenging videos, and the results prove that our text detection framework is more robust than previous methods.

[1] Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. In Proc. IEEE Con-ference on Computer Vision and Pattern Recognition, Jun. 2010, pp.2963-2970.

[2] Zhang J, Kasturi R. Character energy and link energy-based text extraction in scene images. In Proc. the 10th Asian Conference on Computer Vision, Nov. 2010, pp.308-320.

[3] Lyu M R, Song J, Cai M. A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transactions on Circuits and Systems for Video Technology, 2005, 15(2): 243-255.

[4] Huang X, Ma H, Yuan H. A novel video text detection and localization approach. In Proc. the 9th Paci c Rim Confer-ence on Multimedia, Dec. 2008, pp.525-534.

[5] Huang X, Ma H. Automatic detection and localization of natural scene text in video. In Proc. the 20th IEEE In-ternational Conference on Pattern Recognition, Aug. 2010, pp.3216-3219.

[6] Zhao X, Lin K H, Fu Y, Hu Y, Liu Y, Huang T S. Text from corners: A novel approach to detect text and caption in videos. IEEE Transactions on Image Processing, 2011, 20(3): 790-799.

[7] Kim W, Kim C. A new approach for overlay text detection and extraction from complex video scene. IEEE Transac-tions on Image Processing, 2009, 18(2): 401-411.

[8] Shivakumara P, Phan T Q, Tan C L. A robust wavelet transform based technique for video text detection. In Proc. the 10th International Conference on Document Analysis and Recognition, Jul. 2009, pp.1285-1289.

[9] Shivakumara P, Phan T Q, Tan C L. A Laplacian approach to multi-oriented text detection in video. IEEE Transac-tions on Pattern Analysis and Machine Intelligence, 2011, 33(2): 412-419.

[10] Yi C, Tian Y. Text detection in natural scene images by stroke Gabor words. In Proc. the 11th International Con-ference on Document Analysis and Recognition, Sept. 2011, pp.177-181.

[11] Pan Y F, Hou X, Liu C L. A hybrid approach to detect and localize texts in natural scene images. IEEE Transactions on Image Processing, 2011, 20(3): 800-813.

[12] Mikolajczyk K, Tuytelaars T, Schmid C, Zisserman A, Matas J, Schaffalitzky F, Kadir T, Gool L J V. A comparison of affine region detectors. International Journal of Computer Vision, 2005, 65(1/2): 43-72.

[13] Donoser M, Bischof H. Efficient maximally stable extremal region (MSER) tracking. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp.553-560.

[14] Donoser M, Bischof H, Wiltsche M. Color blob segmentation by MSER analysis. In Proc. IEEE International Con-ference on Image Processing, Oct. 2006, pp.757-760.

[15] Jackway P. Improved morphological top-hat. Electronics Letters, 2000, 36(14): 1194-1195.

[16] Ye B, Peng J. Small target detection method based on morphology top-hat operator. Journal of Image and Graphics, 2002, 7(7): 638-642. (in Chinese)

[17] Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pat-tern Analysis and Machine Intelligence, 2001, 23(11): 1222-1239.

[18] Freedman D, Zhang T. Interactive graph cut based segmentation with shape priors. In Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun. 2005, pp.755-762.

[19] Yi C, Tian Y. Text string detection from natural scenes by structure-based partition and grouping. IEEE Transactions on Image Processing, 2011, 20(9): 2594-2605.

[20] Chen H, Tsai S, Schroth G, Chen D, Grzeszczuk R, Girod B. Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In Proc. the 18th IEEE International Conference on Image Processing, Sept. 2011, pp.2609-2612.

[21] He X C, Yang N H C. Curvature scale space corner detector with adaptive threshold and dynamic region of support. In Proc. the 17th IEEE International Conference on Pattern Recognition, Aug. 2004, pp.791-794.

[22] Liu X, Wang W. Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Transactions on Multimedia, 2012, 14(2): 482-489.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 闵应骅; Yashwant K. Malaiya; 金博平;. Aliasing Errors in Parallel Signature Analyzers[J]. , 1990, 5(1): 24 -40 .
[2] 黄维康; F.Lombardi;. Repairing VLSI/WSI Redundant Memories with Minimum Cost[J]. , 1990, 5(2): 187 -196 .
[3] 秦开怀; 范刚; 孙才;. Extrapolating Acceleration Algorithms for Finding B-Spline Intersections Using Recursive Subdivision Techniques[J]. , 1994, 9(1): 70 -85 .
[4] 高庆狮; 刘志勇;. K-Dimensional Optimal Parallel Algorithm for the Solution of a General Class of Recurrence Equations[J]. , 1995, 10(5): 417 -424 .
[5] 章寅; 许卓群;. Concurrent Manipulation of Expanded AVL Trees[J]. , 1998, 13(4): 325 -336 .
[6] 樊晓聪; 徐殿祥; 侯建民; 郑国梁;. Reasoning about Concurrent Actionsin Multi-Agent Systems[J]. , 1999, 14(4): 422 -428 .
[7] . 一个基于高级数据模型的说明性XML更新语言[J]. , 2005, 20(3): 373 -377 .
[8] . 动态隐式曲线曲面重构的初始形状指定[J]. , 2006, 21(2): 249 -254 .
[9] . 暂缺[J]. , 2008, 23(1): 44 -63 .
[10] . 改进的全局分支时序逻辑CTL的限界模型检测[J]. , 2009, 24(1 ): 96 -109 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: