计算机科学技术学报 ›› 2020,Vol. 35 ›› Issue (3): 564-575.doi: 10.1007/s11390-020-0246-3

所属专题: Artificial Intelligence and Pattern Recognition Computer Graphics and Multimedia

• Special Section of CVM 2020 • 上一篇    下一篇

基于信息质心和改进SaliencyCut的视频自动分割

Hui-Si Wu1,*, Meng-Shu Liu1, Lu-Lu Yin1, Ping Li2, Zhen-Kun Wen1,*, Hon-Cheng Wong3   

  1. 1 College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China;
    2 Department of Computing, The Hong Kong Polytechnic University, Hong Kong 999077, China;
    3 Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China
  • 收稿日期:2020-01-03 修回日期:2020-03-22 出版日期:2020-05-28 发布日期:2020-05-28
  • 通讯作者: Hui-Si Wu, Zhen-Kun Wen E-mail:hswu@szu.edu.cn;wenzk@szu.edu.cn
  • 作者简介:Hui-Si Wu received his B.E. and M.E. degrees both in computer science from the Xi'an Jiaotong University (XJTU), Xi'an, in 2004 and 2007, respectively. He obtained his Ph.D. degree in computer science from The Chinese University of Hong Kong (CUHK), Hong Kong, in 2011. He is currently an associate professor in the College of Computer Science and Software Engineering, Shenzhen University, Shenzhen. His research interests include computer graphics, image processing, and medical imaging.
  • 基金资助:
    This work was supported in part by the Major Project of the New Generation of Artificial Intelligence of National Key Research and Development Project, Ministry of Science and Technology of China under Grant No. 2018AAA0102900, the National Natural Science Foundation of China under Grant Nos. 61572328 and 61973221, the Natural Science Foundation of Guangdong Province of China under Grant Nos. 2018A030313381 and 2019A1515011165, and The Hong Kong Polytechnic University under Grant Nos. P0030419 and P0030929.

Automatic Video Segmentation Based on Information Centroid and Optimized SaliencyCut

Hui-Si Wu1,*, Meng-Shu Liu1, Lu-Lu Yin1, Ping Li2, Zhen-Kun Wen1,*, Hon-Cheng Wong3        

  1. 1 College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China;
    2 Department of Computing, The Hong Kong Polytechnic University, Hong Kong 999077, China;
    3 Faculty of Information Technology, Macau University of Science and Technology, Macau 999078, China
  • Received:2020-01-03 Revised:2020-03-22 Online:2020-05-28 Published:2020-05-28
  • Contact: Hui-Si Wu, Zhen-Kun Wen E-mail:hswu@szu.edu.cn;wenzk@szu.edu.cn
  • About author:Hui-Si Wu received his B.E. and M.E. degrees both in computer science from the Xi'an Jiaotong University (XJTU), Xi'an, in 2004 and 2007, respectively. He obtained his Ph.D. degree in computer science from The Chinese University of Hong Kong (CUHK), Hong Kong, in 2011. He is currently an associate professor in the College of Computer Science and Software Engineering, Shenzhen University, Shenzhen. His research interests include computer graphics, image processing, and medical imaging.
  • Supported by:
    This work was supported in part by the Major Project of the New Generation of Artificial Intelligence of National Key Research and Development Project, Ministry of Science and Technology of China under Grant No. 2018AAA0102900, the National Natural Science Foundation of China under Grant Nos. 61572328 and 61973221, the Natural Science Foundation of Guangdong Province of China under Grant Nos. 2018A030313381 and 2019A1515011165, and The Hong Kong Polytechnic University under Grant Nos. P0030419 and P0030929.

视频分割通常用于不同的计算机视觉应用,例如动作识别,场景解析和视频编辑。近年来,视频分割已取得较大程度的改进,但仍然存在局限性,在视频分割中难以达到稳定的性能。本文中,我们根据物理中的水平平衡原理进行信息质心检测,从而提出一种基于信息质心和改进SaliencyCut的视频自动分割方法。
与现有方法不同,我们提出了信息质心以提供另一种维度的图像信息提高视频分割的准确性。特别地,我们的信息质心是根据图片中的信息水平平衡原理,通过汇总所有图像信息到一个质心点来实现的。因此,它能有效提高目标对象的显著性并抑制背景区域。我们还结合了图像的颜色和坐标信息来计算图片中的局部信息质心和全局信息质心。然后根据检测到的信息质心计算出视频中所有帧的显著图。我们又依据视频前后帧之间的时空一致性,提出了信息质心平滑来增强显著性检测,可以进一步纠正复杂视频中可能由于颜色或运动的急剧变化造成的劣质显著图信息。最后,基于信息质心的显著图和改进SaliencyCut,我们获得了精准的分割结果。我们的方法是在DAVIS数据集上评估的,它包括各种具有挑战性的视频。为评估我们的方法,我们还与最先进的方法进行了比较。令人满意的视觉结果和比较统计结果均证明了我们的方法自动检测的优势和可靠的视频分割效果。

关键词: 自动视频分割, 信息质心, 显著性检测, SaliencyCut

Abstract: We propose an automatic video segmentation method based on an optimized SaliencyCut equipped with information centroid (IC) detection according to level balance principle in physical theory. Unlike the existing methods, the image information of another dimension is provided by the IC to enhance the video segmentation accuracy. Specifically, our IC is implemented based on the information-level balance principle in the image, and denoted as the information pivot by aggregating all the image information to a point. To effectively enhance the saliency value of the target object and suppress the background area, we also combine the color and the coordinate information of the image in calculating the local IC and the global IC in the image. Then saliency maps for all frames in the video are calculated based on the detected IC. By applying IC smoothing to enhance the optimized saliency detection, we can further correct the unsatisfied saliency maps, where sharp variations of colors or motions may exist in complex videos. Finally, we obtain the segmentation results based on IC-based saliency maps and optimized SaliencyCut. Our method is evaluated on the DAVIS dataset, consisting of different kinds of challenging videos. Comparisons with the state-of-the-art methods are also conducted to evaluate our method. Convincing visual results and statistical comparisons demonstrate its advantages and robustness for automatic video segmentation.

Key words: automatic video segmentation, information centroid, saliency detection, optimized SaliencyCut

[1] Soomro K, Idrees H, Shah M. Action localization in videos through context walk. In Proc. the 2015 IEEE Int. Conf. Computer Vision, December 2015, pp.3280-3288.
[2] Soomro K, Idrees H, Shah M. Predicting the where and what of actors and actions through online action localization. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016, pp.2648-2657.
[3] Liu B Y, He X M. Multiclass semantic video segmentation with object-level active inference. In Proc. the 2015 CVPR, June 2015, pp.4286-4294.
[4] Huang H Z, Fang X N, Ye Y F, Zhang S H, Rosin P L. Practical automatic background substitution for live video. Computational Visual Media, 2017, 3(3):273-284.
[5] Liu T, Duan H B, Shang Y Y, Yuan Z J, Zheng N J. Automatic salient object sequence rebuilding for video segment analysis. Science China Information Sciences, 2018, 61(1):Article No. 012205.
[6] Zhang Y, Tang Y L, Cheng K L. Efficient video cutout by paint selection. Journal of Computer Science and Technology, 2015, 30(3):467-477.
[7] Zhang C C, Liu Z L. Prior-free dependent motion segmentation using Helmholtz-Hodge decomposition based objectmotion oriented map. Journal of Computer Science and Technology, 2017, 32(3):520-535.
[8] Ochs P, Brox T. Higher order motion models and spectral clustering. In Proc. the 2012 CVPR, June 2012, pp.614-621.
[9] Fragkiadaki K, Zhang G, Shi J. Video segmentation by tracing discontinuities in a trajectory embedding. In Proc. the 2012 CVPR, June 2012, pp.1846-1853.
[10] Xu C L, Xiong C M, Corso J J. Streaming hierarchical video segmentation. In Proc. the 12th European Conference on Computer Vision, October 2012, pp.626-639.
[11] Zhang D, Javed O, Shah M. Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In Proc. the 2013 CVPR, June 2013, pp.628-635.
[12] Wang W G, Shen J B, Porikli F. Saliency-aware geodesic video object segmentation. In Proc. the 2015 CVPR, June 2015, pp.3395-3402.
[13] Caelles S, Maninis K K, Pont-Tuset J, Leal-Taixé L, Cremers D, van Gool L. One-shot video object segmentation. In Proc. the 2017 CVPR, July 2017, pp.221-230.
[14] Zhang S H, Li R L, Dong X et al. Pose2Seg:Detection free human instance segmentation. In Proc. the 2019 CVPR, June 2019, pp.889-898.
[15] Perazzi F, Khoreva A, Benenson R, Schiele B, SorkineHornung A. Learning video object segmentation from static images. In Proc. the 2017 CVPR, July 2017, pp.3491-3500.
[16] Perazzi F, Pont-Tuset J, McWilliams B, van Gool L, Gross M, Sorkine-Hornung A. A benchmark dataset and evaluation methodology for video object segmentation. In Proc. the 2016 CVPR, June 2016, pp.724-732.
[17] Huang Z J, Huang L C, Gong Y C et al. Mask scoring RCNN. In Proc. the 2019 CVPR, June 2019, pp.6409-6418.
[18] Cheng M M, Mitra N J, Huang X L, Torr P H, Hu S M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 37(3):569-582.
[19] Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11):2274-2282.
[20] Mannan S K, Kennard C, Husain M. The role of visual salience in directing eye movements in visual object agnosia. Current Biology, 2009, 19(6):R247-R248.
[21] Hou X D, Harel J, Koch C. Image signature:Highlighting sparse salient regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 34(1):194-204.
[22] Rother C, Kolmogorov V, Blake A. "GrabCut" interactive foreground extraction using iterated graph cuts. ACM Transactions on Graphics, 2004, 23(3):309-314
[23] Papazoglou A, Ferrari V. Fast object segmentation in unconstrained video. In Proc. the 2013 IEEE Int. Conf. Computer Vision, December 2013, pp.1777-1784.
[24] Wang W G, Shen J B, Yang R G, Porikli F. Saliency-aware video object segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(1):20-33.
[25] Seo H J, Milanfar P. Static and space-time visual saliency detection by self-resemblance. Journal of Vision, 2009, 9(12):Article No. 15.
[26] Guo C, Ma Q, Zhang L. Spatio-temporal saliency detection using phase spectrum of quaternion Fourier transform. In Proc. the 2008 CVPR, June 2008.
[27] Fu H, Cao X, Tu Z. Cluster-based co-saliency detection. IEEE Transactions on Image Processing, 2013, 22(10):3766-3778.
[28] Zhou F, Kang B S, Cohen M F. Time-mapping using spacetime saliency. In Proc. the 2014 CVPR, June 2014, pp.3358-3365.
[29] Wang W, Shen J, Yang R, Porikli F. Saliency-aware video object segmentation. IEEE Trans. Pattern Anal. Mach. Intell., 2018, 40(1):20-33.
[30] Perazzi F, Krähenbühl P, Pritch Y, Hornung A. Saliency filters:Contrast based filtering for salient region detection. In Proc. the 2012 CVPR, June 2012, pp.733-740.
[31] Tsai Y H, Yang M H, Black M J. Video segmentation via object flow. In Proc. the 2016 CVPR, June 2016, pp.3899-3908.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 冯玉琳;. Recursive Implementation of VLSI Circuits[J]. , 1986, 1(2): 72 -82 .
[2] 刘明业; 洪恩宇;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[3] C.Y.Chung; 华宣仁;. A Chinese Information Processing System[J]. , 1986, 1(2): 15 -24 .
[4] 陈世华;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[5] 潘启敬;. A Routing Algorithm with Candidate Shortest Path[J]. , 1986, 1(3): 33 -52 .
[6] 陈肇雄; 高庆狮;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[7] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[8] 乔香珍;. An Efficient Parallel Algorithm for FFT[J]. , 1987, 2(3): 174 -190 .
[9] 黄国祥; 刘健;. A Key-Lock Access Control[J]. , 1987, 2(3): 236 -243 .
[10] 陆奇; 张福波; 钱家骅;. Program Slicing:Its Improved Algorithm and Application in Verification[J]. , 1988, 3(1): 29 -39 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: