›› 2011,Vol. 26 ›› Issue (1): 195-202.doi: 10.1007/s11390-011-1122-y

所属专题: Computer Graphics and Multimedia

• • 上一篇    

用于视频编码的基于重要性分析的精细度自适应视频编码预处理

  

  • 收稿日期:2009-05-04 修回日期:2010-10-31 出版日期:2011-01-01 发布日期:2011-01-01

Saliency-Based Fidelity Adaptation Preprocessing for Video Coding

Shao-Ping Lu (卢少平), Student Member, CCF, ACM, and Song-Hai Zhang (张松海), Member, CCF, ACM, IEEE   

  1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
    Beijing Engineering Research Center for Intelligent Processing of Visual Media and Content Security, Beijing 100084, China
  • Received:2009-05-04 Revised:2010-10-31 Online:2011-01-01 Published:2011-01-01
  • About author:Shao-Ping Lu is a Ph.D. candidate at Department of Computer Science and Technology in Tsinghua University. His research interests include image and video process. He is a student member of China Computer Federation and ACM.
    Song-Hai Zhang obtained his Ph.D. degree in 2007 from Tsinghua University. He is currently a lecturer of computer science at Tsinghua University, China. His research interests include image and video processing, geometric computing. He is a member of China Computer Federation, ACM and IEEE.
  • Supported by:

    This work was supported partially by the National High-Tech Research and Development 863 Program of China under Grant No. 2009AA01Z330, the National Natural Science Foundation of China under Grant Nos. 61033012 and 60970100.

1.本文的创新点
根据视觉属性来分配码率是视频编码中的核心目标之一。本文以视频预处理方式,采用基于时空一致性的视觉重要性作为指导属性,实现了用于视频编码的保持重要性区域和边界强度等自适应内容精细度调整方法。本文的创新点主要在以下两点:本文从人的视觉感知属性出发,通过建立时空一致性保持的视觉重要性分析模型来得到视频内容的重要性分布信息;采用自适应保持重要区域和边界等重要内容的扩展双边滤波方法实现对视频内容的精细度预处理。该预处理依据重要性区域分配较多码率、其他区域则相对降低码率的原则,通过有效的参数调整实现自适应伸缩平滑滤波操作,能够保持重要性区域的内容信息,同时有选择性地合理抑制非重要性区域的高频信息,从而达到提高码率分配的增益。此外,我们的算法充分考虑了时间坐标上各帧间视频重要性内容的下降和突变等因素,因此能够有效保持视频内容一致性。
2.实现方法
在视频内容的重要性分析过程中,我们首先建立了自顶向下和自底向上的联合重要性分析模型。自顶向下模型主要分析了图像中人脸等视觉上最为敏感信息元素,而自底向上模型则充分考虑了颜色强度、边界方向、闪烁和运动等信息。此外我们还根据重要性区域在视频中的衰减原则和场景切换检测等计算来合成时空一致性分析图。在扩展的双边滤波中,本文算法通过空间适配参数和强度参数的自适应调整实现了从视觉重要性分析结果到视频内容精细度调整的合理映射,从而实现满足视觉特性地调整并达到编码时码率合理分配的目的。
3.结论及未来待解决的问题
与其他如基于图像宏块做处理、对图像做内容分割等方法不同,我们以视频内容自适应预处理方式来实现码率的自适应调整。本文的实验结果表明在保持重要性区域的内容前提下实现降低码率的目标。与此同时,我们还对其他文献中的相关算法做了比较,结果表明我们的算法在相同码率水平下具有更好的视觉效果。基于本文的研究内容和实现的结果,进一步需要研究的内容和改进方向包括:
(1) 寻找更合理的用于精细度处理的颜色空间。本文的视频预处理是基于CIE-Lab空间,而从YUV到该空间会有一定的视频质量损失。
(2) 我们将致力于研究更鲁棒的时空视觉重要性分析模型。
4.实用价值或应用前景
本文提出了时空一致性的视觉重要性分析并将此应用于视频内容的精细度自适应预处理,从而实现了满足视觉感知的自适应码率调整机制,可有效利用于基于视频质量动态调整的网络传输等应用中。

Abstract:

In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.

[1] Tao B, Dickinson B W, Peterson H A. Adaptive model-driven it allocation for MPEG video coding. IEEE Transactions on ircuits and Systems for Video Technology, 2000, 10(1): 147-157.
[2] Tang C W, Chen C H, Yu Y H, Tsai C J. Visual sensitivity uided bit allocation for video coding. IEEE Transactions on Multimedia, 2006, 8(1): 11-18.
[3] Chen M J, Chi M C, Hsu C T, Chen J W. ROI video coding ased on H.263+ with robust skin color detection technique. IEEE Transactions on Consumer Electron, 2003, 49(3): 724-730.
[4] Chai D, Ngan K N. Foreground/background video coding cheme. In Proc. IEEE Int. Symp. Circuits Syst, Hong ong, China, Jun. 9-12, 1997, pp.1448-1451.
[5] Lee S, Pattichis M S, Bovik A C. Foveated video compresion with optimal rate control. IEEE Transactions on Image rocess, 2001, 10(7): 977-992.
[6] Wang D, Speranza F, Vincent A, Martin T, Blanchfield P. Toards optimal rate control: A study of the impact of spatial esolution, frame rate and quantization on subjective video uality and bit rate. In Proc. SPIE 2003, Lugano, Switzerand, Jul. 8-11, 2003, pp.198-209.
[7] Itti L, Koch C, Niebur E. A model of saliency-based visual ttention for rapid scene analysis. IEEE Transactions on Patern Anal. and Machine Intell., 1998, 20(11): 1254-1259.
[8] Itti L. Automatic foveation for video compression using a neuobiological model of visual attention. IEEE Transactions on mage Processing, 2004, 13(10): 1304-1318.
[9] Cavallaro A, Steiger O, Ebrahimi T. Semantic video analyis for adaptive content delivery and automatic description. IEEE Transactions on Circuits and Systems for Video Techology, 2005, 15(10): 1200-1209.
[10] Tomasi C, Manduchi R. Bilateral filtering for gray and color mages. In Proc. ICCV, Bombay, India, Jan. 4-7, 1998, p.839-846.
[11] Eisemann E, Durand F. Flash photography enhancement via ntrinsic relighting. ACM Transactions on Graphics, 2004, 23(3): 673-678.
[12] Huang H, Zang Y, Rosin P L, Qi C. Edge-aware level set difusion and bilateral filtering reconstruction for image magnication. Journal of Computer Science and Technology, 2009, (24): 734-744.
[13] Bennett E P, McMillan L. Video enhancement using per-pixel irtual exposures. ACM Transactions on Graphics, 2005, 4(3): 845-852.
[14] Winnemöller H, Olsen S C, Gooch B. Real-time video abstracion. ACM Transactions on Graphics, 2006, 25(3): 1221-1226.
[15] Xiao J J, Cheng H, Sawhney H, Rao C, Isnardi M. Bilateral ltering-based optical flow estimation with occlusion detecion. In Proc. ECCV, Graz, Austria, May 7-13, 2006, pp.211- 24.
[16] Paris S, Durand F. A fast approximation of the bilateral filter sing a signal processing approach. In Proc. ECCV, Graz, ustria, May 7-13, 2006, pp.568-580.
[17] Pham T Q, Van Vliet L J. Separable bilateral filtering for ast video preprocessing. In Proc. IEEE ICME, Amsteram, Netherlands, Jul. 6-9, 2005, pp.454-457.
[18] William J. The Principles of Psychology. Cambridge, MA: arvard University Press, 1981.
[19] Cerf M, Harel J, EinhÄauser W, Koch C. Predicting human aze using low-level saliency combined with face detection. In Proc. NIPS, Vancouver, Canada, Dec. 3-7, 2007, pp.241- 48.
[20] Sebe N, Lew M S. Comparing salient point detectors. Pattern Recognition Letters, 2003, 24(1): 89-96.
[21] Robert J P, Iyer A, Itti L, Koch C. Components of bottomp gaze allocation in natural scenes. Journal of Vision, 2005, (8): 692-692.
[22] Tsapatsoulis N, Pattichis C, Rapantzikos K. Biologically inpired region of interest selection for low bit-rate video coding. In Proc. ICIP, San Antonio, USA, Sept. 16-19, 2007, pp.305- 08.
[23] ChenWF, Liu C H, Lander K, Fu X L. Comparison of human ace matching behavior and computational image similarity measure. Science in China Series F: Information Sciences, 009, 52(2): 316-321.
[24] Lee K W. Guiding attention by cooperative cues. Journal of omputer Science and Technology, 2008, 5(23): 874-884.
[25] Viola P, Jones M. Rapid object detection using a boosted ascade of simple features. In Proc. CVPR, Hawaii, USA, ec. 11-13, 2001, pp.511-518.
[26] Paris S. Edge-preserving smoothing and mean-shift segmenation of video streams. In Proc. ECCV, Marseille, France, Oct. 12-18, 2008, pp.460-473.
[27] Zhu S H, Liu Y C. Two-dimensional entropy model for video hot partitioning. Science in China Series F: Information ciences, 2009, 52(2): 183-194.
[28] Gargi U, Kasturi R, Strayer S H. Performance characterizaion of video-shot-change detection methods. IEEE Transacions on Circuits and Systems for Video Technology, 2000, 0(1): 1-13.
[29] H.264/AVC reference software [online]. http://iphome.hhi.de/uehring/html.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 沈理;. Testability Analysis at Switch Level for CMOS Circuits[J]. , 1990, 5(2): 197 -202 .
[2] 韩建超; 史忠植;. Formalizing Default Reasoning[J]. , 1990, 5(4): 374 -378 .
[3] 黄志毅; 胡守仁;. Detection of And-Parallelism in Logic Programs[J]. , 1990, 5(4): 379 -387 .
[4] 郭恒昌;. On the Characterization and Fault Identification of Sequentially t-Diagnosable System Under PMC Model[J]. , 1991, 6(1): 83 -90 .
[5] 王学军; 石纯一;. A Multiagent Dynamic interaction Testbed:Theoretic Framework, System Architecture and Experimentation[J]. , 1997, 12(2): 121 -132 .
[6] 董峰; 蔡文立; 陈天洲; 石教英;. Three-Dimensional Volume Datafield Reconstruction from Physical Model[J]. , 1997, 12(3): 217 -230 .
[7] 马华东; 刘慎权;. Multimedia Data Modeling Based on TemporalLogic and XYZ System[J]. , 1999, 14(2): 188 -193 .
[8] 刘彦佩;. Orthogonal Drawings of Graphs for the Automation of VLSI Circuit Design[J]. , 1999, 14(5): 447 -459 .
[9] SUN Wei;. Multi-Volume CAD Modeling for Heterogeneous Object Design and Fabrication[J]. , 2000, 15(1): 27 -36 .
[10] . 知识图的数学模型与动态行为[J]. , 2005, 20(3): 289 -295 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: