›› 2018,Vol. 33 ›› Issue (3): 487-501.doi: 10.1007/s11390-018-1833-4

所属专题: Artificial Intelligence and Pattern Recognition Computer Graphics and Multimedia

• • 上一篇    下一篇

基于深度卷积网络的多重曝光运动估计

Zhi-Feng Xie1,2, Yu-Chen Guo1, Shu-Han Zhang1, Wen-Jun Zhang1, Li-Zhuang Ma2,3, Member, CCF   

  1. 1 Department of Film and Television Engineering, Shanghai University, Shanghai 200072, China;
    2 Shanghai Engineering Research Center of Motion Picture Special Effects, Shanghai 200072, China;
    3 Department of Software Science and Technology, East China Normal University, Shanghai 200062, China
  • 收稿日期:2017-12-27 修回日期:2018-03-06 出版日期:2018-05-05 发布日期:2018-05-05
  • 作者简介:Zhi-Feng Xie received his Ph.D. degree in computer application technology from Shanghai Jiao Tong University, Shanghai, in 2013. He was a research assistant at the Department of Computer Science, City University of Hong Kong, Hong Kong, in 2011. He is now an assistant professor with Shanghai University, Shanghai. His research interests include image/video editing, computer graphics, and digital media technology.
  • 基金资助:

    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61303093, 61472245, and 61402278, the Innovation Program of the Science and Technology Commission of Shanghai Municipality of China under Grant No. 16511101300, and the Gaofeng Film Discipline Grant of Shanghai Municipal Education Commission of China.

Multi-exposure Motion Estimation based on Deep Convolutional Networks

Zhi-Feng Xie1,2, Yu-Chen Guo1, Shu-Han Zhang1, Wen-Jun Zhang1, Li-Zhuang Ma2,3, Member, CCF   

  1. 1 Department of Film and Television Engineering, Shanghai University, Shanghai 200072, China;
    2 Shanghai Engineering Research Center of Motion Picture Special Effects, Shanghai 200072, China;
    3 Department of Software Science and Technology, East China Normal University, Shanghai 200062, China
  • Received:2017-12-27 Revised:2018-03-06 Online:2018-05-05 Published:2018-05-05
  • About author:Zhi-Feng Xie received his Ph.D. degree in computer application technology from Shanghai Jiao Tong University, Shanghai, in 2013. He was a research assistant at the Department of Computer Science, City University of Hong Kong, Hong Kong, in 2011. He is now an assistant professor with Shanghai University, Shanghai. His research interests include image/video editing, computer graphics, and digital media technology.
  • Supported by:

    This work was supported by the National Natural Science Foundation of China under Grant Nos. 61303093, 61472245, and 61402278, the Innovation Program of the Science and Technology Commission of Shanghai Municipality of China under Grant No. 16511101300, and the Gaofeng Film Discipline Grant of Shanghai Municipal Education Commission of China.

在运动估计中,光照变化一直是一个非常棘手的问题,经常导致光流计算质量的急剧下降。其关键原因在于绝大多数的估计方法不能对于各种各样的环境变化,在颜色或梯度域上形式化一个统一完整的定义。在这篇论文中,我们提出一种新的基于深度卷积网络的解决方案来解决这个问题。我们的思路是训练深度卷积网络来表示光照变化下的复杂运动特征,进而预测最终的光流场。为此,我们通过对传统光流计算训练集执行一系列的非线性调整,从而构建一个多重曝光图片对的训练集。我们端到端的网络模型包括三个主要组件:低层特征网络、混合特征网络和运动估计网络。前两者属于我们模型的收缩部分,主要是为抽取和表示多重曝光运动特征;第三个组件是我们模型的扩张部分,主要是为学习和预测高质量光流场。对比于许多传统的方法,我们基于深度卷积网络的运动估计方法能够消除光照变化的影响,并生成高精度高效率的光流结果。此外,我们的模型在一些多曝光视频应用中也有很出色的表现,例如HDR(高动态范围)合成和闪烁消除。

Abstract: In motion estimation, illumination change is always a troublesome obstacle, which often causes severely performance reduction of optical flow computation. The essential reason is that most of estimation methods fail to formalize a unified definition in color or gradient domain for diverse environmental changes. In this paper, we propose a new solution based on deep convolutional networks to solve the key issue. Our idea is to train deep convolutional networks to represent the complex motion features under illumination change, and further predict the final optical flow fields. To this end, we construct a training dataset of multi-exposure image pairs by performing a series of non-linear adjustments in the traditional datasets of optical flow estimation. Our end-to-end network model consists of three main components:low-level feature network, fusion feature network, and motion estimation network. The former two components belong to the contracting part of our model in order to extract and represent the multi-exposure motion features; the third component is the expanding part of our model in order to learn and predict the high-quality optical flow. Compared with many state-of-the-art methods, our motion estimation based on deep convolutional networks can eliminate the obstacle of illumination change and yield optical flow results with competitive accuracy and time efficiency. Moreover, the good performance of our model is also demonstrated in some multi-exposure video applications, like HDR (High Dynamic Range) composition and flicker removal.

[1] Bouguet J. Pyramidal implementation of the Lucas Kanade feature tracker description of the algorithm. http://robots.stanford.edu/cs223b04/algotracking.pdf, Mar. 2018.

[2] Liu C. Beyond pixels:Exploring new representations and applications for motion analysis[Ph.D. Thesis]. Massachusetts Institute of Technology, MA, USA, 2009.

[3] Sun D Q, Roth S, Black M J. Secrets of optical flow estimation and their principles. In Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, June 2010, pp.2432-2439.

[4] Brox T, Malik J. Large displacement optical flow:Descriptor matching in variational motion estimation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2011, 33(3):500-513.

[5] Xu L, Jia J Y, Matsushita Y. Motion detail preserving optical flow estimation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2012, 34(9):1744-1757.

[6] Brox T, Bruhn A, Papenberg N, Weickert J. High accuracy optical flow estimation based on a theory for warping. In Proc. the 8th European Conf. Computer Vision, May 2004, pp.25-36.

[7] Liu C, Yuen J, Torralba A. SIFT flow:Dense correspondence across scenes and its applications. IEEE Trans. Pattern Analysis and Machine Intelligence, 2011, 33(5):978-994.

[8] Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, van der Smagt P, Cremers D, Brox T. FlowNet:Learning optical flow with convolutional networks. In Proc. IEEE Int. Conf. Computer Vision, December 2015, pp.2758-2766.

[9] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks. In Proc. the 25th Int. Conf. Neural Information Processing Systems, December 2012, pp.1097-1105.

[10] Girshick R. Fast R-CNN. In Proc. IEEE Int. Conf. Computer Vision, December 2015, pp.1440-1448.

[11] Farabet C, Couprie C, Najman L, LeCun Y. Learning hierarchical features for scene labeling. IEEE Trans. Pattern Analysis and Machine Intelligence, 2013, 35(8):1915-1929.

[12] Eigen D, Puhrsch C, Fergus R. Depth map prediction from a single image using a multi-scale deep network. In Proc. the 28th Annual Conf. Neural Information Processing Systems, January 2014, pp.2366-2374.

[13] Teney D, Hebert M. Learning to extract motion from videos in convolutional neural networks. In Proc.the 13th Asian Conf. Computer Vision, November 2016, pp.412-428.

[14] Horn B K P, Schunck B G. Determining optical flow. Artificial Intelligence, 1981, 17(1/2/3):185-203.

[15] Anandan P. A computational framework and an algorithm for the measurement of visual motion. International Journal of Computer Vision, 1989, 2(3):283-310.

[16] Bergen J R, Anandan P, Hanna K J, Hingorani R. Hierarchical model-based motion estimation. In Proc. the 2nd European Conf. Computer Vision, May 1992, pp.237-252.

[17] Bruhn A, Weickert J. Towards ultimate motion estimation:Combining highest accuracy with real-time performance. In Proc. the 10th IEEE Int. Conf. Computer Vision, October 2005, pp.749-755.

[18] Bruhn A, Weickert J, Schnörr C. Lucas/Kanade meets Horn/Schunck:Combining local and global optic flow methods. International Journal of Computer Vision, 2005, 61(3):211-231.

[19] Lempitsky V, Roth S, Rother C. FusionFlow:Discretecontinuous optimization for optical flow estimation. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.

[20] Wedel A, Cremers D, Pock T, Bischof H. Structure-and motion-adaptive regularization for high accuracy optic flow. In Proc. the 12th IEEE Int. Conf. Computer Vision, September 29-October 2, 2009, pp.1663-1668.

[21] Zimmer H, Bruhn A, Weickert J. Optic flow in harmony. International Journal of Computer Vision, 2011, 93(3):368-388.

[22] Mémin E, Pérez P. Hierarchical estimation and segmentation of dense motion fields. International Journal of Computer Vision, 2002, 46(2):129-155.

[23] Xu L, Chen J N, Jia J Y. A segmentation based variational model for accurate optical flow estimation. In Proc. the 10th European Conf. Computer Vision, October 2008, pp.671-684.

[24] Lei C, Yang Y H. Optical flow estimation on coarse-to-fine region-trees using discrete optimization. In Proc. the 12th IEEE Int. Conf. Computer Vision, September 29-October 2, 2009, pp.1562-1569.

[25] Werlberger M, Pock T, Bischof H. Motion estimation with non-local total variation regularization. In Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, June 2010, pp.2464-2471.

[26] Xiao J J, Cheng H, Sawhney H, Rao C, Isnardi M. Bilateral filtering-based optical flow estimation with occlusion detection. In Proc. the 9th European Conf. Computer Vision, May 2006, pp.211-224.

[27] Seitz S M, Baker S. Filter flow. In Proc. the 12th IEEE Int. Conf. Computer Vision, September 29-Octomber 2, 2009, pp.143-150.

[28] Brox T, Bregler C, Malik J. Large displacement optical flow. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2009, pp.41-48.

[29] Steinbrucker F, Pock T, Cremers D. Large displacement optical flow computation without warping. In Proc. the 12th IEEE Int. Conf. Computer Vision, September 29-October 2, 2009, pp.1609-1614.

[30] Sand P, Teller S. Particle video:Long-range motion estimation using point trajectories. In Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition, June 2006, pp.2195-2202.

[31] Chen Z Y, Jin H L, Lin Z, Cohen S, Wu Y. Large displacement optical flow from nearest neighbor fields. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2013, pp.2443-2450.

[32] Revaud J, Weinzaepfel P, Harchaoui Z, Schmid C. EpicFlow:Edge-preserving interpolation of correspondences for optical flow. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2015, pp.1164-1172.

[33] Weinzaepfel P, Revaud J, Harchaoui Z, Schmid C. DeepFlow:Large displacement optical flow with deep matching. In Proc. IEEE Int. Conf. Computer Vision, December 2013, pp.1385-1392.

[34] Bailer C, Taetz B, Stricker D. Flow fields:Dense correspondence fields for highly accurate large displacement optical flow estimation. In Proc. IEEE Int. Conf. Computer Vision, December 2015, pp.4015-4023.

[35] Black M J, Anandan P. The robust estimation of multiple motions:Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding, 1996, 63(1):75-104.

[36] Haussecker H W, Fleet D J Computing optical flow with physical models of brightness variation. IEEE Trans. Pattern Analysis and Machine Intelligence, 2001, 23(6):661-673.

[37] Shen X Y, Xu L, Zhang Q, Jia J Y. Multi-modal and multispectral registration for natural images. In Proc. the 13th European Conf. Computer Vision, September 2014, pp.309-324.

[38] Kumar A, Tung F, Wong A, Clausi D A. A decoupled approach to illumination-robust optical flow estimation. IEEE Trans. Image Processing, 2013, 22(10):4136-4147.

[39] Mohamed M A, Rashwan H A, Mertsching B, García M A, Puig D. Illumination-robust optical flow using a local directional pattern. IEEE Trans. Circuits and Systems for Video Technology, 2014, 24(9):1499-1508.

[40] Roth S, Black M J. On the spatial statistics of optical flow. In Proc. the 10th IEEE Int. Conf. Computer Vision, October 2005, pp.42-49.

[41] Sun D Q, Roth S, Lewis J P, Black M J. Learning optical flow. In Proc. the 10th European Conf. Computer Vision, October 2008, pp.83-97.

[42] Rosenbaum D, Zoran D, Weiss Y. Learning the local statistics of optical flow. In Proc. the 27th Annual Conf. Neural Information Processing Systems, December 2013, pp.2373-2381.

[43] Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T. FlowNet 2.0:Evolution of optical flow estimation with deep networks. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, July 2017, pp.1647-1655.

[44] Zhao W B. A concise tutorial on human motion tracking and recognition with Microsoft Kinect. Science China Information Sciences, 2016, 59(9):93101.

[45] Xia S H, Gao L, Lai Y K, Yuan M Z, Chai J X. A survey on human performance capture and animation. Journal of Computer Science and Technology, 2017, 32(3):536-554.

[46] Liu B, Xu K, Martin R P. Static scene illumination estimation from videos with applications. Journal of Computer Science and Technology, 2017, 32(3):430-442.

[47] Xie Z F, Tang S, Huang D J, Ding Y D, Ma L Z. Photographic appearance enhancement via detail-based dictionary learning. Journal of Computer Science and Technology, 2017, 32(3):417-429.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] . 为基于分类法的知识表示系统架通现实世界语义与模型世界语义之间的桥梁[J]. , 2005, 20(3): 296 -308 .
[2] . [J]. , 2005, 20(4): 483 -490 .
[3] . 暂缺[J]. , 2007, 22(3): 379 -386 .
[4] . 软硬件划分的计算模型与算法[J]. , 2008, 23(4 ): 644 -651 .
[5] Jian-Yun Liu (刘建芸), Yu-Hang Zhao (赵宇航), Member, CCF, Zhao-Xiang Zhang (张兆翔), Me. 基于社会关系网络分析的垃圾短信检测[J]. , 2012, 27(3): 506 -514 .
[6] Belal Al-Khateeb1 and Graham Kendall2, Senior Member, IEEE. 前瞻深度对演化象棋的影响[J]. , 2012, 27(5): 996 -1006 .
[7] Ren-Jie He (贺仁杰) and Zhen-Yu Yang (杨振宇), Member, CCF, ACM, IEEE. 一种基于自适应变异和Levy概率分布的差分进化算法[J]. , 2012, 27(5): 1035 -1055 .
[8] Jie Tang, Chen Liu, Shao-Shan Liu, Zhi-Min Gu, Jean-Luc Gaudiot. 多核服务分工:Intel SCC下的XML数据解析研究[J]. , 2013, 28(1): 3 -13 .
[9] Chong Cao, Hai-Zhou Ai. 用户参与的人脸相似度学习[J]. , 2015, 30(3): 499 -510 .
[10] Huai-Yu Wan, Zhi-Wei Wang You-Fang Lin, Xu-Guang Jia, Yuan-Wei Zhou. 旅客社交网络中的家庭团体发现研究[J]. , 2015, 30(5): 1141 -1153 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: