›› 2017,Vol. 32 ›› Issue (3): 430-442.doi: 10.1007/s11390-017-1734-y

所属专题: Artificial Intelligence and Pattern Recognition Computer Graphics and Multimedia

• Special Section on Selected Paper from NPC 2011 • 上一篇    下一篇

视频中静态场景光照估计及应用

Bin Liu1, Kun Xu1,*, Member, CCF, ACM, IEEE, Ralph R. Martin2, Member, ACM, IEEE   

  1. 1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;
    2. School of Computer Science and Informatics, Cardiff University, Cardiff CF24 3AA, U.K
  • 收稿日期:2016-12-18 修回日期:2017-04-06 出版日期:2017-05-05 发布日期:2017-05-05
  • 通讯作者: Kun Xu E-mail:xukun@tsinghua.edu.cn
  • 作者简介:Bin Liu is a Ph.D.student in the Department of Computer Science and Technology,Tsinghua University, Beijing.He received his Bachelor's degree in computer science from the same university in 2013.His research interests include image and video editing.
  • 基金资助:

    This work was supported by the National Natural Science Foundation of China (NSFC) and the Israel Science Foundation (ISF),Joint NSFC-ISF Research Program under Grant No.61561146393,the National Natural Science Foundation of China under Grant No.61521002,a research grant from the Beijing Higher Institution Engineering Research Center,and the Tsinghua-Tencent Joint Laboratory for Internet Innovation Technology.

Static Scene Illumination Estimation from Video with Applications

Bin Liu1, Kun Xu1,*, Member, CCF, ACM, IEEE, Ralph R. Martin2, Member, ACM, IEEE   

  1. 1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;
    2. School of Computer Science and Informatics, Cardiff University, Cardiff CF24 3AA, U.K
  • Received:2016-12-18 Revised:2017-04-06 Online:2017-05-05 Published:2017-05-05
  • Contact: Kun Xu E-mail:xukun@tsinghua.edu.cn
  • About author:Bin Liu is a Ph.D.student in the Department of Computer Science and Technology,Tsinghua University, Beijing.He received his Bachelor's degree in computer science from the same university in 2013.His research interests include image and video editing.
  • Supported by:

    This work was supported by the National Natural Science Foundation of China (NSFC) and the Israel Science Foundation (ISF),Joint NSFC-ISF Research Program under Grant No.61561146393,the National Natural Science Foundation of China under Grant No.61521002,a research grant from the Beijing Higher Institution Engineering Research Center,and the Tsinghua-Tencent Joint Laboratory for Internet Innovation Technology.

我们提出了一个系统,能从视频中自动恢复场景几何和光照,并支持多种应用。以往的基于图像的光照恢复方法或需要用户交互,或需要大量数据驱动。我们采用SFM和多视角三维重建算法构建场景几何结构,再估计环境光的球面谐波系数(实验表明球面谐波的效果要好于其他基函数)。我们展示了基于几何和光照恢复的应用,包括虚拟物体嵌入(增强现实),阴影检测和视频重新打光。

Abstract: We present a system that automatically recovers scene geometry and illumination from a video, providing a basis for various applications. Previous image based illumination estimation methods either require user interaction or external information in the form of a database. We adopt structure-from-motion and multi-view stereo for initial scene reconstruction, and then estimate an environment map represented by spherical harmonics (as these perform better than other bases). We also demonstrate several video editing applications that exploit the recovered geometry and illumination, including object insertion (e.g. for augmented reality), shadow detection, and video relighting.

[1] Kee E, O'brien J F, Farid H. Exposing photo manipulation from shading and shadows. ACM Transactions on Graphics, 2014, 33(5): 165:1-165:21.

[2] Karsch K, Hedau V, Forsyth D, Hoiem D. Rendering synthetic objects into legacy photographs. ACM Transactions on Graphics, 2011, 30(6): 157:1-157:12.

[3] Karsch K, Sunkavalli K, Hadap S, Carr N, Jin H, Fonte R, Sittig M, Forsyth D. Automatic scene inference for 3D object compositing. ACM Transactions on Graphics, 2014, 33(3): 32:1-32:15.

[4] Debevec P. Rendering synthetic objects into real scenes: Bridging traditional and image-based graphics with global illumination and high dynamic range photography. In Proc. the 25th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), Jul. 1998, pp.189- 198.

[5] Chen X, Xu W W, Yeung S K, Zhou K. View-aware image object compositing and synthesis from multiple sources. Journal of Computer Science and Technology, 2016, 31(3): 463-478.

[6] Bell S, Bala K, Snavely N. Intrinsic images in the wild. ACM Transactions on Graphics, 2014, 33(4): 159:1-159:12.

[7] Shen J, Yang X, Jia Y, Li X. Intrinsic images using optimization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2011, pp.3481-3487.

[8] Bousseau A, Paris S, Durand F. User-assisted intrinsic images. ACM Transactions on Graphics, 2009, 28(5): 130:1- 130:10.

[9] Bi S, Han X, Yu Y. An L1 image transform for edgepreserving smoothing and scene-level intrinsic decomposition. ACM Transactions on Graphics, 2015, 34(4): 78:1- 78:12.

[10] Laffont P Y, Bousseau A, Drettakis G. Rich intrinsic image decomposition of outdoor scenes from multiple views. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(2): 210-224.

[11] Laffont P Y, Bazin J C. Intrinsic decomposition of image sequences from local temporal variations. In Proc. IEEE International Conference on Computer Vision (ICCV), Dec. 2015, pp.433-441.

[12] Bonneel N, Sunkavalli K, Tompkin J, Sun D, Paris S, Pfister H. Interactive intrinsic video editing. ACM Transactions on Graphics, 2014, 33(6): 197:1-197:10.

[13] Kong N, Gehler P V, Black M J. Intrinsic video. In Proc. the 13th European Conference (ECCV), Sept. 2014, pp.360- 375.

[14] Ye G, Garces E, Liu Y, Dai Q, Gutierrez D. Intrinsic video and applications. ACM Transactions on Graphics, 2014, 33(4): 80:1-80:11.

[15] Dong Y, Chen G, Peers P, Zhang J, Tong X. Appearancefrom-motion: Recovering spatially varying surface reflectance under unknown lighting. ACM Transactions on Graphics, 2014, 33(6): 193:1-193:12.

[16] Sato I, Sato Y, Ikeuchi K. Illumination from shadows. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(3): 290-300.

[17] Panagopoulos A, Samaras D, Paragios N. Robust shadow and illumination estimation using a mixture model. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2009, pp.651-658.

[18] Ramanarayanan G, Ferwerda J, Walter B, Bala K. Visual equivalence: Towards a new standard for image fidelity. ACM Transactions on Graphics (TOG), 2007, 26(3): 76:1- 76:11.

[19] Khan E A, Reinhard E, Fleming R W, Bülthoff H H. Imagebased material editing. ACM Transactions on Graphics (TOG), 2006, 25(3): 654-663.

[20] Lalonde J F, Efros A A. Synthesizing environment maps from a single image. Technical Report CMURITR-10-24, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA, 2010.

[21] Chen T, Zhu Z, Shamir A, Hu S M, Cohen-Or D. 3- sweep: Extracting editable objects from a single photo. ACM Transactions on Graphics, 2013, 32(6): 195:1-195:10.

[22] Cao Y P, Ju T, Fu Z, Hu S M. Interactive image-guided modeling of extruded shapes. Computer Graphics Forum, 2014, 33(7): 101-110.

[23] Zheng Y, Chen X, Cheng M M, Zhou K, Hu S M, Mitra N J. Interactive images: Cuboid proxies for smart image manipulation. ACM Transactions on Graphics, 2012, 31(4): 99:1-99:11.

[24] Wu J, Rosin P L, Sun X, Martin R R. Improving shape from shading with interactive tabu search. Journal of Computer Science and Technology, 2016, 31(3): 450-462.

[25] Gupta A, Satkin S, Efros A A, Hebert M. From 3D scene geometry to human workspace. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2011, pp.1961-1968.

[26] Jung C, Kim C. Real-time estimation of 3D scene geometry from a single image. Pattern Recognition, 2012, 45(9): 3256- 3269.

[27] Zhu Z, Martin R R, Pepperell R, Burleigh A. 3D modeling and motion parallax for improved videoconferencing. Computational Visual Media, 2016, 2(2): 131-142.

[28] Horry Y, Anjyo K, Arai K. Tour into the picture: Using a spidery mesh interface to make animation from a single image. In Proc. the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), Aug. 1997, pp.225-232.

[29] Saxena A, Sun M, Ng A Y. Make3D: Learning 3D scene structure from a single still image. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(5): 824-840.

[30] Hoiem D, Efros A A, Hebert M. Automatic photo popup. ACM transactions on graphics, 2005, 24(3): 577-584.

[31] Longuet-Higgins H C. A computer algorithm for reconstructing a scene from two projections. In Readings in Computer Vision: Issues, Problems, Principles, and Paradigms, Fischler M A, Firschein O (eds.), Morgan Kanfmann Publishers Inc., 1987, pp.61-62.

[32] Zhang G, Jia J, Wong T T, Bao H. Recovering consistent video depth maps via bundle optimization. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2008, pp.1-8.

[33] Jiang H, Zhang G, Wang H, Bao H. Spatio-temporal video segmentation of static scenes and its applications. IEEE Transactions on Multimedia, 2015, 17(1): 3-15.

[34] Kopf J, Cohen M F, Szeliski R. First-person hyperlapse videos. ACM Transactions on Graphics, 2014, 33(4): 78:1- 78:10.

[35] Engel J, Schöps T, Cremers D. LSD-slam: Large-scale direct monocular slam. In Proc. the 13th ECCV, Sept. 2014, pp.834-849.

[36] Snavely N, Seitz S M, Szeliski R. Photo tourism: Exploring photo collections in 3D. ACM Transactions on Graphics, 2006, 25(3): 835-846.

[37] Agarwal S, Snavely N, Simon I, Seitz S M, Szeliski R. Building rome in a day. In Proc. IEEE International Conference on Computer Vision (ICCV), Sept. 2009, pp.72-79.

[38] Fuhrmann S, Langguth F, Goesele M. MVE - A multiview reconstruction environment. In Proc. the Eurographics Workshop on Graphics and Cultural Heritage (GCH), Oct. 2014, pp.11-18.

[39] Kronander J, Banterle F, Gardner A, Miandji E, Unger J. Photorealistic rendering of mixed reality scenes. Computer Graphics Forum, 2015, 34(2): 643-665.

[40] Kholgade N, Simon T, Efros A A, Sheikh Y. 3D object manipulation in a single photograph using stock 3D models. ACM Transactions on Graphics, 2014, 33(4): 127:1-127:12.

[41] Zhang G, Dong Z, Jia J, Wan L, Wong T T, Bao H. Refilming with depth-inferred videos. IEEE Transactions on Visualization and Computer Graphics, 2009, 15(5): 828- 840.

[42] Zhang G, Dong Z, Jia J, Wong T T, Bao H. Efficient nonconsecutive feature tracking for structure-from-motion. In Proc. the 11th ECCV, Sept. 2011, pp.422-435.

[43] Wu C. Towards linear-time incremental structure from motion. In Proc. International Conference on 3D Vision, Jun. 2013, pp.127-134.

[44] OpenMVS: Open multi-view stereo reconstruction library. http://cdcseacave.github.io/openMVS/, Mar. 2017.

[45] Cignoni P, Corsini M, Ranzuglia G. MeshLab: An opensource 3D mesh processing system. ERCIM News, 2008, 2008(73).

[46] Kajiya J T. The rendering equation. In Proc. the 13th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), Aug. 1986, pp.143-150.

[47] Jakob W. Mitsuba renderer. http://www.mitsubarenderer.org, Mar. 2017.

[48] Ramamoorthi R, Hanrahan P. On the relationship between radiance and irradiance: Determining the illumination from images of a convex Lambertian object. Journal of the Optical Society of America A, 2001, 18(10): 2448-2459.

[49] Ng R, Ramamoorthi R, Hanrahan P. All-frequency shadows using non-linear wavelet lighting approximation. ACM Transactions on Graphics, 2003, 22(3): 376-381.

[50] Hara K, Nishino K, Ikeuchi K. Multiple light sources and reflectance property estimation based on a mixture of spherical distributions. In Proc. IEEE International Conference on Computer Vision (ICCV), Oct. 2005, pp.1627-1634.

[51] Russell M, Zou J J, Fang G. An evaluation of moving shadow detection techniques. Computational Visual Media, 2016, 2(3): 195-217.

[52] P'erez P, Gangnet M, Blake A. Poisson image editing. ACM Transactions on Graphics, 2003, 22(3): 313-318.

[53] Guo R, Dai Q, Hoiem D. Paired regions for shadow detection and removal. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2956-2967.

[54] Gong H, Cosker D. Interactive shadow removal and ground truth for variable scene categories. In Proc. the British Machine Vision Conference (BMVC), Sept. 2014.

[55] Chen X, Jin X, Zhao Q, Wu H. Artistic illumination transfer for portraits. Computer Graphics Forum, 2012, 31(4): 1425-1434.

[56] Chen X, Wu H, Jin X, Zhao Q. Face illumination manipulation using a single reference image by adaptive layer decomposition. IEEE Transactions on Image Processing, 2013, 22(11): 4249-4259.

[57] Xing G, Zhou X, Peng Q, Liu Y, Qin X. Lighting simulation of augmented outdoor scene based on a legacy photograph. Computer Graphics Forum, 2013, 32(7): 101-110.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 闵应骅;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[2] 闵应骅; Yashwant K. Malaiya; 金博平;. Aliasing Errors in Parallel Signature Analyzers[J]. , 1990, 5(1): 24 -40 .
[3] 张幸儿; 朱晓军; 李建新; 董建宁;. Source-to-Source Conversion Based on Formal Definition[J]. , 1991, 6(2): 178 -184 .
[4] 眭跃飞;. The Polynomially Exponential Time Restrained Analytical Hierarchy[J]. , 1991, 6(3): 282 -284 .
[5] 鄢勇; 金灿明;. A Theory for the Initial Allocating of Real Time Tasks in Distributed Systems[J]. , 1992, 7(2): 185 -188 .
[6] 马军; 马绍汉;. An O(k~2n~2) Algorithm to Find a k-Partition in a k-Connected Graph[J]. , 1994, 9(1): 86 -91 .
[7] 王晖; 刘大有; 王亚飞;. Sequential Back-Propagation[J]. , 1994, 9(3): 252 -260 .
[8] 田增平; 王宇君; 曲云尧; 施伯乐;. On the Expressive Power of F-Logic Language[J]. , 1997, 12(6): 510 -519 .
[9] . [J]. , 2003, 18(2): 0 .
[10] . 基于时分多路访问的无线自组网中的服务质量支持[J]. , 2005, 20(6): 797 -810 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: