|
›› 2017,Vol. 32 ›› Issue (1): 110-121.doi: 10.1007/s11390-017-1681-7
所属专题: Computer Graphics and Multimedia
• Special Section on Selected Paper from NPC 2011 • 上一篇 下一篇
Ming-Ming Cheng1, Member, CCF, ACM, IEEE, Qi-Bin Hou1, Member, CCF, ACM, IEEE, Song-Hai Zhang2, Member, CCF, ACM, IEEE, and Paul L. Rosin1,3, Member, ACM, IEEE
Ming-Ming Cheng1, Member, CCF, ACM, IEEE, Qi-Bin Hou1, Member, CCF, ACM, IEEE, Song-Hai Zhang2, Member, CCF, ACM, IEEE, and Paul L. Rosin1,3, Member, ACM, IEEE
近年来,计算机图形学和计算机视觉技术共同进步,涌现出了一批新的可视媒体分析和编辑的算法与应用。这种现象是由三个主要因素推动:i)互联网大数据带动了处理日益增长的大量资源的需求;ii)强大的处理工具,如深度神经网络,为学习如何处理异质视觉数据提供了有效的方法;iii)新的数据捕获设备,例如Kinect,架起了2D图像理解和3D模型分析算法之间的桥梁。这些近期才逐渐显现的推动因素,让我们相信计算机图形和计算机视觉研究群体的融合才刚刚开始。本文就计算机视觉技术和计算机图形技术如何相互推动进行综述,内容涵盖分析、编辑、合成和交互技术。我们还讨论现有技术中存在的问题,并对可能的进一步研究方向给出建议。
[1] Lengyel J. The convergence of graphics and vision. Computer, 1998, 31(7):46-53.[2] Kang S B. Vision for graphics. In Proc. IJARC/ACCV Joint Int. Symp. Computer Vision, Nov. 2007, pp.23-34.[3] Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 2013, 56(1):116-124.[4] Xiao J X. Graphics for vision:Learning to see using big 3D synthetic data. http://www.cs.princeton.edu/~xj/slides/2015CADGraphicsKeynote.pdf, Oct. 2016.[5] Zheng S, Prisacariu V A, Averkiou M, Cheng M M, Mitra N J, Shotton J, Torr P H S, Rother C. Object proposals estimation in depth image using compact 3D shape manifolds. In Lecture Notes in Computer Science 9358, Gall J, Gehler P, Leibe B (eds.), Springer International Publishing, 2015, pp.196-208.[6] Meeker M. Internet trends 2014-code conference. http://www.kpcb.com/internet-trends, Oct. 2016.[7] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553):436-444.[8] Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Analysis and Machine Intelligence, 1998, 20(11):1254-1259.[9] Cheng M M, Mitra N J, Huang X L, Torr P H S, Hu S M. Global contrast based salient region detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):569-582.[10] Qi W, Cheng M M, Borji A, Lu H C, Bai L F. SaliencyRank:Two-stage manifold ranking for salient object detection. Computational Visual Media, 2015, 1(4):309-320.[11] Wu X M, Du M N, Chen W H, Wang J H. Salient object detection via region contrast and graph regularization. Science China Information Sciences, 2016, 59:032104.[12] Zhang W, Borji A, Wang Z, Le Callet P, Liu H T. The application of visual saliency models in objective image quality assessment:A statistical evaluation. IEEE Trans. Neural Networks and Learning Systems, 2016, 27(6):1266-1278.[13] Borji A, Cheng M M, Jiang H Z, Li J. Salient object detection:A benchmark. ⅡEEE Transactions on Image Processing, 2015, 24(12):5706-5722.[14] Borji A, Cheng M M, Jiang H Z, Li J. Salient object detection:A survey. arXiv:1411.5878, 2014. https://arxiv.org/abs/1411.5878, Nov. 2016.[15] Zhang G X, Cheng M M, Hu S M, Martin R R. A shapepreserving approach to image resizing. Computer Graphics Forum, 2009, 28(7):1897-1906.[16] Zhao Y T, Liu Y H. Patch based saliency detection method for 3D surface simplification. In Proc. the 21st International Conference on Pattern Recognition, Nov. 2012, pp.845-848.[17] Jänicke H, Chen M. A salience-based quality metric for visualization. Computer Graphics Forum, 2010, 29(3):1183-1192.[18] Miao Y W, Feng J Q, Wang J R, Pajarola R. A multichannel salience based detail exaggeration technique for 3D relief surfaces. Journal of Computer Science and Technology, 2012, 27(6):1100-1109.[19] Avidan S, Shamir A. Seam carving for content-aware image resizing. ACM Transactions on Graphics, 2007, 26(3):Article No. 10.[20] Wang Y S, Tai C L, Sorkine O, Lee T Y. Optimized scale-and-stretch for image resizing. ACM Transactions on Graphics, 2008, 27(5):Article No. 118.[21] Lee C H, Varshney A, Jacobs D W. Mesh saliency. ACM Transactions on Graphics, 2005, 24(3):659-666.[22] Kim Y, Varshney A. Saliency-guided enhancement for volume visualization. IEEE Transactions on Visualization and Computer Graphics, 2006, 12(5):925-932.[23] Zhang L M, Wang M, Nie L Q, Hong L, Rui Y, Tian Q. Retargeting semantically-rich photos. IEEE Transactions on Multimedia, 2015, 17(9):1538-1549.[24] Wu H S, Wang Y S, Feng K C, Wong T T, Lee T Y, Heng P A. Resizing by symmetry-summarization. ACM Transactions on Graphics, 2010, 29(6):Article No. 159.[25] Zhang F, Zhang X, Qin X Y, Zhang C M. Enlarging image by constrained least square approach with shape preserving. Journal of Computer Science and Technology, 2015, 30(3):489-498.[26] Li B, Duan L Y, Lin C W, Huang T J, Gao W. Depthpreserving warping for stereo image retargeting. IEEE Transactions on Image Processing, 2015, 24(9):2811-2826.[27] Jain E, Sheikh Y, Shamir A, Hodgins J. Gaze-driven video re-editing. ACM Trans. Graphics, 2015, 34(2):Article No. 21.[28] Liu Y, Sun L F, Yang S Q. A retargeting method for stereoscopic 3D video. Computational Visual Media, 2015, 1(2):119-127.[29] Miao Y W, Lin H B. Visual saliency guided global and local resizing for 3D models. In Proc. Int. Conf. Computer-Aided Design and Computer Graphics, Nov. 2013, pp.212-219.[30] Jia S X, Zhang C M, Li X M, Zhou Y F. Mesh resizing based on hierarchical saliency detection. Graphical Models, 2014, 76(5):355-362.[31] Song R, Liu Y H, Zhao Y T, Martin R R, Rosin P L. Conditional random field-based mesh saliency. In Proc. the 19th IEEE International Conference on Image Processing, Sept. 30-Oct. 3, 2012, pp.637-640.[32] Castelló P, Chover M, Sbert M, Feixas M. Reducing complexity in polygonal meshes with view-based saliency. Computer Aided Geometric Design, 2014, 31(6):279-293.[33] Miao Y W, Feng J Q, Pajarola R. Visual saliency guided normal enhancement technique for 3D shape depiction. Computers & Graphics, 2011, 35(3):706-712.[34] Zhao Y, Lu S J, Qian H L, Yao P C. Robust mesh deformation with salient features preservation. Science China Information Sciences, 2016, 59:052106.[35] Semmo A, Trapp M, Kyprianidis J E, Döllner J. Interactive visualization of generalized virtual 3D city models using level-of-abstraction transitions. Computer Graphics Forum, 2012, 31:885-894.[36] Song P, Fu Z Q, Liu L G, Fu C W. Printing 3D objects with interlocking parts. Computer Aided Geometric Design, 2015, 35/36:137-148.[37] Wang W M, Chao H Y, Tong J et al. Saliency-preserving slicing optimization for effective 3D printing. Computer Graphics Forum, 2015, 34(6):148-160.[38] Criminisi A, Pérez P, Toyama K. Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on Image Processing, 2004, 13(9):1200-1212.[39] Adams A, Gelfand N, Dolson J, Levoy M. Gaussian KD trees for fast high-dimensional filtering. ACM Transactions on Graphics, 2009, 28(3):Article No. 21.[40] Simakov D, Caspi Y, Shechtman E, Irani M. Summarizing visual data using bidirectional similarity. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, June 2008.[41] Shamir A, Avidan S. Seam carving for media retargeting. Communications of the ACM, 2009, 52(1):77-85.[42] Chen T, Zhu Z, Shamir A, Hu S M, Cohen-Or D. 3-sweep:Extracting editable objects from a single photo. ACM Trans. Graphics, 2013, 32(6):Article No. 195.[43] Kholgade N, Simon T, Efros A, Sheikh Y. 3D object ma nipulation in a single photograph using stock 3D models. ACM Trans. Graphics, 2014, 33(4):Article No. 127.[44] Koka K. Principles of Gestalt Psychology (Reprint Edition). Routledge, 2013.[45] Cheng M M, Zhang F L, Mitra N J, Huang X L, Hu S M. RepFinder:Finding approximately repeated scene ele ments for image editing. ACM Transactions on Graphics, 2010, 29(4):Article No. 83.[46] Goldberg C, Chen T, Zhang F L, Shamir A, Hu S M. Data driven object manipulation in images. Computer Graphics Forum, 2012, 31:265-274.[47] Chen T, Cheng M M, Tan P, Shamir A, Hu S M. Sketch2Photo:Internet image montage. ACM Transactions on Graphics, 2009, 28(5):Article No. 124.[48] Lu S P, Zhang S H, Wei J, Hu S M, Martin R R. Timeline editing of objects in video. IEEE Transactions on Visuali zation and Computer Graphics, 2013, 19(7):1218-1227.[49] Zheng Y Y, Chen X, Cheng M M et al. Interactive images:Cuboid proxies for smart image manipulation. ACM Trans. Graphics, 2012, 31(4):Article No. 99.[50] Iizuka S, Endo Y, Hirose M, Kanamori Y, Mitani J, Fukui Y. Object repositioning based on the perspective in a single image. Computer Graphics Forum, 2014, 33(8):157-166.[51] Rong Y L, Zheng Y Y, Shao T J et al. An interactive approach for functional prototype recovery from a single RGBD image. Computational Visual Media, 2016, 2(1):87-96.[52] Wu J, Rosin P L, Sun X F, Martin R R. Improving shape from shading with interactive Tabu search. Journal of Com puter Science and Technology, 2016, 31(3):450-462.[53] Zhao H L, Nie G Z, Li X J et al. Structure-aware nonlocal optimization framework for image colorization. Journal of Computer Science and Technology, 2015, 30(3):478-488.[54] Cheng M M, Prisacariu V A, Zheng S, Torr P H S, Rother C. DenseCut:Densely connected CRFs for realtime Grab Cut. Computer Graphics Forum, 2015, 34(7):193-201.[55] Cheng M M, Zheng S, Lin W Y, Vineet V, Sturgess P, Crook N, Mitra N J, Torr P. ImageSpirit:Verbal guided image parsing. ACM Trans. Graphics, 2014, 34(1):Article No. 3.[56] Huang Q X, Wang H, Koltun V. Single-view reconstruc tion via joint analysis of image and shape collections. ACM Transactions on Graphics, 2015, 34(4):Article No. 87.[57] Chen T, Tan P, Ma L Q, Cheng M M, Shamir A, Hu S M. Poseshop:Human image database construction and per sonalized content synthesis. IEEE Transactions on Visua lization and Computer Graphics, 2013, 19(5):824-837.[58] Tanahashi Y, Hsueh C H, Ma K L. An efficient framework for generating storyline visualizations from streaming data. IEEE Transactions on Visualization and Computer Graph ics, 2015, 21(6):730-742.[59] Hasegawa K, Saito H. Synthesis of a stroboscopic image from a hand-held camera sequence for a sports analysis. Computational Visual Media, 2016, 2(3):277-289.[60] Lalonde J F, Hoiem D, Efros A A, Rother C, Winn J, Cri minisi A. Photo clip art. ACM Transactions on Graphics, 2007, 26(3):Article No. 3.[61] Xu K, Chen K, Fu H B, Sun W L, Hu S M. Sketch2Scene:Sketch-based co-retrieval and co-placement of 3D models. ACM Trans. Graphics, 2013, 32(4):Article No. 123.[62] Chia A Y S, Zhuo S J, Gupta R K, Tai Y W, Cho S Y, Tan P, Lin S. Semantic colorization with Internet images. ACM Transactions on Graphics, 2011, 30(6):Article No. 156.[63] Longuet-Higgins H C. A computer algorithm for recon structing a scene from two projections. In Readings in Computer Vision:Issues, Problems, Principles, and Paradigms, Fischler M A, Firschein O (eds.), Morgan Kauf mann Publishers Inc., 1987, pp.61-62.[64] Snavely N, Seitz S M, Szeliski R. Photo tourism:Exploring photo collections in 3D. ACM Transactions on Graphics, 2006, 25(3):835-846.[65] Agarwal S, Snavely N, Simon I, Seitz S M, Szeliski R. Build ing Rome in a day. In Proc. the 12th International Confer ence on Computer Vision, Sept. 29-Oct. 2, 2009, pp.72-79.[66] Cao C, Bradley D, Zhou K, Beeler T. Realtime high-fidelity facial performance capture. ACM Transactions on Graph ics, 2015, 34(4):Article No. 46.[67] Frahm J M, Fite-Georgel P, Gallup D, Johnson T, Ragu ram R, Wu C C, Jen Y H, Dunn E, Clipp B, Lazebnik S, Pollefeys M. Building Rome on a cloudless day. In Lecture Notes in Computer Science 6314, Daniilidis K, Maragos P, Paragios N (eds.), Springer-Verlag, 2010, pp.368-381.[68] Fuhrmann S, Langguth F, Moehrle N, Waechter M, Goesele M. MVE-An imagebased reconstruction environment. Computers & Graphics, 2015, 53:44-53.[69] Ceylan D, Mitra N J, Zheng Y Y, Pauly M. Coupled structure-from-motion and 3D symmetry detection for ur ban facades. ACM Trans. Graphics, 2014, 33(1):Article No. 2.[70] Kopf J, Cohen M F, Szeliski R. Firstperson hyper-lapse videos. ACM Trans. Graphics, 2014, 33(4):Article No. 78.[71] Tan W, Liu H M, Dong Z L, Zhang G F, Bao H J. Robust monocular SLAM in dynamic environments. In Proc. Int. Sym. Mixed and Augmented Reality, Oct. 2013, pp.209-218.[72] Li K, Yang J Y, Jiang J M. Nonrigid structure from motion via sparse representation. In Proc. International Confer ence on Multimedia and Expo, July 2014.[73] Li K, Yang J, Jiang J. Nonrigid structure from motion via sparse representation. IEEE Trans. Cybernetics, 2015, 45(8):1401-1413.[74] Huang H D, Chai J X, Tong X, Wu H T. Leveraging mo tion capture and 3D scanning for high-fidelity facial perfor mance acquisition. ACM Transactions on Graphics, 2011, 30(4):Article No. 74.[75] Zhang L, Snavely N, Curless B, Seitz S M. Spacetime faces:High-resolution capture for modeling and animation. In Data-Driven 3D Facial Animation, Deng Z G, Neumann U (eds.), Springer, 2008, pp.248-276.[76] Beeler T, Hahn F, Bradley D, Bickel B, Beardsley P, Gotsman C, Sumner R W, Gross M. High-quality passive facial performance capture using anchor frames. ACM Transactions on Graphics, 2011, 30(4):Article No. 75.[77] Chen K, Lai Y K, Hu S M. 3D indoor scene modeling from RGB-D data:A survey. Computational Visual Media, 2015, 1(4):267-278.[78] Cao C, Hou Q M, Zhou K. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Transactions on Graphics, 2014, 33(4):Article No. 43.[79] Casas D, Tejera M, Guillemaut J Y, Hilton A. Interactive animation of 4D performance capture. IEEE Trans. Visualization and Computer Graphics, 2013, 19(5):762-773.[80] Huang P, Tejera M, Collomosse J, Hilton A. Hybrid skeletalsurface motion graphs for character animation from 4D performance capture. ACM Transactions on Graphics, 2015, 34(2):Article No. 17.[81] Xia S H, Wang C Y, Chai J X, Hodgins J. Realtime style transfer for unlabeled heterogeneous human motion. ACM Transactions on Graphics, 2015, 34(4):Article No. 119.[82] Pons-Moll G, Romero J, Mahmood N, Black M J. Dyna:A model of dynamic human shape in motion. ACM Transactions on Graphics, 2015, 34(4):Article No. 120.[83] Rogez G, Schmid C. MoCap-guided data augmentation for 3D pose estimation in the wild. arXiv:1607.02046, 2016. https://arxiv.org/abs/1607.02046, Oct. 2016.[84] Shotton J, Girshick R, Fitzgibbon A, Sharp T, Cook M, Finocchio M, Moore R, Kohli P, Criminisi A, Kipman A, Blake A. Efficient human pose estimation from single depth images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12):2821-2840.[85] Song S R, Xiao J X. Sliding shapes for 3D object detection in depth images. In Lecture Notes in Computer Science 8694, Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds.), Springer International Publishing, 2014, pp.634-651.[86] Malisiewicz T, Gupta A, Efros A A. Ensemble of exemplarSVMs for object detection and beyond. In Proc. Int. Conf. Computer Vision, Nov. 2011, pp.89-96.[87] Peng X C, Sun B C, Ali K, Saenko K. Learning deep object detectors from 3D models. In Proc. International Conference on Computer Vision, Dec. 2015, pp.1278-1286.[88] Gupta S, Girshick R, Arbeláez P, Malik J. Learning rich features from RGB-D images for object detection and segmentation. In Lecture Notes in Computer Science 8695, Fleet D, Pajdla T, Schiele B, Tuytelaars T (eds.), Springer International Publishing, 2014, pp.345-360.[89] Wu Z R, Song S R, Khosla A, Yu F, Zhang L G, Tang X O, Xiao J X. 3D ShapeNets:A deep representation for volumetric shapes. In Proc. Conference on Computer Vision and Pattern Recognition, June 2015, pp.1912-1920.[90] Maturana D, Scherer S. VoxNet:A 3D convolutional neural network for real-time object recognition. In Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept. 28-Oct. 2, 2015, pp.922-928.[91] Wohlhart P, Lepetit V. Learning descriptors for object recognition and 3D pose estimation. In Proc. Conf. Computer Vision and Pattern Recognition, June 2015, pp.3109-3118.[92] Valentin J, Vineet V, Cheng M M, Kim D, Shotton J, Kohli P, Nieβner M, Criminisi A, Izadi S, Torr P. SemanticPaint:Interactive 3D labeling and learning at your fingertips. ACM Trans. Graphics, 2015, 34(5):Article No. 154.[93] Xu K, Huang H, Shi Y F, Li H, Long P X, Caichen J, Sun W, Chen B Q. Autoscanning for coupled scene reconstruction and proactive object analysis. ACM Transactions on Graphics, 2015, 34(6):Article No. 177.[94] Tateno K, Tombari F, Navab N. When 2.5D is not enough:Simultaneous reconstruction, segmentation and recognition on dense SLAM. In Proc. International Conference on Robotics and Automation, May 2016, pp.2295-2302. |
No related articles found! |
|
版权所有 © 《计算机科学技术学报》编辑部 本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn 总访问量: |