计算机科学技术学报 ›› 2021,Vol. 36 ›› Issue (3): 572-587.doi: 10.1007/s11390-021-1299-7

所属专题: Computer Graphics and Multimedia

• • 上一篇    下一篇

初始估计引导的非局部PatchMatch多视图立体重建

Qi-Tong Zhang, Member, CCF, Shan Luo, Member, CCF, Lei Wang, Member, CCF, and Jie-Qing Feng*, Distinguished Member, CCF   

  1. State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou 310058, China
  • 收稿日期:2021-01-19 修回日期:2021-04-14 出版日期:2021-05-05 发布日期:2021-05-31
  • 通讯作者: Jie-Qing Feng E-mail:jqfeng@cad.zju.edu.cn
  • 作者简介:Qi-Tong Zhang received her B.S. degree in digital media technology from Shandong University, Jinan, in 2017. She is now a Ph.D. candidate in the State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou. Her fields of interest are multi-view stereo and 3D reconstruction.
  • 基金资助:
    This work was jointly supported by the National Natural Science Foundation of China under Grant Nos. 61732015, 61932018, and 61472349, and the National Key Research and Development Program of China under Grant No. 2017YFB0202203.

CNLPA-MVS: Coarse-Hypotheses Guided Non-Local PAtchMatch Multi-View Stereo

Qi-Tong Zhang, Member, CCF, Shan Luo, Member, CCF, Lei Wang, Member, CCF, and Jie-Qing Feng*, Distinguished Member, CCF        

  1. State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou 310058, China
  • Received:2021-01-19 Revised:2021-04-14 Online:2021-05-05 Published:2021-05-31
  • Contact: Jie-Qing Feng E-mail:jqfeng@cad.zju.edu.cn
  • About author:Qi-Tong Zhang received her B.S. degree in digital media technology from Shandong University, Jinan, in 2017. She is now a Ph.D. candidate in the State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou. Her fields of interest are multi-view stereo and 3D reconstruction.
  • Supported by:
    This work was jointly supported by the National Natural Science Foundation of China under Grant Nos. 61732015, 61932018, and 61472349, and the National Key Research and Development Program of China under Grant No. 2017YFB0202203.

研究背景
在多视图立体(Multi-View Stereo, MVS)重建中,由于弱纹理区域完整准确的深度信息难以估计,使得高质量和完整的三维模型获取仍然是一个挑战。对于弱纹理区域的深度估计,其主要困难是局部窗口下像素颜色相似而导致的匹配不确定。基于马尔可夫随机场(Markov Random Field, MRF)模型的方法通过采用非局部信息,可以减轻匹配的不确定性,从而提高重建完整度。但是,这种方法在连续的深度空间中具有较高的计算复杂度。近些年来,基于PatchMatch的多视图立体重建方法以其重建的高准确度和在连续空间中的高计算效率的优势逐渐成为多视图立体重建的主流方法。然而,由于原始PatchMatch方法计算过程中只存在一个数据项,在大块弱纹理区域中仍然无法充分考虑全局信息,难以解决弱纹理区域匹配不确定性的问题。
目的
本文旨在充分考虑非局部信息,设计一种基于PatchMatch的多视图立体重建方法,提高弱纹理区域的重建完整度。
方法
本文通过结合两种方法(基于MRF模型的非局部方法和基于PatchMatch MVS的方法)的优势,并对其劣势进行相互补偿,提出了一种新颖的重建方法,即初始估计引导的非局部PatchMatch的多视图立体重建(CNLPA-MVS)。首先,结合动态规划算法和沿扫描线顺序传播的思想,利用非局部信息实现并行计算,从而使每个像素获得最优的深度值和法向值。其次,本文引入低分辨率下赢者通吃的策略得到初始深度值和法向值,并将其作为备选最优值,以此来减轻动态规划导致的条纹现象,提高重建完整度。最后,本文基于相似颜色的像素具有相似深度的假设,将局部一致性策略结合到CNLPA-MVS中,进一步提高重建完整度。
结果
在公开数据集上的实验表明,即使在室内外强遮挡环境下,CNLPA-MVS得到的重建模型在弱纹理区域仍然具有高完整度。定量和定性对比结果表明,CNLPA-MVS可以在完整度和整体质量方面达到最优性能。
结论
本文提出了一种能够并行处理的非局部PatchMatch MVS方法,该方法可以有效缓解弱纹理区域的匹配不确定性问题。为了进一步提高弱纹理区域重建的完整度,本文同时考虑了初始估计引导和局部一致性策略。实验表明,本文方法能够重建高完整度三维模型,并且可以广泛应用于强遮挡下的弱纹理区域重建。此外,本文对基于MRF模型的全局方法的优化效率进行了提升,提供了一种新思路。未来工作中,我们将结合扩散式传播方法和全局优化方法,进一步提高计算效率和模型的重建质量。

关键词: 三维重建, 多视图立体匹配, PatchMatch, 动态规划

Abstract: In multi-view stereo, unreliable matching in low-textured regions has a negative impact on the completeness of reconstructed models. Since the photometric consistency of low-textured regions is not discriminative under a local window, non-local information provided by the Markov Random Field (MRF) model can alleviate the matching ambiguity but is limited in continuous space with high computational complexity. Owing to its sampling and propagation strategy, PatchMatch multi-view stereo methods have advantages in terms of optimizing the continuous labeling problem. In this paper, we propose a novel method to address this problem, namely the Coarse-Hypotheses Guided Non-Local PAtchMatch Multi-View Stereo (CNLPA-MVS), which takes the advantages of both MRF-based non-local methods and PatchMatch multi-view stereo and compensates for their defects mutually. First, we combine dynamic programing (DP) and sequential propagation along scanlines in parallel to perform CNLPA-MVS, thereby obtaining the optimal depth and normal hypotheses. Second, we introduce coarse inference within a universal window provided by winner-takes-all to eliminate the stripe artifacts caused by DP and improve completeness. Third, we add a local consistency strategy based on the hypotheses of similar color pixels sharing approximate values into CNLPA-MVS for further improving completeness. CNLPA-MVS was validated on public benchmarks and achieved state-of-the-art performance with high completeness.

Key words: 3D reconstruction, multi-view stereo, PatchMatch, dynamic programming

[1] Xiao X, Xu C, Wang J, Xu M. Enhanced 3-D modeling for landmark image classification. IEEE Trans. Multim., 2012, 14(4):1246-1258. DOI:10.1109/TMM.2012.2190384.
[2] Forster C, Pizzoli M, Scaramuzza D. Air-ground localization and map augmentation using monocular dense reconstruction. In Proc. the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sept. 2014, pp.3971-3978. DOI:10.1109/IROS.2013.6696924.
[3] Hedman P, Alsisan S, Szeliski R, Kopf J. Casual 3D photography. ACM Trans. Graph., 2017, 36(6):Article No. 234. DOI:10.1145/3130800.3130828.
[4] Knapitsch A, Park J, Zhou Q Y, Koltun V. Tanks and temples:Benchmarking large-scale scene reconstruction. ACM Trans. Graph., 2017, 36(4):Article No. 78. DOI:10.1145/3072959.3073599.
[5] Schöps T, Schönberger J L, Galliani S, Sattler T, Schindler K, Pollefeys M, Geiger A. A multi-view stereo benchmark with high resolution images and multicamera videos. In Proc. the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Jul. 2017, pp.2538-2547. DOI:10.1109/CVPR.2017.272.
[6] Strecha C, Von Hansen W, Van Gool L, Fua P, Thoennessen U. On benchmarking camera calibration and multiview stereo for high resolution imagery. In Proc. the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2008. DOI:10.1109/CVPR.2008.4587706.
[7] Huang P, Matzen K, Kopf J, Ahuja N, Huang J. DeepMVS:Learning multi-view stereopsis. In Proc. the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2018, pp.2821-2830. DOI:10.1109/CVPR.2018.00298.
[8] Luo K, Guan T, Ju L, Huang H, Luo Y. P-MVSNet:Learning patch-wise matching confidence aggregation for multiview stereo. In Proc. the 2019 IEEE International Conference on Computer Vision, October 27-November 2, 2019, pp.10451-10460. DOI:10.1109/ICCV.2019.01055.
[9] Yao Y, Luo Z, Li S, Fang T, Quan L. MVSNet:Depth inference for unstructured multi-view stereo. In Proc. the 15th European Conference on Computer Vision, Sept. 2018, pp.785-801. DOI:10.1007/978-3-030-01237-3_47.
[10] Campbell N D F, Vogiatzis G, Hernández C, Cipolla R. Using multiple hypotheses to improve depth maps for multi-view stereo. In Proc. the 10th European Conference on Computer Vision, Oct. 2008, pp.766-779. DOI:10.1007/978-3-540-88682-2_58.
[11] Woodford O J, Torr P H S, Reid I, Fitzgibbon A W. Global stereo reconstruction under second-order smoothness priors. IEEE Trans. Pattern Anal. Mach. Intell., 2009, 31(12):2115-2128. DOI:10.1109/TPAMI.2009.131.
[12] Schönberger J L, Zheng E, Frahm J, Pollefeys M. Pixelwise view selection for unstructured multi-view stereo. In Proc. the 14th European Conference on Computer Vision, Oct. 2016, pp.501-518. DOI:10.1007/978-3-319-46487-9_31.
[13] Galliani S, Lasinger K, Schindler K. Massively parallel multiview stereopsis by surface normal diffusion. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.873-881. DOI:10.1109/ICCV.2015.106.
[14] Xu Q, Tao W. Multi-scale geometric consistency guided multi-view stereo. In Proc. the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2019, pp. 5483-5492. DOI:10.1109/CVPR.2019.00563.
[15] Bleyer M, Rhemann C, Rother C. Patchmatch stereo-Stereo matching with slanted support windows. In Proc. British Machine Vision Conference, August 29-September 2, 2011. DOI:10.5244/C.25.14.
[16] Besse F, Rother C, Fitzgibbon A W, Kautz J. PMBP:Patchmatch belief propagation for correspondence field estimation. Int. J. Comput. Vis., 2014, 110(1):2-13. DOI:10.1007/s11263-013-0653-9.
[17] Heise P, Klose S, Jensen B, Knoll A C. PM-Huber:PatchMatch with huber regularization for stereo matching. In Proc. the 2013 IEEE International Conference on Computer Vision, Dec. 2013, pp.2360-2367. DOI:10.1109/ICCV.2013.293.
[18] Li L, Zhang S, Yu X, Zhang L. PMSC:PatchMatch-based superpixel cut for accurate stereo matching. IEEE Trans. Circuits Syst. Video Technol., 2018, 28(3):679-692. DOI:10.1109/TCSVT.2016.2628782.
[19] Liao J, Fu Y, Yan Q, Xiao C. Pyramid multi-view stereo with local consistency. Comput. Graph. Forum, 2019, 38(7):335-346. DOI:10.1111/cgf.13841.
[20] Zheng E, Dunn E, Jojic V, Frahm J. PatchMatch based joint view selection and depthmap estimation. In Proc. the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2014, pp.1510-1517. DOI:10.1109/CVPR.2014.196.
[21] Furukawa Y, Ponce J. Accurate, dense, and robust multiview stereopsis. IEEE Trans. Pattern Anal. Mach. Intell., 2010, 32(8):1362-1376. DOI:10.1109/TPAMI.2009.161.
[22] Locher A, Perdoch M, van Gool L. Progressive prioritized multiview stereo. In Proc. the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2016, pp.3244-3252. DOI:10.1109/CVPR.2016.353.
[23] Vogiatzis G, Esteban C H, Torr P H S, Cipolla R. Multiview stereo via volumetric graph-cuts and occlusion robust photo-consistency. IEEE Trans. Pattern Anal. Mach. Intell., 2007, 29(12):2241-2246. DOI:10.1109/TPAMI.2007.70712.
[24] Ulusoy A O, Geiger A, Black M J. Towards probabilistic volumetric reconstruction using ray potentials. In Proc. the 2015 International Conference on 3D Vision, Oct. 2015, pp.10-18. DOI:10.1109/3DV.2015.9.
[25] Vu H H, Labatut P, Pons J P, Keriven R. High accuracy and visibility-consistent dense multiview stereo. IEEE Trans. Pattern Anal. Mach. Intell., 2012, 34(5):889-901. DOI:10.1109/TPAMI.2011.172.
[26] Seitz S M, Curless B, Diebel J, Scharstein D, Szeliski R. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proc. the 2006 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2006, pp.519-528. DOI:10.1109/CVPR.2006.19.
[27] Furukawa Y, Hernandez C. Multi-View Stereo:A Tutorial. Now Publishers Inc., 2015. DOI:10.1561/0600000052.
[28] Barnes C, Shechtman E, Finkelstein A, Goldman D B. PatchMatch:A randomized correspondence algorithm for structural image editing. ACM Trans. Graph., 2009, 28(3):Article No. 24. DOI:10.1145/1531326.1531330.
[29] Barnes C, Zhang F, Lou L, Wu X, Hu S. PatchTable:Efficient patch queries for large datasets and applications. ACM Transactions on Graphics, 2015, 34(4):Article No. 97. DOI:10.1145/2766934.
[30] Barnes C, Zhang F. A survey of the state-of-the-art in patch-based synthesis. Computational Visual Media, 2017, 3(1):3-20. DOI:10.1007/s41095-016-0064-2.
[31] Wei J, Resch B, Lensch H P A. Multi-view depth map estimation with cross-view consistency. In Proc. the 2014 British Machine Vision Conference, Sept. 2014. DOI:10.5244/C.28.76.
[32] Romanoni A, Matteucci M. TAPA-MVS:Textureless-aware PAtchMatch multi-view stereo. In Proc. the 2019 IEEE International Conference on Computer Vision, Oct. 27-Nov. 2, 2019, pp.10412-10421. DOI:10.1109/ICCV.2019.01051.
[33] Xu Q, Tao W. Planar prior assisted PatchMatch multiview stereo. In Proc. the 34th AAAI Conference on Artificial Intelligence, Feb. 2020, pp.12516-12523. DOI:10.1609/aaai.v34i07.6940.
[34] Xu Z, Liu Y, Shi X, Wang Y, Zheng Y. MARMVS:Matching ambiguity reduced multiple view stereo for efficient large scale scene reconstruction. In Proc. the 2010 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp.5980-5989. DOI:10.1109/CVPR42600.2020.00602.
[35] Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell., 2001, 23(11):1222-1239. DOI:10.1109/34.969114.
[36] Taniai T, Matsushita Y, Naemura T. Graph cut based continuous stereo matching using locally shared labels. In Proc. the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Jun. 2014, pp.1613-1620. DOI:10.1109/CVPR.2014.209.
[37] Ogawara K. Approximate belief propagation by hierarchical averaging of outgoing messages. In Proc. the 20th IEEE International Conference on Pattern Recognition, Aug. 2010, pp.1368-1372. DOI:10.1109/ICPR.2010.338.
[38] Yu T, Lin R, Super B J, Tang B. Efficient message representations for belief propagation. In Proc. the 11th IEEE International Conference on Computer Vision, Oct. 2007. DOI:10.1109/ICCV.2007.4408905.
[39] Hallek M, Smach F, Atri M. Real-time stereo matching on CUDA using Fourier descriptors and dynamic programming. Computational Visual Media, 2019, 5(1):59-71. DOI:10.1007/s41095-019-0133-4.
[40] Kolmogorov V, Zabih R. Computing visual correspondence with occlusions via graph cuts. In Proc. International Conference on Computer Vision, Jul. 2001, pp.508-515. DOI:10.1109/ICCV.2001.937668.
[41] Klaus A, Sormann M, Karner K F. Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In Proc. the 18th IEEE International Conference on Pattern Recognition, Aug. 2006, pp.15-18. DOI:10.1109/ICPR.2006.1033.
[42] Wang W, Hu L, Hu Z. Energy-based multi-view piecewise planar stereo. Sci. China Inf. Sci., 2017, 60(3):Article No. 32101. DOI:10.1007/s11432-015-0710-5.
[43] Neal R M, Hinton G E. A view of the EM algorithm that justifies incremental, sparse, and other variants. In Learning in Graphical Models, Jordan M I (ed.), Springer, 1998, pp.355-368. DOI:10.1007/978-94-011-5014-912.
[44] Kopf J, Cohen M F, Lischinski D, Uyttendaele M. Joint bilateral upsampling. ACM Trans. Graph., 2007, 26(3):Article No. 96. DOI:10.1145/1276377.1276497.
[45] Li Y, Min D, Brown M S, Do M N, Lu J. SPMBP:Sped-up PatchMatch belief propagation for continuous MRFs. In Proc. the 2015 IEEE International Conference on Computer Vision, Dec. 2015, pp.4006-4014. DOI:10.1109/ICCV.2015.456.
[1] Jing Zhou, Shan-Feng Zhu, Xiaodi Huang, Yanchun Zhang. 基于半监督学习整合多种时间序列距离度量方法的时间序列聚类[J]. , 2015, 30(4): 859-873.
[2] Jun-Fa Liu, Wen-Jing He, Tao Chen, and Yi-Qiang Chen. 由流形约束实现人脸知识迁移的三维卡通重建方法[J]. , 2013, 28(3): 479-489.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] Qing-Bin Liu, Shi-Zhu He, Kang Liu, Sheng-Ping Liu, Jun Zhao. 一种用于对话状态跟踪的统一共享私有网络和去燥方法[J]. 计算机科学技术学报, 2021, 36(6): 1407 -1419 .
[2] Fu-Rong Dang, Jin-Tao Tang, Kun-Yuan Pang, Ting Wang, Sha-Sha Li, Xiao Li. 构建一个教育领域知识图谱并将知识点链接到维基百科[J]. 计算机科学技术学报, 2021, 36(5): 1200 -1211 .
[3] . Online First Under Construction [J]. 计算机科学技术学报, 0, (): 1 .
[4] Dan-Hao Zhu, Xin-Yu Dai, Jia-Jun Chen. 预训练和学习:在图神经网络中保留全局信息[J]. 计算机科学技术学报, 2021, 36(6): 1420 -1430 .
[5] Zeynep Banu Ozger, Nurgul Yuzbasioglu Uslu. 基于三元组重新排序的有效离散人工蜂群SPARQL查询路径优化[J]. 计算机科学技术学报, 2021, 36(2): 445 -462 .
[6] Zhi-Neng Chen, Chong-Wah Ngo, Wei Zhang, Juan Cao, Yu-Gang Jiang. 网络视频人脸—姓名关联:大规模数据库,基准实验和开放性问题[J]. , 2014, 29(5): 785 -798 .
[7] Yuan Li, Xing-Chen Wang, Lin Huang, Yun-Lei Zhao. 揭序加密:文件注入攻击和前向安全[J]. 计算机科学技术学报, 2021, 36(4): 877 -895 .
[8] Fei Xia, De-Jun Jiang, Jin Xiong, Ning-Hui Sun. PCM内存系统研究综述[J]. , 2015, 30(1): 121 -144 .
[9] Yan-Hong Fan, Mei-Qin Wang, Yan-Bin Li, Kai Hu, Mu-Zhou Li. 一种抗SCPA和DOS攻击的高安全性的固件升级方案[J]. 计算机科学技术学报, 2021, 36(2): 419 -433 .
[10] Zhi-Xing Li, Yue Yu, Tao Wang, Gang Yin, Xin-Jun Mao, Huai-Min Wang. 基于文本和变更相似度的重复性合并请求检测技术[J]. 计算机科学技术学报, 2021, 36(1): 191 -206 .
版权所有 © 《计算机科学技术学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn
总访问量: