We use cookies to improve your experience with our site.

稀疏输入视图条件下用于新视图合成的泛化神经辐射场

SG-NeRF: Sparse-Input Generalized Neural Radiance Fields for Novel View Synthesis

  • 摘要: 传统的神经辐射场渲染新视角需要密集的输入图像并进行逐场景优化,本文提出一种从输入图像中推理场景,无需进行逐场景优化即可进行高质量渲染的泛化方法。首先,本文构建一种基于卷积注意力以及多级融合机制的改进MVS结构,从稀疏的输入图像中获得场景的几何特征以及外观特征信息,然后这些特征信息被多头注意力聚合作为神经辐射场的输入。这种利用神经辐射场对场景特征进行解码而不是对位置和方向进行映射的策略使我们的方法能够进行跨场景训练以及推理,从而使神经辐射场具有在未知场景上进行新视图合成的泛化能力。本文在真实场景上进行了泛化能力测试,在同等输入条件下与基线方法相比,本文的PSNR提升了3.31。此外,如果场景有密集的输入视图可用,可以在短时间内通过进一步的细化训练使平均PSNR提升1.04,获得更高质量的渲染效果。

     

    Abstract: Traditional neural radiance fields for rendering novel views require intensive input images and pre-scene optimization, which limits their practical applications. We propose a generalization method to infer scenes from input images and perform high-quality rendering without pre-scene optimization named SG-NeRF (Sparse-Input Generalized Neural Radiance Fields). Firstly, we construct an improved multi-view stereo structure based on the convolutional attention and multi-level fusion mechanism to obtain the geometric features and appearance features of the scene from the sparse input images, and then these features are aggregated by multi-head attention as the input of the neural radiance fields. This strategy of utilizing neural radiance fields to decode scene features instead of mapping positions and orientations enables our method to perform cross-scene training as well as inference, thus enabling neural radiance fields to generalize for novel view synthesis on unseen scenes. We tested the generalization ability on DTU dataset, and our PSNR (peak signal-to-noise ratio) improved by 3.14 compared with the baseline method under the same input conditions. In addition, if the scene has dense input views available, the average PSNR can be improved by 1.04 through further refinement training in a short time, and a higher quality rendering effect can be obtained.

     

/

返回文章
返回