Journal of Computer Science and Technology ›› 2019, Vol. 34 ›› Issue (3): 581-593.doi: 10.1007/s11390-019-1928-6

Special Issue: Artificial Intelligence and Pattern Recognition; Computer Graphics and Multimedia

Previous Articles     Next Articles

Geometry-Aware ICP for Scene Reconstruction from RGB-D Camera

Bo Ren, Member, CCF, ACM, Jia-Cheng Wu, Ya-Lei Lv, Ming-Ming Cheng, Member, CCF, ACM, Shao-Ping Lu, Member, CCF, ACM   

  1. College of Computer Science, Nankai University, Tianjin 300350, China
  • Received:2018-12-29 Revised:2019-03-15 Online:2019-05-05 Published:2019-05-06
  • About author:Bo Ren received his Ph.D. degree in computer science from Tsinghua University, Beijing, in 2015. He is currently a lecturer in the College of Computer Science, Nankai University, Tianjin. His research interests include physically-based simulation and rendering, scene geometry reconstruction and analysis. His recent research focuses on multi-fluid and multi-phase simulations in computer graphics.
  • Supported by:
    This work was supported by Tianjin Natural Science Foundation of China under Grant Nos. 18JCYBJC41300 and 18ZXZNGX00110, and the National Natural Science Foundation of China under Grant No. 61620106008.

The Iterative Closest Point (ICP) scheme has been widely used for the registration of surfaces and point clouds. However, when working on depth image sequences where there are large geometric planes with small (or even without) details, existing ICP algorithms are prone to tangential drifting and erroneous rotational estimations due to input device errors. In this paper, we propose a novel ICP algorithm that aims to overcome such drawbacks, and provides significantly stabler registration estimation for simultaneous localization and mapping (SLAM) tasks on RGB-D camera inputs. In our approach, the tangential drifting and the rotational estimation error are reduced by:1) updating the conventional Euclidean distance term with the local geometry information, and 2) introducing a new camera stabilization term that prevents improper camera movement in the calculation. Our approach is simple, fast, effective, and is readily integratable with previous ICP algorithms. We test our new method with the TUM RGB-D SLAM dataset on state-of-the-art real-time 3D dense reconstruction platforms, i.e., ElasticFusion and Kintinuous. Experiments show that our new strategy outperforms all previous ones on various RGB-D data sequences under different combinations of registration systems and solutions.

Key words: ICP (iterative closest point); RGB-D; tangential drifting; rotational estimation; covariance matrix;

[1] Besl P J, McKay N D. Method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2):239-256.
[2] Segal A, Hähnel D, Thrun S. Generalized-ICP. In Proc. Robotics:Science and Systems, June 2009, Article No. 21.
[3] Steinbrücker F, Sturm J, Cremers D. Real-time visual odometry from dense RGB-D images. In Proc. the 2011 IEEE International Conference on Computer Vision Workshops, November 2011, pp.719-722.
[4] Kerl C, Sturm J, Cremers D. Robust odometry estimation for RGB-D cameras. In Proc. the 2013 IEEE International Conference on Robotics and Automation, May 2013, pp.3748-3754.
[5] Tam G K, Cheng Z Q, Lai Y K, Langbein F C, Liu Y, Marshall D, Martin R R, Sun X F, Rosin P L. Registration of 3D point clouds and meshes:A survey from rigid to nonrigid. IEEE Transactions on Visualization and Computer Graphics, 2013, 19(7):1199-1217.
[6] Salvi J, Matabosch C, Fofi D, Forest J. A review of recent range image registration methods with accuracy evaluation. Image and Vision Computing, 2007, 25(5):578-596.
[7] Rusinkiewicz S, Levoy M. Efficient variants of the ICP algorithm. In Proc. the 3rd International Conference on 3D Digital Imaging and Modeling, May 2001, pp.145-152.
[8] Newcombe R A, Izadi S, Hilliges O, Molyneaux D, Kim D, Davison A J, Kohi P, Shotton J, Hodges S, Fitzgibbon A. KinectFusion:Real-time dense surface mapping and tracking. In Proc. the 10th IEEE International Symposium on Mixed and Augmented Reality, October 2011, pp.127-136.
[9] Izadi S, Kim D, Hilliges O et al. KinectFusion:Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. the 24th Annual ACM Symposium on User Interface Software and Technology, October 2011, pp.559- 568.
[10] Henry P, Krainin M, Herbst E, Ren X, Fox D. RGB-D mapping:Using Kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research, 2012, 31(5):647-663.
[11] Huang A S, Bachrach A, Henry P, Krainin M, Maturana D, Fox D, Roy N. Visual odometry and mapping for autonomous flight using an RGB-D camera. In Proc. the 15th International Symposium on Robotics Research, December 2017, pp.235-252.
[12] Rusinkiewicz S, Hall-Holt O, Levoy M. Real-time 3D model acquisition. ACM Transactions on Graphics, 2002, 21(3):438-446.
[13] Curless B, Levoy M. A volumetric method for building complex models from range images. In Proc. the 23rd Annual Conference on Computer Graphics and Interactive Techniques, August 1996, pp.303-312.
[14] Simon D A. Fast and accurate shape-based registration[Ph.D. Thesis]. Robotics Institute, Carnegie Mellon University, 1996.
[15] Johnson A E, Kang S B. Registration and integration of textured 3D data. Image and Vision Computing, 1999, 17(2):135-147.
[16] Jin H, Favaro P, Soatto S. Real-time feature tracking and outlier rejection with changes in illumination. In Proc. the 8th International Conference on Computer Vision, July 2001, pp.684-689.
[17] Chen Y, Medioni G. Object modelling by registration of multiple range images. Image and Vision Computing, 1992, 10(3):145-155.
[18] Biber P, Straßer W. The normal distributions transform:A new approach to laser scan matching. In Proc. the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 2003, pp.2743-2748.
[19] Magnusson M, Lilienthal A, Duckett T. Scan registration for autonomous mining vehicles using 3D-NDT. Journal of Field Robotics, 2007, 24(10):803-827.
[20] Whelan T, Leutenegger S, Salas-Moreno R F, Glocker B, Davison A J. ElasticFusion:Dense SLAM without a pose graph. In Proc. Robotics:Science and Systems XI, July 2015, Article No. 1.
[21] Whelan T, Kaess M, Fallon M, Johannsson H, Leonard J, McDonald J. Kintinuous:Spatially extended kinectFusion. In Proc. Robotics:Science and Systems Workshop on RGB-D:Advanced Reasoning with Depth Cameras, July 2012.
[22] Pomerleau F, Colas F, Siegwart R, Magnenat S. Comparing ICP variants on real-world data sets - Open-source library and experimental protocol. Autonomous Robots, 2013, 34(3):133-148.
[23] Holz D, Ichim A E, Tombari F, Rusu R B, Behnke S. Registration with the point cloud library:A modular framework for aligning in 3-D. IEEE Robotics & Automation Magazine, 2015, 22(4):110-124.
[24] Whelan T, Johannsson H, Kaess M, Leonard J J, McDonald J. Robust real-time visual odometry for dense RGB-D mapping. In Proc. the 2013 IEEE International Conference on Robotics and Automation, May 2013, pp.5724-5731.
[25] Choi S, Zhou Q Y, Koltun V. Robust reconstruction of indoor scenes. In Proc. the 2015 IEEE Conference on Computer Vision and Pattern Recognition, June 2015, pp.5556- 5565.
[26] Valentin J, Vineet V, Cheng M M, Kim D, Shotton J, Kohli P, Nießner M, Criminisi A, Izadi S, Torr P. SemanticPaint:Interactive 3D labeling and learning at your fingertips. ACM Transactions on Graphics, 2015, 34(5):Article No. 154.
[27] Kähler O, Prisacariu V A, Ren C Y, Sun X, Torr P, Murray D. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Transactions on Visualization and Computer Graphics, 2015, 21(11):1241-1250.
[28] Whelan T, Salas-Moreno R F, Glocker B, Davison A J, Leutenegger S. ElasticFusion:Real-time dense SLAM and light source estimation. The International Journal of Robotics Research, 2016, 35(14):1697-1716.
[29] Hu R, Wen C, van Kaick O, Chen L, Lin D, CohenOr D, Huang H. Semantic object reconstruction via casual handheld scanning. ACM Trans. Graph., 2018, 37(6):Article No. 219.
[30] Cheng M, Hou Q, Zhang S, Rosin P L. Intelligent visual media processing:When graphics meets vision. J. Comput. Sci. Technol., 2017, 32(1):110-121.
[31] Whelan T, Kaess M, Leonard J J, McDonald J. Deformation-based loop closure for large scale dense RGBD SLAM. In Proc. the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, November 2013, pp.548-555.
[32] Pirker K, Rüther M, Schweighofer G, Bischof H. GPSlam:Marrying sparse geometric and dense probabilistic visual mapping. In Proc. the 22nd British Machine Vision Conference, August 2011, Article No. 102.
[33] Konolige K, Agrawal M. FrameSLAM:From bundle adjustment to real-time visual mapping. IEEE Transactions on Robotics, 2008, 24(5):1066-1077.
[34] Davison A J. Real-time simultaneous localisation and mapping with a single camera. In Proc. the 9th IEEE International Conference on Computer Vision, October 2003, pp.1403-1410.
[35] Klein G, Murray D. Parallel tracking and mapping for small AR workspaces. In Proc. the 6th IEEE/ACM International Symposium on Mixed and Augmented Reality, November 2007, pp.225-234.
[36] Dai A, Nießner M, Zollhöfer M, Izadi S, Theobalt C. BundleFusion:Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics, 2017, 36(3):Article No. 24.
[37] Granger S, Pennec X. Multi-scale EM-ICP:A fast and robust approach for surface registration. In Proc. the 7th European Conference on Computer Vision, May 2002, pp.418- 432.
[38] Liu Y. A mean field annealing approach to accurate free form shape matching. Pattern Recognition, 2007, 40(9):2418-2436.
[39] Rangarajan A, Chui H, Mjolsness E, Pappu S, Davachi L, Goldman-Rakic P, Duncan J. A robust point-matching algorithm for autoradiograph alignment. Medical Image Analysis, 1997, 1(4):379-398.
[40] Bylow E, Sturm J, Kerl C, Kahl F, Cremers D. Real-time camera tracking and 3D reconstruction using signed distance functions. In Proc. Robotics:Science and Systems IX, June 2013, Article No. 35.
[41] Jian B, Vemuri B C. Robust point set registration using Gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8):1633-1645.
[42] Tsin Y, Kanade T. A correlation-based approach to robust point set registration. In Proc. the 8th European Conference on Computer Vision, May 2004, pp.558-569.
[43] Song P. Local voxelizer:A shape descriptor for surface registration. Computational Visual Media, 2015, 1(4):279- 289.
[44] Nießner M, Zollhöfer M, Izadi S, Stamminger M. Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph., 2013, 32(6):Article No. 169.
[45] Prisacariu V A, Kahler O, Cheng M M, Ren C Y, Valentin J, Torr P H S, Reid I D, Murray D W. A framework for the volumetric integration of depth images. arXiv:1410.0925, 2014., March 2019.
[46] Sturm J, Engelhard N, Endres F, Burgard W, Cremers D. A benchmark for the evaluation of RGB-D SLAM systems. In Proc. the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, Oct. 2012, pp.573-580.
[47] Kraft M, Nowicki M, Schmidt A, Fularz M, Skrzypczyński P. Toward evaluation of visual navigation algorithms on RGB-D data from the first- and second-generation Kinect. Machine Vision and Applications, 2017, 28(1/2):61-74.
[48] Magnusson M. The three-dimensional normal-distributions transform:An efficient representation for registration, surface analysis, and loop detection[Ph.D. Thesis]. Örebro University, 2009.
[49] Huhle B, Magnusson M, Straßer W, Lilienthal A J. Registration of colored 3D point clouds with a kernel-based extension to the normal distributions transform. In Proc. the 2008 IEEE International Conference on Robotics and Automation, May 2008, pp.4025-4030.
[50] Pulli K. Multiview registration for large data sets. In Proc. the 2nd International Conference on 3D Digital Imaging and Modeling, October 1999, pp.160-168.
[51] Dorai C, Wang G, Jain A K, Mercer C. Registration and integration of multiple object views for 3D model construction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(1):83-89.
[52] Abou-Moustafa K, Ferrie F P. Local generalized quadratic distance metrics:Application to the k-nearest neighbors classifier. Advances in Data Analysis and Classification, 2018, 12(2):341-363.
[1] Xiong Lv, Shu-Qiang Jiang, Luis Herranz, Shuang Wang . RGB-D Hand-Held Object Recognition Based on Heterogeneous Feature Fusion [J]. , 2015, 30(2): 340-352.
[2] Tam V. Nguyen, Jiashi Feng, Shuicheng Yan. Seeing Human Weight from a Single RGB-D Image [J]. , 2014, 29(5): 777-784.
Full text



[1] Liu Mingye; Hong Enyu;. Some Covering Problems and Their Solutions in Automatic Logic Synthesis Systems[J]. , 1986, 1(2): 83 -92 .
[2] Chen Shihua;. On the Structure of (Weak) Inverses of an (Weakly) Invertible Finite Automaton[J]. , 1986, 1(3): 92 -100 .
[3] Gao Qingshi; Zhang Xiang; Yang Shufan; Chen Shuqing;. Vector Computer 757[J]. , 1986, 1(3): 1 -14 .
[4] Chen Zhaoxiong; Gao Qingshi;. A Substitution Based Model for the Implementation of PROLOG——The Design and Implementation of LPROLOG[J]. , 1986, 1(4): 17 -26 .
[5] Huang Heyan;. A Parallel Implementation Model of HPARLOG[J]. , 1986, 1(4): 27 -38 .
[6] Min Yinghua; Han Zhide;. A Built-in Test Pattern Generator[J]. , 1986, 1(4): 62 -74 .
[7] Tang Tonggao; Zhao Zhaokeng;. Stack Method in Program Semantics[J]. , 1987, 2(1): 51 -63 .
[8] Min Yinghua;. Easy Test Generation PLAs[J]. , 1987, 2(1): 72 -80 .
[9] Zhu Hong;. Some Mathematical Properties of the Functional Programming Language FP[J]. , 1987, 2(3): 202 -216 .
[10] Li Minghui;. CAD System of Microprogrammed Digital Systems[J]. , 1987, 2(3): 226 -235 .

ISSN 1000-9000(Print)

CN 11-2296/TP

Editorial Board
Author Guidelines
Journal of Computer Science and Technology
Institute of Computing Technology, Chinese Academy of Sciences
P.O. Box 2704, Beijing 100190 P.R. China
  Copyright ©2015 JCST, All Rights Reserved