Tracking an RGB-D Camera on Mobile Devices Using an Improved Frame-to-Frame Pose Estimation Method Jaepung An∗ Jaehyun Lee† Jiman Jeong† Insung Ihm∗ ∗Department of Computer Science and Engineering †TmaxOS Sogang University, Korea Korea fajp5050,
[email protected] fjaehyun lee,jiman
[email protected] Abstract tion between two time frames. For effective pose estima- tion, several different forms of error models to formulate a The simple frame-to-frame tracking used for dense vi- cost function were proposed independently in 2011. New- sual odometry is computationally efficient, but regarded as combe et al. [11] used only geometric information from rather numerically unstable, easily entailing a rapid accu- input depth images to build an effective iterative closest mulation of pose estimation errors. In this paper, we show point (ICP) model, while Steinbrucker¨ et al. [14] and Au- that a cost-efficient extension of the frame-to-frame tracking dras et al. [1] minimized a cost function based on photo- can significantly improve the accuracy of estimated camera metric error. Whereas, Tykkal¨ a¨ et al. [17] used both geo- poses. In particular, we propose to use a multi-level pose metric and photometric information from the RGB-D im- error correction scheme in which the camera poses are re- age to build a bi-objective cost function. Since then, sev- estimated only when necessary against a few adaptively se- eral variants of optimization models have been developed lected reference frames. Unlike the recent successful cam- to improve the accuracy of pose estimation. Except for the era tracking methods that mostly rely on the extra comput- KinectFusion method [11], the initial direct dense methods ing time and/or memory space for performing global pose were applied to the framework of frame-to-frame tracking optimization and/or keeping accumulated models, the ex- that estimates the camera poses by repeatedly registering tended frame-to-frame tracking requires to keep only a few the current frame against the last frame.