Robust low-overlap 3-D point cloud registration for rejection

John Stechschulte,1 Nisar Ahmed,2 and Christoffer Heckman1

Abstract— When registering 3-D point clouds it is expected y1 y2 y3 y4 that some points in one cloud do not have corresponding y5 y6 y7 y8 points in the other cloud. These non-correspondences are likely z1 z2 z3 z4 to occur near one another, as surface regions visible from y9 y10 y11 y12 one sensor pose are obscured or out of frame for another. z5 z6 z7 z8 In this work, a hidden Markov random field model is used to capture this prior within the framework of the iterative y13 y14 y15 y16 z9 z10 z11 z12 closest point algorithm. The EM algorithm is used to estimate the distribution parameters and learn the hidden component z13 z14 z15 z16 memberships. Experiments are presented demonstrating that this method outperforms several other outlier rejection methods when the point clouds have low or moderate overlap. Fig. 1. Graphical model for nearest fixed point distance, shown for a 4×4 grid of pixels in the free depth map. At pixel i, the Zi ∈ {±1} value is I.INTRODUCTION the unobserved inlier/outlier state, and Yi ∈ R≥0 is the observed distance to the closest fixed point. Depth sensing is an increasingly ubiquitous technique for , 3-D modeling and mapping applications. Struc- Which error measure is minimized in step 3 has been con- tured light sensors, , and stereo matching produce point sidered extensively [4], and is explored further in Section II. clouds, or 3-D points on surfaces. Algorithmic registration There are closed-form solutions [5] that minimize the point- of point clouds is needed to use the resulting data, often to-point distance, including using the singular value decom- without the aid of other measurements. Iterative closest point position [6] and the dual number quaternion method [7]. (ICP) [1] is commonly used for this purpose, although this The present work minimizes the sum of squared point-to- time-tested approach has its drawbacks, including failure point distances, for simplicity, although it is fundamentally when registering clouds with low overlap [2], [3]. However, independent of choice of error metric. many applications of 3-D registration would be more efficient with low overlap clouds, such as combining partial scans A crucial step in ICP is the rejection of (step 2), into a complete 3-D model. To address this gap, we present generally resulting from non-overlapping volumes of space a probabilistic model using a hidden Markov random field between two measurements. The original ICP formulation [1] (HMRF), shown in Fig. 1, for inferring via the EM algorithm does not discard any points and simply incurs error for which points lie in the overlap. A successful alignment of each outlier. A proliferation of strategies have been proposed clouds with low overlap is shown in Fig. 2. for discarding outliers [3], [8], [9], [10], [11], [12]. These ICP attempts to recover the optimal transformation to align methods threshold the residual distance from free points to two point clouds. The algorithm recovers the transformation the nearest point in the fixed cloud, but do not consider the that moves the “free” point cloud onto the “fixed” point cloud spatial relations amongst points in the free cloud, and what by iteratively: information might be gained by considering whether a point’s neighbors are inliers or outliers. In the present work we pro- 1) finding the closest point in the fixed cloud to each point pose an alternative method where the distribution of residual in the free cloud; distances is modeled as a mixture of two distributions: a 2) discarding some of these matches as outliers; and Gaussian for inliers and a logistic distribution for outliers. 3) computing and applying the rigid transform that op- Using a technique from image segmentation, a point’s inlier timally aligns the remaining points (the inliers), min- or outlier state is modeled as being influenced by the state imizing some measure of the nearest point distances of its neighbors through a hidden Markov random field— found in step 1; an expectation we refer to as the “neighbor prior.” The EM until convergence. The present work focuses on step 2 of the algorithm, with the use of a mean field approximation, allows above sequence. for inference of the hidden state. 1 Department of Computer Science, University of Col- By employing the neighbor prior, the HMRF model can orado, Boulder, [email protected], infer which points are outliers in high- and low-overlap [email protected] 2 Department of Aerospace Engineering, University of Colorado, Boul- cloud pairs. Although exact inference for an MRF model is der, [email protected] intractable in applications of reasonable size, the mean field JS was supported by the U.S. Department of Defense (DoD) through the approximation provides sufficient accuracy at a reasonable National Defense Science & Engineering Graduate Fellowship (NDSEG) program while conducting this work. CH was supported by DARPA award computational expense. Further justification and development no. N65236–16–1–1000. of the neighbor prior is discussed at length in Sections II presented, and so could be implemented within HMRF ICP, possibly resulting in further performance improvements. Improvements to ICP have been achieved through better rejection of outlier correspondences—a thread of research that the current work fits into. Several methods work by thresholding the residual distances, and differ simply in how the threshold is determined: fixed distance [8], residual per- centile [3], standard deviations from the mean residual [9], or median absolute deviations from the median residual [10]. These methods inherently assume a large overlap fraction, so are brittle to overlap variations. Zhang [11] presents an adaptive threshold tuned with a distance parameter that does not have direct physical significance. Fractional ICP dynamically adapts the fraction of point correspondences to be used [18], although this method explicitly assumes a large fraction of overlap, with a penalty for smaller overlap fractions. Finally, there are a few outlier-rejection methods that cannot be summarized as simple thresholds. Enqvist, Fig. 2. A successful registration with only 36% overlap. The blue points are the fixed cloud, and the green and red points together are the free cloud, et al. [12] leverage the fact that distances between corre- which has been registered. The green points are inliers and the red are sponding points within a cloud will be invariant under rigid outliers, as inferred by the HMRF method. motion and find the largest set of consistent correspondences to identify inliers. In [19], points are classified as inliers or and III. Sections IV and V present experiments and results one of three classes of outliers: occluded, unpaired (outside on two challenging benchmark dense 3D datasets, showing the frame), or outliers (sensor noise). The current work is this method outperforms two state-of-the-art methods, par- more closely related to the simple threshold methods, but ticularly on point clouds with low overlap. To the authors’ uses a probabilistic model that better captures the expected knowledge, this is the first example of an HMRF model being residual distribution. employed for 3-D registration, and these promising results In contrast to the current method, EM-ICP [20], [21] show that the model is a viable and attractive alternative to employs EM to recover point correspondences. The corre- other more complex probabilistic scan matching techniques. spondence is the hidden variable and the E-step computes In particular, the HMRF’s relative simplicity and ease of assignment probabilities, and the implied rigid transforma- deployment make it scalable and adaptable to different tion is recovered in the M-step. This approach assumes that environments where little or no prior knowledge or training every point in the free cloud corresponds to some point in data are available. the fixed cloud, although it may be straightforward to include an “unassigned” value to the hidden variable states. II.RELATED WORK Since ICP requires an informed initialization, significant work has been devoted to achieving global registration or Several attempts have been made to improve registra- finding an approximate alignment from any initial state, tion performance in the low-overlap regime. The hybrid which is then refined with ICP. In Super 4PCS [22], [23], genetic/hill-climbing algorithm of Silva, et al. [13] shows an initial alignment is found by matching sets of 4 coplanar success with overlaps down to 55% but at the great compu- points, using ratios invariant to rigid transformation; good tational expense of a stochastic global search. Good low- performance with low overlap is claimed, and is used as overlap performance is claimed in [14], which defines a a basis for comparison in this work. Fast global registration “direction angle” on points and then aligns clouds in rotation between two or more point clouds is achieved in [24] by find- using a histogram of these, and in translation using correla- ing correspondences only once based on point features, then tions of 2-D projections; although not described as such, a using robust estimators to mute the impact of spurious cor- Manhattan world assumption is made. Rotational alignment respondences. Branch-and-bound algorithms can guarantee is recovered using extended Gaussian images in [2], and global optimality, and several variations have been applied to refined with ICP, showing success with overlap as low the registration problem to search over transformations [25] as 45%. HMRF ICP is simpler than these methods, and or over point or feature correspondences [26]. With the generally performs well at overlaps that are still lower. exception of Super 4PCS, low-overlap performance is not Robust statistics or improved error metrics can make ICP addressed in these works, and many require final refinement, more robust to outliers. The point-to-plane [15] metric, which thereby offering an opportunity to apply HMRF ICP. measures the distance projected onto the surface normal of Finally, Ramos, et al. [27] and Sun, et al. [28] use the fixed point, is frequently used and more robust in the case conditional random fields to discriminatively match 2-D lidar of limited overlap [16]. Sparse norms are used within ICP scans in an ICP setting with impressive and reliable results. in [17]. These additions are mostly orthogonal to the work Although extending the method to 3-D registration is theoret- ically straightforward, the computational cost would increase Function ICP(Initial transform Tinit, clouds B and C, significantly. These methods use an AdaBoost classifier to field parameter β, thresholds): combine several derived features, which requires training the KDtree = BuildKdTree(C) classifier with ground truth alignments. The classifier may T = Tinit not generalize well to new observations that are not drawn B = T × B from the same distribution as the training data. In contrast, I, y = KDtree.NearestNeighbors(B) the generative neighbor prior leveraged in this work is a Initialize z˜ to 1 except for highest 10% of y, which direct consequence of sensor geometry, and thus does not are initialized to -1. require training data. do Few existing methods specifically address the problem of do aligning clouds with low overlap—more often, high overlap θ = M-step(y, z˜) is assumed. By using an appropriate probabilistic model that z˜ = E-step(y, z˜, θ, β) while some z˜ value changed sign captures the neighbor prior, this assumption need not be T = localize(B, C, z˜) made, and the resulting method works equally well with high step T = T × T and low overlap. step B = Tstep × B III.PROBABILISTIC NEIGHBOR PRIOR MODEL I, y = KDtree.NearestNeighbors(B) while T sufficiently large Let B ∈ 4 be a point in the free cloud in homogeneous step i R return T coordinates. In the first step of ICP, we find the closest point Function E-step(y, z˜, θ, β): 4 to Bi in the fixed cloud, which we will denote Cj ∈ R . Calculate update, z˜ ← E[Z|y, z˜, θ] The distance between them, yi = kBi −Cik, is the observed return z˜ residual. Y will denote all Yi random variables, with specific Function M-step(y, z˜): Calculate MLE of θ from complete data log instantiations represented as y and yi, respectively. Many sources of depth data generate data with a 2-D likelihood, assuming E[Z] = z˜ lattice structure—the pixel grid—in which each pixel has return θ four nearest neighbors. We exploit these neighbor relations Algorithm 1: The full HMRF ICP algorithm. Pyramiding to model the distribution of the closest point distances Y . has been omitted for clarity, as well as iteration limits on Neighbor relations can also be defined on unstructured point loops. clouds, as discussed below, although a grid topology is more intuitive. Note that calculation of W , and therefore exact calculation Given observed distances y, we wish to decide which of PG(z), requires a sum over all possible configurations, so points are inliers. Our prior beliefs are: (i) inliers will is exponential in the number of nodes. For a nontrivial field generally lie closer to their respective closest point than this is intractable. We avoid this issue with the mean field outliers; and (ii) neighbors of inliers are likely inliers, and approximation [29], which assumes a fixed configuration z˜ = neighbors of outliers are likely outliers—the neighbor prior. E[PG(Z|β)] and approximates the Gibbs distribution with To capture these priors, we model the distribution of Y as independent components conditioned on z˜, a mixture of two distributions, one for inliers and one for Y P (Z|β) ≈ P (Z |β, z˜). (1) outliers, where a point’s mixture membership is dependent G mf i i on its nearest neighbors. That is, we capture the second prior As the components are independent, it is no longer necessary using a hidden Markov random field on the inlier/outlier state to exhaust over z configurations. The components also of a point. depend only on local information: A. Probabilistic model Pmf(zi|β, z˜) = The graphical model for data with a grid topology is shown P exp (β i0∼i wi,i0 ziz˜i0 ) in Fig. 1. The distribution of Y conditionally depends on the P P . exp (β w 0 (+1)˜z 0 ) + exp (β w 0 (−1)˜z 0 ) hidden field Z, which is Gibbs distributed with a parameter i0∼i i,i i i0∼i i,i i (2) β that controls the strength of the neighbor influence. The Gibbs distribution is calculated based on the energy of a We assume that inliers are normally distributed and out- −1 liers are logistically distributed (as discussed further below). given configuration, PG(Z) = W exp(−H(Z)), where W is a normalization term called the “partition function”, The approximate complete data likelihood can be written, P W = z exp(−H(z)). Y P fmf(Y , Z|β, θ, z˜) = Pmf(zi|β, z˜) We use the energy function H(z) = −β 0 w 0 z z 0 i ∼i i,i i i i where w 0 is the edge weight, and β ≥ 0 is a parameter i,i (N(y |µ , σ ))(1+zi)/2(L(y |µ , s ))(1−zi)/2, controlling the interaction strength. If all neighbor relations i +1 +1 i −1 −1 (3) are considered equally, such as in a grid topology with no with parameters β, µ−1, s−1, µ+1, σ+1. We will use θ to privileged edges, the weights are all 1. More generally, they represent all Gaussian and logistic parameters, that is, θ = can capture the strength of a particular neighbor interaction. {µ−1, s−1, µ+1, σ+1}. B. Applying the EM algorithm For the outliers,   X 1 − zi The maximum likelihood model parameters and hidden n = (9) −1 2 state are estimated using the EM algorithm, similarly to i the image segmentation model presented in [30], [31], [32]. 1 X 1 − zi µ = y (10) In the E-step, the current estimate of the normal and lo- −1 n 2 i −1 i gistic parameters, along with the current mean field and √ s 3 1 X 1 − zi the observed residuals, are used to find the expected value s = y2 − µ2 . (11) −1 π n 2 i −1 of the hidden field. Then, in the M-step, the estimates for −1 i the normal and logistic parameters are updated using the C. Fitting the outlier distribution maximum likelihood estimates from the observed data and the expected value of the hidden field. The full HMRF ICP It was observed that some outlier distributions had multiple algorithm is shown in Algorithm 1. modes or heavy tails, as distant regions were observed. To better model this distribution, outlier residuals for 100 pairs The complete data log likelihood can be bounded be- of frames from each of the three datasets described in Sec- low by taking the expectation with respect to the hidden tion IV were analyzed. A Kolmogorov-Smirnov statistic was Z values, due to Jensen’s inequality. Because of linearity used to compare the empirical CDF with the normal distribu- of expectation, and since z appears linearly in the com- i tion and six heavy-tail distributions: the gamma, logistic, log- plete data log likelihood, this amounts to replacing z with i logistic, log-normal, Rayleigh, and Student’s t distributions. [z |y , z˜, θ, β] = 2P (z = 1|y , z˜, θ, β) − 1 throughout E i i i i No distribution outperforms all others in all cases, but the Equation 3. The EM algorithm iteratively calculates these logistic distribution shows good average performance and is probabilities in the E-step, and then chooses parameters to simple to estimate within the EM framework. Nevertheless, maximize the resulting expected likelihood in the M-step. the worst cloud pairs still fail to align—in particular, those 1) The E-step: The mean field approximation allows for with multimodal outlier residuals. A would a closed-form E-step, where we calculate, likely be more effective in these cases, but would also be significantly more computationally expensive. P (z = +1|y , z˜, θ, β) ∝ i i D. Accelerating EM convergence ! 2 ! X 1 (yi − µ+1) The EM algorithm is accelerated via a pyramid method: exp β wi,i0 z˜i0 q exp 2 0 2 2σ+1 we downsample the Z and Y fields, allow EM to converge i ∼i 2πσ+1 Z (4) for this smaller model, then upsample the converged to initialize the larger model. For grid topologies, each pyramid P (zi = −1|yi, z˜, θ, β) ∝ level has a quarter the number of pixel sites as the level   ! exp − yi−µ−1 below it (half in each dimension). For unstructured clouds, X s exp −β wi,i0 z˜i0 2 discussed below, each pyramid level has half the number   y −µ  i0∼i i −1 s 1 + exp − s of points as the level below it. Although pyramiding is (5) a common technique in , such as in the pyramidal Lucas Kanade optical flow algorithm [33], this is the first application to initializing a Markov random field. The necessary normalization factor is the same for the two calculations, so is simply their sum. E. Unstructured clouds 2) The M-step: Now, the expected hidden values can be The pixel grid provides an intuitive structure on which used to update the mean field and problem parameters in to define the Markov random field, but the method can the M-step using maximum likelihood estimates. Updating be extended to any point cloud by defining neighbor rela- the mean field is simply adopting the expected zi values tions independent of underlying topology. We measure the calculated in the E-step. The MLEs of the normal and logistic “neighborliness” of two points by their distance, and apply 2 distribution parameters are calculated in the normal way, but a Gaussian kernel, exp −kBi − Bjk/2σ , to generate an the data are weighted by the probabilities that they came appropriate edge weight. The σ parameter is set to half the from the given distribution. For the inliers, average nearest-neighbor distance. Calculating weights for every pair of points would require   2 X 1 + zi N work, but only the nearest neighbors to a given point n = (6) +1 2 will have non-negligible values, by design. Thus, neighbor i weights are only calculated for the k-nearest neighbors of 1 X 1 + zi µ = y (7) each point, with 6 ≤ k ≤ 10 giving satisfactory experimental +1 n 2 i +1 i results. Applying the pyramid acceleration to this neighbor- s hood structure, however, is not straightforward. The obvious 1 X 1 + zi σ = y2 − µ2 . (8) +1 n 2 i +1 downsampling and upsampling methods available with a grid +1 i cannot be used for an arbitrary graph. Instead, we sample half InfiniTAM[37] during the reconstruction of the scene. The camera calibrations are known and used to appropriately unproject the depth maps and generate point clouds. In the shark sequence, 100 pairs of frames were selected at random, but stratified to include a variety of overlap ratios. Similarly, 90 frames were selected from the TUM datasets: one each from each dataset and overlap decile, except 0–10%. The overlap is estimated using the ground truth poses and the relative distance to each free point’s nearest neighbor in its own cloud versus in the fixed cloud. The threshold for this comparison was chosen to err on the side of overestimating the overlap ratio. The frame pairs were aligned using ground truth poses, and translated so that the coordinate origin was at the centroid of the fixed frame. Then, the free cloud alignment Fig. 3. Successful registration of two lidar spins, taken from the KITTI was perturbed by rotating 18 degrees about a random axis odometry dataset [34]. On the left, the initial registration from the RTK GPS through the origin. The same initialization was used for all poses. On the right, the registration refined by HMRF ICP. No information methods for a given frame pair (including the global meth- about the data topology was used. ods, Go-ICP and Super 4PCS). HMRF ICP was configured with 4 pyramid levels. Before the first ICP transformation the points at random to build a new layer on the pyramid. was calculated, EM was limited to 150 iterations at each The new layer needs neighbor weights, as well, which we pyramid level, although this limit was rarely met; after calculate by squaring the neighbor matrix for the lower level, applying the first transformation, EM was limited to 5 steps with diagonal elements set to 0. This captures the two-hop at each pyramid level. All experiments were executed in weight between points in the lower neighbor matrix. Finally, MATLAB on a workstation with 8 Intel R Xeon R E5620 once EM converges in a pyramid layer, the resulting Z field CPUs at 2.40GHz, and with 48 GB RAM. The Go-ICP is upsampled using the lower-level neighborhood matrix to and Super 4PCS implementations are available from the initialize the points that were not sampled in the higher-level authors [35], [23]. Both are written in C++ and are single- neighborhood. threaded. Parameters for these methods were chosen based Figure 3 shows the successful alignment of two lidar on the provided demos, and their execution time was limited spins, taken from the KITTI odometry dataset [34]. The to 120 seconds. The Super 4PCS algorithm requires a prior ground truth poses in this dataset are from RTK GPS and estimate of the cloud overlap, which was provided from the IMU measurements, which have high accuracy over long overlap estimate described above. All code can be found at sequences, but are not sufficiently accurate for the purpose https://github.com/JStech/ICP. of our experiments, described below. V. RESULTS IV. EXPERIMENTS The results were noisy and no method was consistently The HMRF method is compared with Go-ICP [35] and best. Complete results are available in the supplementary ma- Super 4PCS [23], as well as ICP with five other outlier terials. HMRF ICP performed best among the ICP variants, rejection methods: (1) no outlier rejection, (2) keeping 90% so we compare it to Go-ICP and Super 4PCS here. of the points with smallest residuals, (3) keeping points Fig. 4 shows summary plots of the rotation error, trans- whose residuals were less than 2.5 standard deviations above lation error, and elapsed time for each method. At high the mean, (4) keeping points whose residuals were less overlaps, Go-ICP often recovered the transformation most than 5.2 median absolute deviations above the median (the accurately. However, its performance deteriorated quickly as so-called “X84” criterion), and (5) the dynamic threshold the overlap decreased: only 7 Go-ICP tests with overlaps from Zhang [11]. All of the ICP implementations (including below 70% ran to completion within the time limit and HMRF ICP) use the point-to-point sum of squares distance returned transformations. Super 4PCS performed well at metric. moderate overlaps, and often recovered the most accurate The method is applied to several datasets: the shark transformation in rotation and translation for overlaps around sequence is a tabletop scene, taken by an Asus Xtion Pro; 50%. At very low overlaps (below 20%), no method per- the remaining experiments were run on ten publicly available formed reliably. Super 4PCS was very fast at moderate- RGB-D SLAM datasets [36] from the TUM Computer Vision to-high overlaps, even running single-threaded. However, it Group (specifically, those datasets in the “Handheld SLAM” had a higher variance than HMRF ICP, in both accuracy category). Ground truth poses are known for both sequences. and elapsed time. Note that the elapsed time of HMRF The sensor in the desk and room sequences is tracked ICP cannot be compared directly with that of Go-ICP and using an external motion tracking system. The poses in the Super 4PCS, as HMRF ICP was implemented entirely in shark sequence are those estimated in the tracking stage of MATLAB (and was thus able to take advantage of built-in Translation error Rotation error Elapsed time 1.5 1.5 40 1 1 20 0.5 0.5 Shark scene Error (meters) Error (radians)

0 0 Elapsed time (sec) 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.5 1.5 40 1 1 20 0.5 0.5 TUM scenes Error (meters) Error (radians)

0 0 Elapsed time (sec) 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Overlap fraction Overlap fraction Overlap fraction

Go-ICP Maximum Q3 Super 4PCS Median Q1 HMRF Minimum

Fig. 4. Error and elapsed time plots for shark scene (top) and all TUM scenes (bottom). For each method, the data are aggregated by overlap decile, and the minimum, first quartile, median, third quartile, and maximum are shown. A method is only plotted if it ran successfully for at least half of the cases in the decile: hence the truncated Go-ICP and Super 4PCS plots. multithreading optimizations), whereas Go-ICP and Super The pixels that are unobserved (that is, the sensor returns no 4PCS were single-threaded in C++ (although then had the depth measurement) occupy 31% of the image. In the initial advantage of being compiled). However, the performance z˜ field, 90% of observed pixels are considered inliers, and trends can still be reliably understood: all methods perform the remaining 10% are considered outliers. After 375 initial more slowly as overlap fraction drops, although HMRF ICP’s EM iterations, the z˜ field has converged, and now has only performance does not deteriorate as quickly as the other 54% inliers and 46% outliers. By eliminating these outliers methods. before the first transformation is calculated, divergence from the nearby optimum is avoided. An example alignment is shown at https://youtu.be/w4eVOgd7Zes. Despite the use of the logistic distribution to model outliers, there were still cases where alignment failed because of multimodal outlier residuals. To address this failure mode, a still better representation of the residual distributions for Fig. 5. Example Y , z˜(1), and z˜(375) (at convergence). This frame has 36% outliers is necessary. For instance, modeling the outliers as overlap with the fixed frame. The left image shows the observed distance a Gaussian mixture of several components could allow the to the nearest fixed point in the initial configuration; yellow is farther, blue field to fit the distant outliers with one Gaussian, and the is nearer, and the white areas are unobserved. In the right two images, blue pixels are outliers and red pixels are inliers, green pixels are unobserved. nearer outliers as another. Using robust norms [17], [38] and the point-to-plane metric [15] would also make HMRF ICP more robust to imperfect estimations of which points lie in VI.DISCUSSIONAND CONCLUSIONS the overlap of the two clouds. The good performance of HMRF ICP at low overlaps can The HMRF model for the overlap of point clouds being be understood by considering the example frame with 36% aligned via ICP has demonstrated advantages at low overlap overlap shown in Fig. 5. The three images all represent values without sacrificing performance at high overlap. The HMRF before the first transformation is applied to the free cloud: model captures the neighbor prior and describes observed the first is the residual distance to the nearest fixed point, the inlier/outlier behavior well, and so can adapt to the particular second is the initial setting of the z˜ field, and the third is the clouds being aligned. This would prove useful in the con- converged z˜ field before the first iteration of ICP. The HMRF struction of models from 3-D scanner data, as fewer scans model flexibly adapts to the small proportion of inlier points, would be required, or aligning depth readings at a low frame in particular it eliminates outliers across the top of the image. rate, allowing greater differences between frames. REFERENCES [23] N. Mellado, D. Aiger, and N. J. Mitra, “Super 4PCS fast global pointcloud registration via smart indexing,” in Computer Graphics [1] P. J. Besl, N. D. McKay, et al., “A method for registration of 3- Forum, vol. 33, pp. 205–215, Wiley Online Library, 2014. D shapes,” IEEE Transactions on Pattern Analysis and Machine [24] Q.-Y. Zhou, J. Park, and V. Koltun, “Fast global registration,” in Intelligence, vol. 14, no. 2, pp. 239–256, 1992. European Conference on Computer Vision, pp. 766–782, Springer, [2] A. Makadia, A. Patterson, and K. Daniilidis, “Fully automatic registra- 2016. tion of 3D point clouds,” in Computer Vision and , [25] C. Papazov and D. Burschka, “Stochastic global optimization for 2006 IEEE Computer Society Conference on, vol. 1, pp. 1297–1304, robust point set registration,” Computer Vision and Image Understand- IEEE, 2006. ing, vol. 115, no. 12, pp. 1598–1609, 2011. [3] D. Chetverikov, D. Svirko, D. Stepanov, and P. Krsek, “The trimmed [26] N. Gelfand, N. J. Mitra, L. J. Guibas, and H. Pottmann, “Robust global iterative closest point algorithm,” in Pattern Recognition, 2002. Pro- registration,” in Symposium on , vol. 2, p. 5, 2005. ceedings. 16th International Conference on, vol. 3, pp. 545–548, IEEE, [27] F. T. Ramos, D. Fox, and H. F. Durrant-Whyte, “CRF-matching: Con- 2002. ditional random fields for feature-based scan matching,” in Robotics: [4] S. Rusinkiewicz and M. Levoy, “Efficient variants of the ICP algo- Science and Systems, 2007. rithm,” in 3-D Digital Imaging and Modeling, 2001. Proceedings. [28] Z. Sun, J. Van de Ven, F. Ramos, X. Mao, and H. Durrant-Whyte, Third International Conference on, pp. 145–152, IEEE, 2001. “Inferring laser-scan matching uncertainty with conditional random [5] D. W. Eggert, A. Lorusso, and R. B. Fisher, “Estimating 3-D rigid body fields,” Robotics and Autonomous Systems, vol. 60, no. 1, pp. 83–94, transformations: a comparison of four major algorithms,” Machine 2012. Vision and Applications, vol. 9, no. 5-6, pp. 272–290, 1997. [29] D. Chandler, Introduction to modern statistical mechanics. Oxford [6] K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting University Press, 1987. of two 3-D point sets,” IEEE Transactions on Pattern Analysis and [30] Z. Kato and T.-C. Pong, “A Markov random field image segmentation Machine Intelligence, no. 5, pp. 698–700, 1987. model for color textured images,” Image and Vision Computing, [7] M. W. Walker, L. Shao, and R. A. Volz, “Estimating 3-D location vol. 24, no. 10, pp. 1103–1114, 2006. parameters using dual number quaternions,” CVGIP: Image Under- [31] J. Besag, “On the statistical analysis of dirty pictures,” Journal of standing, vol. 54, no. 3, pp. 358–367, 1991. the Royal Statistical Society. Series B (Methodological), pp. 259–302, [8] G. Turk and M. Levoy, “Zippered polygon meshes from range images,” 1986. in Proceedings of the 21st Annual Conference on Computer Graphics [32] G. Celeux, F. Forbes, and N. Peyrard, “EM procedures using mean and Interactive Techniques, pp. 311–318, ACM, 1994. field-like approximations for Markov model-based image segmenta- [9] T. Masuda, K. Sakaue, and N. Yokoya, “Registration and integration tion,” Pattern Recognition, vol. 36, no. 1, pp. 131–144, 2003. of multiple range images for 3-D model construction,” in Pattern [33] J.-Y. Bouguet, “Pyramidal implementation of the affine Lucas Kanade Recognition, 1996., Proceedings of the 13th International Conference feature tracker,” Intel Corporation, vol. 5, no. 1-10, p. 4, 2001. on, vol. 1, pp. 879–883, IEEE, 1996. [34] A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous [10] A. Fusiello, U. Castellani, L. Ronchetti, and V. Murino, “Model driving? the kitti vision benchmark suite,” in Conference on Computer acquisition by registration of multiple acoustic range views,” Computer Vision and Pattern Recognition (CVPR), 2012. VisionECCV 2002, pp. 558–559, 2002. [35] J. Yang, H. Li, D. Campbell, and Y. Jia, “Go-ICP: a globally optimal [11] Z. Zhang, “Iterative point matching for registration of free-form curves solution to 3D ICP point-set registration,” IEEE Transactions on and surfaces,” International Journal of Computer Vision, vol. 13, no. 2, Pattern Analysis and Machine Intelligence, vol. 38, no. 11, pp. 2241– pp. 119–152, 1994. 2254, 2016. [12] O. Enqvist, K. Josephson, and F. Kahl, “Optimal correspondences [36] J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers, “A from pairwise constraints,” in Computer Vision, 2009 IEEE 12th benchmark for the evaluation of RGB-D SLAM systems,” in Proc. International Conference on, pp. 1295–1302, IEEE, 2009. of the International Conference on Intelligent Robot Systems (IROS), [13] L. Silva, O. R. P. Bellon, and K. L. Boyer, “Precision range image Oct. 2012. registration using a robust surface interpenetration measure and en- [37]O.K ahler,¨ V. A. Prisacariu, C. Y. Ren, X. Sun, P. Torr, and D. Murray, hanced genetic algorithms,” IEEE Transactions on Pattern Analysis “Very high frame rate volumetric integration of depth images on and Machine Intelligence, vol. 27, no. 5, pp. 762–776, 2005. mobile devices,” IEEE Transactions on Visualization and Computer [14] Y. Ma, Y. Guo, J. Zhao, M. Lu, J. Zhang, and J. Wan, “Fast and Graphics, vol. 21, no. 11, pp. 1241–1250, 2015. accurate registration of structured point clouds with small overlaps,” [38] P. Mavridis, A. Andreadis, and G. Papaioannou, “Efficient sparse ICP,” in The IEEE Conference on Computer Vision and Pattern Recognition Computer Aided Geometric Design, vol. 35, pp. 16–26, 2015. (CVPR) Workshops, June 2016. [15] Y. Chen and G. Medioni, “Object modeling by registration of multiple range images,” in Robotics and Automation, 1991. Proceedings., 1991 IEEE International Conference on, pp. 2724–2729, IEEE, 1991. [16] J. Salvi, C. Matabosch, D. Fofi, and J. Forest, “A review of recent range methods with accuracy evaluation,” Image and Vision Computing, vol. 25, no. 5, pp. 578–596, 2007. [17] S. Bouaziz, A. Tagliasacchi, and M. Pauly, “Sparse iterative closest point,” in Computer Graphics Forum, vol. 32, pp. 113–123, Wiley Online Library, 2013. [18] J. M. Phillips, R. Liu, and C. Tomasi, “Outlier robust ICP for minimizing fractional RMSD,” in 3-D Digital Imaging and Modeling, 2007. 3DIM’07. Sixth International Conference on, pp. 427–434, IEEE, 2007. [19] T. Masuda and N. Yokoya, “A robust method for registration and segmentation of multiple range images,” Computer Vision and Image Understanding, vol. 61, no. 3, pp. 295–307, 1995. [20] S. Granger and X. Pennec, “Multi-scale EM-ICP: A fast and robust ap- proach for surface registration,” in European Conference on Computer Vision, pp. 418–432, Springer, 2002. [21] J. Hermans, D. Smeets, D. Vandermeulen, and P. Suetens, “Robust point set registration using EM-ICP with information-theoretically optimal outlier handling,” in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 2465–2472, IEEE, 2011. [22] D. Aiger, N. J. Mitra, and D. Cohen-Or, “4-points congruent sets for robust pairwise surface registration,” ACM Transactions on Graphics (TOG), vol. 27, no. 3, p. 85, 2008.