Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17)

Non-Rigid Point Set Registration with Robust Transformation Estimation under Manifold Regularization

Jiayi Ma,1 Ji Zhao,1 Junjun Jiang,2 Huabing Zhou3 1Electronic Information School, Wuhan University, Wuhan 430072, China 2School of Computer Science, China University of Geosciences, Wuhan 430074, China 3Hubei Provincial Key Laboratory of Intelligent Robot, Wuhan Institute of Technology, Wuhan 430073, China {jyma2010, zhaoji84, zhouhuabing}@gmail.com, [email protected]

Abstract robust transformation estimation from putative correspon- dences. First, the putative correspondences are usually es- In this paper, we propose a robust transformation estimation tablished based on only local feature descriptors, where the method based on manifold regularization for non-rigid point set registration. The method iteratively recovers the point cor- unavoidable noise, repeated structures and occlusions often respondence and estimates the spatial transformation between lead to a high number of false correspondences. Therefore, two point sets. The correspondence is established based on a robust procedure of removal is required. Second, existing local feature descriptors which typically results in a to establish reliable correspondence, the putative set usually number of . To achieve an accurate estimate of the removes a large part of the original point sets whose feature transformation from such putative point correspondence, we descriptors are not similar enough. However, the point sets formulate the registration problem by a with are typically extracted from the contour or surface of a spe- a set of latent variables introduced to identify outliers, and cific object, and hence can provide intrinsic structure infor- a prior involving manifold regularization is imposed on the mation of the input data which is beneficial to the transfor- transformation to capture the underlying intrinsic geometry mation estimation. Therefore, it is desirable to incorporate of the input data. The non-rigid transformation is specified in a reproducing kernel Hilbert space and a sparse approxima- the whole point sets into the objective function formulation tion is adopted to achieve a fast implementation. Extensive during transformation estimation. Third, for large scale point experiments on both 2D and 3D data demonstrate that our cloud data, the number of points can reach tens of thousands. method can yield superior results compared to other state-of- This poses a significant burden on typical point registration the-arts, especially in case of badly degraded data. methods, particularly in the non-rigid case. Therefore, it is of particular advantage to develop a more efficient technique. Introduction To address these issues, we formulate the registration problem by a mixture model with a set of latent variables Point set registration is a fundamental problem and fre- introduced to identify outliers. We also assume a prior on quently encountered in , , the geometry involving manifold regularization to impose a medical imaging and remote sensing (Brown 1992). Many non-parametric smoothness constraint on the spatial trans- tasks in these fields including , shape formation (Ma et al. 2014; Belkin, Niyogi, and Sindhwani recognition, panoramic stitching, feature-based image reg- 2006). The manifold regularization defined on the whole istration and content-based image retrieval can be solved by input point sets controls the complexity of the transforma- algorithms operating on the point sets (e.g., salient point fea- tion and is able to capture the underlying intrinsic geome- tures) extracted from the input data (Ma et al. 2015a; 2015c; try of the input data. This leads to a maximum a posteriori Bai et al. 2017; Zhou et al. 2016). The goal of point set (MAP) estimation problem which can be solved by using registration is then to determine the right correspondence the Expectation-Maximization (EM) algorithm (Dempster, and/or to recover the spatial transformation between the two Laird, and Rubin 1977) to estimate the variance of the prior, point sets (Jian and Vemuri 2011). In this paper, we focus on while simultaneously estimating the outliers, with the vari- non-rigid registration where the transformation is character- ance given a large initial value. Moreover, a sparse approx- ized by a nonlinear or non-parameterized model (Zhao et al. imation based on a similar idea as the subset of regressors 2011; Ma et al. 2013; Wang et al. 2015). method (Poggio and Girosi 1990) is introduced to improve The registration problem is typically solved by using the computational efficiency. an iterative framework, where a set of putative correspon- Our contribution in this paper includes the following three dences is established and used to refine the estimate of aspects. Firstly, we introduce the manifold regularization to transformation, and vice versa (Besl and McKay 1992; the point set registration problem, which can capture the in- Chui and Rangarajan 2003). In this process, the most chal- trinsic geometry of the input point sets and hence helps to lenging and critical task is to develop an efficient strategy for estimate the transformation. Secondly, we propose a new Copyright c 2017, Association for the Advancement of Artificial formulation for robust transformation estimation based on Intelligence (www.aaai.org). All rights reserved. manifold regularization, which could estimate transforma-

4218 tion from point correspondences contaminated by outliers. geometric structures (e.g., neighborhood structures) which Thirdly, we provide a fast implementation for our method could be incorporated into a feature descriptor. Therefore, by using sparse approximation, which enables our method the correspondences could be established by finding for to handle large scale datasets such as 3D point clouds. each point in one point set (e.g., the model) the point on the other point set (e.g., the target) that has the most sim- Related work ilar feature descriptor. Fortunately, there are several well- The iterated closest point (ICP) algorithm (Besl and McKay designed feature descriptors that can fulfill this task, both 1992) is one of the representative approaches using the iter- in 2D and in 3D cases (Belongie, Malik, and Puzicha 2002; ative framework to solve the registration problem. In (Chui Rusu, Blodow, and Beetz 2009). and Rangarajan 2003), Chui and Rangarajan developed a For 2D case, the shape context (Belongie, Malik, and general framework for non-rigid registration called TPS- Puzicha 2002) has been a widely used feature descriptor. x y RPM. Different from ICP, which uses the nearest point strat- Consider two points i and j, their SCs which capture egy in learning the correspondence, TPS-RPM introduces the distributions of their neighborhood points are histograms {p k }K {q k }K χ2 soft assignments and solves it in a continuous optimization i( ) k=1 and j( ) k=1, respectively. The distance C x , y framework involving deterministic annealing. Zheng and is used to measure their difference ( i j): K 2 Doermann (Zheng and Doermann 2006) proposed a method 1 [pi(k) − qj(k)] called RPM-LNS which can preserve local neighborhood C(xi, yj)= . (1) 2 pi(k)+qj(k) structures during matching, where the shape context (SC) k=1 feature (Belongie, Malik, and Puzicha 2002) is used to ini- After we have obtained the distances of all point pairs, i.e., tialize the correspondence. Ma et al. (Ma et al. 2015b) in- {C(xi, yj),i=1, ··· ,M, j =1, ··· ,N}, the Hungarian troduced a non-rigid registration strategy based on Gaussian method (Papadimitriou and Steiglitz 1982) is applied to seek M N fields, which was later improved in (Wang et al. 2016) by the correspondences between {xi}i=1 and {yj}j=1. using inner distance shape context (Ling and Jacobs 2007) For 3D case, we consider the fast point feature histograms to construct initial correspondences. In the recent past, the (FPFH) (Rusu, Blodow, and Beetz 2009) as the feature de- point registration is typically solved by probabilistic meth- scriptor. It is a histogram that collects the pairwise pan, tilt ods (Jian and Vemuri 2011; Myronenko and Song 2010; and yaw angles between every point and its k-nearest neigh- Horaud et al. 2011; Ma, Zhao, and Yuille 2016). Specifi- bors, followed by a reweighting of the resultant histogram of cally, to cope with highly articulated deformation, A global- a point with the neighboring histograms. The computation of local topology preservation (GLTP) method (Ge, Fan, and the histogram is quite efficient which has linear complexity Ding 2014; Ge and Fan 2015) is proposed based on coherent with respect to the number of surface normals. The match- point drift (CPD) (Myronenko and Song 2010). These meth- ing of FPFH descriptors is performed by a sample consensus ods formulated registration as the estimation of a mixture of initial alignment method. densities using GMMs, and the problem is solved using the After using some local feature descriptor to establish cor- L framework of maximum likelihood and the EM algorithm. respondence, we obtain a putative set S = {(xi, yi)}i=1, where L ≤ min{M,N} is the number of correspondence. L Method Without loss of generality, we assume that {xi}i=1 and L M {yi} in the putative set correspond to the first L points in Suppose we are given a model point set {xi}i=1 and a target i=1 N {x }M L {yj} xi yj D the original model point set i i=1 and the first points in point set j=1, where and are dimensional col- N umn vectors denoting the point positions (typically D =2 the original target point set {yj}j=1, respectively. or 3), M and N are the numbers of points in the two sets, respectively. To solve the registration problem, we consider Solve transformation with manifold regularization L an iterative strategy which first constructs a putative set of Given a putative correspondence set S = {(xi, yi)}i=1 es- M N correspondences by using the local geometric structures of tablished from two point sets {xi}i=1 and {yj}j=1, our pur- points, and then estimates the spatial transformation based pose is to estimate the underlying spatial transformation T on the putative set together with some global geometric con- between them, for example, yi = T (xi) for any correspon- straints. In the following, we start by introducing the cor- dence (xi, yi) in S. This problem is in general ill-posed as respondence estimation based on local structure informa- T is non-rigid which has an infinite number of solutions. To tion, and then lay out the manifold regularization which obtain a meaningful solution, the regularization technique could capture the underlying spatial geometry of a point set. could be used which typically operates in a Reproducing We subsequently propose a formulation for robust transfor- Kernel Hilbert Space (RKHS) (Aronszajn 1950) (associated mation estimation from putative correspondences based on with a particular kernel). Specifically, the Tikhonov regular- manifold regularization, and followed by the fast implemen- ization (Tikhonov and Arsenin 1977) in an RKHS H mini- tation using sparse approximation. Finally, we present the mizes a regularized risk functional: implementation details of the proposed approach. L ∗ 2 2 T =min yi −T(xi) + λT H, (2) T∈H Correspondence estimation i=1 For two point sets representing similar shapes or objects, where the first term enforces closeness to the data, the sec- their corresponding points will in general have similar local ond term controls complexity of the transformation T , λ is

4219 a regularization parameter controlling the trade-off between Robust transformation estimation these two terms, and ·H denotes the norm of H (we will T · The transformation could be solved by minimizing the reg- discuss the detailed forms of and H later). ularized risk functional in Eq. (3). However, the putative set Recall that in our problem, due to the existence of noise, L S = {(xi, yi)} typically involves some unknown false outliers, occlusions, etc., the number of matched points is i=1 L ≤ M correspondences, as only local neighborhood structures are typically less than the whole point set, i.e., . That is considered. Therefore, it is important that the transformation to say, only L points x1,...,xL are given labels y1,...,yL T estimation is robust to outliers. In this section, we propose a drawn from the spatial transformation . However, for point method for robust transformation estimation from point cor- set registration, the points we wish to match are usually ex- respondences by using manifold regularization. tracted from a shape contour or an object surface which pos- We make the assumption that, for the inliers, the noise sess some sort of “intrinsic geometry”. Fore instance, the of point position is Gaussian on each component (dimen- point positions for each type of shapes or objects are not ar- sion) with zero mean and uniform standard deviation σ; bitrary, which often form a specific distribution. Therefore, for the outliers, the position of the target point yi lies ran- the M − L unlabeled points can provide additional informa- D domly in a bounded region of IR , and the distribution is tion about the characteristic of the point set. To make full assumed to be uniform 1/a with a denoting the volume use of such additional information, we consider the man- of this region (Ma et al. 2014). We then associate the i-th ifold regularization (Belkin, Niyogi, and Sindhwani 2006; correspondence with a latent variable zi ∈{0, 1}, where Minh and Sindhwani 2011; Zhao et al. 2015). It introduces 2 zi =0indicates a uniform distribution and zi =1points an additional regularizer T I to penalize T along a low- T to a Gaussian distribution. Let X =(x1,...,xL) and dimensional manifold, which is defined on the whole input T L×D M Y =(y1,...,yL) ∈ IR be the two sets of points set {xi}i=1. Thus the regularized risk functional becomes: in the putative set. Thus, the likelihood is a mixture model L ∗ 2 2 2 given by T =min yi −T(xi) + λ1T H + λ2T I , (3) T∈H i=1 L  p Y|X, θ p y ,z |x , θ where the parameter λ1 controls the complexity of the trans- ( )= ( i i i ) formation in the input space, and λ2 regularizes with respect i=1 zi L   λ1  y −T (x )2 to the intrinsic geometry. The term is necessary since the γ − i i 1 − γ = e 2σ2 + , (8) manifold is a strict subset of the input space; among many (2πσ2)D/2 a T∈Hwhich give the same value on the manifold, we pre- i=1 fer a solution which is smooth in the input space. where θ = {T ,σ2,γ} includes the unknown parameters, To define the manifold regularization term, we use the and γ is a mixing coefficient specifying the marginal dis- graph Laplacian which is a discrete analogue of the man- tribution over the latent variable, i.e., ∀zi, p(zi =1)=γ. ifold Laplacian (Belkin, Niyogi, and Sindhwani 2006). It We assume the non-rigid transformation T to lie within an models a manifold using the weighted neighborhood graph RKHS, and it should also reflect the intrinsic structure of for the data based on an assumption that the input points a point set. Thus a slow-and-smooth prior could be applied − 1 (λ T 2 +λ T 2 ) are drawn i.i.d. from the manifold. Consider the weighted T p T ∝ e 2 1 H 2 I G to : ( ) . Using Bayes rule, we neighborhood graph given by taking the graph on vertex estimate an MAP solution of θ: set V = {x1,...,xM } (the matched and unmatched points) 2 ∗ with edges (xi, xj) if and only if xi − xj ≤ , and as- θ = arg max p(θ|X, Y) = arg max p(Y|X, θ)p(T ). (9) signing to edge (xi, xj) the weight θ θ 1 2 − xi−xj  Wij = e . (4) To optimize this objective function, we consider the EM The graph Laplacian of G is the matrix A given by algorithm, which is a general technique for learning and in- ference in the context of latent variables. We follow standard Aij = Dij − Wij, (5)    notations (Bishop 2006) and omit some terms that are inde- D M W M where = diag j=1 ij i=1, i.e., the diagonal matrix pendent of θ. Considering the negative log posterior func- whose i-th entry is the sum of the weights of edges leav- tion, the complete-data log posterior is: T ing xi. Let t =(T (x1),...,T (xM )) , then the manifold L regularization term can be defined as: old 1 2 DLp 2 Q(θ, θ )=− piyi −T(xi) − ln σ + M M σ2   2 i=1 2 T 2 W t − t 2 tTAt , I = ij( i j) = tr( ) (6) λ1 2 λ2 2 Lp ln γ +(L − Lp)ln(1− γ) − T H − T I , (10) i=1 j=1 2 2 where tr(·) denotes the trace. Therefore, the regularized risk  p P z |x , y , θold L L p functional (3) becomes: where i = ( i =1 i i ), p = i=1 i. The L EM algorithm alternates with two steps: an expectation step ∗ 2 2 T T = min yi −T(xi) + λ1T H + λ2tr(t At). (7) (E-step) and a maximization step (M-step). T∈H old i=1 E-step: We use the current parameter values θ to find We will discuss the solution of this manifold regularized the posterior distribution of the latent variables. Denote P = risk functional in the next section. diag(p1,...,pL) a diagonal matrix which can be computed

4220 by applying Bayes rule: as the initial value, and finally converges to a stable local

2 minimum. A similar concept has been introduced in deter- − yi−T (xi) γe 2σ2 ministic annealing (Chui and Rangarajan 2003), where the pi = 2 . (11) − yi−f (xi) (2πσ2)D/2 solution of an easy problem is used to recursively give ini- γe 2σ2 − γ +(1 ) a tial conditions to increasingly harder problems. p The posterior probability l is a soft decision, which indi- Fast implementation cates to what degree the correspondence (xi, yi) agrees with the current estimated transformation T . The most time consuming step of our proposed algorithm is to solve the transformation T using linear system (16), M-step: We determine the revised parameter estimate 3 new new old which requires O(M ) and may pose a se- θ as: θ =argmaxθ Q(θ, θ ). Let T (X)= T rious problem for large values of M. Even when it is im- (T (x1),...,T (xL)) . Considering the diagonal matrix P 2 plementable, a suboptimal but faster method may be a bet- and taking derivative of Q(θ) with respect to σ and γ, and ter choice. In this section, we provide a fast implementation setting them to zero, we obtain based on a similar kind of idea as the subset of regressors tr((Y −T(X))TP(Y −T(X))) method (Poggio and Girosi 1990). σ2 = , (12) DLp Rather than searching for the optimal solution in the space M of HM = Γ(·, xi)ci , we use a sparse approxima- γ =tr(P)/L. (13) i=1 tion and search a suboptimal solution in a space with much Q(θ) T  Next we consider the terms of that are related to . H K ·, x c We obtain a manifold regularized risk functional as (Mic- less basis functions defined as K = i=1 Γ( i) i , chelli and Pontil 2005): and then minimize the manifold regularized risk functional over all the sample data. Here K  M and we choose L 1 2 λ1 2 λ2 2 the point set {xi : i ∈ IINK } as a random subset of E(T )= piyi −T(xi) + T H + T I . σ2 {x i ∈ } 2 i=1 2 2 i : IINM according to (Rifkin, Yeo, and Poggio (14) 2003), who found that simply selecting an arbitrary subset of the training inputs performs no worse than those more T We define model the transformation by requiring it to sophisticated and time-consuming methods. Therefore, we lie within an RKHS H defined by a matrix-valued kernel D D D×D search a solution with the form Γ:IR × IR → IR . In this paper, we consider a di- K   T x x, x c , agonal decomposable kernel Γ(x, x )=κ(x, x ) · I, where ( )= i=1 Γ( i) i (17) 2 κ x, x e−βx−x  β K ( )= is a scalar Gaussian kernel, with with the coefficients {ci}i=1 determined by a linear system determining the width of the range of interaction between T 2 2 T T samples. Therefore, we have the following representer theo- (U PU + λ1σ Γs + λ2σ V AV )C = U PY, (18) rem (Belkin, Niyogi, and Sindhwani 2006). K×K where the Gram matrix Γs ∈ IR with the (i, j)-th ele- L×K M×K Theorem 1. The optimal solution of the manifold regular- ment being κ(xi, xj), U ∈ IR and V ∈ IR with ized risk functional (14) is given by the (i, j)-th element being κ(xi, xj). Note that the matrix U  ∗ M is composed of the first L rows of the matrix V. The deriva- T (x)= Γ(x, xi)ci, (15) i=1 tion of Eq. (18) is similar to that of Theorem 1. Compared M with the coefficients {ci}i=1 determined by a linear system with the original method, the difference of the fast version is that it solves a linear system in Eq. (18) rather than Eq. (16). T 2 2 T (J PJΓ + λ1σ I + λ2σ AΓ)C = J PY, (16) Algorithm summary & computational complexity where Γ ∈ IR M×M is the so-called Gram matrix with the The two steps of estimating correspondence and transfor- (i, j)-th element being κ(xi, xj), J =(IL×L, 0L×(M−L)) mation are iterated to obtain a reliable result. In this paper, with I being an identity matrix and 0 being a matrix of all we use a fixed number of iterations, typically 10 but more in C c ,...,c T ∈ M×D zeros, =( 1 M ) IR is a coefficient matrix. case of badly degraded data, for example, large degree of de- Convergence analysis. The objective function (9) is not formation, high level of noise or large percentage of outliers convex, and hence it is unlikely that any optimization tech- in the point sets. As our Robust Point Matching algorithm nique can find its global minimum. However, for many prac- is based on Manifold Regularization, we name it MR-RPM. tical applications a stable local minimum is often enough. We summarize the MR-RPM method in Algorithm 1. T 2 To achieve this goal, we use a large value to initialize the For the linear system (16), the matrix J PJΓ + λ1σ I + 2 2 3 variance σ so that the objective function becomes convex λ2σ AΓ is of size M × M, and hence it requires O(M ) in a large region. In this situation, a lot of unstable shal- time complexity to solve the transformation T . However, T 2 low local minima can be filtered and a good minimum could for the linear system (18), the matrix U PU + λ1σ Γs + 2 2 T be achieved. As the EM iteration proceeds, the value of σ λ2σ V AV is of size K × K, and hence the time com- gradually decreases and the objective function tends to ap- plexity for solving the linear system reduces to O(K3). proach the true curve smoothly. This makes it likely that a Nevertheless, the time complexity of compute the matrix T 2 2 T 2 better minimum could be reached by using the old minimum U PU + λ1σ Γs + λ2σ V AV is O(KM ), due to the

4221 Algorithm 1: The MR-RPM Algorithm M (a) (d) Input: Model point set {xi}i=1, target point set N {yj}j=1, parameters , β, λ1, λ2 M Output: Aligned model point set {xˆi}i=1 N 1 Compute feature descriptors for target set {yj}j=1; 2 Set a to the volume of the output space; 3 repeat (b) (e) M 4 Compute feature descriptors for model set {xi}i=1; L 5 Construct S = {(xi, yi)}i=1 using descriptors; 6 Compute graph Laplacian A by Eqs. (4) and (5); 7 Compute Gram matrix Γ using the definition of Γ; 2 8 Initialize γ, P = I, T (xi)=xi, and σ by Eq. (12); (c) (f) 9 repeat 10 E-step: 11 Update P =diag(p1,...,pL) by Eq. (11); 12 M-step: 2 13 Update σ and γ by Eqs. (12) and (13); 14 Update C by solving linear system (18); Figure 1: Qualitative results of our MR-GRL algorithm on 15 until Q converges; the fish (a-c) and Chinese character (d-f) shapes shown in {x }M ←{T x }M 16 Update model point set i i=1 ( i) i=1; every two rows. For each group, the upper figures are the 17 until reach the maximum iteration number; model (‘+’) and target (‘◦’) point sets, while the lower fig- M 18 The aligned model point set {xˆi}i=1 is given by ures are our registration results, and the degradation level M {T (xi)}i=1 in the last iteration. increases from left to right. From top to bottom, different degradations involving deformation, noise and occlusion.

M × M multiplication operation on the graph Laplacian 0.9 A K to . Moreover, to use the fast implementation, we set the matrix .As is a constant which is not dependent on solution base number K to 15 in the 2D case and 50 in the M and K  M, the total time complexity of solving the a T 3D case; the uniform distribution parameter is set to be the transformation in our fast implementation can be writ- volume of the bounding box of the data. ten as O(M 2). The space complexity of our method scales like O(M 2) due to the memory requirements for storing the graph Laplacian matrix A. Experimental results In order to evaluate the performance of our MR-RPM, we Implementation details conduct experiments on both 2D shape contour and 3D point Before solving the registration problem, we first normalize cloud. The experiments were performed on a laptop with 3.0 the input point sets with a linear scaling, so that they are GHz Intel Core CPU, 8 GB memory and Matlab Code. expressed in the same coordinate system, more specifically, the points in both of the two sets both have zero mean and Results on 2D shape contour unit variance. Besides, we solve a displacement function v We use the same synthesized data as that in (Chui and Ran- defined as T (x)=x + v(x) rather than directly solving garajan 2003) and (Zheng and Doermann 2006), which con- the transformation T , which can be achieved by directly sists of two shape patterns (i.e., a fish and a Chinese charac- setting the output as y − x in our formulation. The use of ter) with different kinds of degenerations including deforma- displacement field guarantees more robustness (Myronenko tion, noise, outlier, rotation and occlusion. For each kind of and Song 2010; Ma et al. 2014). degeneration, it involves several different degeneration lev- Parameter settings. There are four main parameters in our els and each degeneration level contains 100 samples. Note MR-RPM: , β, λ1 and λ2. Parameter  is a threshold used to that the outlier is somewhat similar to the occlusion, as in construct the neighborhood graph. Parameter β determines both cases one point set contains some points not contained the width of the range of the interaction between samples. in the other point set. But for the test data, the occlusion The rest two parameters control the trade-off between the is more practical as the non-common points come from the closeness to the data and the smoothness of the solution, shape contour while in the outlier case they are randomly where λ1 and λ2 regularize with respect to the whole input spread over the shape patterns. Besides, the rotation could be space and the intrinsic geometry, respectively. In general, we well addressed by using rotation invariant feature descrip- found that our method was robust to parameter changes. We tors. Therefore, we only test our method on three kinds of set  =0.05,β =0.1,λ1 =3,λ2 =0.05 throughout this degenerations such as the deformation, noise and occlusion. paper. The inlier percentage parameter γ needs an initial as- We first provide some qualitative illustrations of our MR- sumption, as shown in Line 8 in Algorithm 1, here we fix it RPM on the two shape patterns, as shown in Fig. 1. For each

4222 0.08 0.08 0.18 SC SC SC TPS-RPM TPS-RPM 0.15 TPS-RPM 0.06 RPM-LNS 0.06 RPM-LNS RPM-LNS GMMREG GMMREG 0.12 GMMREG CPD CPD CPD 0.04 GLTP 0.04 GLTP 0.09 GLTP VFC VFC VFC MR-RPM MR-RPM MR-RPM 0.02 0.06 Average Error Average Error 0.02 Average Error 0.03 0 0 0 -0.02 0.02 0.035 0.05 0.065 0.08 0 0.01 0.02 0.03 0.04 0.05 0 0.1 0.2 0.3 0.4 0.5 Degree of Deformation Noise Level Occlusion Ratio

0.08 0.08 0.18 SC SC SC TPS-RPM TPS-RPM 0.15 TPS-RPM 0.06 RPM-LNS 0.06 RPM-LNS RPM-LNS GMMREG GMMREG 0.12 GMMREG CPD CPD CPD 0.04 GLTP 0.04 GLTP 0.09 GLTP VFC VFC VFC MR-RPM MR-RPM MR-RPM 0.02 0.06 Average Error Average Error 0.02 Average Error 0.03 0 0 0 -0.02 0.02 0.035 0.05 0.065 0.08 0 0.01 0.02 0.03 0.04 0.05 0 0.1 0.2 0.3 0.4 0.5 Degree of Deformation Noise Level Occlusion Ratio

Figure 2: Comparison of MR-RPM with SC, TPS-RPM, RPM-LNS, GMMREG, CPD and VFC on the fish (top) and Chinese character (bottom). The error bars indicate the registration error means and standard deviations over 100 trials. group of results, our goal is to align the model points (‘+’) onto the target points (‘◦’) which are both presented in the upper figures, and the registration results are given in the lower figures. From the results, we see that our MR-RPM can well address all the different degradations, and the reg- istration accuracy decreases gradually and gracefully as the degradation level increases. It is interesting that even in case Figure 3: Qualitative results of our method on 3D wolf point of extreme degradation level, especially for the deformation clouds involving non-rigid deformation (left two) and occlu- and occlusion, our method can still generate satisfactory re- sion (right two). For each group, the left figure is the model sults. The average run time of our MR-RPM on this dataset (‘·’) and target (‘◦’) point sets, and the right is our result. with about 100 points for each shape pattern is about 0.5s. To provide a quantitative comparison to the state-of-the- as VFC is that we introduce an additional manifold regular- arts, we report the results of six methods such as SC (Be- ization term; our consistently better results demonstrate that longie, Malik, and Puzicha 2002), TPS-RPM (Chui and the manifold regularization does play an important role for Rangarajan 2003), RPM-LNS (Zheng and Doermann 2006), improving the transformation estimation. GMMREG (Jian and Vemuri 2011), CPD (Myronenko and Song 2010), GLTP (Ge, Fan, and Ding 2014), and VFC (Ma et al. 2014), as shown in Fig. 2. For each pair of shapes, the Results on 3D point cloud registration error is characterized by the average Euclidean We next test our MR-RPM for registration of 3D point cloud distance between the warped model set and its correspond- data, where a wolf pattern with about 5, 000 points in differ- ing target set. Then the mean and standard deviation of the ent poses is used for evaluation. The results are presented in registration error on all 100 samples for each degradation Fig. 3, where the left two figures test the non-rigid deforma- level and degradation type are computed for performance tion and the right two figures test the occlusion. We see that comparison. From the results, we see that SC, GMMRGE our method can produce almost perfect alignments for both and GLTP are not robust to noise, while TPS-RPM degrades point cloud pair, even both the model and target sets suffer badly in case of occlusion. The registration accuracies of from degradations, as shown in occlusion test. The average RPM-LNS and CPD are satisfactory which decrease grace- runtime of our MR-RPM on this dataset is about 47s. fully as the degradation level increases. In contrast, VFC We also conduct a quantitative comparison with respect to and our MR-RPM have the best results in most case expect two representative state-of-the-arts such as CPD and VFC. for large noise level, and our MR-RPM almost consistently The average registration errors on the two point cloud pairs outperforms VFC for both different degradation type and shown in Fig. 3 is CPD (0.82, 0.72), VFC (1.15, 1.01), and degradation level on all the dataset. Note that a major dif- MR-RPM (0.78, 0.53), respectively. Clearly, our method has ference of our MR-RPM and other iterative methods such the best performance, which means that our MR-RPM is ef-

4223 fective for both 2D and 3D point set registration. Ling, H., and Jacobs, D. W. 2007. Shape classification using the inner-distance. IEEE Trans. Pattern Anal. Mach. Intell. 29(2):286– Conclusion 299. Ma, J.; Zhao, J.; Tian, J.; Tu, Z.; and Yuille, A. 2013. Robust Within this paper, we presented a new method called MR- estimation of nonrigid transformation for point set registration. In RPM for non-rigid registration of both 2D shapes and 3D CVPR, 2147–2154. point clouds. A key characteristic of our approach is the us- Ma, J.; Zhao, J.; Tian, J.; Yuille, A. L.; and Tu, Z. 2014. Robust ing of manifold regularization to capture the underlying in- point matching via vector field consensus. IEEE Trans. Image Pro- trinsic geometry of the input data, leading to a better esti- cess. 23(4):1706–1721. mate of the transformation. We also provide a fast imple- Ma, J.; Qiu, W.; Zhao, J.; Ma, Y.; Yuille, A. L.; and Tu, Z. 2015a. mentation to reduce the algorithm complexity from cubic to Robust L2E estimation of transformation for non-rigid registra- quadratic, so that large scale data (especially 3D point cloud) tion. IEEE Trans. Signal Process. 63(5):1115–1129. could be addressed. The qualitative and quantitative results Ma, J.; Zhao, J.; Ma, Y.; and Tian, J. 2015b. Non-rigid visible and on both 2D and 3D public available datasets demonstrate infrared face registration via regularized gaussian fields criterion. that our MR-RPM outperforms the state-of-the-art methods Pattern Recognit. 48(3):772–784. in most cases, especially when there are significant non-rigid Ma, J.; Zhou, H.; Zhao, J.; Gao, Y.; Jiang, J.; and Tian, J. 2015c. deformations and or occlusions in the data. Robust feature matching for remote sensing via locally linear transforming. IEEE Trans. Geosci. Remote Sens. Acknowledgments 53(12):6469–6481. Ma, J.; Zhao, J.; and Yuille, A. L. 2016. Non-rigid point set reg- The authors gratefully acknowledge the financial supports istration by preserving global and local structures. IEEE Trans. from the National Natural Science Foundation of China un- Image Process. 25(1):53–64. der Grant nos. 61503288, 61501413 and 41501505. Micchelli, C. A., and Pontil, M. 2005. On learning vector-valued functions. Neural Comput. 17(1):177–204. References Minh, H. Q., and Sindhwani, V. 2011. Vector-valued manifold Aronszajn, N. 1950. Theory of reproducing kernels. Trans. Amer. regularization. In ICML, 57–64. Math. Soc. 68(3):337–404. Myronenko, A., and Song, X. 2010. Point set registration: Coherent Bai, S.; Bai, X.; Tian, Q.; and Latecki, L. J. 2017. Regularized point drift. IEEE Trans. Pattern Anal. Mach. Intell. 32(12):2262– diffusion process for visual retrieval. In AAAI. 2275. Belkin, M.; Niyogi, P.; and Sindhwani, V. 2006. Manifold regu- Papadimitriou, C. H., and Steiglitz, K. 1982. Combinatorial opti- larization: A geometric framework for learning from labeled and mization: algorithms and complexity. Courier Corporation. unlabeled examples. J. Mach. Learn. Res. 7:2399–2434. Poggio, T., and Girosi, F. 1990. Networks for approximation and Belongie, S.; Malik, J.; and Puzicha, J. 2002. Shape matching and learning. Proc. IEEE 78(9):1481–1497. object recognition using shape contexts. IEEE Trans. Pattern Anal. Rifkin, R.; Yeo, G.; and Poggio, T. 2003. Regularized least-squares Mach. Intell. 24(24):509–522. classification. In Advances in Learning Theory: Methods, Model Besl, P. J., and McKay, N. D. 1992. A method for registration of 3- and Applications. Cambridge, MA, USA: MIT Press. d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2):239–256. Rusu, R. B.; Blodow, N.; and Beetz, M. 2009. Fast point feature Bishop, C. M. 2006. Pattern Recognition and Machine Learning. histograms (FPFH) for 3d registration. In ICRA, 3212–3217. New York, NY, USA: Springer-Verlag. Tikhonov, A. N., and Arsenin, V. Y. 1977. Solutions of Ill-posed Brown, L. G. 1992. A survey of image registration techniques. Problems. Washington, DC, USA: Winston. ACM Comput. Surv. 24(4):325–376. Wang, G.; Wang, Z.; Chen, Y.; and Zhao, W. 2015. A robust non- Chui, H., and Rangarajan, A. 2003. A new point matching algo- rigid point set registration method based on asymmetric gaussian rithm for non-rigid registration. Comput. Vis. Image Understand. representation. Comput. Vis. Image Understand. 141:67–80. 89:114–141. Wang, G.; Wang, Z.; Chen, Y.; Zhou, Q.; and Zhao, W. 2016. Dempster, A.; Laird, N.; and Rubin, D. B. 1977. Maximum likeli- Context-aware gaussian fields for non-rigid point set registration. hood from incomplete data via the em algorithm. J. R. Statist. Soc. In CVPR, 5811–5819. Series B 39(1):1–38. Zhao, J.; Ma, J.; Tian, J.; Ma, J.; and Zhang, D. 2011. A robust method for vector field learning with application to mismatch re- Ge, S., and Fan, G. 2015. Articulated non-rigid point set reg- moving. In CVPR, 2977–2984. istration for human pose estimation from 3d sensors. Sensors 15(7):15218–15245. Zhao, M.; Chow, T. W.; Wu, Z.; Zhang, Z.; and Li, B. 2015. Learn- ing from normalized local and global discriminative information Ge, S.; Fan, G.; and Ding, M. 2014. Non-rigid point set registration for semi-supervised regression and dimensionality reduction. Inf. with global-local topology preservation. In CVPRW, 245–251. Sci. 324:286–309. Horaud, R.; Forbes, F.; Yguel, M.; Dewaele, G.; and Zhang, J. Zheng, Y., and Doermann, D. 2006. Robust point matching for 2011. Rigid and articulated point registration with expectation con- nonrigid shapes by preserving local neighborhood structures. IEEE ditional maximization. IEEE Trans. Pattern Anal. Mach. Intell. Trans. Pattern Anal. Mach. Intell. 28(4):643–649. 33(3):587–602. Zhou, Y.; Bai, X.; Liu, W.; and Latecki, L. J. 2016. Similarity Jian, B., and Vemuri, B. C. 2011. Robust point set registration fusion for visual tracking. Int. J. Comput. Vis. 1–27. using gaussian mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 33(8):1633–1645.

4224