Estimating 3-D Rigid Body Transformations: a Comparison of Four Major Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Machine Vision and Applications (1997) 9: 272–290 Machine Vision and Applications c Springer-Verlag 1997 Estimating 3-D rigid body transformations: a comparison of four major algorithms D.W. Eggert1, A. Lorusso2, R.B. Fisher3 1 Department of Computer Science, University of New Haven, West Haven, CT 06516, USA, e-mail: [email protected] 2 Department of Artificial Intelligence, University of Edinburgh, Edinburgh EH1 2QL, Scotland, UK, e-mail: [email protected] 3 Department of Artificial Intelligence, University of Edinburgh, Edinburgh EH1 2QL, Scotland, UK, e-mail: [email protected] Abstract. A common need in machine vision is to com- A recent survey by Sabata and Aggarwal (1991) lists a pute the 3-D rigid body transformation that aligns two sets large number of methods that have been developed to com- of points for which correspondence is known. A compara- pute the 3-D rigid body transformation between two sets tive analysis is presented here of four popular and efficient of corresponded features. These techniques are categorized algorithms, each of which computes the translational and ro- based on feature type (surfaces, lines or points) and general tational components of the transform in closed form, as the solution method (iterative vs closed form). Point features solution to a least squares formulation of the problem. They are the most commonly used in practice. Closed form so- differ in terms of the transformation representation used and lutions are generally superior to iterative methods, in terms the mathematical derivation of the solution, using respec- of efficiency and robustness, because the latter suffer from tively singular value decomposition or eigensystem compu- the problems of not guaranteeing convergence, becoming tation based on the standard [R; T] representation, and the trapped in local minima of the error function and requir- eigensystem analysis of matrices derived from unit and dual ing a good starting estimate. For these reasons, this pa- quaternion forms of the transform. This comparison presents per compares only a subset of the approaches mentioned both qualitative and quantitative results of several experi- in the indicated survey: closed form solutions involving cor- ments designed to determine (1) the accuracy and robust- responded point sets. The problem of determining point cor- ness of each algorithm in the presence of different levels of respondence, itself an active research area, is not addressed noise, (2) the stability with respect to degenerate data sets, here. and (3) relative computation time of each approach under Four popular and efficient solutions are compared in this different conditions. The results indicate that under “ideal” paper. These differ according to the representations used for data conditions (no noise) certain distinctions in accuracy the transformation components, and in the ways they mini- and stability can be seen. But for “typical, real-world” noise mize a criterion function. The first solution was developed levels, there is no difference in the robustness of the final by Arun et al. (1987) and is based on computing the sin- solutions (contrary to certain previously published results). gular value decomposition (SVD) of a matrix derived from Efficiency, in terms of execution time, is found to be highly the standard [R; T] representation. A similar approach, but dependent on the computer system setup. based on exploiting the orthonormal properties of the ro- tation matrix, computes the eigensystem of a different de- Key words: Motion analysis – 3-D rigid transformations – rived matrix, as presented by Horn et al. (1988). The third Pose estimation algorithm, also developed by Horn (1987), involves com- puting the eigensystem of a matrix related to representing the rotational component as a unit quaternion. Yet another eigensystem is analyzed when the translation and rotation components are represented using dual quaternions, as pre- 1 Introduction sented in the fourth technique by Walker et al. (1991). Determining the relationship between two coordinate sys- The comparison conducted here consists of three parts. tems through the use of sets of corresponded feature mea- First, the absolute accuracy of each algorithm is determined surements is known as the absolute orientation problem.It using ground truth under noiseless conditions, and then ro- has numerous applications in the areas of photogrammetry, bustness is measured as the coordinates of corresponding robotics (constructing world models), object motion analysis, points are corrupted with increasing amounts of noise. Sec- relating a camera coordinate system to others (the hand-eye ond, the stability of the algorithms is quantified by finding transform), as well as estimating the position and orientation the breakdown limits as the original 3-D point sets degen- of a recognized object (pose estimation). erate into such forms as a plane, a line and a single point. Lastly, the relative efficiency of the algorithms, in terms of Correspondence to: D.W. Eggert 273 actual execution time, is reported for the above situations. 3.1 The singular value decomposition of a matrix Conclusions based on these results should make the choice of an appropriate algorithm for a given application easier This first method was developed by Arun et al. (1987), and and more reliable. was originally designed to explicitly minimize Eq. 2. 2 Previous comparisons 3.1.1 Transformation representation The review by Sabata and Aggarwal (1991), while quite For this approach, the rotation is represented using a standard broad in the number of techniques described, unfortunately 3 3 orthonormal matrix. Translation is a 3-D vector, as in only presents a qualitative summary and comparison of these Eq.× 1. techniques (also, the dual quaternion method was devel- oped after this survey). Quantitative results which would help answer the question, “Which is the best?”, were not 3.1.2 Calculating rotation given. However, in the defining papers of these approaches there appear certain quantitative results and qualitative as- As a consequence of the least-squares solution to Eq. 2, the sessments of accuracy and speed, including the following. point sets di and mi should have the same centroid. Arun et al. observed that “the computer time require- Using this{ constraint} { a new} equation can be generated. By ments for the SVD and (unit) quaternion algorithms are com- defining: parable” (1987, p. 700). In a paper not directly related to any of the methods, Zhang implemented both of the quaternion 1 N d¯ = di dc = di d¯ algorithms and found that “they yield exactly the same mo- N i − tion estimate” (1994, p. 124). Also, he found these two tech- i=1 X (3) niques to be more efficient than an iterative technique based N on the extended Kalman filter that he developed. Finally, 1 m¯ = mi mc = mi m¯ Walker et al. stated that “the two algorithms produce the N i − i=1 same rotation errors ... for the translation errors, the DQ al- X gorithm exhibits better performance than the SVD algorithm Eq. 2 can be rewritten and reduced to: ... ” (1991, p. 364). The thorough and unbiased comparison N 2 2 presented here will clarify, extend (and even refute) some Σ = dc Rˆ mc k i − i k of these previous findings. i=1 X (4) N 3 Descriptions of the algorithms T T T = (d dc + m mc 2 d Rˆ mc ) ci i ci i − ci i i=1 Each of the four algorithms computes the solution to a sim- X ilar problem, which can be described as follows. Assume This equation is minimized when the last term is maximized, that there exist two corresponded point sets mi and di , which is equivalent to maximizing Trace(RHˆ ), where H is i =1:: N, such that they are related by: { } { } a correlation matrix defined by: di = R mi + T + Vi (1) N T H = mc d (5) where R is a standard 3 3 rotation matrix, T is a 3-D i ci × i=1 translation vector and Vi a noise vector. Solving for the op- X timal transformation [Rˆ , Tˆ ] that maps the set mi onto di If the singular value decomposition of H is given by H = typically requires minimizing a least squares{ error} criterion{ } UΛV T, then the optimal rotation matrix, Rˆ , that maximizes given by: the desired trace is N Rˆ = VUT (6) 2 2 Σ = di Rˆ mi Tˆ (2) k − − k Note: minimizing Eq. 4 is also known as the orthogonal Pro- i=1 X crustes problem, the SVD-based solution of which has been It is true that if outliers exist in the data set (through known for some time (Schonemann 1966). [It has also been incorrect correspondences), this least squares solution is not shown by Goryn and Hein (1995) that the above solution optimal and other techniques should be used (Meer et al. holds in the case where both the model and data points have 1991). However, for the remainder of this paper we assume been corrupted by noise.] only correct correspondences exist. In the following sections, the basic steps involved in minimizing Eq. 2 are listed for each method in a common 3.1.3 Calculating translation framework. This includes exactly how the rotation and trans- lation components of the transformation are represented and The optimal translation aligns the centroid of the set di computed, as well as any special cases that must be consid- { } with the rotated centroid of the set mi as mentioned ear- ered for a complete solution. (For complete derivations of lier. That is { } each technique, the reader is invited to consult the appropri- ate paper.) Tˆ = d¯ Rˆ m¯ (7) − 274 3.1.4 Special cases 3.2.3 Calculating translation ˆ When the determinant of R is +1, all is well. However, when The optimal translation Tˆ is computed using Eq.