<<

Projective for A

Projective geometry is all geometry. Arthur Cayley (1821–1895)

We are familiar with the concept and measurements of , which is a good approximation to the properties of a general physical . However, when we con- sider the imaging process of a camera, the Euclidean geometry becomes insufficient since parallelism, , and are no longer preserved in images. In this appendix, we will briefly survey some basic concepts and properties of which are extensively used in computer vision. For further information, readers may refer to [1, 3, 7]. Euclidean geometry is actually a subset of the projective geometry, which is more gen- eral and least restrictive in the hierarchy of fundamental . Just like Euclidean geometry, projective geometry exists in any of , such as a in one- dimensional , denoted as P1, corresponds to 1D R1;the projective in P2 is analogous to 2D Euclidean plane; the three-dimensional projective space P3 is related to 3D Euclidean space.

A.1 2D Projective Geometry

A.1.1 Points and Lines

In Euclidean space R2, a can be denoted as x¯ =[x,y]T , a line passing through the point can be represented as

l1x + l2y + l3 = 0 (A.1)

If we multiply the same nonzero scalar w on both sides of (A.1), we have

l1xw + l2yw + l3w = 0 (A.2)

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 183 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 184 A Projective Geometry for Computer Vision

=[ ]T =[ ]T A Clearly, (A.1) and (A.2) represent the same line. Let x xw,yw,w , l l1,l2,l3 , then the line (A.2) is represented as

xT l = lT x = 0 (A.3)

where the line is represented by the vector l, and any point on the line is denoted by x. We call the 3-vector x the of a point in P2, which represents the same point as inhomogeneous coordinates x¯ =[xw/w,yw/w]T =[x,y]T . Similarly, we call l the homogeneous representation of the line, since for any nonzero scalar k, l and kl represent the same line. From (A.3), we find that there is actually no difference between the representation of a line and the representation of a point. This is known as the duality principal. Given =[ ]T =[ ]T two lines l l1,l2,l3 and l l1,l2,l3 , their intersection defines a point that can be computed from

x = l × l =[l]×l (A.4)

where ‘×’ denotes the of two vectors, ⎡ ⎤ 0 −t3 t2 ⎣ ⎦ [l]× = t3 0 −t1 −t2 t1 0

denotes the antisymmetric of vector l. Similarly, a line passing through two points x and x can be computed from

l = x × x =[x]×x =[x ]×x (A.5)

Any point with homogeneous coordinates x =[x,y,0]T corresponds to a point at infin- ity, or ideal point. Whereas its corresponding inhomogeneous point x¯ =[x/0,y/0]T makes no sense. In space plane, all ideal points can be written as [x,y,0]T . The set of these points lies on a single line l∞, which is called the line at infinity. From (A.3), it is easy to obtain the coordinates of the line at infinity l∞ =[0, 0, 1]T .

A.1.2 Conics and Duel Conics

In Euclidean plane, the equation of a conic in inhomogeneous coordinates is written as

ax2 + bxy + cy2 + dx + ey + f = 0 (A.6)

If we adopt homogeneous coordinates and denote any point on the conic by x = T [x1,x2,x3] , then the conic (A.6) can be written as the following quadratic homogeneous expression.

2 + + 2 + + + = ax1 bx1x2 cx2 dx1x3 ex2x3 fx3 0 (A.7) A.1 2D Projective Geometry 185

Fig. A.1 A point conic (a) and its dual line conic (b). x is a point on the conic xT Cx = 0, l is a line to C at point x which satisfies lT C∗l = 0 or in a matrix form as ⎡ ⎤ ⎡ ⎤ ab/2 d/2 x1 T ⎣ ⎦ ⎣ ⎦ x Cx=[x1,x2,x3] b/2 ce/2 x2 = 0 (A.8) d/2 e/2 f x3 where C is the conic coefficient matrix which is symmetric. A conic has five since multiplying C by any nonzero scalar does not affect the above equation. Therefore, five points in P2 at a (no three points are collinear) can uniquely determine a conic. Generally, a conic matrix C is of full . In degenerate cases, it may degenerate to two lines when rank(C) = 2, or one repeated line when rank(C) = 1. The conic defined in (A.8) is defined by points in P2, which is usually termed as point conic. According to duality principal, we can obtain the dual line conic as

∗ lT C l = 0 (A.9) where the notation C∗ stands for the adjoint matrix of C. The dual conic is also called conic , as shown in Fig. A.1, which is formulated by lines tangent to C. For conics (A.8) and (A.9), we have the following Results.

Result A.1 The line l tangent to the non-degenerate conic C at point x is given by l = Cx. In duality, the tangent point x to the non-degenerate line conic C∗ at line l is given by x = C∗l.

Result A.2 For non-degenerate conic C and its duality C∗, we have C∗ = C−1, and (C∗)∗ = C. The line conics may degenerate to two points when rank(C∗) = 2, or one repeated point when rank(C∗) = 1, and (C∗)∗ = C in degenerate cases.

Any point x and a conic C define a line l = Cx, as shown in Fig. A.2. Then x and l forms a pole-polar relationship. The point x is called the pole of line l with respect to conic C, and the line l is called the polar of point x with respect to conic C. It is easy to verify the following Result. 186 A Projective Geometry for Computer Vision

Fig. A.2 The pole-polar A relationship. The line l = Cx is the polar of point x with respect to conic C,andthe point x = C−1l is the pole of l with respect to conic C

Result A.3 The polar line l = Cx intersects the conic C at two points x1 and x2, then the two lines l1 = x × x1 and l2 = x × x2 are tangent to the conic C. If the point x is on the conic, then the polar is the tangent line to C at x.

Result A.4 Any two points x and y satisfying xT Cy = 0 are called conjugate points with respect to C. The set of all conjugate points of x forms the polar line l. If x is on the polar of x, then x is also on the polar of x since xT Cx = xT Cx = 0. In duality, two lines l and l are conjugate with respect to C if lT Cl = 0.

Result A.5 There are a pair of conjugate ideal points ⎡ ⎤ ⎡ ⎤ 1 1 i = ⎣i⎦ , j = ⎣−i⎦ 0 0

on the line at infinity l∞. We call i and j the canonical forms of circular points. Essentially, the circular points are the intersection of any with the line at infinity. Thus three additional points can uniquely determine a circle, which is equivalent to the fact that five general points can uniquely determine a general conic. The dual of the circular point forms ∗ T T a degenerated line conic given by C∞ = ij + ji .

A.1.3 2D Projective Transformation

Two dimensional projective transformation is an invertible linear mapping H : P2 → P2 which is a 3 × 3 matrix. The transformation is also known as projectivity, or . T The mapping of a point x =[x1,x2,x3] can be written as

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x1 h11 h12 h13 x1 ⎣ ⎦ = ⎣ ⎦ ⎣ ⎦ x2 h21 h22 h23 x2 (A.10) x3 h31 h32 h33 x3 A.2 3D Projective Geometry 187 or more briefly as x = Hx. This is a homogeneous transformation which is defined scale. Thus there are only 8 degrees of freedom on H. Four pairs of corresponding points can uniquely determine the transformation if no three points are collinear. The transforma- tion (A.10) is defined by points.

Result A.6 Under a point transformation x = Hx, a line l is transformed to l via

− l = H 1l (A.11)

A conic C is transformed to C via

− − C = H T CH 1 (A.12)

and a dual conic C∗ is transformed to C∗ via

∗ ∗ C = HC HT (A.13)

All projective transformations form a which is called projective . There are some specializations or of the transformation, such as affine group, , and oriented Euclidean group. Different transformations have different geometric invariance and properties. For example, and are under Euclidean transformation; parallelism and line at infinity are invariant under affine trans- formation; general projective transformation preserves concurrency, , and cross ratio.

A.2 3D Projective Geometry

A.2.1 Points, Lines, and Planes

In 3D space P3, the homogeneous coordinates of a point is represented by a 4-vector T as X =[x1,x2,x3,x4] , which is defined up to a scale since X and sX (s = 0) repre- sent the same point. The corresponding inhomogeneous coordinates is X¯ =[x,y,z]T = T [x1/x4,x2/x4,x3/x4] . When X4 = 0, X represents a point at infinity. A plane in 3D space P3 can be formulated as

X = π1x1 + π2x2 + π3x3 + π4x4 = 0 (A.14)

T where X =[x1,x2,x3,x4] is the homogeneous representation of a point on the plane. The T 4-vector =[π1,π2,π3,π4] is called the homogeneous coordinates of the plane. When =[0, 0, 0, 1]T , the solution of (A.14) is the set of all points at infinity. In this case, the plane is named as plane at infinity, denoted as ∞. For any finite plane = ∞,ifwe 188 A Projective Geometry for Computer Vision

A use inhomogeneous point coordinates, the plane equation in Euclidean geometry can be written as

nT X¯ + d = 0 (A.15)

T where n =[π1,π2,π3] is called the plane normal; d = π4,andd/n is the distance of the plane from origin.

Result A.7 Two distinct planes intersect in a unique line. Two planes are if and only if their intersection is a line at infinity. A line is parallel to a plane if and only if their intersection is a point at infinity.

Result A.8 Three non-collinear points X1, X2, and X3 uniquely define a plane, which can T be obtained from the 1-dimensional right null-space of the 3×4 matrix A =[X1, X2, X3] since A = 0.

Result A.9 As a dual to Result A.8, three non-collinear planes 1, 2, and 3 uniquely define a point, which can be obtained from the 1-dimensional right null-space of the 3 × 4 ∗ T ∗ matrix A =[1,2,3] since (A )X = 0.

A line in P3 is defined by the join of two points or the intersection of two planes, which has 4 degrees of freedom in 3-space. Suppose X1 and X2 are two non-coincident space points, then the line joining these points can be defined by the span of the two points with the following 2 × 4 matrix   XT A = 1 (A.16) T X2

It is evident that the of points X ={X(α, β) = αX1 + βX2} is a line joining the two points, which is called the line generated by the span of AT . The 2-dimensional right null-space of A is a pencil of planes with the line as axis. Similarly, suppose we have two planes 1 and 2. A dual representation of a line can be generated from the span of the 2 × 4 matrix   T ∗ A = 1 (A.17) T 2

∗T Clearly, the span of A is a pencil of planes ={(α,β) = α1 +β2} with the line as axis. The 2-dimensional right null-space of A∗ is the pencil of points on the line. There are still other popular representations of line by virtue of Plücker matrices and Plücker . A.2 3D Projective Geometry 189

A.2.2 Projective Transformation and

Similar to the 2D projective transformation (A.10), the transformation is P3 can be de- scribed by a 4 × 4 matrix H as

X = HX (A.18)

The transformation H is a homogeneous matrix defined up to a scale, thus it has only 15 degrees of freedom. The hierarchy of 3D includes subgroups of Euclidean, affine, and projective transformations. Each has its special form and invariant properties. A is a in P3 defined by the following equation.

XT QX = 0 (A.19) where Q is a 4 × 4 symmetric matrix. The quadric has some similar properties as the conic in P2.

Result A.10 A quadric has 9 degrees of freedom since it is defined up to a scale. Thus 9 points in general position can define a quadric. If the matrix Q is singular, the quadric is degenerate which may be defined by fewer points.

Result A.11 A quadric defines a polarity between a point and a plane. The plane = QX is the polar plane of X with respect to Q. A plane intersects a quadric at a conic.

Result A.12 Under the transformation X = HX, a quadric transforms as

− − Q = H T QH 1 (A.20) and a dual quadric Q∗ transforms as

∗ ∗ Q = HQ HT (A.21) Matrix Decomposition B

Mathematics is the door and key to the sciences.

Roger Bacon (1214–1294)

In linear , a matrix decomposition is a factorization of a matrix into some canonical form. There are many different classes of matrix decompositions. In this appendix, we will introduce some common decomposition methods used in this book, such as singular value decomposition, RQ decomposition, and Cholesky decomposition, etc. Please refer to [2, 9] for more details.

B.1 Singular Value Decomposition

Singular value decomposition (SVD) is one of the most useful decompositions in numerical computations such as optimization, least-, etc. In this book, it is widely used as a basic mathematical tool for the factorization . Suppose A is an m × n real matrix. The singular value decomposition of A is of the form

A = UVT (B.1) where U is an m × m orthogonal matrix whose columns are the eigenvectors of AAT , V is an n × n orthogonal matrix whose columns are the eigenvectors of AT A,and is an m × n matrix with nonnegative real on the diagonal. The decomposition is conventionally carried out in such a way that the diagonal entries σi are arranged in a descending order. The diagonal entries of  are known as the singular values of A.

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 191 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 192 B Matrix Decomposition

B B.1.1 Properties of SVD Decomposition

SVD decomposition reveals many intrinsic properties of matrix A and is numerically stable for computation. Suppose the singular values of A are

σ1 ≥ σ2 ≥···≥σr >σr+1 =···=0

Then from (B.1) we infer the following statements about SVD.  = r T = T 1. A i=1 σi uivi Ur r Vr , where Ur =[u1,...,ur ], Vr =[v1,...,vr ], r = diag (σ1,...,σr ); 2. rank(A) = rank() = r. The rank of A and  equal to r; 3. range(A) = span{u1,...,ur }. The column space of A is spanned by the first r columns of U; 4. null(A) = span{vr+1,...,vn}. The null space of A is spanned by the last n − r columns of V; T 5. range(A ) = span{v1,...,vr }. The row space of A is spanned by the first r columns of V; T 6. null(A ) = span{ur+1,...,um}. The null space of A is spanned by the last m − r columns of U; 1   = 2 +···+ 2 2 7. A F (σ1 σr ) ; 8. A2 = σ1. From SVD decomposition (B.1), we have

AAT = UVT VT UT = U2UT (B.2)

2 = T Therefore, σi ,i 1,...,m are eigenvalues of AA , and the columns of U are the corresponding eigenvectors. Similarly, from

AT A = VT UT UVT = V2VT (B.3)

2 = T we know that σi ,i 1,...,n are eigenvalues of A A, and the columns of V are the corresponding eigenvectors. The result of SVD decomposition can be used to measure the dependency between columns of a matrix. The measure is termed as condition number and defined by σ cond(A) = max (B.4) σmin

where σmax and σmin denote the largest and smallest singular values of A. Note that cond(A) ≥ 1. If the condition number is close to 1, then the columns of A are indepen- dent. Large condition number means the columns of A are nearly dependent. If A is sin- gular, σmin = 0, and cond(A) =∞. The condition number plays an important role in the numerical solution of linear systems since it measures the noise sensitivity of the systems. B.1 Singular Value Decomposition 193

B.1.2 Low-Rank Matrix Approximation

The fact that rank(A) = rank() tells us that we can determine the rank of matrix A by counting the nonzero entries in . In some practical applications, such as the structure and motion factorization algorithm, we need to solve the problem of approximating a matrix with another matrix which has a specific low rank. For example, an m × n matrix is sup- posed to have rank r, but due to noise data, its rank is greater than r. This will show up in . When we look at  we may find that σr+1,σr+2,... are much smaller than σr and very close to zero. In this case, we can obtain the best rank-r approximation of the matrix by modifying . If rank(A)>rand we want to approximate it with a rank-r matrix Aˆ , the approximation is based on minimizing the Frobenius norm of the difference between A and A˜ ˆ J = min A − AF (B.5) A˜ subject to the constraint that rank(A˜ ) = r. It turns out that the solution can be simply given by singular value decomposition. Suppose the SVD factorization of A is

A = UVT (B.6)

Let us construct a ˜ which is the same matrix as  exceptthatit contains only the r largest singular values and the rest are replaced by zero. Then, the rank-r approximation is given by

A˜ = U˜ VT (B.7)

This is known as the Eckar-Young theorem in . Here is a short proof of the theorem.

Proof Since the Frobenius norm is unitarily invariant, we have the equivalent cost of (B.5)as

1 n 2 T T ˆ ˜ 2 J = min U AV − U AV F = min  − F = min (σi −˜σi ) (B.8) ˜ ˜ σ˜ A A i i=1 ˜ where σ˜i is the singular value of A,andσ˜r+1 =···=σ ˜n = 0. Thus, the cost function (B.8) is converted to

1 1 r n 2 n 2 = −˜ 2 + 2 = 2 J min (σi σi ) σi σi (B.9) σ˜ i i=1 i=r+1 i=r+1

Therefore, A˜ is the best rank-r approximation of A in the Frobenius norm sense when σ˜i = σi for i = 1,...,r,andσ˜r+1 =···=σ ˜n = 0.  194 B Matrix Decomposition

B B.2 QR and RQ Decompositions

An orthogonal matrix Q is a matrix whose transpose equals to its inverse as

QT Q = QQT = I (B.10)

where I is the identity matrix. Taking on (B.10) leads to det(Q) =±1. If det(Q) = 1, Q is a matrix. Otherwise, it is called a reflection matrix. QR decomposition is to decompose a matrix into an orthogonal matrix and an upper- . Any real square matrix A can be decomposed as

A = QR (B.11)

where Q is an orthogonal matrix, and R is an upper-triangular matrix, also called right- triangular matrix. QR decomposition is often used to solve the linear least squares problem. Analogously, we can define QL, RQ, and LQ decompositions, with L being a left-triangular matrix. The decomposition is usually performed by means of the Gram-Schmidt process, Householder transformation, or Givens rotations. In computer vision, we are particularly interested in the RQ decomposition of a 3 × 3 real matrix. Using this decomposition, we can factorize the matrix into a cam- era parameter matrix and a . The process can be easily realized by Givens rotations. The 3-dimensional Givens rotations can be given by ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 cs c −s ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ Gx = c −s , Gy = 1 , Gz = sc (B.12) sc −sc 1

where c = cos θ and s = sin θ with θ a rotation . Multiply matrix A on the right by a Givens rotation (B.12), we can zero an element in the subdiagonal of the matrix, forming the triangular matrix R. The concatenation of all the Givens rotations forms the orthogonal matrix Q. For example, the RQ decomposition of a matrix A =[aij ]3×3 can be performed as follows.

1. Zero the subdiagonal element a21 by multiplying Gz; 2. Zero the subdiagonal element a31 by multiplying Gy . The process does not change a21; 3. Zero the subdiagonal element a32 by multiplying Gx . The process does not change a21 and a31; 4. Formulate the decomposition A = RQ, where the upper-triangular matrix is given by = = T T T R AGzGy Gx , and the rotation matrix is given by Q Gx Gy Gz . B.3 Symmetric and Skew-Symmetric Matrix 195

B.3 Symmetric and Skew-Symmetric Matrix

A symmetric matrix A is a square matrix that is equal to its transpose. i.e. A = AT .Askew- symmetric (also called antisymmetric) matrix A is a square matrix whose transpose is its negative. i.e. A =−AT . Some properties of symmetric and skew-symmetric matrices are listed bellow.

1. Every diagonal matrix is symmetric, since all off-diagonal entries are zero. Similarly, each diagonal element of a skew-symmetric matrix must be zero. 2. Any square matrix can be expressed as the sum of symmetric and skew-symmetric parts.

1 1 A = (A + AT ) + (A − AT ) (B.13) 2 2

where A + AT is symmetric, A − AT is skew-symmetric. 3. Let A be an n × n skew-symmetric matrix. The of A satisfies

det(A) = det(AT ) = det(−A) = (−1)n det(A) (B.14)

Thus, det(A) = 0 when n is odd. 4. Every real symmetric matrix A can be diagonalized. If A is a real n × n symmetric matrix, its eigen decomposition has the following simple form. ⎡ ⎤ 2 σ1 T ⎢ . ⎥ T A = UDU = U ⎣ .. ⎦ U (B.15) 2 σn

where U is an n × n orthogonal matrix whose columns are eigenvectors of A, D is an n × n orthogonal matrix with the eigenvalues of A as the diagonal elements. This is equivalent to the matrix equation

AU = UD (B.16)

B.3.1 Cross Product

T A 3-vector a =[a1,a2,a3] defines a corresponding 3 × 3 skew-symmetric matrix as fol- lows. ⎡ ⎤ 0 −a3 a2 ⎣ ⎦ [a]× = a3 0 −a1 (B.17) −a2 a1 0 196 B Matrix Decomposition

B The matrix [a]× is singular and a is one of its null-vector. Thus a 3 × 3 skew-symmetric matrix is defined up to a scale by its null-vector. Reversely, any 3 × 3 skew-symmetric matrix can be written as [a]×. T T The cross product of two 3-vectors a =[a1,a2,a3] and b =[b1,b2,b3] is given by

T a × b =[a2b3 − a3b2,a3b1 − a1b3,a1b2 − a2b1] (B.18)

It is easy to verify that the cross product is related with the skew-symmetric matrix as T T a × b =[a]×b = a [b]× (B.19)

Suppose A is any 3 × 3 matrix, a and b are two 3-vectors. Then we have

∗ (Aa) × (Ab) = A [a×b] (B.20)

where A∗ is the adjoint of A. Specifically, when A is invertible, equation (B.20) can be written as

− (Aa) × (Ab) = det(A)A T [a×b] (B.21)

B.3.2 Cholesky Decomposition

A symmetric matrix A is positive definite if, for any vector x, the product xT Ax is positive. A positive definite symmetric matrix can be uniquely decomposed as

A = KKT

where K is an upper-triangular real matrix with positive diagonal entries. Such a decom- position is called Cholesky decomposition, which is of great importance in camera calibra- tion.

Proof Following the eigen decomposition (B.15), we have T 1 1 T T A = UDU = UD 2 UD 2 = VV (B.22)

Perform RQ decomposition on V,wehaveV = KQ,whereK is an upper-triangular real matrix, Q is an orthogonal matrix. Then the decomposition (B.22) can be written as

A = (KQ)(KQ)T = KKT (B.23)

In (B.23), the diagonal entries of K may not be positive. Suppose the sign of the ith di- agonal entry of K is sign(kii). Then, we can make them positive by multiplying a diagonal transformation matrix H = diag(sign(k11),...,sign(knn)) to K. This will not change the form of (B.23). B.3 Symmetric and Skew-Symmetric Matrix 197

The decomposition is unique. Otherwise, suppose there are two such decompositions as

= T = T A K1K1 K2K2 from which we have

−1 = T −T = −1 −T K2 K1 K2 K1 (K2 K1) −1 −1 = =  where K2 K1 is a diagonal matrix. Thus, K2 K1 I,andK2 K1.

B.3.3 Extended Cholesky Decomposition

Cholesky Decomposition can only deal with positive definite symmetric matrix. In some applications, such as recovering the upgrading matrix in Chap. 9, the matrix A is a positive semidefinite symmetric matrix. In this case, we employ the following extended Cholesky decomposition. Suppose A is a n × n positive semidefinite symmetric matrix of rank k (k

Proof Since A is a n × n positive semidefinite symmetric matrix of rank k, it can be decomposed by SVD as ⎡ ⎤ σ1 ⎢ ⎥ ⎢ . ⎥ ⎢ .. ⎥ ⎢ ⎥ ⎢ ⎥ = T = ⎢ σk ⎥ T A UU U ⎢ ⎥ U (B.24) ⎢ 0 ⎥ ⎢ . ⎥ ⎣ .. ⎦ 0 where U is a n × n orthogonal matrix,  is a diagonal matrix with σi the singular value of A. Consequently we get ⎡√ ⎤ σ1 ⎢ ⎥ = (1:k) . = Hku Hk U ⎣ .. ⎦ (B.25) √ Hkl σk

= T (1:k) such that A HkHk ,whereU denotes first k columns of U, Hku denotes upper (n − k) × k submatrix of Hk,andHkl denotes lower k × k submatrix of Hk. By apply- ing RQ-decomposition to Hkl,wehaveHkl = klOk,where kl is an upper triangular matrix, Ok is an orthogonal matrix. 198 B Matrix Decomposition

Let us denote = H OT and construct a n × k vertical extended upper triangular B ku ku k matrix ku k = (B.26) kl

Then we have

= = T = T = T Hk kOk, A HkHk ( kOk)( kOk) k k (B.27) − 1 − It is easy to verify that the matrix A has nk 2 k(k 1) degrees of freedom, which is just the number of unknowns in k.  Numerical Computation Method C

What we know is not much. What we do not know is immense. Pierre-Simon Laplace (1749–1827)

In this appendix, we will introduce two widely used numerical computation methods. One is least squares for linear systems, the other is iterative estimation method for nonlinear systems. Please refer to [2, 8, 9] for detailed study.

C.1 Linear Least Squares

The method of least squares is a standard approach to the approximate solution of an over- determined system of equations. “Least squares” means that the overall solution minimizes the sum of the squares of the residuals. Assuming normal distribution of the errors, least squares solution produces a maximum likelihood estimation of the parameters. Depending on whether or not the residuals are linear in all unknowns, least squares problems fall into two categories: linear least squares and nonlinear least squares. Let us consider a linear system of the form

Ax = b (C.1) where A ∈ Rm×n is the data matrix, x ∈ Rn is the parameter vector, and b ∈ Rm is the observation vector. The condition for uniqueness of the solution is closely related to the rank of matrix A. The rank of A is the maximum number of linearly independent column vectors of A. Thus the rank can never exceed the number of columns. On the other hand, since rank(A) = rank(AT ), we also know that the rank of A also equals to the maximum number of linearly independent rows of A. Therefore, rank(A) ≤ min(m, n). We say that A is full rank if rank(A) = min(m, n). In the system (C.1), when m

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 199 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 200 C Numerical Computation Method

C there will be a unique solution. When m>n, typically the system has no exact solution unless b happens to lie in the span of the columns of A. The goal of least squares fitting is to find x ∈ Rn to minimize

J1 = min Ax − b2 (C.2) x

C.1.1 Full Rank System

We consider the case when m>nand A is of full rank. Three methods will be presented in the following section.

Normal equation. Suppose x is the solution to the least squares problem (C.2). Then Ax is the closest point to b,andAx − b must be a vector orthogonal to the column space of A. Thus we have AT (Ax − b) = 0, which leads to

(AT A)x = AT b (C.3)

We call (C.3) the normal equations. Since rank(A) = n, AT A is an n × n positive defi- nite symmetric matrix. Thus, (C.3) has unique solution which can be given by

− x = (AT A) 1AT b (C.4)

The normal equations can also be solved by Cholesky decomposition as follows. 1. Let C = AT A and d = AT b; 2. Perform Cholesky decomposition C = KKT ; 3. Solve Ky = d by forward substitution; 4. Solve KT x = y by backward substitution. 2 + 1 3 The computation cost of the algorithm is about mn 3 n .

QR decomposition. Suppose the QR decomposition of A is QR, then we have

AT A = (QR)T (QR) = RT R

and the system (C.3) is converted to

(RT R)x = RT QT b (C.5)

Since R is invertible, (C.5) can be simplified to

Rx = QT b = d (C.6)

Thus, x can be easily solved by backward substitution from (C.6). The computation cost of the algorithm is about 2mn2. C.1 Linear Least Squares 201

SVD decomposition. Since rank(A) = n, the SVD decomposition of A has the follow- ing form  A = UVT = U n VT (C.7) 0

We define the pseudo-inverse of A as −1 + +  A = V UT = V n UT (C.8) 0

Therefore, the least squares solution is given by −1 n T + n T ui b x = A b = V U b = vi (C.9) 0 σi i=1

C.1.2 Deficient Rank System

In some cases, the column vectors of A are not independent of each other. Suppose rank(A) = r

Then one solution can be obtained from −1 r T + r T ui b x0 = A b = V U b = vi (C.11) 0 σi i=1 The general solution is given by

r T n ui b x = x0 + μr+1vr+1 +···+μnvn = vi + μj vj (C.12) σi i=1 j=r+1 which is a (n − r)-parameter family parameterized by μj .

C.1.2.1 Homogeneous System

We consider a homogeneous system of equation as

Ax = 0 (C.13) 202 C Numerical Computation Method

C where A is an m × n matrix. Suppose the system is over-determined, i.e. m>n.Weare seeking a nonzero solution to the system. Obviously, if x is a solution, kx is also a solution for any scalar k. Thus we apply a constraint x=1. The least squares problem can be written as

J2 = min Ax2,s.t.x=1 (C.14) x Case 1. If rank(A) = r

n−r n−r = 2 = x αi vr+i ,s.t. αi 1 i=1 i=1

T where vr+1,...,vn are n − r linear independent eigenvectors of A A.

Proof The problem (C.14) is equivalent to

T J2 = min A Ax2,s.t.x=1 (C.15) x

Since rank(AT A) = rank(A) = r, the SVD decomposition of AT A can be written as follows.

T T A A = V diag(λ1,...,λr , 0,...,0)V (C.16)

It is easy to verify from (C.16) that vr+1,...,vn are n−r linearly independent mutually orthogonal solutions of (C.15). 

Case 2. If rank(A) = n−1, system (C.14) has a unique solution which is the eigenvector corresponding to the zero eigenvalue of AT A. For noise data, A may be of full rank. Then the least squares solution can be given by the eigenvector that corresponds to the smallest eigenvalue of AT A. In practice, we simply perform SVD decomposition A = UVT , then the last column of V is exactly the solution.

C.2 Nonlinear Estimation Methods

In the following part, we will briefly introduce the bundle adjustment method and two commonly used iterative for nonlinear parameter estimation.

C.2.1 Bundle Adjustment

Bundle adjustment is always used as the last refining step of every feature-based recon- struction algorithm. The objective is to produce jointly optimal estimation of the 3D struc- C.2 Nonlinear Estimation Methods 203 ture and projection parameters. The problem of structure from motion is defined as: given a set of tracking data xij , i = 1,...,m, j = 1,...,n, which are the projection of n 3D points over m views, we want to recover the camera projection parameters and the coordinates of the space points Xj following the imaging process xij = PiXj . Suppose we already have a set of solution to the problem. However, due to image noise and model error, the projection equations are not satisfied exactly. We want to refine the ˆ ˆ ˆ ˆ estimated projection matrices Pi and 3D points Xj so that their projection xˆij = Pi Xj is more close to the measurement xij . The process can be realized by minimizing the image residues (i.e., the geometric distance between the detected image point xij and the reprojected point xˆij ) as follows [3, 10].

ˆ ˆ 2 min d(xij − Pi Xj ) (C.17) P ,X i j i,j

If the image error is zero-mean Gaussian, then bundle adjustment produces a maximum likelihood estimation. The cost function (C.17) involves a large number of nonlinear equa- tions and parameters. Thus, the minimization is achieved using nonlinear least-squares algorithms, such as iteration and Levenberg-Marquardt iteration (LM). The modi- fied sparse LM algorithm has proven to be one of the most successful algorithms due to its efficiency and the ability to converge quickly from a wide range of initial guesses. More generally, an unconstrained nonlinear estimation problem can be usually denoted as

min f(x) (C.18) x where x ∈ Rn is a parameter vector, f(x) ∈ Rm is a nonlinear convex function. We want to find the parameter vector xˆ such that (C.18) is minimized. The problem of nonlinear optimization is usually solved via a iterative algorithm as follows.

1. The algorithm starts from an initial estimation x0; 2. Compute the next search direction and increments to construct the next point xi+1 = xi + i ; 3. Perform termination test for minimization. The main difference between different algorithms lies in the construction method of shift vector i , which is a key factor of convergence. We hope the algorithm will rapidly converge to the required least squares solution. Unfortunately, in some cases, the iteration may converge to a local minimum, or do not converge at all. Please refer to [4, 8]formore details on nonlinear optimization.

C.2.2 Newton Iteration

Newton iteration is a basic algorithm used to solve nonlinear least squares problem. Sup- pose at the ith iteration, the function is approximated by f(x) = f(xi ) + εi , and the func- 204 C Numerical Computation Method

C tion can be locally linearized at xi as

f(xi + i ) = f(xi) + J i

where J is the Jacobian matrix. We seek to find the next point xi+1 = xi + i such that f(xi+1) is closer to f(x).From

f(x) − f(xi+1) = f(x) − f(xi ) − J i = εi − J i

we know that it is equivalent to minimization of

min εi − J i i

which is exactly the linear least squares problem (C.2). It can be solved by the normal equations

T T J J i = J εi (C.19)

T −1 T + Thus the increment is given by i = (J J) J εi = J εi , and the parameter vector can be updated according to

+ xi+1 = xi + J εi (C.20)

where the Jacobian matrix is evaluated at xi in each iteration. ∂f (x) J = ∂x xi

C.2.3 Levenberg-Marquardt Algorithm

The Levenberg-Marquardt (LM) algorithm is the most widely used optimization algorithm for bundle adjustment. The algorithm is a variation of Newton iteration. In LM algorithm, the normal equations (C.19) are replaced by the augmented normal equations

T T (J J + λI) i = J εi (C.21)

where I is the identity matrix. The damping factor λ is initially set to some value, typically 0.001. The update rule is as follows: if the error goes down following an update i ,the increment is accepted and λ is divided by a factor (usually by a factor of 10) before next iteration to reduce the influence of gradient descent. On the other hand, if i leads to an increased error, then λ is multiplied by the same factor and the update i is solved again from (C.21). The process is repeated until an acceptable update is found. C.2 Nonlinear Estimation Methods 205

The above algorithm has a disadvantage. If the damping factor λ is large, the matrix JT J in (C.21) is not used at all. This leads to a large movement along the directions where the gradient is small. To address this problem, Marquardt [6] modified equation (C.21)to T T T J J + λ diag(J J) i = J εi (C.22) where the identity matrix is replaced by the diagonal of JT J. Each step of the LM algorithm involves the solution of normal equation (C.21)or(C.22), which has a complexity of n3. Thus, it is computational intensive for a problem with large number of parameters. In bundle adjustment (C.17), the parameter space consists of two different sets. One set of parameters is related to camera parameters and the other set consists of space points. This leads to a sparse structure in the Jacobian matrix. Therefore, a sparse Levenberg- Marquardt algorithm was proposed to improve its efficiency. Please refer to [3, 5] for im- plementation details. References

1. Faugeras, O.: Three-Dimensional Computer Vision. MIT Press, Cambridge (1993) 2. Golub, G.H., Van Loan, C.F.: Matrix Computations, 3rd edn. Johns Hopkins University Press, Baltimore (1996). ISBN 978-0-8018-5414-9 3. Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision, 2nd edn. Cam- bridge University Press, Cambridge (2004). ISBN: 0521540518 4. Kelley, C.T.: Iterative Methods for Optimization. SIAM Frontiers in Applied Mathematics, vol. 18. SIAM, Philadelphia (1999) 5. Lourakis, M.I.A., Argyros, A.A.: SBA: A software package for generic sparse bundle adjust- ment. ACM Trans. Math. Softw. 36(1), 1Ð30 (2009) 6. Marquardt, D.: An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 11, 431Ð441 (1963) 7. Mundy, J.L., Zisserman, A.: Geometric Invariance in Computer Vision. MIT Press, Cambridge (1992) 8. Nocedal, J., Wright, S.J.: Numerical Optimization. Springer, New York (1999) 9. Press, W.H., Flannery, B.P., Teukolsky, S.A., Vetterling, W.T.: Numerical Recipes in C. Cam- bridge University Press, Cambridge (1988) 10. Triggs, B., McLauchlan, P., Hartley, R. Hartley, Fitzgibbon, A.: Bundle adjustment—A mod- ern synthesis. In: Proc. of the International Workshop on Vision Algorithms, pp. 298Ð372. Springer, Berlin (1999)

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 207 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 Glossary

Affine factorization A structure and motion factorization algorithm based on affine cam- era model. Depending on the type of object structure, it is classified as rigid factorization, nonrigid factorization, articulated factorization, etc. Bundle adjustment Any refinement approach for visual reconstructions that aims to pro- duce jointly optimal structure and camera estimates. Disparity The position difference of a feature point in two images, which is denoted by a shift of an image point from one image to the other. Duality principle The proposition in projective geometry appears in pairs. Given any proposition of the pair, a dual result can be immediately inferred by interchanging the parts played by the words “point” and “line”. Extended Cholesky decomposition An extension of Cholesky decomposition to effi- ciently factorize a positive semidefinite matrix into a product of a vertical extended upper- triangular matrix and its transpose. Feature matching The process of establishing correspondences between the features of two or more images. Typical image features include corners, lines, edges, conics, etc. Hadamard product Refers to element-by-element multiplication of two matrices with same dimensions. Histogram A graphical display of tabulated frequencies. It is the graphical version of a table which shows what proportion of cases fall into every specified categories. Incomplete tracking matrix A tracking matrix with missing point entries due to tracking failures. Lateral rotation The orientation of a camera is denoted by roll-pitch-yaw angles, where the roll angle is referred as axial rotation, and the pitch and yaw angles are referred as lateral rotation. Nonrigid factorization Structure and motion factorization algorithm for nonrigid objects. It may assume different camera models, such as affine, projection model, or quasi-perspective projection model.

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 209 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 210 Glossary

Normalized tracking matrix Depending on the context, it refers either to a tracking ma- trix in which every point is normalized by the camera parameters, or a tracking matrix that is normalized point-wisely and image-wisely so as to increase its numerical stability and accuracy in factorization. Perspective factorization Perspective projection-based structure and motion factoriza- tion algorithm for either rigid or nonrigid objects. Pose estimation Estimating the position and orientation of an object relative to the cam- era, which are usually expressed by a rotation matrix and a vector. Projective factorization Refers to structure and motion factorization in a projective space, which may be stratified to Euclidean space by an upgrading matrix. Quasi-perspective factorization Structure and motion factorization of rigid or nonrigid objects based on quasi-perspective projection that assumes the camera is far away from the object and undergoes small lateral rotations. The projective depths are implicitly em- bedded in the motion and matrices in quasi-perspective factorization. Rigid factorization Structure and motion factorization of rigid objects. Its formulation is depended on the employed projection model, such as affine camera model, perspective projection model, or quasi-perspective projection model. Stratified reconstruction A 3D reconstruction approach which begins with a projective or affine reconstruction, then refines the solution progressively to an Euclidean solution by applying constraints about the scene and the camera. Structure and motion factorization The process of factorizing a tracking matrix into 3D structure and motion parameters associated with each frame. It is usually carried out by SVD decomposition or power factorization algorithm. Tracking matrix A matrix composed by coordinates of tracked points across an image sequence. It is also termed as a measurement matrix. Visual metrology The science of acquiring measurement through computer vision tech- nology. Weighted tracking matrix A tracking matrix in which the coordinates of every tracked point is scaled by a projective depth. In presence of a set of consistent depth scales, the factorization produces a perspective solution. Index

3D reconstruction, 1, 18, 52 Canny edge detector, 14, 22 Cholesky decomposition, 7, 24, 70, 96, 196, A 200 Absolute conic, 7 Circular points, 186 Active appearance model, 118, 131 Column space, 192 Adjoint matrix, 185 Condition number, 169, 192 Affine camera, 29, 30, 163 Conic, 6, 13, 184 Affine factorization, 114, 207 Conic coefficient matrix, 185 Affine fundamental matrix, 44, 49 Conic envelope, 185 Affine group, 187 Conjugate ideal points, 186 Affine projection, 31, 36, 142, 162 Conjugate point, 186 Affine reconstruction, 6, 19 Constrained power factorization, 142, 147 Affine structure, 149 Convergence property, 116, 133 Affine transformation, 145 Cross product, 89, 184, 196 Antisymmetric matrix, 91, 184 Cross ratio, 187 Articulated factorization, 82 Articulated nonrigid factorization, 83 D Aspect ratio, 3, 30, 95 Deficient rank system, 201 Augmented normal equations, 204 Deformation structure, 73 Axial rotation, 33 Deformation weight constraint, 142Ð144, 146 B Deformation weights, 142 constraint, 76, 172 Degree of deformability, 110 Bundle adjustment, 97, 114, 151, 169, 202, Degrees of freedom, 4, 18, 45, 47, 50, 97, 187 203, 207 DIAC, 7, 9, 12 Diagonal matrix, 69, 191 C Disparity, 207 Camera at infinity, 4 Disparity matrix, 93 Camera calibration, 7, 9, 24, 88, 94 Dual image of the absolute conic, 7, 9 Camera calibration matrix, 3, 30, 94 Dual line conic, 185 Camera , 4 Dual of the circular point, 186 Camera imaging geometry, 1 Duality principal, 184, 185 Camera matrix, 2, 4 Duality principle, 207 Camera model, 2 Camera parameters, 94, 95 E Camera pose, 3 Eckar-Young theorem, 193 Camera projection matrix, 4 Eigen decomposition, 195

G. Wang, Q.M.J. Wu, Guide to Three Dimensional Structure and Motion Factorization, 211 Advances in Pattern Recognition, DOI 10.1007/978-0-85729-046-5, © Springer-Verlag London Limited 2011 212 Index

Eigenvalues, 192, 195 Ideal points, 184 Eigenvectors, 191, 192, 195 Image of the absolute conic, 7, 24 Element-by-element multiplication, 94, 128, Imaging geometry, 2 147 Imaging process, 183 EM algorithm, 65 Incomplete tracking matrix, 126, 207 Epipolar constraint, 89, 91 Infinite homography, 7, 25, 51 Epipolar geometry, 17, 44, 95 Infinite plane, 6 Epipolar line, 17, 49, 89 Inhomogeneous coordinates, 184, 187 Epipolar plane, 17 Initialization, 114, 130, 148 Epipole, 17, 49, 89 Intrinsic calibration matrix, 31 Essential matrix, 18, 47 Intrinsic parameters, 3, 169 Euclidean geometry, 2, 183 Iterative algorithm, 90 Euclidean group, 187 Euclidean reconstruction, 19, 96 J Euclidean structure, 168 Jacobian matrix, 204 Euclidean transformation, 143, 146 Expectation maximization, 125, 141 K Exponential map, 114 KLT, 67 Extended Cholesky decomposition, 165, 197, Kruppa constraint, 94, 105, 170 207 Kruppa equation, 88, 95 Extrinsic parameters, 11 L F Lateral rotation, 33, 44, 164, 207 Feature matching, 207 Least-squares, 9, 23, 127, 199, 201 First-order approximation, 29, 114 Levenberg-Marquardt algorithm, 97, 204 Focal length, 3 Line at infinity, 184, 186, 188 Forward and back-projection, 5 Linear least squares, 199 Frobenius norm, 69, 92, 113, 193 Linear recursive algorithm, 114 Full rank system, 200 Linear system, 199 Fundamental matrix, 18, 47, 89, 91, 95 Local minimum, 114, 203 Low-rank approximation, 63, 126, 127, 147, G 193 Geometric constraints, 7 Low-rank factorization, 111 Geometric distance, 93 Givens rotations, 194 M Global minimum, 91, 92 Matrix decomposition, 191 Global optimization, 151 Maximum likelihood estimation, 126, 199, 203 Gradient descent, 90, 96, 114, 167 Mean reconstruction error, 174 Mean registration error, 150 H Mean reprojection error, 149 Hadamard product, 128, 147, 207 Measurement matrix, 66 Histogram, 207 constraint, 70, 74, 94, 171 Homogeneous coordinates, 2, 184, 187 Metric reconstruction, 19 Homogeneous representation, 184, 187 Missing data, 125, 129, 134, 148 Homogeneous structure, 164 Modulus constraint, 88 Homogeneous system, 201 Morphable model, 131 Homography, 6, 8, 12, 20, 23, 46, 50, 186 Motion matrix, 68, 94, 127, 168 Hough transform, 15 Multi-body factorization, 79 Householder transformations, 194 N I Newton iteration, 114, 203 IAC, 7, 9 Nonlinear estimation, 202 Ideal point, 46 Nonlinear least-squares, 199, 203 Index 213

Nonlinear optimization, 114, 203 Projective factorization, 208 Nonrigid factorization, 74, 77, 109, 110, 130, Projective geometry, 1, 183 142, 163, 171, 207 Projective reconstruction, 19 Normal equations, 200 Projective structure, 93 Normalized image coordinates, 111 Projective transformation, 19, 48, 186, 187 Normalized tracking matrix, 96, 208 Projectivity, 186 Null space, 192 Pseudo-inverse, 48, 201

O Q One-view geometry, 44 QR decomposition, 194, 200 Optical axis, 3 Quadratic constraint, 95 Optical center, 2 Quadric, 189 Orthogonal matrix, 90, 191, 194 Quasi-essential matrix, 48 , 29, 31 Quasi-fundamental matrix, 48, 54 Orthonormal constraint, 130 Quasi-perspective factorization, 170, 208 Outliers, 51, 55, 65, 97 Quasi-perspective nonrigid factorization, 170 Quasi-perspective projection, 30, 33, 35, 36, P 44, 45, 164, 171 Para-perspective factorization, 114 Quasi-perspective reconstruction, 52 Para-perspective projection, 29, 32 Quasi-perspective rigid factorization, 169 Pencil of planes, 188 Quasi-perspective transformation, 52 Pencil of points, 188 Perspective effect, 109 R Perspective factorization, 72, 94, 164, 208 Rank constraint, 90, 141, 171 Perspective projection, 29, 91, 109, 112, 114, RANSAC, 20, 44, 51, 97, 150 162, 173 Reconstruction error, 56, 173 Perspective stratification, 112 Registration error, 150 Perspective structure, 91 Relative reprojection error, 116, 120, 134, 153 , 46 Replicated block structure, 127, 128, 148 Pinhole camera, 2 Reprojected tracking matrix, 114, 117, 130, Plane at infinity, 88, 187 134, 149, 151 Plane measurements, 8 Reprojection error, 150 Plücker line coordinates, 188 Reprojection residuals, 92, 97, 114, 169 Point at infinity, 188 Retinal plane, 3 Point conic, 185 Reversal ambiguity, 111, 113 Polarity, 189 Rigid factorization, 68, 208 Pole-polar relationship, 185 Rotation constrained power factorization, 126, Pose estimation, 208 128 Positive definite symmetric matrix, 166, 197, Rotation constraint, 70 200 Rotation matrix, 3, 10, 129 Positive semidefinite symmetric matrix, 97, Row space, 192 165, 166, 197 RQ decomposition, 194, 196 Power factorization, 126, 147, 170 Principal axis, 4 S Principal components analysis, 126 Scalar variation, 113 Principal modes, 142 Scaled tracking matrix, 87, 164 Principal plane, 4 Scene constraints, 7, 11, 13 Principal point, 3, 4, 30, 95 Self-calibration, 88, 105 Procrustes analysis, 131 Sequential factorization, 126, 131 Projection depth, 89, 92 Shape balance matrix, 149 Projection matrix, 2, 4, 11, 12, 15, 34 Shape bases, 73, 142, 146, 163 Projective depth, 2, 30, 33, 78, 93 Shape interaction matrix, 80 Projective depth recovery, 89, 109 Shape matrix, 68, 168 214 Index

SIFT, 67 Tri-linear algorithm, 76, 131 transformation, 143 Trifocal tensor, 44 Simplified camera model, 165, 172 Two-view geometry, 17, 46 Single view geometry, 5 Single view metrology, 7 U Single view reconstruction, 14 Upgrading matrix, 70, 74, 111, 143, 164, 168, Singular value decomposition, 53, 191 170, 171, 173, 197 Singular values, 11, 69, 90, 95, 191 Upper-triangular matrix, 194 Skew parameter, 3, 30 Skew-symmetric matrix, 195 V Sparse LM algorithm, 203, 205 , 7, 9, 11 Standard deviation, 116 Vertical extended upper triangular matrix, 166, Stereo vision, 17, 43 172, 198 Stratification matrix, 151 Visual metrology, 16, 208 Stratified reconstruction, 19, 53, 151, 208 Structure and motion factorization, 67, 163, W 193, 208 Weak-perspective camera model, 109 Structure from motion, 17 Structured objects, 7, 12 Weak-perspective projection, 29, 32, 113 Sub-block factorization, 76 Weighing coefficients, 173 SVD decomposition, 11, 90, 95, 126, 129, 143, Weighted least-squares, 88, 125 163, 192, 201 Weighted tracking matrix, 78, 91, 93, 110, 112, Symmetric matrix, 166, 172, 195 208 Weighting matrix, 128 T Tracking matrix, 66, 90, 93, 127, 208 Z Translation vector, 3, 11 Zero-order approximation, 29, 111