Glimpses of geometry and graphics
Przemys law Koprowski
Contents
Introduction 5
Part 1. Geometry of space 7
Chapter 1. Homogeneous coordinates and affine transforms 9 1. Projective space 9 2. Homogeneous coordinates 11 3. Vectors and normals 17 4. Isometries 19 5. Transformation decomposition 23 Implementation notes 25 Exercises 25 Bibliographic notes 25
Chapter 2. Projective space and quaternions 27 1. Projective maps 27 2. Projections and perspective 32 3. Duality and the Pl¨ucker-Grassmann coordinates 41 4. Quaternions and transform interpolation 54 Exercises 62 Bibliographic notes 63
Part 2. Geometry of curves 65
Chapter 3. Cubic curves 67 1. Introduction 67 2. B´eziersplines 71 3. Interpolation with B´ezier curves 75 4. Catmull-Rom splines 79 5. B-splines 85 6. Other cubic representations 89 7. Change of basis 90 8. Curve splitting 93 Implementation notes 95 Exercises 95 Bibliographic notes 96
Chapter 4. Rational parametric curves 103 1. Why rational splines? 103 2. B´eziercurves 106
3 4 CONTENTS
3. Uniform B-splines 112 4. NURBS 118 5. Tangent and curvature 124 6. On-curve alignment 124 7. Manifolds and geometric continuity 124 8. Arc-length parametrization 124 Exercises 124
Chapter 5. Spherical curves 125 1. Transformation interpolation revisited 125 Chapter 6. Algebraic curves 127 1. Affine algebraic curves 127 2. Resultants 128 3. Projective algebraic curves 133 4. Singular points on algebraic curves 138 5. Parametrization of algebraic curves 141 6. Implicitization of parametric curves 147 7. Intersection points 148 8. Hessian and the inflection points 150 9. Cubic curves 154 Implementation notes 163 Exercises 163 Bibliographic notes 163
Part 3. Geometry of solids and surfaces 165 Chapter 7. Polygons and polyhedra 167 1. Triangles 167 2. Convex polytopes 170 3. Convex polygons 171
Chapter 8. Parametric surfaces 173 Bibliography 175 Introduction
“They believe the world is run by geometry. All lines and angles and numbers. That sort of thing [ ... ] can lead to some very unsound ideas.” — T. Pratchett
Part 1
Geometry of space
CHAPTER 1
Homogeneous coordinates and affine transforms
“Space is essentially one” — I. Kant
“We cannot assert with Kant that the propositions of Euclidean geometry possess any universal truth” — A. D’Abro This chapter introduces the notion of homogeneous coordinates, which is one of the fundamental structures in the domain of geometric modeling. To this end we shall define a projective space, which is an extension of an affine space known from the basic linear algebra. We won’t, however, present an in-depth exposition of projective geometry in this chapter and, after introducing a homogeneous coordi- nate system, we will restrict ourselves to the affine case, returning to the projective space in Chapter 2. In computer graphics, one constantly deals with basic geometric transforma- tions of R3 such as a rotation, scaling and translation. The former two transfor- mations are linear maps, but the last one is not linear. It is well known, that every linear transform can be represented with a use of matrix multiplication. Fix T T T 3 the canonical basis ε1 = (1, 0, 0) , ε2 = (0, 1, 0) and ε3 = (0, 0, 1) of R . Let 3 τ End R be a linear endomorphism and aij be the i-th coordinate of τ(εi), then we∈ have x a11 a12 a13 x τ y = a21 a22 a23 y · . z a31 a32 a33 z Our aim is to develop an equally convenient and uniform1 representation for all affine endomorphisms. In fact we will eventually be able to represent an even wider class of functions known as projective maps.
1. Projective space
In this section we introduce the notion of a projective space. Let n N be a fixed dimension (in practice we usually need n = 2 or n = 3). Define a relation∈ ∼ on the set Rn+1 (0,..., 0)T by the condition \ _ u v u = cv. ∼ ⇐⇒ c 0 ∈R\{ }
1If all the transforms are represented in the same, standardized, fashion, they can be easily implemented in a hardware (e.g. GPU).
Build: January 28, 2021 10 Homogeneous coordinates and affine transforms
In other words, two non-zero vectors are related if they are linearly dependent (i.e. they belong to the same line through the origin). It is clear that is an equivalence relation. ∼
Definition 1.1. The set PnR of equivalence classes of (n + 1)-dimensional vectors with respect to the relation is called the n-dimensional projective space. T ∼ The class of a vector (x0, . . . , xn) is denoted by x0 : ... : xn . J K Of course, in the above definition one can substitute any arbitrary field (e.g. the rationals, the complex numbers,. . . ) in place of the reals and define respectively the rational projective space PnQ, the complex projective space PnC and so on. We won’t need them, however, till Chapter 6. Consider now the subset Ui of the projective space PnR consisting of all the (projective) points having non-zero i-th coordinate: n Ui := x0 : ... : xn P R xi = 0 . J K ∈ | 6 Defining PnR we have excluded the null vector of Rn+1, consequently every pro- n Sn jective point has at least one non-zero coordinate. It follows that P R = i=0 Ui. n Observe that there is a bijection Ui R defined by the formula ↔ x x x x T x0 : ... : xi : ... : xn ( 0/xi,..., i−1/xi, i+1/xi,..., n/xi) . J K 7→ It allows us to identify Ui (for every 1 i n) with the affine n-dimensional space n n ≤ ≤ n R . The map Ui R is called a dehomogenization of Ui and its inverse R Ui → → is the homogenization of Rn. It is most convenient to use them for either the first or the last coordinate and the choice between these two cases is just a matter of taste. We will stick to the latter convention, where the maps have the forms: T homogenization : (x1, . . . , xn) x1 : ... : xn : 1 , 7→ Jx x K T dehomogenization : x0 : ... : xn 1 : xn ( 0/x1,..., n−1/xn) . J − K 7→ Let us characterize the completion of the affine space Rn in the projective space PnR. n Proposition 1.2. There is a canonical bijection between P R Un and the n 1 \ (n 1)-dimensional projective space P − R. − Proof. By the very definition Un = x0 : ... : xn xn = 0 , hence its n J K | 6 completion P R Un consists of all the projective points of the form x0 : ... : \ J xn 1 : 0 . Of course some xi = 0 due to the way we have constructed the projective space.− DroppingK the last coordinate,6 which is constantly zero anyway, we arrive n 1 at P − R. n 0 The set P R Un is called the set of points at infinity. Observe that P R consists \ of a single point by definition. Then, the projective line P1R decomposes into a sum of the affine line R1 and the unique point at infinity. Next, the projective plane P2R consists of the affine plane R2 and the line at infinity. Further, the projective 3-space P3R is just the affine 3-space R3 together with the plane at infinity, and so on. There are simple geometric models for these low-dimensional projective spaces. n The coordinates of a projective point x0 : ... : xn P R are unique only up to a multiplication by a non-zero factor.J Thus, normalizingK ∈ them, we may assume that x2 + + x2 = 1. Consequently, every projective point is represented by a 0 ··· n point of a n-dimensional unit sphere Sn Rn+1. Unfortunately, there are two such points. These antipodal points P and ⊂P of a sphere map to the same point in − Build: January 28, 2021 Homogeneous coordinates and affine transforms 11
P P P Q P Q − P − Q Q Q − −
Q − P − Figure 1.1. Circle as a model of the projective line the projective space. Consequently, one constructs the projective space by gluing together antipodal points of the unit sphere. In particular, we may easily check that the real projective line P1R can be identified with a unit circle. Imagine a rubber loop representing a circle. Twist it into an “eight-shape” and fold it so that the two halves overlap, joining the points that were originally opposing each other (see Fig. 1.1). We get a circle again!
2. Homogeneous coordinates We will return to the projective space in the next chapter, but for the time being, let us concentrate on the affine space Rn, which we identify with the subset n T n Un P R. Every point P = (x1, . . . , xn) R can be assigned its projective co- ⊂ ∈ ordinates x1 : ... : xn : 1 . We call them homogeneous coordinates of P and write J T K [x1, . . . , xn, 1] . In what follows, we always use square brackets when working with homogeneous coordinates and parentheses when dealing with linear/affine coordi- T nates. The distinction between x0 : ... : xn and [x0, . . . , xn] is purely artificial for points, but will become importantJ once weK admit vectors into our discussion. A word of caution: the homogeneous coordinates of a point (x, y)T R2 in the ∈ plane are [xw, yw, w]T for any non-zero w R. Hence all: [x, y, 1]T, [2x, 2y, 2]T, [ 10.7x, 10.7y, 10.7]T,. . . refer to the same∈ point. Similarly, the homogeneous co- − − − T 3 T T ordinates of a point (x, y, z) R are [x, y, z, 1] as well as [x/3, y/3, z/3, 1/3] ,. . . and so on. ∈ The homogeneous coordinates are not unique! Usually, we try to work with “normalized” coordinates [x, y, z, 1]T , but the addi- tional freedom in choice of the weight w is sometimes important (see Chapter 4). As noted above, points have homogeneous coordinates [xw, yw, zw, w]T with weight w = 0. What about vectors? In the affine world, when translating a point P by6 a vector v, we simply add their coordinates component-wisely. This is a nice property, which we want to preserve in a homogeneous case. Take two points P , Q and let v be a vector satisfying Q = P + v. Let P = [x, y, z, 1]T and T Q = [x0, y0, z0, 1] . (i.e. we fix the weights of P and Q to be w = w0 = 1). Write the equation Q = P + v coordinate-wisely: x0 x x0 x − y0 y y0 y Q = = + − = P + v. z0 z z0 z 1 1 0− Build: January 28, 2021 12 Homogeneous coordinates and affine transforms
Hence, vectors have coordinates [x, y, z, 0]T . Vectors in homogeneous coordi- nates have weight zero! More generally, we define a point-vector addition by the formula: x a x + aw y b y + bw + := z c z + cw . w 0 w Conversely, the difference of two points is a vector: x 0 0 x x0 /w x /w y − 0 0 y y0 /w y /w = z − 0 0 z − z0 /w z /w − . w w0 0 Remark. Note the difference between x : y : z : 0 and [x, y, z, 0]T. The former is a point at infinity in the projectiveJ space P3R andK is the same as, let’s say, 2x : 2y : 2z : 0 . The latter is a vector in R3 and differs from [2x, 2y, 2z, 0]T , whichJ is again a vectorK but twice longer. The notion of homogeneous coordinates allows us to write all the affine endo- morphisms in a matrix form. Begin with a translation. 3 T Observation 2.1. A translation in R by a vector t = [tx, ty, tz, 0] can be expressed in homogeneous coordinates in the following way: x 1 0 0 tx x y 0 1 0 ty y z 0 0 1 t z 7→ z · . 1 0 0 0 1 1
3 Observation 2.2. Let τ End R be a linear endomorphism and let aij be the ∈ i-th coordinate (with respect to the canonical basis) of τ(εj), then in homogeneous coordinates we write τ as x a11 a12 a13 0 x y a21 a22 a23 0 y z a a a 0 z 7→ 31 32 33 · . 1 0 0 0 1 1 Proposition 2.3. Let τ, σ be two geometric transformations with correspond- ing homogeneous matrices A and B, respectively. Then the composition τ σ is realized by matrix multiplication by a product A B. ◦ · We leave the proof to the reader. Recall that every affine endomorphism can be written as a composition of a translation with some linear map. Both have matrix representations by the previous two observations and it follows from the proposition that their composition is represented by a matrix multiplication, as well. Hence we proved: Corollary 2.4. Every affine map can be expressed as a matrix multiplication in homogeneous coordinates. In fact the class of all the functions defined by a matrix multiplication in homo- geneous coordinates is strictly wider than the class of affine endomorphisms. These are so called projective maps. We will refer to them later. For now, we shall only Build: January 28, 2021 Homogeneous coordinates and affine transforms 13 characterize these matrices that give rise to affine maps. Observe that due to the nature of homogeneous coordinates, we may multiply a matrix by a non-zero scalar without altering the transformation itself. Hence, we may always assume that the bottom-right entry of the matrix is either zero or one. Now, every affine map is a composition of a linear map with a translation. Examining what happens in the last row when multiplying the matrices from Observations 2.1 and 2.2, we arrive at the following conclusion:
Observation 2.5. Let τ : R3 R3 be a geometric transformation expressible → in a matrix notation as τ(P ) = MP for some matrix M = [mij] M4,4(R) with ∈ m44 = 1. Then τ is an affine endomorphism if and only if M has the form m11 m12 m13 m14 m m m m M = 21 22 23 24 m m m m 31 32 33 34. 0 0 0 1 Below we present matrices of a few basic linear maps. Observation 2.6. The matrix (in homogeneous coordinates) of a non-uni- form scaling (dilatation) by factors s = (sx, sy, sz) in directions respectively OX, OY and OZ has the form sx 0 0 0 0 sy 0 0 MS = s 0 0 s 0 z . 0 0 0 1 Consider now a rotation by an angle α around a given (directed) axis. By convention a positive angle refers to a rotation in a counter-clockwise direction when looking along the axis of the rotation coherently with its orientation (i.e. “looking from the minus infinity”). Rotations around the principal axes OX, OY , OZ have simple matrices. In what follows, by lin (v) we denote a directed line lin(v) spanned by v and with orientation inherited−→ from v. Proposition 2.7. The matrix of a rotation by an angle α around a principal axis lin (εi) has a form −→ 1 0 0 0 cos α 0 sin α 0 0 cos α sin α 0 0 1 0 0 MO = MO = ε1,α 0 sin α −cos α 0 ε2,α sin α 0 cos α 0 , , 0 0 0 1 − 0 0 0 1 cos α sin α 0 0 sin α −cos α 0 0 MO = ε3,α 0 0 1 0 . 0 0 0 1
2 Proof. First consider a rotation in the plane R . The basis vectors ε1 = 1 0 0 and ε2 = 1 are mapped to some vectors v1 = Oα(ε1) and v2 = Oα(ε2). Elementary trigonometry (see Figure 1.2) shows that cos α sin α v = and v = − . 1 sin α 2 cos α Build: January 28, 2021 14 Homogeneous coordinates and affine transforms
v2 1
v1 α
α
1
Figure 1.2. Rotation in the plane R2.
Hence, the matrix of the rotation in linear coordinates equals: cos α sin α . sin α −cos α Switching to three dimensions and further to homogeneous coordinates, we arrive at the last matrix in the assertion. Consider now a rotation around ε . Permuting the basis so that ε ε 1 1 7→ 3 7→ ε2 ε1, rotating around ε3 and permuting back, we reduce the case to the one already7→ considered. In terms of a matrix multiplication we express this permutation as: 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 cos α sin α 0 MO = MO = ε1,α 0 1 0 0 ε3,α 1 0 0 0 0 sin α −cos α 0 · · . 0 0 0 1 0 0 0 1 0 0 0 1
One derives the last matrix MOε2,α in the same way, this time taking the permu- tation ε ε ε ε . 1 7→ 2 7→ 3 7→ 1 A rotation around an arbitrary axis id a bit more involving. Proposition 2.8. Let v = [x, y, z, 0]T be a vector of a unit length (i.e. x2 + y2 + z2 = 1). The matrix of a rotation around lin (v) by an angle α has a form −→ c + x2(1 c) xy(1 c) zs xz(1 c) + ys 0 xy(1 c)− + zs c + y−2(1 −c) yz(1 − c) xs 0 MO = v,α xz(1 − c) ys yz(1 c)− + xs c + z−2(1 −c) 0 , −0− − 0 0− 1 where c = cos α and s = sin α.
Proof. If v lies on one of the principal axes OX, OY or OZ, then v = εi (for some 1 i 3) and the previous proposition yields the assertion. Thus,± without loss of generality,≤ ≤ we may assume that v does not lie on any of the principal axes. 1 We express the rotation O as a composition τ − R τ, where R is a rotation v,α ◦ ◦ around ε1 and τ is an isometry sending v to ε1. To this end consider an orthonormal basis v, u, w defined as follows: { } T T uˆ h z y i uˆ := ε1 v = [0, z, y, 0] , u := = 0, p , p , 0 × − uˆ − y2 + z2 y2 + z2 k k Build: January 28, 2021 Homogeneous coordinates and affine transforms 15 and h iT p 2 2 xy xz w := v u = y + z , p , p , 0 × − y2 + z2 − y2 + z2
Now, let τ map the basis v, u, w to the canonical basis ε1, ε2, ε3 . It follows that 1 { } { } T the matrix Mτ− of the inverse of τ has the columns v, u, w and [0, 0, 0, 1] .
p x 0 y2 + z2 0 z xy y 2 2 2 2 0 M 1 = −√y +z −√y +z τ− z y xz 0 √y2+z2 √y2+z2 − . 0 0 0 1
Both bases (v, u, w) and (ε1, ε2, ε3) are orthonormal, therefore τ is an isometry (in 1 T fact it is even a rotation), hence Mτ− = Mτ (we postpone the proof of this last property till Section 4). It follows that the matrix MOv,α of a rotation around v can be expressed as a product MO = M T MO M . Thus v,α τ · ε1,α · τ
x 0 √y2+z2 0 1 0 0 0 x y z 0 y z xy 0 z 0 y 0 0 cos α sin α 0 2 2 2 2 − √y2+z2 − √y2+z2 − − √y +z √y +z MOv,α = y xz xy xz z 0 0 sin α cos α 0 √y2+z2 0 √y2+z2 − √y2+z2 · · − √y2+z2 − √y2+z2 0 0 0 1 0 0 0 1 0 0 0 1 .
Computing the product and making use of the assumption x2 + y2 + z2 = 1 one gets the desired formula.
Another proof of the above proposition is presented in Section 4 of Chapter 2. Now, take a rotation Ov,α around lin (v) by an angle α. Clearly, the inverse of Ov,α is the rotation by α around lin−→(v) or equivalently the rotation by α around lin ( v). Hence, − −→ −→ − 1 T MOv,α− = MOv, α = MO v,α = MOv,α. − − Example. Above we have discussed only rotations around axes passing through the origin. A rotation around a general axis can be expressed as the translation of the axis to the origin, followed by the rotation around the shifted axis, followed finally by the inverse translation. Take for example the line L passing through the points P = [0, 1, 0, 1]T and Q = [1, 0, 1, 1]T , which clearly misses the origin. We wish to find the matrix of a rotation by an angle π/4 in CCW direction when look- ing along L from P towards Q. Denote byv ˆ the difference P Q = [1, 1, 1, 0]T . T − − Normalizing we have v = vˆ/ vˆ = [√3/3, √3/3, √3/3, 0] , hence k k − 0 √3/3 1 √3/3 L = + lin − 0 √3/3 . 1 0 Build: January 28, 2021 16 Homogeneous coordinates and affine transforms
The translation by u = [0, 1, 0, 0]T shifts P to the origin and so the matrix of the rotation around lin (v) has− the form: −→
MOv,π/4 = 1 0 0 0 √2+1 √6+√2 2 √6 √2+2 1 0 0 0 3 − 6 − − −6 0 0 1 0 1 √6+√2 2 √2+1 √6+√2 2 0 0 1 0 1 = 6 − 3 − 6 − − = 0 0 1 0 · √6 √2+2 √6+√2 2 √2+1 · 0 0 1 0 −6 6 − 3 0 0 0 0 1 0 0 0 1 0 0 0 1 √2+1 √6+√2 2 √6 √2+2 √6+√2 2 − − − − − − 3 6 6 − 6 √6+√2 2 √2+1 √6+√2 2 1 √2+1 = 6 − 3 − 6 − 3 √6 √2+2 √6+√2 2 √2+1 √−6+√2 2 −6 6 − 3 6 − . 0 0 0− 1
Another important geometric transformation is a reflection with respect to a given plane. Let Π be a plane in R3 containing the origin and given by an equation ax + by + cz = 0. Write this equation using homogeneous coordinates:
a b (2.9) [x, y, z, 1] = 0. · c 0
The vector v = [a, b, c, 0]T is perpendicular to Π. Without loss of generality, we may assume that it has a unit length (i.e. a2 + b2 + c2 = 1). Take a point T 3 P = [x, y, z, 1] R and let P 0 denote its orthogonal projection onto Π. Then ∈ P 0 = P + tv for some t R. The point P 0 lies on Π, hence its coordinates satisfy the equation (2.9): ∈ a b [x + ta, y + tb, z + tc, 1] = 0. · c 0
Solve it for t to get:
a a b b t = t [a, b, c, 0] = [x, y, z, 1] c c · · − · . 0 0
Thus t is the dot product of −−→P Θ and v, namely t = −−→ΘP v. The reflection through Π maps the point P to P + 2tv. Consequently we− write• the reflection as:
x x a (1 2a2)x 2aby 2acz y y b 2abx− + (1 − 2b2)y− 2bcz + 2t = z z c −2acx 2bcy− + (1 −2c2)z 7→ . 1 1 0 − − 1 − Build: January 28, 2021 Homogeneous coordinates and affine transforms 17
Corollary 2.10. Let [a, b, c, 0]T be the unit vector orthogonal to a plane Π ⊂ R3 passing through the origin. The matrix of the reflection through Π is 1 2a2 2ab 2ac 0 − − − 2ab 1 2b2 2bc 0 Mσ = Π −2ac −2bc 1− 2c2 0 . −0− 0− 0 1 Corollary 2.11. The determinant of a reflection equals 1. − 3. Vectors and normals This section will be reworked In the previous section we learned how to map points, expressed in homoge- neous coordinates, by affine endomorphisms. Now, let’s see what happens if we map vectors rather than points. Observation 3.1. Let P,Q R3 be two points and v := Q P be a vector. ∈ − Further let M M4,4(R) be a matrix. Then Mv = MQ MP . ∈ − The conclusion from the above observation is that vectors are transformed by exactly the same formulas as points. Let us note one special case. Corollary 3.2. If M is a matrix of a translation and v = [x, y, z, 0]T is a vector, then M v = v. · The above corollary can be derived either directly from the preceding obser- vation or by the following simple geometric reasoning: if v = −−→PQ for some points P,Q R3 and T is a translation by a vector u, then T v = (P + u) (Q + u) = v. Geometric∈ objects, except the simplest ones, consist usually of− infinitely (or even uncountably) many points. But to manipulate them effectively, using e.g. computers that have only limited memory, we need to represent them by some finite sets of parameters (we understand the term quite broadly here). Typically, we would use finite sets of points and/or vectors. For example, a polyhedron can be represented by a finite set of its faces, where each face (a polygon) is in turn represented by finitely many vertices (more on polyhedra representations may be found in Chapter 7). Denote by (X) the class of all the subsets of X and by Fin(X) the class of all the finite subsets.P The finite representation of a geometric object can, thus, be treated as a function : Fin(R3) (R3). F → P Definition 3.3. The representation : Fin(R3) (R3) is affinely invariant, 3 F → P if for every finite set P1,...,Pn R and every affine endomorphism T one has { } ⊂ T ( P ,...,P ) = TP ,...,TP . F { 1 n} F { 1 n} The representation is projectively invariant, if the above holds for every projective map T . F In a nutshell, to transform an object represented in affinely (resp. projectively) invariant way, all we need to do is to transform the points that define it. As a simple example, take a triangle represented by its three vertices. In order to find the image of this triangle under an affine map, it suffices to transform the vertices and reconstruct a triangle from their images. Most of the time we will be interested only in these representations of geometric object that are at least affinely invariant. There is however one exception. Build: January 28, 2021 18 Homogeneous coordinates and affine transforms
Figure 1.3. Transformations of a normal and the vector repre- senting it may differ.
A normal 2 is an orthogonal pair of unit vectors (u, v), this means that u v and u = v = 1. A normal N = (u, v) may be represented by the cross product⊥ u v. Itk isk worthk k to emphasize the fact that, while normals are represented by vectors, they× are not vectors themselves. They obey a different set of rules, in particular we will see that the representation of a normal by a vector is not affinely invariant. Normals are used in 3D computer graphics for object shading. Interpolating normals across the edge between two faces of a polyhedron can create an illusion of a smooth shape. Local transformations of normals (so called bump mapping and normal mapping) are used to simulate tiny details or a surface texture. Let N = (u, v) be a normal represented by the cross-product n = u v. × Consider a matrix M = M4,4(R). In general, after applying a transformation, the vector Mn is no longer orthogonal to lin(Mu, Mv) as it can be observed for example in Figure 1.3. This means that normals need to be treated differently than points and vectors. Recall that every affine transformation is a composition of a linear endomorphism with a translation. As we know (see Corollary 3.2) the translation does not alter vectors. Hence, clearly, it should not change the normal. But what about the linear part? Suppose that M is a matrix of a linear endomorphism. By its construction n is orthogonal to both u and v, hence n u = n v = 0. Denote • • u0 := Mu, v0 := Mv. The vector n0 representing the image of the normal N must be orthogonal to both u0 and v0 so that n0 u0 = 0 = n0 v0. We would like to • • express it as n0 = M 0n for some matrix M 0 and so T T T 0 = n0 u0 = (M 0 n) (M u) = n (M 0 M) u, • · · · · · T T T 0 = n0 v0 = (M 0 n) (M v) = n (M 0 M) v. • · · · · · If M is invertible, then u0, v0 are linearly independent, since u, v are orthogonal. Hence, the direction orthogonal to lin(u0, v0) is uniquely determined. It follows from 1 T the above equations that (M − ) n is orthogonal to both u0 and v0. Thus we can take · 1 T M 0 = k (M − ) , · for some scalar k R selected in such a way that n0 = 1. ∈ k k
2In some books, normals are called “normal vectors”. This may suggest that they are vectors of a special kind. But they are not. Hence we use the term “normal”. In the language of Clifford algebras, normals are bivectors.
Build: January 28, 2021 Homogeneous coordinates and affine transforms 19
Corollary 3.4. Let M M4,4(R) be a matrix of a linear automorphism (in homogeneous coordinates). If∈ a normal N is represented by a vector n, then (up to a scalar multiplication), the image N 0 of N under this automorphism is represented 1 T by n0 = (M − ) n. · Given a linear automorphism ϕ Aut R3 of the space R3 with the associated ∈ 1 T matrix M, we define its dual automorphism by the formula v (M − ) v. Hence the previous corollary can be rephrased as: 7→
Corollary 3.5. Up to a scalar multiplication, normals are transformed by the automorphism dual to the one used to transform vectors.
4. Isometries Isometry preserves length In this section we discuss a special class of linear maps that preserve lengths and angles. Such functions are called isometries. As we will see, one of the special properties of isometries is that they treat vectors and normals uniformly. We begin with the following basic lemma.
Lemma 4.1. The composition of two reflections in R2 with respect to two non- parallel lines is the rotation around the intersection point of these two lines by the double angle between them.
Proof. Without loss of generality, we may assume that the two lines intersect at the origin. Let u, v be two unit vectors spanning these lines and denote σu (respectively σv) the reflection across lin(u) (resp. lin(v)). A composition σv σu is a linear map. Hence, in order to prove the lemma, it suffices to show that◦ this composition act as a rotation on an arbitrary basis of R2. The natural choice is the basis consisting of u and v. Rotate u and v by π/2 and denote the resulting vectors u0 and v0, respectively. Decompose the vectors u and u0 (see Figure 1.4) into
u = u + u , u0 = u0 + u0 , k ⊥ k ⊥ where u , u0 are parallel to v and u , u0 are orthogonal to v (hence parallel to k k ⊥ ⊥ v0). Similarly, write v, v0 as
v = v + v , v0 = v0 + v0 , k ⊥ k ⊥ with v , v0 parallel to u and v , v0 orthogonal to u (and parallel to u0). Elementary k ⊥ trigonometryk tells us how to express⊥ the parallel and orthogonal parts in terms of the sine and cosine of α:
u = v cos α, u = v0 sin α, u0 = v sin α, u0 = v0 cos α, k ⊥ − k ⊥ v = u cos α, v = u0 sin α, v0 = u sin α, v0 = u0 cos α. k ⊥ k − ⊥ Now the reflection reverses the orthogonal part, leaving the parallel one intact, hence
σu(v) = v v , σu(v0) = v0 v0 , σv(u) = u u , σv(u0) = u0 u0 . k − ⊥ k − ⊥ k − ⊥ k − ⊥ Build: January 28, 2021 20 Homogeneous coordinates and affine transforms
u0 u0 v v
v0 u v k v0 v0 ⊥ ⊥ u0 k α u u0 v ⊥ α u k
u v0 ⊥ k
Figure 1.4. Orthogonal and parallel components of u and v.
Compute the image of the first basis vector under the composition σ σ : v ◦ u
(σv σu)(v) = σv(v v ) = ◦ k − ⊥ = σv(u cos α u0 sin α) = (u u ) cos α (u0 u0 ) sin α = − k − ⊥ − k − ⊥ 2 2 = (v cos α + v0 sin α cos α) (v sin α v0 sin α cos α) = − − = v cos(2α) + v0 sin(2α) = O2α(v). Similarly for the second vector:
(σv σu)(u) = σv(u) = u u = ◦ k − ⊥ = v cos α + v0 sin α = (v + v ) cos α + (v0 + v0 ) sin α = k ⊥ k ⊥ 2 2 = u cos α + u0 sin α cos α u sin α + u0 sin α cos α = − = u cos(2α) + u0 sin(2α) = O2α(u).
Now, since σv σu is linear, it follows that for every vector w = xu + yv one has (σ σ )(w) =◦xO (u) + yO (v) = O (w) as desired. v ◦ u 2α 2α 2α 3 Switching to three dimensions, let Π1, Π2 R be two distinct and non-parallel planes. Translating back and forth, without loss⊂ of generality we may assume that the intersection Π Π contains the origin and is spanned by some vector v (i.e. 1 ∩ 2 lin(v) = Π1 Π2). Let α be the (directed) angle between Π1 and Π2 and σ1, σ2 ∩ 3 be the reflections in R through Π1 and Π2, respectively. Take Π to be the plane ⊥ orthogonal to v and passing through the origin. An arbitrary vector w R2 can ∈ be expressed as a sum w = w + w with w lin(v) and w Π . Neither σ1 k ⊥ k ∈ ⊥ ∈ ⊥ nor σ2 alters the parallel part w of w since it belongs to both Π1 and Π2. On the k other hand, w lies in Π and the restrictions σ1 Π⊥ , σ2 Π⊥ are just the reflections ⊥ ⊥ | | through two lines in a plane. The previous lemma asserts that (σ1 σ2) Π⊥ is a rotation by 2α. It follows that σ σ is a rotation by 2α around v. ◦ | 1 ◦ 2 Corollary 4.2. The composition of two reflections through non-parallel planes gives the rotation around their intersection by the double angle between them. The determinant of a reflection equals 1 by Corollary 2.11. A rotation is a composition of two reflections therefore: − Corollary 4.3. The determinant of a rotation equals 1. Build: January 28, 2021 Homogeneous coordinates and affine transforms 21
Recall that a linear automorphism τ Aut(Rn) is called a linear isometry if ∈ it preserves the dot product in the sense that for every two vectors u, v R one has u v = τ(u) τ(v). In particular a linear isometry preserves norms (lengths)∈ of vectors• as well as• measures of angles, since both these quantities can be expressed in terms of the dot product. An (affine) isometry is a composition of a linear isometry with a translation. An affine isometry preserves distances between points. Theorem 4.4 (Cartan, Dieudonn´e). Every linear isometry (different from identity) of Rn is a composition of n hyperplane reflections. ≤ The proof of this theorem exceeds the scope of this book. An interested reader is referred to [16, Theorem I.7.1]. It follows from the theorem that there are only three types of linear isometries in R3. These are: reflections, rotations (i.e. compositions of two reflections) and compositions of a rotation with a reflection. If we consider a composition of two rotations, then clearly it is an isometry. Its determinant equals 1, hence it must be a composition of an even number of reflections. It follows from the Cartan-Dieudonn´etheorem, that it is a composition of just two reflections, hence a rotation again. Moreover, the inverse of a rotation is also a rotation. Therefore we proved: Theorem 4.5 (Euler). Rotations form a group. Our next goal is to show that every rotation can be expressed as a composition of just three rotations around vectors from an arbitrary orthogonal basis of R3. To this end, fix an orthogonal basis u, v, w . Without loss of generality, we may assume that all these vectors have unit{ lengths.} Let R be an arbitrary rotation by an angle α around some vector ν R3. Denote the images of the basis vectors by ∈ u0 := R(u), v0 := R(v) and w0 := R(w). Assume that u and w0 are not linearly dependent (if they are, just take another pair). Thus, there is a rotation Ou around u that takes w0 into the plane lin(u, w). Next, rotating around v, we map Ou(w0) to w. Denote this rotation by Ov. Now, u0 and v0 are perpendicular to w0, hence their images lie in lin(u, v). Thus one can find a rotation Ow around w, sending 1 them to u and v, respectively. Consequently, we have O O O = R− and so w ◦ v ◦ u 1 1 1 R = O− O− O− , u ◦ v ◦ w as required. Therefore with a fixed orthogonal basis, every rotation can be expressed as an ordered triple of angles. These are known as the Euler’s angles. 3 3 Take a linear isometry τ : R R . The canonical basis ε1, ε2, ε3 is or- thonormal, which means that → { } ( 1, if i = j εi εj = • 0, if i = j. 6 An isometry preserves dot products, hence ( 1, if i = j τ(εi) τ(εj) = • 0, if i = j, 6 Thus, τ(ε1), τ(ε2), τ(ε3) is an orthonormal basis, as well. The matrix M of τ (in linear coordinates) has a form M = τ(ε1) τ(ε2) τ(ε3) .
Build: January 28, 2021 22 Homogeneous coordinates and affine transforms
Compute the product M T M. The entry in the i’th row and j’th column equals τ(ε )T τ(ε ) = τ(ε ) τ(·ε ). It follows that M T M is the unit matrix. The i · j i • j matrices satisfying the condition M T M = I (or equivalently MM T = I) are called orthogonal. Observe that a geometric transform having an orthogonal matrix treats T T 1 vectors and normals uniformly, since if M M = I, then (M )− = M. In other words: every isometry equals its own dual. The orthogonality is preserved once we switch to homogeneous coordinates as T M 0 M 0 M T 0 M 0 M T M 0 = = · = I. 0 1 · 0 1 0 1 · 0 1 0 1 Observation 4.6. The matrix of a linear isometry is orthogonal Observation 4.7. The determinant of an isometry equals 1. ± The last observation follows either immediately from the Cartan-Dieudonn´e theorem or equivalently from Observation 4.6, since the determinant of an orthog- onal matrix satisfy: 1 = det I = det(M T M) = (det M)2. Hence det M = 1. We may summarize the above results in a convenient criterion testing weather± a given matrix represents a rotation.
Theorem 4.8. A matrix M M3,3(R) represents a rotation if and only if ∈ M T M = I and det M = 1. Usually, after verifying that a given matrix represents a rotation, we wish to find its axis and angle. The axis is fixed by the rotation, hence it is spanned by an eigenvector corresponding to the eigenvalue 1.
Observation 4.9. If M M3,3(R) represents a rotation, then it has an eigen- value 1 and the corresponding∈ eigenvector generates the axis of the rotation. Although it is possible to find the axis using the previous observation, there exists a more direct and far simpler approach. Examining the matrix presented in Proposition 2.8, we arrive at the following conclusion: Observation 4.10. If M is a matrix (in homogeneous coordinates) of a rota- tion by an angle α around a unit vector v = [x, y, z, 0]T , then the trace of M equals Tr M = 2 + 2 cos α and 0 2z sin α 2y sin α 0 − 2z sin α 0 2x sin α 0 M M T = 2y sin α 2x sin α − 0 0 − . − 0 0 0 0 Thus, the angle of the rotation may be retrieved from the trace. Once we know the angle, we recover the coordinates of the vector spanning the axis of the rotation from the difference M M T . − Example. Consider a matrix M written (in linear coordinates) as 0.78 0.46 0.43 − − M = 0.25 0.85 0.46 . 0.57 0.25− 0.78 Build: January 28, 2021 Homogeneous coordinates and affine transforms 23
A direct computation shows that M T M I and det M 1. Hence, M represents a rotation. The trace Tr M of M equals 2≈.41 and so the≈ angle of rotation is Tr M 1 π α = arccos − 0.79 . 2 ≈ ≈ 4 To find the axis, compute the difference between M and its transpose. We have 0.00 0.71 1.0 − − M M T 0.71 0.00 0.71 . − ≈ 1.00 0.71− 0.00 It follows that 0.5 0.71 1 0.5 1.0 √2 0.5 0.71 1 x · , y · , z · . ≈ sin α ≈ 2 ≈ − sin α ≈ − 2 ≈ sin α ≈ 2 T All in all, M is the matrix of the rotation by π/4 around lin(1/2, √2/2, 1/2) . − 5. Transformation decomposition In the previous sections we encountered the matrices of some commonly used affine maps and we saw how the composition of geometric transformations translates to the multiplication of the associated matrices. In this section we partially revert this operation. Our goal is to decompose a given endomorphism (matrix) to a product of elementary transformations (matrices). The main application is an interpolation of transformations as a function of time, used in computer animation. We discuss this subject in Section 4 of the next chapter and Section 1 of Chapter 5. As we will see, these are the rotations that require a special treatment during the interpolation. Thus, our goal is to split-off the rotation from a given transform. In general, given a composite endomorphism T one can express it as a product Tn T1 of elementary transformations in more than one way. In fact in infinitely many◦· · ·◦ways! Therefore, it is impossible to decompose a product T = T T n ◦ · · · ◦ 1 into its factors T1,...,Tn, knowing only the matrix of T . Nevertheless, it is still possible to find some decomposition of T , which is convenient for an interpolation and somehow coherent with human vision (see the discussion in [22]). Suppose that we are given the matrix of some affine endomorphism expressed in homogeneous coordinates as m11 m12 m13 m14 m m m m M = 21 22 23 24 m m m m 31 32 33 34. 0 0 0 1 We can write it as a product 1 0 0 m14 m11 m12 m13 0 0 1 0 m m m m 0 M = 24 21 22 23 0 0 1 m m m m 0 34 · 31 32 33 . 0 0 0 1 0 0 0 1 Here, the first term is the matrix of a translation, while the second one represents a linear map. Now, after isolating the linear part, we concentrate on splitting-off the rotation. We do it by a polar decomposition. To this end, we first recall the notion of a singular value decomposition (SVD in short), which is well known in the realm of a numerical analysis. Build: January 28, 2021 24 Homogeneous coordinates and affine transforms
Theorem 5.1. If M GLn(R) is an invertible matrix, then there are ortho- gonal matrices U, V and a∈ diagonal matrix D such that M = U D V T . · · We omit the proof, referring the reader to [6]. Take now an invertible matrix T M GL3(R). Using SVD, we write it as a product M = U D V . Denote ∈ T T · · T 1 Q(M) := U V and S := V D V . Now, V is orthogonal, hence V = V − . Further, D is· diagonal, so it is· a matrix· of a non-uniform scaling (with respect to T 1 the canonical basis). Therefore, the composition S = V D V = V D V − represents a stretching, i.e. a non-uniform scaling with respect· to· some orthonormal· · basis (namely the basis consisting of the columns of V ). On the other hand, Q(M) is a product of orthogonal matrices, hence it is orthogonal itself. Consequently, we can write M in the form: M = U D V T = (U V T ) (V D V T ) = Q(M) S. · · · · · · · This is called a polar decomposition of M. Unfortunately, all we know is that Q(M) is orthogonal. It does not have to represent a rotation since it may have a negative determinant. If this is the case we need to alter S (we are lucky, that we are working in an odd-dimensional space R3).
Corollary 5.2. Let M M3,3(R) be a matrix of a linear automorphism of ∈ the space R3. Then M admits a decomposition M = R ( S), · ± where R = Q(M) represents a rotation and S a stretching. ± Robust numerical algorithms for SVD can be found in the literature on numer- ical analysis. Any of them can be used to find the above decomposition. There is however a direct recursive algorithm for the polar decomposition of an invertible matrix. In order to present it, we need to introduce a norm in the space of matrices, giving a meaning to the notion of the limit of a sequence of matrices.
Definition 5.3. The Frobenius norm of a matrix A = [aij] Mn,n(R) is defined by the formula ∈ s X A := a2 . k k ij 1 i,j n ≤ ≤
Take now an invertible matrix M M3,3(R) and define a sequence (Qi i N0) ∈ | ∈ of matrices in a recursive manner, starting from Q0 := M and setting
1 1 T Q := Q + (Q− ) . i+1 2 i i Readers familiar with numerical methods will easily recognize that it is a matrix analog of an algorithm, due to Heron of Alexandria, for finding a square root.
Theorem 5.4. The sequence (Qi i N0) is well defined (providing that M is | ∈ invertible) and converges to Q(M) = lim Qi. i →∞ Example. Consider the following matrix 2.34 0.48 0.36 − − M := 0.75 1.69 0.94 1.72 0.10− 0.87 Build: January 28, 2021 Homogeneous coordinates and affine transforms 25
Compute the limit of the sequence (Qi i N0) defined above to get | ∈ 0.78 0.46 0.43 − − Q(M) = lim Qi = 0.25 0.85 0.46 i − . →∞ 0.57 0.25 0.78 As we remember from the previous section, Q(M) represents a rotation. Find S writing 2.25 0.21 0.93 S = M Q 1 = M QT = 0.21 2.06 0.12 − . · · 0.93 0.12 1.69 Implementation notes
Missing
Exercises (1) Prove Corollary 4.2 using the matrix forms of a rotation and a reflection. (2) Give a direct proof of Corollary 4.3. (3) Check that the set of orthogonal 3 3 matrices forms a group under multiplication. Show that this group× is isomorphic to the group of linear isometries (with composition). (4) Does Observation 4.6 remain true for affine isometries? (5) Let M GL3(R) be the matrix of a rotation. By Observation 4.9 it has an eigenvalue∈ 1. Show that it has no other real eigenvalues. Show that it has exactly two complex eigenvalues. Give a geometric interpretation of these eigenvalues. Bibliographic notes Projective spaces sometimes appear in more advanced courses of linear alge- bra and geometry. For example [15] gives a smooth introduction to projective geometry. Computer graphics uses the homogeneous coordinates from its very be- ginning. Matrix representation of affine transforms is discussed in classical sources like [8]. More rigorous (from mathematical viewpoint) treatise can be found in the unpublished book [7]. There is also an alternative approach to introducing homogeneous coordinates to computer graphics, which uses Grassmann spaces. We refer the reader to [10] for details. On the other hand, a reader interested in a more-in-depth discussion of isometries is full-heartedly encouraged to look into the literature on bilinear algebra like [16, 26]. In particular, [16] contains a detailed proof of Cartan-Dieudonn´etheorem. Matrix decomposition is a classical subject of numerical analysis. The singular value decomposition is discussed for example in [6], where the reader can find a complete proof of Theorem 5.1. The idea to use polar decomposition in this context and Theorem 5.4 comes from [22].
Build: January 28, 2021
CHAPTER 2
Projective space and quaternions
“Positing infinity, the rest is easy.” — R. Zelazny
“Infinity itself looks flat and uninteresting.” — D. Adams In this chapter we discuss a few more advanced subjects. The first three sections provide the reader with a glimpse of projective geometry. In Section 1 we show how to construct a projective transformation that sends a set of n+2 points in a general position1 to another n + 2 points in a general position. We use this construction in the following section to define a perspective projection. Next, in Section 3 we show how to uniformly assign coordinates to points, lines and planes in the three- dimensional space and how to use these coordinates to check a relative position of two objects (e.g. line-line or line-triangle intersections). In the previous chapter we saw that all the affine transformations may be repre- sented by matrix multiplication. However, for rotations there is a more convenient representation, which is better suited for an interpolation (animation). This is the subject of the last section.
1. Projective maps Recall that on the beginning of the first chapter, we defined a projective space and the projective/homogeneous coordinates. In this section we show how to define a map on a projective space preserving its structure and how to design a projective map with desired properties. First, however, we need to introduce projective sub- spaces. As we know, the projective coordinates of a point in PnR are not unique. Two (n+1)-tuples correspond to the same point when they differ by a non-zero mul- tiplicative factor. Hence, the only linear equations that make sense in a projective space are of the form:
a x + a x + + a x = 0, 0 0 1 1 ··· n n since they are not fragile to multiplication by non-zero scalars. They are called homogeneous (linear) equations. A projective subspace is just the set of zeros of a system of homogeneous linear equations. Strictly speaking:
1A “general position” is an euphemism that geometers use to mean “the points satisfy what- ever assumptions we need, but we don’t want to spend time on them here”.
Build: January 28, 2021 28 Projective space and quaternions
Definition 1.1. Let M = [mij] be a matrix with n + 1 columns and d linearly independent rows. The set n n T o V = x0 : ... : xn P R M x0 : ... : xn = Θ J K ∈ | · J K is a (n d) dimensional projective subspace of PnR defined by M. − By convention a 1-dimensional projective subspace is called a (projective) line and a 2-dimensional subspace is a (projective) plane. Thus a line in the projective plane P2R is the set of projective points x : y : w satisfying an equation ax + by + cw = 0 for some a, b, c not all equal zero.J The coefficientsK a, b, c defining the line are unique only up to a multiplication by a non-zero scalar exactly like the coordinates of a projective point. We will investigate this analogy in Section 3. Likewise an equation ax + by + cz + dw = 0 defines a plane in the projective space P3R, while a line in P3R is given by a system of two simultaneous linear equations: ( a1x + b1y + c1z + d1w = 0,
a2x + b2y + c2z + d2w = 0. Every affine subspace of Rn extends uniquely to a projective subspace of PnR by adjoining the point(s) at infinity corresponding to all the directions of the lines contained in this subspace. Take for example an affine line l R2 defined by the equation ax + by + c = 0. Homogenizing2 the equation we get⊂ ax + by + cw = 0. This defines a projective line L. One checks that L l consists of a single point at infinity, namely b : a : 0 . Observe that this point\ does not depend on c, hence it is the same forJ− every translationK of l. The projective subspaces have one striking property. Take two affine lines in the plane R2. If they are not parallel, then they intersect already in the affine plane, hence also in the projective plane. But if they are parallel, they do not intersect in R2. Nevertheless, when extending to the projective lines, we adjoin to both of them the same point at infinity, since being parallel they have the same direction. Consequently, they do intersect in the projective plane P2R. Observation 1.2. Every two lines in the projective plane intersect. After the above apparent digression we can concentrate on the main subject of this section: projective maps. Recall that in Chapter 1 we defined an equivalence relation on Rn+1 saying that two vectors are related when they are linearly dependent.∼ n+1 Lemma 1.3. Let M Mn+1,n+1(R) be a matrix and u, v R two vectors. If u v, then Mu Mv∈. ∈ ∼ ∼ Proof. If u and v are related, then there is a non-zero scalar c R satisfying ∈ u = cv, hence Mu = c(Mv). In a fancier language, the above lemma says that every linear endomorphism of Rn+1 factors through . ∼ n Definition 1.4. Let M GLn+1(R) be an invertible matrix. The map P R ∈ → PnR defined as P M P is called the projective automorphism of PnR associated with M. 7→ ·
2In this case, by “homogenization” we mean the multiplication of the constant term by the new variable w. For a general notion of a homogenization, refer to Chapter 6.
Build: January 28, 2021 Projective space and quaternions 29
Thus, the projective automorphisms are just the linear automorphisms (in one more dimension) factored through the relation . The reason we begin with au- tomorphisms is that a non-invertible (linear) endomorphism∼ have the non-trivial kernel. Therefore, if M Mn+1,n+1(R) is a non-invertible matrix, then there are vectors mapped to the origin∈ when multiplied by M. But the origin was explicitly excluded in the definition of a projective space. There are no projective points with all-zero coordinates! This brings us back to the projective subspaces intro- duced above—the points where the multiplication by M fails to give a proper results form a subspace which we need to omit, when defining a general projective map.
Definition 1.5. Let M Mn+1,n+1(R) be a non-zero matrix and V be the ∈ associated subspace. The map PnR V PnR defined by the formula P M P \ → 7→ · is called the projective endomorphism of PnR associated with M. It follows that the projective automorphisms are distinguished among projective endomorphisms not only by the fact that they are invertible but also that they are the only endomorphisms defined on the entire projective space PnR. Example. Consider the projective automorphisms τ : P2R P2R associated with a matrix → u4 0 0} 0 8 0 . v0 1 4~ Take the following four points: A = 2 : 0 : 1 ,B = 2 : 4 : 1 ,C = 2 : 0 : 1 ,D = 2 : 4 : 1 . J− K J− K J K J K The line passing through A and B is parallel to the line passing through C and D. Compute their images under τ:
A0 := τ(A) = 2 : 0 : 1 = A, B0 := τ(B) = 1 : 4 : 1 , J− K J− K C0 := τ(C) = 2 : 0 : 1 = C,D0 := τ(D) = 1 : 4 : 1 . J K J K Now, the lines passing through A0 and B0, respectively C0 and D0, are no longer parallel (see Figure 2.1). Thus τ is definitely not affine (this can also be observed by just looking at the last row of the matrix). In fact the two lines intersect at the point E = 0 : 8 : 1 , while the original lines, being parallel, intersect at the point at infinity J0 : 1 : 0K . Thus τ maps a point at infinity to the point E at a finite distance etJ vice versa—theK point 0 : 4 : 1 is sent to infinity. J − K An important lesson from the above example is that, although projective mor- phisms preserve co-linearity, they do not need to preserve parallelism. Now, we should learn how to to design a projective map bending a space to our wish. To this end we need the following definition. n Definition 1.6. Points P1,P2,...,Pn+2 P R are in a general position, if no proper projective subspace contains n + 1 of them.∈ n Take two sets of projective points P1,...,Pn+2 P R and Q1,...,Qn+2 ∈ ∈ PnR both in general positions. Our goal is to construct a projective isomor- n n phism τ : P R P R sending Pi to Qi for every 1 i n + 2. Fix vectors → n+1 ≤ ≤ u1, . . . , un+2, v1, . . . , vn+2 R in the (n + 1)-dimensional space in such a way that the equivalence class∈ of u (with respect to the relation defined in Chap- i ∼ ter 1) is Pi and the class of vi equals Qi. Points P1,...,Pn+2 are in general position, Build: January 28, 2021 30 Projective space and quaternions
B D B D 0 0
A C A0 C0
Figure 2.1. A projective map may fail to preserve parallelism.
hence vectors u1, . . . , un+1 are linearly independent. Thus, they form a basis of the space Rn+1. Consequently the matrix A = u1, . . . , un+1 1 is invertible (as usually, vectors are written as columns). Its inverse A− sends n+1 u1, . . . , un+1 onto the canonical basis ε1, . . . , εn+1 R . Similarly, the vectors { } ⊂ v1, . . . , vn+1 form another basis; writing them side-by-side we construct another matrix B = v1, . . . , vn+1 1 which again is invertible. The product B A− maps the basis u1, . . . , un+1 · n { } to v1, . . . , vn+1 . Therefore, in the projective space P R the multiplication by { 1 } B A− transforms P ,...,P to Q ,...,Q . But this still leaves us with the · 1 n+1 1 n+1 unmatched pair Pn+2 and Qn+2! Fortunately, we have one more degree of freedom introduced by the relation . Write the vector u as the linear combination of ∼ n+2 its peers u1, . . . , un+1: n+1 X un+2 = αiui. i=1
Do the same with vn+2: n+1 X vn+2 = βivi. i=1
Now, observe that the assumption that the points P1,...,Pn+2 are in a general po- sition implies that all αi’s are non-zero. Indeed, if some αi = 0, then un+2 belongs to the hyperplane lin(u1, . . . , ui 1, ui+1, . . . , un+1) spanned by all the vectors ex- − cept ui. Consequently P1,...,Pi 1,Pi+1,...,Pn+1,Pn+2 lies in a common proper − β subspace—contrary to the assumption. Define γi = i/αi for every 1 i n+1 and ≤ ≤ 1 let Γ denote the diagonal matrix with entries γ1, . . . , γn+1. Take M := B Γ A− , then for every 1 i n + 1 we have · · ≤ ≤ Mu = B Γ (Av ) = B (Γε ) = γ (Bε ) = γ v v i · · i · i i · i i i ∼ i and for the last vector: n+1 ! n+1 1 X X Mu = BΓA− α u = α γ v = v . n+2 · i i i i i n+1 i=1 i=1 Build: January 28, 2021 Projective space and quaternions 31
n It follows that the multiplication by M is an automorphism of P R sending every Pi to the corresponding point Qi. Therefore we have proved the following important theorem:
Theorem 1.7. If P ,...,P and Q ,...,Q are two sets of points in { 1 n+2} { 1 n+2} the projective space PnR, both in general positions, then there is such an automor- n phism τ of P R that τ(Pi) = Qi for every 1 i n + 2. ≤ ≤ The class of projective maps is quite broad, if we consider affine maps, instead, which are not so generic, we are free to choose only n + 1 points in general position (in affine world, this term means that they don’t all belong to a common proper subspace, or in the other words that they span the whole space). Moreover, if we consider linear maps only, we need to drop yet one more degree of freedom.
Proposition 1.8. Let P1,...,Pn+1 and Q1,...,Qn+1 be two sets of points n+1 { } { } in the affine space R and assume that P1,...,Pn+1 do not belong to any common n n proper affine subspace. Then there is an affine map τ : R R satisfying τ(Pi) = Q for every 1 i n + 1. → i ≤ ≤ If, in addition, Q1,...,Qn+1 do not belong to any common proper subspace of Rn, either, then τ is an affine automorphism.
Proposition 1.9. Let u1, . . . , un and v1, . . . , vn be two sets of vectors in n { } { } the linear space R and assume that u1, . . . , un are linearly independent. Then there n n is a linear map τ : R R satisfying τ(ui) = vi for every 1 i n. → ≤ ≤ If in addition, v1, . . . , vn are linearly independent, too, then τ is a linear auto- morphism.
The latter of these two propositions is a standard result taught at every ele- mentary course of linear algebra. We recalled it only to emphasize the connection with Theorem 1.7. The proof of the theorem can be easily adapted to show any of these two propositions. We leave the details to the reader.
Remark. General projective maps of the plane are often used in photo-editing software for a perspective correction. The interface is usually designed in such a way that a user may freely move the four corners of an image distorting the picture accordingly. This is just a straightforward implementation of Theorem 1.7.
Example. Suppose we are given a photo, taken with a camera tilted upward (see Figure 2.10). It exhibits an undesirable perspective distortion. Assume that the picture coordinates were normalized so that it occupies the unit square [0, 1] [0, 1]. Pick two line segment in the photo, that are vertical in reality and denote× their endpoints by P1,...,P4. Say
0.092 0.162 0.722 0.771 P = ,P = ,P = ,P = . 1 0.049 2 0.54 3 0.54 4 0.049
In order to correct the distortion, we are going to map the above points to
0.092 0.092 0.771 0.7771 Q = ,Q = ,Q = ,Q = . 1 0.029 2 0.45 3 0.45 4 0.029 Build: January 28, 2021 32 Projective space and quaternions
To this end, we build the three matrices A, B and Γ as explained above. We have
u0.092 0.162 0.722} u0.092 0.092 0.771} A = 0.049 0.54 0.54 B = 0.029 0.45 0.45 , , v 1.0 1.0 1.0 ~ v 1.0 1.0 1.0 ~ u1.0 0.0 0.0 } Γ = 0.0 0.825 0.0 . v0.0 0.0 0.825~
Thus, the matrix of the sought projective map is
u1.0 0.175 0.009 } u0.983 0.172 0.008 } − − M = B Γ A 1 = 0.0 0.697 0.005 0.0 0.685 0.005 − . · · v0.0 0.357− 1.017 ~ ∼ v 0.0 0.351− 1.0 ~ − − The bottom of Figure 2.10 illustrates the effect of applying this transformation to the original picture.
2. Projections and perspective In computer graphics, the ultimate result of one’s work is (most of the time) a flat picture—even though it may have been designed using three dimensional tools. This is, of course, caused by the presentation technology—an image printed on a paper or displayed on computer screen is as flat as the medium used. Nevertheless, it still should make an impression of having depth. Over centuries, artists, mostly painters, developed a number of tricks used to simulate depth of a scene. These include: making distant objects smaller with one-point, two-point or three-point perspective projection, blurring a background to simulate an out-of-focus image and painting distant objects dimmed and hazed. In this section we concentrate on the first of these techniques (i.e. perspective projection) as it is a strictly geometric subject. The simplest way to project a three-dimensional scene onto a flat viewport is just to drop one coordinate. Assume that we look at a scene in R3 along the OZ axis. The orthogonal projection R3 R3 sends (x, y, z)T (x, y)T . Observe that in this case only the direction of→ the projection matters,7→ the distance from the observer is unimportant. Later we will see that the orthogonal projection is a limit case of a perspective projections, when the focal length approaches infinity. In practice, the size of the viewport is limited by the medium used for the presentation (e.g. a computer screen). Without loss of generality we may assume the viewport to be a square [ 1, 1] [ 1, 1] R2. Thus in order to correctly display a scene not − × − ⊂ contained inside [ 1, 1] [ 1, 1] R or viewed from a direction other than OZ, we must first transform− it appropriately.× − ×
Example. Consider a scene contained in a rectangular box [ 10, 50] [0, 20] [10, 30] with the observer standing at P = (120, 60, 0)T and looking− at× a point× Q = (20, 10, 50)T with the “up” direction preserved (in other words the camera is yawed and pitched but not rolled). The idea is to translate the scene so that P is shifted to the origin, then align the viewing direction with OZ and finally scale the scene uniformly to fit the projection into the given viewport. The first part is Build: January 28, 2021 Projective space and quaternions 33 trivial. The matrix of the translation in homogeneous coordinates is 1 0 0 120 0 1 0 − 60 T = 0 0 1− 0 . 0 0 0 1 For the second step, form an orthogonal basis u, v, w , with w given by the viewing direction, and then map it to the canonical basis{ ε ,} ε , ε . The viewing direction { 1 2 3} is −−→PQ = ( 100, 50, 50)T , so after normalization: − − 0.82 Q P − w = − 0.41 Q P ≈ − . k − k 0.41 Next, find the vector u. The camera was not rolled and so u is orthogonal to w and ε2 at the same time. Forming the cross product we have: 0.45 w ε2 − u = × 0 w ε ≈ . k × 2k 0.89 − The last vector must span the orthogonal completion of both u and w, hence 0.36 − v = u w 0.91 . × ≈ 0.18 We do not need to normalize it, as the cross product of two orthogonal unit vectors has always a unit length. The matrix (in linear coordinates) of an automorphism sending the canonical basis ε , ε , ε to u, v, w equals: { 1 2 3} { } 0.45 0.36 0.82 − − − A = u, v, w 0 0.91 0.41 . ≈ 0.89 0.18− 0.41 − 1 Thus, its inverse A− defines a map sending u, v, w to ε1, ε2, ε3 . Now, the 1 T{ } { } basis u, v, w is orthonormal, hence A− = A as observed in Section 4 of Chap- ter 1.{ Switching} to homogeneous coordinates and combining the result with the translation matrix computed earlier we get the desired transformation: T u 0 0.45 0 0.89 54 T − − v 0 0.37 0.91 0.18 11 M = T T − − w 0 · ≈ 0.82 0.41 0.41 122 − − . 0 0 0 1 0 0 0 1 Multiply the vertices of the bounding box of the scene by the above matrix to see that the transformed scene ranges in X direction from about 4.5 to about 49, in Y direction from about 27 to about 16 and in Z direction from 78 to 143. Consequently the scene must− be uniformly scaled by a factor of roughly 1/145 0.007 to keep its projection inside the viewport [ 1, 1] [ 1, 1]. ≈ − × − The orthogonal projection is not very persuasive when it comes to represent depth. It is used mostly for technical drawings since it preserves information about length ratios of parallel line segments. A slight modification of the orthogonal Build: January 28, 2021 34 Projective space and quaternions
Figure 2.2. Oblique projection of a unit box. projection, known as the oblique projection is defined by the following formula (in linear coordinates): x x + d z cos α y · · , 7→ y + d z sin α z · · for some parameters d (0, 1) and α (0, 2π). The typical values are d = √2/2 ∈ ∈ and α = π/4. The oblique projection leaves the OX and OY axes intact and maps the OZ axis to the line inclined to OX at an angle α, scaling it by the factor d, at the same time. Figure 2.2 shows the result of this map. The oblique projection can be obtained by a non-uniform scaling, composed with a shear transform followed by an orthogonal projection. The following matrix can be applied to transform the scene in such a way that the subsequent orthogonal projection will result in the oblique one: 1 0 d cos α 0 0 1 d sin α 0 0 0 d 0 . 0 0 0 1 Quite like the orthogonal projection, the oblique one is mostly used for technical illustrations. The oblique projection is better in convincing depth than the orthogonal one, yet still it does not stand up to our expectations based on everyday experience. Since at least the Renaissance, painters have been using a geometric perspective to simulate depth. The simplest one is the 1pt-perspective projection. The idea of a perspective projection is to simulate depth by adjusting relative sizes of ob- jects depending on their distance from the observer. Projections of objects farther aways should be relatively smaller than projections of objects that are closer to the observer. To this end, we emit rays (i.e. semi-lines) from the observer’s position toward all objects in the scene. A point in the scene is then mapped to the inter- section point of the corresponding ray with the viewport as depicted in Figure 2.3. Figure 2.4 shows a 1pt-perspective projection of a cube [ 1, 1]3. Observe how the lines parallel to the viewing direction converge towards a− single vanishing point. In order to derive an analytic formula for the perspective projection, first we restrict ourselves to just two dimensions. The field of view (FOV) is an isosceles Build: January 28, 2021 Projective space and quaternions 35
Figure 2.3. One-point perspective projection—objects posi- tioned farther away are visually smaller. trapezoid defined by two rays symmetrically3 emitted from the origin (see Fig- ure 2.5) and bounded by the near and far clipping lines z = z and z = z . { 0} { 1} Let y0 denote the ordinate of its near-top corner. The viewport is a line segment of length 2 perpendicular to the viewing direction (hence parallel to OY ). Using the similarity of triangles we compute the distance zp of the viewport from the origin: zp z0 zp = = . 1 y0 Now, take a point P = (y, z) lying inside the trapezoid. The perspective projection P 0 = (y0, zp) of P is obtained by intersecting the viewport with the line through P and the origin. Use the similarity of triangles again to compute y0. We have 0 y (yz ) y /zp = /z, thus y0 = 0 /(zy0). It follows that the perspective projection is defined by the formula y yz 0 . z 7→ zy0 Switching to the three-dimensional space, the viewing frustum becomes a truncated pyramid and the perspective projection takes a form: xz0 x zx 0 y yz z 7→ 0 . zy0 Earlier we observed that the oblique projection can be expressed as a compo- sition of an appropriate space transformation followed by the canonical orthogonal projection (i.e. dropping the last coordinate). The same holds for the perspective projection. The perspective makes line parallel to the viewing direction converge at some distance e. Using the projective geometry language, introduced in the pre- vious section, we say that the point at infinity 0 : 0 : 1 : 0 , which is the common point of all the lines parallel to OZ, is mappedJ to someK point 0 : 0 : e : 1 at a finite distance e, yet still lying on the OZ axis. On the otherJ hand, the originK 0 : 0 : 0 : 1 and the points at infinity corresponding to the two remaining principal axesJ (i.e. 1K : 0 : 0 : 0 and 0 : 1 : 0 : 0 ) should stay intact. This way, we have fixed our mapJ at fourK points.J But, in orderK to apply the methods of the previous section, we need to have five points in a general position. Take the last point to
3This assumption is not really necessary, see Exercise 2.
Build: January 28, 2021 36 Projective space and quaternions
Figure 2.4. A cube in 1pt-perspective.
y0 1 P
zp z0 z1 1 −
Figure 2.5. One-point perspective projection in two dimensions. be the corner 1 : 1 : 1 : 1 of the viewing frustum and assume that it stays fixed under theJ− perspective− − map.K Hence, the map we are looking for must satisfy the following constrains: 1 : 0 : 0 : 0 1 : 0 : 0 : 0 , J K 7→ J K 0 : 1 : 0 : 0 0 : 1 : 0 : 0 , J K 7→ J K 0 : 0 : 1 : 0 0 : 0 : e : 1 , J K 7→ J K 0 : 0 : 0 : 1 0 : 0 : 0 : 1 , J K 7→ J K 1 : 1 : 1 : 1 1 : 1 : 1 : 1 . J− − − K 7→ J− − − K Theorem 1.7 asserts that there is a projective automorphism fulfilling the above requirements. The matrix of this automorphism is computed as a product BΓA where 1 u1 0 0 0}− u1 0 0 0} u1 0 0 0} 0 1 0 0 0 1 0 0 0 1 0 0 A = w = w B = w w0 0 1 0 w0 0 1 0 w0 0 e 0 , , v0 0 0 1~ v0 0 0 1~ v0 0 1 1~ β u 1/α1 0 0 0 } β 0 1/α1 0 0 Γ = w β w 0 0 1/α1 0 v β ~. 0 0 0 1/α1
Build: January 28, 2021 Projective space and quaternions 37
The coefficients αi and βi are obtained from the equations: u 1} u1} u0} u0} u0} −1 0 1 0 0 w− = α1 w + α2 w + α3 w + α4 w w 1 w0 w0 w1 w0 , v−1 ~ v0~ v0~ v0~ v1~ u 1} u1} u0} u0} u0} −1 0 1 0 0 w− = β1 w + β2 w + β3 w + β4 w w 1 w0 w0 we w0 . v−1 ~ v0~ v0~ v1~ v1~
1 It follows that α1 = α2 = α3 = 1 = β1 = β2, while α4 = 1, β3 = − /e and 1 − β4 = 1 + /e. Multiplying the matrices we get the desired result. Proposition 2.1. The 1pt-perspective transformation is defined by the formula (in the homogeneous coordinates): x 1 0 0 0 x y 0 1 0 0 y z 0 0 1 0 z 7→ . 1 0 0 1/e 1 + 1/e 1 It maps the truncated pyramid with the front-face having vertices [ 1, 1, 1, 1]T e+1 e+1 e+1 T ± ± 3− 3 and the back-face with vertices [ e 1 , e 1 , e 1 , 1] onto the cube [ 1, 1] R . ± − ± − − − ⊂ In the situation described in the above proposition, the observer sits at the apex of the pyramid, which is the point [0, 0, (1 + e), 1]T . Thus, the longer the focal length e is, the farther away is the observer.− This comes as no surprise since the long focal length corresponds to a narrow angle of view, hence the scene must be watched from a considerable distance in order to fit into the viewport. Compute the limit of the above expression as the focal length e approaches the infinity: 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 lim = e 0 0 1 0 0 0 1 0 →∞ . 0 0 1/e 1 + 1/e 0 0 0 1 It is the identity matrix. Therefore, there is only the orthogonal projection (that followed the perspective transformation) left. This is why we said earlier that the orthogonal projection is a limit case of a perspective projection. Figure 2.6 shows how the perspective changes with the focal length. In real-life applications (e.g. in computer graphics), we want the observer to stay at the origin and the viewing frustum to be defined, as in the earlier discussion, by the two cutting planes: near z = z0 and far z = z1 . This can be achieved by a translation followed by a non-uniform{ } scale—we{ leave} the details to the reader, showing only the resulting matrix. Observation 2.2. The following map z0 0 0 0 x x0 x z y 0 0 0 0 y y0 z 7→ 0 0 z0+z1 2z0z1 z − z0 z1 z0 z1 . 1 0 0 1− − 0 1 Build: January 28, 2021 38 Projective space and quaternions
Figure 2.6. Three perspective projections of the same cube but with increasing focal lengths.
transforms the truncated pyramid with the near clipping plane z = z0 , far clip- ping plane z = z and the corner of the front-face [x , y , z ,{1]T onto} the cube { 1} 0 0 0 [ 1, 1]3 R3. − ⊂ Of course, if the position of the observer and its line of view does not coincide with the origin together with the OZ axis, then the scene must be aligned first by the appropriate isometry as shown in the example earlier in this section. Beside the 1pt-perspective, painters and illustrators often use 2pt- and 3pt- perspective projections, where two (respectively three) points at infinity, corre- sponding to the given system of orthogonal directions, are brought to finite dis- tances. Figure 2.7 shows how these two maps work. The matrix forms of these transformations are once again derived using the technique developed in Section 1. Here we present the procedure for the 2pt-perspective, the other one is similar. Without loss of generality we may assume that the fixed orthogonal directions mentioned above are just the principal axes—if not, combine the map with an isometry. Suppose that the lines parallel to OY should remain parallel, while the ones parallel to OX and OZ should converge at (finite) distances, respectively e and g. Again, let the origin and the point 1 : 1 : 1 : 1 stay fixed. Thus the map is uniquely determined by the set of constrains:J− − − K 1 : 0 : 0 : 0 e : 0 : 0 : 1 , J K 7→ J K 0 : 1 : 0 : 0 0 : 1 : 0 : 0 , J K 7→ J K 0 : 0 : 1 : 0 0 : 0 : g : 1 , J K 7→ J K 0 : 0 : 0 : 1 0 : 0 : 0 : 1 , J K 7→ J K 1 : 1 : 1 : 1 1 : 1 : 1 : 1 . J− − − K 7→ J− − − K Constructing the matrices A, B, Γ and multiplying them together (like for the 1pt-perspective) we have: Observation 2.3. The 2pt-perspective transformation is obtained by the for- mula x 1 0 0 0 x y 0 1 0 0 y z 7→ 0 0 1 0 z 1 1 1 1 , 1 e 0 g 1+ e + g 1 Build: January 28, 2021 Projective space and quaternions 39
Figure 2.7. Comparison of the 2pt-perspective (left) and 3pt- perspective (right). where the lines parallel to OX converge at [e, 0, 0, 1]T , parallel to OZ at [0, 0, g, 1]T and parallel to OY remain parallel. Likewise for the 3pt-perspective we have: Observation 2.4. The 3pt-perspective transformation is obtained by the for- mula x 1 0 0 0 x y 0 1 0 0 y z 7→ 0 0 1 0 z 1 1 1 1 1 1 , 1 e f g 1+ e + f + g 1 where the lines parallel to OX converge at [e, 0, 0, 1]T , parallel to OY at [0, f, 0, 1]T and parallel to OZ at [0, 0, g, 1]T . All three ways of representing perspective, namely 1pt-, 2pt- and 3pt-perspec- tive projections are used by artists with the choice of method dictated by the actual situation. An inside view of a big hall could be depicted in the 1pt-perspective with edges of the floor and ceiling converging at some distant point; a building seen from the outside with one corner facing a viewer would likely be drawn in either the 2pt- perspective (if the horizon is roughly at half of the building’s height) or in the 3pt- perspective (when the observer is near the basement of the structure). All three projections correspond to our natural way of perceiving the reality surrounding us. Thus it should not come as a big surprise that they are actually equivalent in the sense that any one of them is conjugated with any other by some rotation and (uniform) scaling. In other words, by just yawing/pitching/rolling the camera and possibly adjusting the scale of the scene we can change the projection from 1pt-perspective to 2pt-perspective to 3pt-perspective et vice versa. Indeed, it is an easy observation that two parallel lines stay parallel after per- forming a 1pt-perspective transformation if and only if they are contained in a plane orthogonal to the viewing direction. Therefore, a projective map which is a 1pt-perspective with respect to the principal system of coordinates, becomes a 2pt-perspective transformation, when regarded with respect to a coordinate sys- 3 tem determined by an orthonormal basis α, ε2, γ of U3 = R , where α and γ are { } ∼ contained in the OX plane but are not parallel to ε1, ε3. This is so because all the points at infinity corresponding to the directions of lines contained in OXZ are brought to finite distances, except the single point corresponding to the direction of the OZ axis. More rigorously, we can formalize the above discussion as follows: Build: January 28, 2021 40 Projective space and quaternions
3 3 Proposition 2.5. If π2 : P R P R is a 2pt-perspective transformation, then → 3 3 there is a 1pt-perspective transformation π1 : P R P R together with a rotation ρ and a uniform scaling σ such that →
1 π2 = (ρσ)− π1(ρσ).
Proof. Our first step is to find the focal length h in the direction OZ for the 1pt-perspective transformation π1. Let e, g be the convergence distances of π2 in directions OX and OZ, respectively. Thus we have a matrix equation
1 (2.6) B = (RS)− A(RS), where A is the matrix of the sought 1pt-perspective (c.f. Proposition 2.1), B is the matrix of the 2pt-perspective given in Observation 2.3, R is the matrix of a rotation (see Proposition 2.8) and finally S is the matrix of a uniform scaling (c.f. Observation 2.6). Comparing the bottom-rightmost entries we have 1 + 1/e + 1/g = 1 + 1/h. It follows that the focal distance h is given by the formula
eg h = . e + g
Consider three points at infinity P = e : 0 : g : 0 , Q = 0 : 1 : 0 : 0 and R = g : 0 : e : 0 together with the originJ O = 0 : 0K : 0 : 1 .J Clearly theK lines throughJ− O and P , KO and Q and O and R are pairwiseJ orthogonal.K Compute the images of P , Q, R under the 1pt-perspective π1. First, it sends P to P 0 = e : 0 : e + g 2 eg J g : /e = e /e + g : 0 : /e + g : 1 . Further, the point Q remains fixed under π1, 2 K J g e K+ g eg e + g while R is mapped to R0 = − / : 0 : / : 1 . Observe that P 0 belongs to the line through O and P , likewiseJ R0 lies on the line throughK O and R. In the linear 3 space R ∼= U3 we fix an orthogonal basis consisting of the following three vectors: e g T p 2 2 T g e T u = ( /e + g, 0, /e + g) , v = (0, e + g /e + g, 0) and w = ( − /e + g, 0, /e + g) . The lines parallel to u converge under π1 to eu and the lines parallel to w converge to gw, while the ones parallel to v stay parallel. Therefore π1 in the coordinate system (u, v, w) behaves like the 2pt-perspective. Consequently, by switching from canonical coordinate system given by ε1, ε2 and ε3 to the new one, transforming the space by π1 and switching back, we obtain exactly the same result as using the 2pt-perspective. The vectors u, v, w are not normalized, but they have the p common length e2 + g2/e + g. Since they are pairwise orthogonal it follows that the map sending ε1 u, ε2 v and ε3 w is a composition of a uniform scaling σ by e + g p7→2 2 7→ 7→ g the factor / e + g followed by a rotation ρ around ε2 by an angle arctan( /e) with . Thus
1 π2 = (ρσ)− π1(ρσ). Build: January 28, 2021 Projective space and quaternions 41
1 In matrix notation (see Eq. (2.6)) we write it as B = (RS)− A(RS), where: 1 0 0 0 1 0 0 0 0 1 0 0 0 1 0 0 A = B = 0 0 1 0 0 0 1 0 e+g e+g , 1 1 1 1 , 0 0 eg 1 + eg e 0 g 1 + e + g
2 2 e g √e +g 0 − 0 0 0 0 √e2+g2 √e2+g2 e+g √e2+g2 0 1 0 0 0 0 0 R = g e S = e+g 0 0 2 2 2 2 2 2 √e +g √e +g √e +g 0 0 0 , e+g . 0 0 0 1 0 0 0 1 A fully analogous result can be proved for a 3pt-perspective. We let the reader fill up the missing details. 3 3 Proposition 2.7. If π3 : P R P R is a 3pt-perspective transformation, then → 3 3 there is a 1pt-perspective transformation π1 : P R P R together with a rotation ρ and a uniform scaling σ such that → 1 π3 = (ρσ)− π1(ρσ).
3. Duality and the Pl¨ucker-Grassmann coordinates So far we have been using the notion of homogeneous/projective coordinates of points. In this section we show how to assign coordinates to any projective subspace. Not only points, but also lines and planes in P3R have consistently defined coordinates, hence may be treated in a unified manner. Moreover, as we will see, points and lines in the projective plane P2R (respectively points and planes in the projective space P3R) behave in a dual fashion. Every operation performed on one of them has its dual acting on the other. In order to motivate our farther studies and build some basic intuition, consider first an equation (3.1) ax + by + cw = 0. If the parameters a, b, c (not all equal zero) are considered fixed, then the equation describes all the points x : y : w lying on a line L in the projective plane P2R. The numbers a, b, c R Jare the coefficientsK (“coordinates”) of this line L. But we ∈ can reverse the situation, assuming that it is the point P = x : y : w P2R which is fixed. Then, the equation Eq. (3.1) describes all the projectiveJ K lines∈ passing through P . Observe that we may scale the coefficients (“coordinates”) a, b, c of a line by a non-zero factor, without altering the line itself, exactly as we scale the projective coordinates of a point. We shall see soon that a, b, c are actually the Pl¨ucker-Grassmann coordinates of a line and like the projective coordinates of a point they are unique only up to the multiplication by a non-zero scalar. Take now three points P1 = x1 : y1 : w1 , P2 = x2 : y2 : w2 and P3 = x3 : 2J K J K J y3 : w3 in the projective plane P R. They are co-linear (i.e. they lie on a common line) ifK and only if there are a, b, c R, not all equal zero, satisfying ∈ ax + by + cw = 0 1 1 1 ax2 + by2 + cw2 = 0 ax3 + by3 + cw3 = 0. Build: January 28, 2021 42 Projective space and quaternions
This is equivalent to saying that the determinant of the matrix constructed from the coordinates of P1, P2 and P3 is non-zero:
ux1 y1 w1} det x y w = 0. v 2 2 2~ 6 x3 y3 w3
2 On the other hand, if we take three lines L1, L2, L3 in the projective plane P R, where the Pl¨ucker-Grassmann coordinates of Li are ai, bi and ci, then these lines intersect at a single point if and only if there are x, y, z R, not all equal zero, such that ∈ a x + b y + c w = 0 1 1 1 a2x + b2y + c2w = 0 a3x + b3y + c3w = 0 if and only if
ua1 b1 c1} det a b c = 0. v 2 2 2~ 6 a3 b3 c3 As we see, in order to check whether three point lie on a common line, one needs to perform exactly the same operations as when checking if three lines pass through a common point. This is so because points and lines in the projective plane are dual to each other. Every statement true about one kind of objects has its dual con- structed by exchanging the words “point(s)” and “line(s)” and the phrases “lie(s) on” and “pass(es) through”. The mother of all the examples is the following pair of sentences: Every two distinct projective points lie on a common line. Every two distinct projective lines pass through a common point. Now, it is time to define Pl¨ucker-Grassmann coordinates in their full generality. Take a subspace V of a projective space PnR of dimension d < n given by a matrix equation
ux0 } u0} . . (3.2) M w . = w. · v ~ v ~, xn 0 where the rank of M is rank M = n d (see Section 1). Consider the equation − Eq. (3.2) on the linear space Rn+1 and let Vˆ denote the subspace of the solutions: x0 x0 0 ˆ . n+1 . . V = . R M . = . . ∈ · xn xn 0 Clearly, the projective subspace V PnR is the set of the equivalence classes of ⊂ vectors from Vˆ with respect to the relation defined in Chapter 1. We say that Vˆ ˆ ˆ ∼ ˆ n is the lift of V . Choose a basis P0,..., Pd of V and let P0,...,Pd P R be the corresponding points in the projective{ space.} They span V in the following∈ sense:
Observation 3.3. V is the smallest projective subspace containing P0,...,Pd. Build: January 28, 2021 Projective space and quaternions 43
Suppose that Pj has the projective coordinates xj,0 : ... : xj,n . Take a (d+1)- J K tuple ˆı = (i0, . . . , id) where 0 ik n. Let Aˆı Md+1,d+1(R) be a matrix with ≤ ≤ ∈ rows (xj,i0 , . . . , xj,id ): x x 0,i0 ··· 0,id . . Aˆı = . . x x 0,i0 ··· d,id and let Vˆı := det Aˆı. In other words, to build Aˆı we write all the coordinates of P0,...,Pd as rows of a (d + 1) (n + 1) matrix, denote it A, and then we select the columns corresponding to the× entries of ˆı. The rank of A equals d + 1 since Pˆ0,..., Pˆd where linearly independent as a basis of V . Therefore, for at least one (d + 1)-tuple ˆı, the corresponding matrix Aˆı is invertible, hence its determinant Vˆı is non-zero. n+1 Definition 3.4. The d+1 -tuple ... : Vˆı : ... , where ˆı ranges over all the combinations of d + 1 elements from theJ set 0, . . . , nK is called the Pl¨ucker-Grass- mann coordinates of the subspace V . { } Recall that the projective coordinates of a point are unique only up to a mul- tiplication by a non-zero scalar. Such a multiplication scales all the determinants in the above definition by a constant factor. It follows that the Pl¨ucker-Grass- mann coordinates of a subspace are also defined only up to a multiplication by a common factor. Before we present any examples we still need to show that the definition of Pl¨ucker-Grassmann coordinates does not depend on the choice of the basis Pˆ0,..., Pˆd . {ˆ ˆ } ˆ ˆ If P00,..., Pd0 is another basis of V , then every Pi0 can be written (uniquely) as the linear combination Pˆ0 = b Pˆ + + b Pˆ . i i0 0 ··· id d In a matrix form: ˆ T ˆT (P00) P0 . = B . . · . ˆ T ˆT , (Pd0 ) Pd where B = (b ) is a (d + 1) (d + 1) matrix. Clearly B is invertible since both ij × sets Pˆ ,..., Pˆ and Pˆ0,..., Pˆ0 are bases of V . Define P 0,...,P 0 and A0 , V 0 { 0 d} { 0 d} 0 d ˆı ˆı analogously as P ,...,P and A , V . We have A0 = B A consequently 0 d ˆı ˆı ˆı · ˆı ... : Vˆı0 : ... = det B ... : Vˆı : ... = ... : Vˆı : ... . J K · J K J K This shows that the definition is correct—the Pl¨ucker-Grassmann coordinates de- pend solely on V not on the choice of a set of points spanning V . Observe that straight from the definition, the full space PnR has just one coor- n+1 dinate, since n+1 = 1. The coordinate is the determinant of the matrix obtained from a basis of Rn+1, hence it is not zero. The Pl¨ucker-Grassmann coordinates are defined only up to a multiplicative factor, hence we can take the unique coordinate of PnR to be 1. For completeness we assign also the Pl¨ucker-Grassmann coordi- n+1 nates to the empty set. Again it must be just one coordinate as 0 = 1. We simply define the unique coordinate of the empty set to be 0. For a point P the order in which we write its coordinates is in fact arbitrary. It is only a matter of tradition and a common convention that we first write x, then y and then z. For Pl¨ucker-Grassmann coordinates there is no such widely-spread Build: January 28, 2021 44 Projective space and quaternions convention. Thus, to avoid confusion we shall always explicitly write the order of ˆı’s. What’s more, since every ˆı is a finite sequence of indexes, a permutation of these indexes influences the resulting Pl¨ucker-Grassmann coordinates. To explain this correspondence we need to recall the notion of the sign of a permutation. Let σ be a permutation of the set 0, . . . , n . It is well known (see e.g. [14, Chapter I, § 4]) that σ can be written as a{ product} of transpositions (a transposition is a permutation that exchanges exactly two elements of the domain leaving the rest intact). The decomposition of the permutation into a product of transpositions is not unique, nevertheless the parity of a number of transpositions is constant for the given permutation σ. We define sgn σ to be ( 1)k where k is a number of transpositions in some decomposition of σ. The permutation− σ is called even if sgn σ = 1 otherwise it is odd. After this short digression, we can explain how a permutation of indexes in ˆı influences the Pl¨ucker-Grassmann coordinates. As we know, every transposition of columns of a matrix changes the sign of its determinant. Thus we have: Observation 3.5. If σ S(d + 1) is a permutation, then V = sgn σ V . ∈ ˆı · σ(ˆı) Let us see how to compute the Pl¨ucker-Grassmann coordinates of some basic subspaces like: points, lines and planes. Start from a point P = x0 : ... : xn n n+1 J K ∈ P R which form a 0-dimensional subspace. It has 1 = n+1 Pl¨ucker-Grassmann coordinates
P0 : ... : Pn = det(x0): ... : det(xn) = x0 : ... : xn . J K J K J K Observation 3.6. The Pl¨ucker-Grassmann coordinates of a point equal its projective coordinates. Thus the notion of the Pl¨ucker-Grassmann coordinates generalizes the notion of the projective coordinates. Now, take a line L through two points P , Q in the projective plane P2R. The line in the projective plane has three Pl¨ucker-Grassmann 2+1 coordinates as 1+1 = 3. These are:
L12 : L20 : L01 =
J K yP wP wP xP xP yP = qdet ( yQ wQ ) : det ( wQ xQ ) : det ( xQ yQ )y =
= yP wQ yQwP : xQwP xP wQ : xP yQ xQyP . J − − − K Observe that these are just the coordinates of a cross product p q, where p, q R3 × ∈ are lifts of P and Q. Abusing the notation, we write shortly L12 : L20 : L01 = p q . At least one of the determinants is non-zero. SupposeJ that the equationK definingJ × K L is L : ax + by + cw = 0. It is satisfied by both P and Q, hence we have a system of two simultaneous linear equations: ( axP + byP + cwP = 0
axQ + byQ + cwQ = 0.
Suppose, for example, that it is L01 which is not zero. Solve the system using the Cramer’s rule to get L L L L a = c 21 = c 12 and b = c 02 = c 20 . − L01 L01 − L01 L01 Build: January 28, 2021 Projective space and quaternions 45
Observation 3.7. The Pl¨ucker-Grassmann coordinates of a line L defined by the equation ax + by + cw = 0 are L12 : L20 : L01 = a : b : c . J K J K Switch now to subspaces of the projective space P3R. Beside the points, covered by Observation 3.6, there are lines and planes that we should consider. Planes in the projective space behave quite like lines in the projective plane. Namely we have:
Observation 3.8. The Pl¨ucker-Grassmann coordinates of a plane V P3R defined by the equation ax + by + cz + dw = 0 are ⊂
V132 : V230 : V310 : V012 = a : b : c : d . J K J K The reader may easily supply the details. Now, consider a line L P3R through ⊂ two points P = xP : yP : zP : wP and Q = xQ : yQ : zQ : wQ . According to J K J K the definition, the Pl¨ucker-Grassmann coordinates of L are L12 : L20 : L01 : L03 : J L13 : L23 , where K yP zP , yP zP , yP zP , L12 = det ( yQ zQ ) L20 = det ( yQ zQ ) L01 = det ( yQ zQ )
yP zP , yP zP , yP zP . L03 = det ( yQ zQ ) L13 = det ( yQ zQ ) L23 = det ( yQ zQ )
3 Let p, q R be two vectors with coordinates respectively p = (xP , yP , zP ) and ∈ q = (xQ, yQ, zQ). The first three Pl¨ucker-Grassmann coordinates of L are equal p q, while the next three are pwQ qwP . We may rewrite the above formula in a× short, informal but easy to remember− form
L12 : L20 : L01 : L03 : L13 : L23 = p q : p wQ q wP . J K J × · − · K If we assume that neither P nor Q lies at infinity, we may always take wP = wQ = 1 and so the coordinates of L are p q : p q . Summarizing we have the following result: J × − K Observation 3.9. The Pl¨ucker-Grassmann coordinates of a line L through two affine points P,Q R3 are ∈ L12 : L20 : L01 : L03 : L13 : L23 = x : y : z : x0 : y0 : z0 , J K J K T T where (x, y, z) = −−→ΘP −−→ΘQ and (x0, y0, z0) = −−→QP . × Example. Take two points P = (1, 0, 1)T and Q = (0, 2, 0)T . The cross product is then (1, 0, 1)T (0, 2, 0)T = ( 2, 0, 2)T and their difference is (1, 0, 1)T (0, 2, 0)T = (1, 2, 1)T . Therefore× the Pl¨ucker-Grassmann− coordinates of the line− L containing these− points are
L12 : L20 : L01 : L03 : L13 : L23 = 2 : 0 : 2 : 1 : 2 : 1 . J K J− − K Beside being determined by two points on it, a line L in the projective space P3R can be also given as an intersection of two planes. One can derive the Pl¨uc- ker-Grassmann coordinates of L from the coordinates of the two planes defining it.
Proposition 3.10. Let V,W P3R be two projective planes with the Pl¨ucker- ⊂ Grassmann coordinates respectively: V132 : V230 : V310 : V012 and W132 : W230 : J K J W310 : W012 . Further let L := V W be the line where the two planes intersect. K ∩ Build: January 28, 2021 46 Projective space and quaternions
The Pl¨ucker-Grassmann coordinates of L are L12 : L20 : L01 : L03 : L13 : L23 = x0 : y0 : z0 : x : y : z , where J K J K x0 V132 W132 x V132 W132 y0 = W012 V230 V012 W230 and y = V230 W230 − × z0 V310 W310 z V310 W310
T T Write v := (V132,V230,V310) and w = (W132,W230,W310) . If neither V nor W is the plane at infinity, then the vectors v and w are orthogonal to V and W , respectively. The Pl¨ucker-Grassmann coordinates of L can be written in a short form (dual to the formula for the coordinates of a line through two points):
L12 : L20 : L01 : L03 : L13 : L23 = v W012 w V012 : v w . J K J · − · × K If neither of these two planes pass through the origin, we may take W012 = V012 = 1, simplifying the formula even farther to v w : v w . J − × K Proof. In the projective space P3R we have four principal planes, namely: x = 0 , y = 0 , z = 0 and the plane at infinity w = 0 . Since we are dealing{ } here{ with} only{ two planes} V and W , hence changing{ possibly} the order of coordinates, without loss of generality we may assume that neither V nor W is the plane at infinity. Take the vectors v and w as in the above comment, so that they are orthogonal respectively to V and W . It follows that their cross-product u := v w, being perpendicular to both of them, determines the direction of L. × Fix a point P = xP : yP : zP : 1 on L and let Q be the translation of P along L: J K Q := P + u. Let p := −−→ΘP and q := −−→ΘQ. In order to apply Observation 3.9, we need to compute the cross product p q = p u. We have × × V (yW + zW ) W (yV + zV ) 132 · 230 310 − 132 · 230 310 (3.11) p u = V (xW + zW ) W (xV + zV ) 230 132 310 230 132 310 . × V · (xW + yW ) − W · (xV + yV ) 310 · 132 230 − 310 · 132 230 The coordinates xP , yP , zP of P satisfy the equations for V and W simultaneously. Therefore we can write ( p v + V = 0 • 012 p w + W = 0. • 012 This let us substitute the formulas in the parenthesis in Eq. (3.11), to get V W + W V − 132 012 132 012 p u = V230W012 + W230V012 = V012 w W012 v. × −V W + W V · − · − 310 012 310 012 Observation 3.9 asserts that the Pl¨ucker-Grassmann coordinates L12 : L20 : L01 : J L03 : L13 : L23 of L are given by the formula p q : p q , but this equals K J × − K w V012 v W012 : v w , as desired. J · − · × K Observe that the co-dimension 1 subspaces of PnR (e.g. planes in P3R and 2 n+1 lines in P R) have n = n + 1 Pl¨ucker-Grassmann coordinates—exactly like points. Consider the set of all the co-dimension 1 subspaces of PnR. It has a natural structure of a projective space. This space is called the dual space to PnR n and denoted PnR. As the co-dimension one subspaces of P R become points in Build: January 28, 2021 Projective space and quaternions 47
n PnR, the co-dimension 2 subspaces of P R correspond to lines in PnR, the co- n dimension 3 subspaces of P R turn into planes in PnR and so on till the points of PnR become the co-dimension 1 subspaces of the dual space. We omit the details of this correspondence in an arbitrary dimension, referring the reader to [15, Part 3, §7]. Instead we concentrate on the low dimensional cases. The duality in the projective plane was discussed at the beginning of this sec- tion. We restate the main points using the terminology introduced in the meantime. The projective plane P2R consists of points and the co-dimension one subspaces are lines. A line with the Pl¨ucker-Grassmann coordinates a : b : c consists of all the points x : y : w P2R satisfying the equation ax + byJ + cw K= 0. On the other J K ∈ 2 4 hand, the dual space P2R consists of lines of P R. So a point in P2R is a line in 2 P R. Now P2R is clearly a plane, hence a line in the dual space is the set of points L∗ = a : b : c of P2R satisfying the equation xa + yb + cw = 0. Such a line P ∗ J K has the Pl¨ucker-Grassmann coordinates P12∗ : P20∗ : P01∗ = x : y : w in P2R and clearly corresponds to a point P = x : yJ : w in the originalK J plane PK2R. A point P = x : y : w P2R lies on a lineJ L = a K: b : c P2R if and only if the line J K ∈ J K ⊂ P ∗ = x : y : w P2R passes through the point L∗ = a : b : c P2R. ⊂ ∈ SimilarJ correspondenceK occurs in three dimensions.J A planeK V in P3R with the Pl¨ucker-Grassmann coordinates V132 : V230 : V310 : V012 = a : b : c : d becomes J K J K a point V ∗ = a : b : c : d in the dual space P3R.A plane P ∗ in P3R with the J K Pl¨ucker-Grassmann coordinates P132∗ : P230∗ : P310∗ : P012∗ = x : y : z : w clearly corresponds to a point P = x :Jy : z : w in P3R. ThisK stillJ leaves us withK lines, which have the dimension oneJ and the co-dimensionK two. It should not come with a 3 big surprise that lines in P R corresponds to lines in the dual space P3R. Let us take a closer look at their Pl¨ucker-Grassmann coordinates in both spaces. Fix a line L in 3 P R spanned by the points P = xP : yP : zP : wP and Q = xQ : yQ : zQ : wQ . J T K J T K As before take two vectors p = (xP , yP , zP ) and q = (xQ, yQ, zQ) . The Pl¨ucker- Grassmann coordinates of L are p q : pwQ qwP . The points P , Q correspond J × − K to two planes P ∗ and Q∗ in the dual space P3R. Hence the line L through P , Q turns into the line L∗ defined as the intersection L∗ = P ∗ Q∗. Proposition 3.10 ∩ asserts that the Pl¨ucker-Grassmann coordinates of L∗ are pwQ qwP : p q . Thus we have proved: J − × K
3 Corollary 3.12. The dual L∗ to a projective line L P R is again a line with the Pl¨ucker-Grassmann coordinates ⊂
L12∗ : L20∗ : L01∗ : L03∗ : L13∗ : L23∗ = L03 : L13 : L23 : L12 : L20 : L01 . J K J K In a nutshell: to write the coordinates of the dual line we swap the two triples making up the coordinates of the original line. The following theorem provides a generalization of the above corollary to an arbitrary dimension.
n Theorem 3.13. The dual V ∗ PnR to a projective subspace V P R of dimension d is a projective subspace⊂ of dimension n d 1 with the Pl¨ucker-Grass-⊂ − − mann coordinates ... : Vˆ∗ : ... satisfying the condition J K V ∗ = ε V , ˆ · ˆı
4For clarity, we use italics to distinguish objects in a dual space from objects in a “normal” projective space.
Build: January 28, 2021 48 Projective space and quaternions
providing that the support of the concatenation (ˆı, ˆ) = (i0, . . . , id, j0, . . . , jn d 1) is the whole set 0, . . . , n and ε is the sign of the permutation − − { } 0 1 d d + 1 n ··· ··· . i0 i1 id j0 jn d 1 ··· ··· − − We omit the proof here, referring an interested reader to [12, Chapter VII, §3, Theorem I]. Take a projective automorphism T : PnR PnR. It induces a function → T ∗ : PnR PnR on the dual space, defined by a condition → T ∗(V ∗) = T (V ) ∗. n In other words, if V ∗ is a point of PnR, then V is a hyperplane of P R. Suppose that V is spanned by some points P, ...,Pn 1 and denote Q0 := T (P0),...,Qn 1 := − − T (Pn 1). Since T is an automorphism, the smallest projective subspace containing − Q0,...,Qn 1 is again a hyperplane. Denote it W , then T ∗(V ∗) is just the point − W ∗ dual to W . It is easy to observe that T ∗ is also a projective automorphism. We say that it is a dual projective automorphism to T . The following result provides an analytic formula bounding T ∗ with T . Proposition 3.14. If M is a matrix associated to a projective automorphism n n T 1 T : P R P R, then (M )− is a matrix associated to its dual T ∗ : PnR PnR. → → Proof. Take a point V ∗ of PnR with coordinates y∗ : ... : y∗ . Let V 0 n ⊂ PnR be the corresponding hyperplane. It follows fromJ the discussionK preceding Theorem 3.13 that the Pl¨ucker-Grassmann coordinates of V are
y1,2,...,n : y0,2,...,n : ... : y0,1,...,n 1 = y0∗ : y1∗ : ... : yn∗ . J − K J K Suppose that V is spanned by points P0,...,Pn 1 with coordinates − Pi = xi,0 : ... : xi,n . J K It follows from a reasoning similar to the one that led us to Observation 3.7 and 3.8, that
y1,2,...,n xi,0 + + y0,1,...,n 1 xi,n = 0 · ··· − · for every i 0, . . . , n 1 . In a matrix form we can write this condition as ∈ { − } uxi,0 } . y1,2,...,n : ... : y0,1,...,n 1 w . = 0 J − K · v ~ xi,n (or in even shorter form as V P = 0). Introducing M into this equation, we write · i uxi,0 } 1 . y1,2,...,n : ... : y0,1,...,n 1 M − M w . = 0. J − K · · · v ~ xi,n Rearranging the terms we have T u y1,2,...,n } uxi,0 } T 1 . . (M )− w . M w . = 0. · v ~ · · v ~ y0,1,...,n 1 xi,n − Build: January 28, 2021 Projective space and quaternions 49
Now, M P is the image of T (P ) of P and it follows from the above formula that · i i i the Pl¨ucker-Grassmann coordinates of T (V ) are z1,2,...,n : ... : z0,1,...,n 1 , where J − K u z1,2,...,n } u y1,2,...,n } . T 1 . w . = (M )− w . v ~ · v ~. z0,1,...,n 1 y0,1,...,n 1 − −
Consequently the coordinates of V ∗ are z0∗ : ... : zn∗ with zj∗ = z0,...,j 1,j+1,...,n. J K − This proves the thesis. Example. As in the example on page 31, suppose we are given a photo, that was shot with a tilted camera. Our aim is again to correct the perspective distortion. This time however, instead of four points, we operate on two pairs of lines: A, B and C, D. The first pair represents directions in the picture that should be horizontal and the second vertical. Say
A = A12 : A20 : A01 = 0.491 : 0.07 : 0.042 J K J− K B = B12 : B20 : B01 = 0.491 : 0.049 : 0.381 J K J− − K C = C12 : C20 : C01 = 0.0 : 0.679 : 0.033 J K J − K D = D12 : D20 : D01 = 0.0, 0.56, 0.302 . J K J − K Passing to the dual space, we consider four points:
A∗ = 0.491 : 0.07 : 0.042 B∗ = 0.491 : 0.049 : 0.381 J− − K J− K C∗ = 0.0 : 0.679 : 0.033 D∗ = 0.0 : 0.56 : 0.302 . J − − K J − − K We need to construct a map T ∗ : P2R P2R specified by the following conditions → T ∗(A∗) = 0.421 : 0.0 : 0.039 T ∗(B∗) = 0.421 : 0.0 : 0.325 J− K J− K T ∗(C∗) = 0.0 : 0.679 : 0.02 T ∗(D∗) = 0.0 : 0.679 : 0.306 . J − K J − K Using methods described in Section 1, we build a matrix N of T ∗: u 1.040 0.0 0.0 } N = 0.258 1.496 0.525 . v 0.007 0.008 1.024~ − Proposition 3.14 asserts that u0.962 0.169 0.008 } u0.983 0.172 0.008 } T 1 − − M = (N )− = 0.0 0.67 0.005 0.0 0.685 0.005 v 0.0 0.343− 0.979 ~ ∼ v 0.0 0.351− 1.0 ~ − − defines the desired projective transformation of the picture. If V , W are two disjoint projective subspaces, then the smallest projective subspace U containing both of them is called the join of V and W . We are now in the position to present the main theorem of this section, which not only allow us to compute the Pl¨ucker-Grassmann coordinates of the join of V , W , knowing only their Pl¨ucker-Grassmann coordinates, but also to test whether the two intersect.
Theorem 3.15. Let V , W be two projective subspaces of PnR with the Pl¨uc- ker-Grassmann coordinates ... : Vˆı : ... and ... : Wˆ : ... , respectively. Take J K J K Build: January 28, 2021 50 Projective space and quaternions
m := dim V , r := dim W and d := m + r + 1. For any kˆ = (k0, . . . , kd) let X Ukˆ := sˆı,,ˆ kˆVˆıWˆ, ˆı ˆ=kˆ ∪ kˆ where sˆı,,ˆ kˆ is the sign of the permutation ˆı,ˆ :
k k k k k s := sgn 0 1 m m+1 d . ˆı,,ˆ kˆ i i ··· i j ··· j 0 1 ··· m 0 ··· r
If V and W are disjoint, then ... : Ukˆ : ... are the Pl¨ucker-Grassmann coordinates J K of the join U of V and W . If V and W intersect, then all Ukˆ are zero.
Proof. Suppose that V is spanned by P0,...,Pm V and W is spanned ∈ n by Q0,...,Qr W . Consequently U is the smallest subspace of P R containing ∈ P0,...,Pm,Q0,...,Qr. Denote the coordinates of Pi by pi0 : ... : pin and the J K coordinates of Qj by qj0 : ... : qjn . Construct a matrix J K p p 00 ··· 0n . . . . pm0 pmn A = ··· q00 q0n ··· . . . . . q q r0 ··· rn As in the definition of Pl¨ucker-Grassmann coordinates, for a (d + 1)-tuple kˆ =
(k0, . . . , kd), let Akˆ be the submatrix of A obtained by selecting only the columns with indexes k0, . . . , kd:
p p 0k0 ··· 0kd . . . . pmk0 pmkd Akˆ = ··· q0k0 q0kd ··· . . . . . q q rk0 ··· rkd
Compute the determinant of Akˆ expanding it into the determinants of (m + 1) (m + 1) minors from the first m + 1 rows and (r + 1) (r + 1) minors from the× remaining r + 1 rows. It follows from the basic properties× of a determinant that X det Akˆ = sˆı,,ˆ kˆVˆıWˆ = Ukˆ. ˆı ˆ=kˆ ∪ The rows of A are linearly independent if and only if V W = and then at least ˆ ∩ ∅ one Ukˆ is not zero. Otherwise, Ukˆ = 0 for every k.
Example. Take a point P = 1 : 1 : 1 : 1 and a line L with the Pl¨ucker- J K Grassmann coordinates L12 : L20 : L01 : L03 : L13 : L23 = 1 : 1 : 1 : 0 : 1 : 1 . J K J− − K Build: January 28, 2021 Projective space and quaternions 51
Use the above theorem to compute U = sgn ( 1 3 2 ) L P + sgn ( 1 3 2 ) L P + sgn ( 1 3 2 ) L P = 1, 132 1 2 3 12 3 1 3 2 13 2 2 3 1 23 1 − 2 3 0 2 3 0 2 3 0 U230 = sgn ( 2 0 3 ) L20P3 + sgn ( 0 3 2 ) L03P2 + sgn ( 2 3 0 ) L23P0 = 0, 3 1 0 3 1 0 3 1 0 U310 = sgn ( 0 1 3 ) L01P3 + sgn ( 0 3 1 ) L03P1 + sgn ( 1 3 0 ) L13P0 = 0, 0 1 2 0 1 2 0 1 2 U012 = sgn ( 0 1 2 ) L01P2 + sgn ( 2 0 1 ) L20P1 + sgn ( 1 2 0 ) L12P0 = 1.
Not all Ukˆ are zero, hence P does not lie on L and the Pl¨ucker-Grassmann coor- 3 dinates of the plane U P R containing both P and L are U132 : U230 : U310 : ⊂ J U012 = 1 : 0 : 0 : 1 . K J− K An immediate application of the above theorem is the following criterion for intersection of two lines in the projective space P3R. Corollary 3.16. The two distinct lines L, M P3R with the Pl¨ucker-Grass- ⊂ mann coordinates L12 : L20 : L01 : L03 : L13 : L23 and M12 : M20 : M01 : M03 : J K J M13 : M23 intersect if and only if K L12M03 + L20M13 + L01M23 + L03M12 + L13M20 + L23M01 = 0. Proof. The join of two non-intersecting lines in the 3-dimensional space, is the whole space P3R. As we know, P3R has just one Pl¨ucker-Grassmann coordinate U0123. Theorem 3.15 asserts that it is not zero if the two lines are disjoint and then 3 they span the entire space P R. Otherwise U0123 is null. Recall that the Pl¨ucker-Grassmann coordinates of a line through two points P,Q R3 are shortly written as p q : p q , where p = −−→ΘP , q = −−→ΘQ. Thus the corollary∈ asserts that the lines: LJ through× −P ,KQ and M through R, S intersect if and only if (p q) (r s) + (p q) (r s) = 0. × • − − • × Example. Take the four points: P = 0 : 2 : 0 : 1 ,Q = 2 : 0 : 0 : 1 ,R = 1 : 1 : 0 : 1 ,S = 2 : 2 : 2 : 1 . J K J K J K J − K Let L be the line through P , Q and M through R, S. The Pl¨ucker-Grassmann coordinates of L are p q : p q = 0 : 0 : 4 : 2 : 2 : 0 and the coordinates of M are r s : r s J=× 2 : 2− : 0K : 1J : 1 :− 2 . We− have K J × − K J− − − K (p q) (r s) + (p q) (r s) = 8 + 8 = 0. × • − − • × − It follows from the corollary that these two lines intersect. There is more to the above corollary. Restrict ourselves to the affine space R3, for a while. An affine line L R3 can be assigned an orientation. The two possible orientation may be encoded⊂ into line’s coordinates providing that the Pl¨ucker- Grassmann coordinates of an oriented line are unique only up to a multiplication by a strictly positive factor. The coordinates of the oriented line through P , Q (in that order) are then defined (using the previous notational convention) as p q : p q . Reversing the orientation of a line one changes the signs of the coordinates,J × − asK p q = q p and p q = (q p). This construction fails in the projective space,× but− is× correct in− the affine− world− (for further details as well as the proof of the next proposition we refer the reader to [23]). An oriented line L, that does not intersect another oriented line M, can go around it either in the clockwise or counterclockwise direction (see Figure 2.8). It turns out that the formula from Corollary 3.16 can be used to distinguish these two cases. Build: January 28, 2021 52 Projective space and quaternions
Figure 2.8. Oriented line L going around another oriented line M in the clockwise (left) and counterclockwise direction (right).
Proposition 3.17. Let L, M R3 be to distinct oriented lines with the Pl¨uc- ⊂ ker-Grassmann coordinates respectively: L12 : L20 : L01 : L03 : L13 : L23 and J K M12 : M20 : M01 : M03 : M13 : M23 . Define J K ∆(L, M) := L12M03 + L20M13 + L01M23 + L03M12 + L13M20 + L23M01. Then: (1) L goes around M in the clockwise direction if ∆(L, M) > 0; (2) L intersects M if ∆(L, M) = 0; (3) L goes around M in the counterclockwise direction if ∆(L, M) < 0. In 3D computer rendering it is often important to know whether a ray (resp. an oriented line) intersects a given triangle. Proposition 3.17 give rise to an effective line-triangle intersection test. Let C be a convex polygon contained in R3 and L a given (oriented) line. Orient all the edges of C in a consistent way. It is clear that L intersects the interior of C if and only if it goes either clockwise around all the edges or it goes counterclockwise around all the edges of C. Thus we have proved:
Corollary 3.18. If C is a convex polygon with edges E1,...,En oriented coherently, then a line L intersects the interior of C if and only if the sign of ∆(L, E ) is the same for every 1 i n. i ≤ ≤ Example. Consider a triangle T with vertices A = (2, 0, 2)T , B = (0, 4, 3)T , C = ( 2, 1, 2)T and take a line L through P = (0, 0, 0)T and Q = (0, 1, 1)T . Then the Pl¨ucker-Grassmann− − coordinates of the edges of T are
AB12 : AB20 : AB01 : AB03 : AB13 : AB23 = 8 : 6 : 8 : 2 : 4 : 1 J K J− − − − K BC12 : BC20 : BC01 : BC03 : BC13 : BC23 = 11 : 6 : 8 : 2 : 5 : 1 J K J − K CA12 : CA20 : CA01 : CA03 : CA13 : CA23 = 2 : 8 : 2 : 4 : 1 : 0 J K J− − − K and that of L are
L12 : L20 : L01 : L03 : L13 : L23 = 0 : 0 : 0 : 0 : 1 : 1 J K J − − K We compute: ∆(L, AB) = 2, ∆(L, BC) = 2, ∆(L, CA) = 10. − − − It follows from the corollary that L intersects T . Build: January 28, 2021 Projective space and quaternions 53
In practice, once we spot an intersection, we often need to find the coordinates of the intersection point. If we deal with a triangle, rather than an arbitrary convex polygon, the previous corollary can be strengthen to give this extra bit of information as follows:
Observation 3.19. Let L be an oriented affine line in R3 and T be a triangle not parallel to L with vertices A, B, C, then the barycentric coordinates (with respect to A, B, C) of the point P , where L intersects the plane containing T , are: ∆(L, BC) , ∆(L, AB) + ∆(L, BC) + ∆(L, CA) ∆(L, CA) , ∆(L, AB) + ∆(L, BC) + ∆(L, CA) ∆(L, AB) . ∆(L, AB) + ∆(L, BC) + ∆(L, CA) In particular P lies in the convex hull of A,B,C (i.e. the line intersects the triangle), when sgn ∆(L, AB) = sgn ∆(L, BC) = sgn ∆(L, CA). The proof is left as an exercise for the reader. In the previous example, the intersection point of L and T is 2 0 2 0 2 10 2 − − 0 + − 4 + − 1 = 19/7 2 2 10 2 2 10 2 2 10 − . − − − 2 − − − 3 − − − 2 19/7 Remark. In some cases Theorem 3.15 together with Theorem 3.13 leads to another method for computing the Pl¨ucker-Grassmann coordinates of an intersec- tion of two subspaces. Indeed, suppose that U = V W . Switching to the dual ∩ space we have that U ∗ is the join of V ∗ and W ∗, where the Pl¨ucker-Grassmann coordinates of the dual V ∗ (respectively W ∗) of V (resp. W ) are computed by the means of Theorem 3.13. Now, Theorem 3.15 let us compute the coordinates of U ∗ and Theorem 3.13 converts them back to the Pl¨ucker-Grassmann coordinates of U.
Example. Let the points A, B, C, P, Q R3 be the same as in the previous example. We want to compute the intersection∈ point of the line L through P and Q with the plane V containing A, B, C using the above remark. The Pl¨ucker- Grassmann coordinates of L are 0 : 0 : 0 : 0 : 1 : 1 thus its dual L∗ has the coordinates J − − K
L12∗ : L20∗ : L01∗ : L03∗ : L13∗ : L23∗ = 0 : 1 : 1 : 0 : 0 : 0 . J K J − − K On the other hand, the Pl¨ucker-Grassmann coordinates of V are
V132 : V230 : V310 : V012 = 1 : 4 : 18 : 38 . J K J− − K and so are the coordinates V ∗ : V ∗ : V ∗ : V ∗ of the point V ∗ P3R dual to the 0 1 2 3 ∈ plane V . Compute the coordinatesJ of the join PK ∗ of L∗ and V ∗ using Theorem 3.15: 1 3 2 1 3 2 1 3 2 P132∗ = sgn ( 1 2 3 ) L12∗ V3∗ + sgn ( 1 3 2 ) L13∗ V2∗ + sgn ( 2 3 1 ) L23∗ V1∗ = 0, 2 3 0 2 3 0 2 3 0 P230∗ = sgn ( 2 0 3 ) L20∗ V3∗ + sgn ( 0 3 2 ) L03∗ V2∗ + sgn ( 2 3 0 ) L23∗ V0∗ = 38, 3 1 0 3 1 0 3 1 0 P310∗ = sgn ( 0 1 3 ) L01∗ V3∗ + sgn ( 0 3 1 ) L03∗ V1∗ + sgn ( 1 3 0 ) L13∗ V0∗ = 38, 0 1 2 0 1 2 0 1 2 P012∗ = sgn ( 0 1 2 ) L01∗ V2∗ + sgn ( 2 0 1 ) L20∗ V1∗ + sgn ( 1 2 0 ) L12∗ V0∗ = 14.
Build: January 28, 2021 54 Projective space and quaternions
The intersection point P is dual to the plane P ∗, hence
P0 : P1 : P2 : P3 = P132∗ : P230∗ : P310∗ : P012∗ = 0 : 38 : 38 : 14 . J T K J K J K Thus P = (0, 19/7, 19/7) . The exercises at the end of this chapter present some more interesting applica- tions of the Pl¨ucker-Grassmann coordinates.
4. Quaternions and transform interpolation Transform interpolation is a basic but widely used method of animating com- puter-generated objects. Suppose that we are given a finite sequence of pairs (t0,T0),..., (tn,Tn), where the first entry is time and the second is a transfor- mation, assume that the ti’s form a strictly ascending sequence. We are looking for such a map Φ : R M4,4(R) that Φ(ti) = Ti for every 0 i n. In other → ≤ ≤ words, we want to smoothly pass from a transformation T0 at the moment t0, to T1 at time t1 and so on till Tn at time tn. In this section we deal with the simplest case, when we interpolate between two transforms only. The generalization to an arbitrary number of key-frames will be discussed in Chapter 5. 5 Let A0, A1 be two elements of some linear or affine space over R. We define a linear interpolation operator by the formula: lerp(A ,A ; t) := (1 t)A + tA , for t [0, 1]. 0 1 − 0 1 ∈ It is clear that lerp(A0,A1; 0) = A0 and lerp(A0,A1; 1) = A1.
Observation 4.1. Let T0, T1 be two translations by vectors v0, v1 and let MT0, MT1 be the corresponding matrices in homogeneous coordinates, then the matrix M(t) := lerp(MT ,MT ; t), for t [0, 1] 0 1 ∈ represents a translation by an intermediate vector v(t) = lerp(v0, v1; t).
Observation 4.2. Let S1, S2 be the dilatations by si = (si,x, si,y, si,z)(for i 0, 1 ) and let MS1, MS2 be the corresponding matrices in homogeneous coor- dinates.∈ { } Then the matrix M(t) := lerp(MS ,MS ; t), for t [0, 1] 0 1 ∈ corresponds to an intermediate scaling by s(t) = lerp(s0, s1; t). As simple as the above two observations are, they say that both a translation and a dilatation are handled by a direct linear interpolation of the corresponding matrices (see Figure 2.9). Let us look now at a rotation. This case is not so simple any longer. It is obvious that a weighted average of two rotation matrices does not have to represent a rotation at all, not to mention an intermediate rotation. If the axes of the two interpolated rotations coincide, the task is still quite easy. All one needs to do, is to linearly interpolate the angle of the rotation α(t) := lerp(α0, α1; t), where α0 is the angle of the first rotation and α1 denotes the angle of the second one. A problem arises when the two axes differ. A naive approach is to interpolate the Euler angles of the rotations. This can lead to quite an unexpected behavior. As we will see, the space of rotations in R3 is homeomorphic to the projective space P3R. On the other hand, the Euler angles are the triples (αX , αY , αZ ) hence the space of all the Euler angles form a cube. These two spaces are not homeomorphic. An
5The only axiom we actually need is the existence of weighted averages of every two elements.
Build: January 28, 2021 Projective space and quaternions 55
Figure 2.9. Interpolation of a translation (left) and a dilatation (right). artifact, known as a gimbal lock, occurs every time an interpolation passes through a point of discontinuity of the correspondence between Euler angles and rotations.
We explain it using a simplified example where the space of rotations is sub- stituted by an ordinary unit sphere S2 R3. Consider a function Φ : [ π, π] 2 ⊂ − × [ π/2, π/2] S that maps a latitude u and the longitude v to the point on the sphere− → sin(u) cos(v) u Φ = cos(u) cos(v) v . sin(v) An observer on the sphere has two degrees of freedom: he can move either in the east-west direction (i.e. changing u) or in the north-south direction (i.e. changing v), unless. . . he stays at one of the two poles. At the north pole every direction leads south (analogously, at the south pole every direction leads north). A gimbal lock occurred, we lost one degree of freedom. The correct way to interpolate rotations in 3D is to represent them using quaternions. First, however, we analyze the situation in two dimensions. The intuition we develop, will subsequently be applied in three dimensions. Take a x point P = R2. Rotation around the origin can be written as a map y ∈ cos α sin α x x cos α y sin α P − = − . 7→ sin α cos α · y x sin α + y cos α