<<

In this handout, we discuss orthogonal maps and their significance from a geometric standpoint.

Preliminary results on the

The definition of an orthogonal involves transpose, so we prove some facts about it first. Proposition 1. (a) If A is an ` × m matrix and B is an m × n matrix, then (AB)> = B>A>.

(b) If A is an invertible n × n matrix, then A> is invertible and (A>)−1 = (A−1)>.

Proof. (a) We compute the (i, j)-entries of both sides for 1 ≤ i ≤ n and 1 ≤ j ≤ `:

m m m > X X > > X > > > > [(AB) ]ij = [AB]ji = AjkBki = [A ]kj[B ]ik = [B ]ik[A ]kj = [B A ]ij. k=1 k=1 k=1

(b) It suffices to show that A>(A−1)> = I. By (a), A>(A−1)> = (A−1A)> = I> = I.

Orthogonal matrices

Definition (). An n × n matrix A is orthogonal if A−1 = A>.

We will first show that “being orthogonal” is preserved by various matrix operations. Proposition 2. (a) If A is orthogonal, then so is A−1 = A>. (b) If A and B are orthogonal n × n matrices, then so is AB.

Proof. (a) We have (A−1)> = (A>)> = A = (A−1)−1, so A−1 is orthogonal. (b) We have (AB)−1 = B−1A−1 = B>A> = (AB)>, so AB is orthogonal.

The collection O(n) of n × n orthogonal matrices is the orthogonal in dimension n. The above definition is often not how we identify orthogonal matrices, as it requires us to compute an n × n inverse. Instead, let A be an orthogonal matrix and suppose its columns are v1,..., vn. Then we can compute

 >  − v1 −  | |  >  .  v ··· v A A =  .   1 n = (vi · vj)1≤i,j≤n, > | | − vn − so comparing with the I, we obtain

1 Proposition 3. An n×n matrix A is orthogonal if and only if its columns form an of Rn.

By considering A>, we also have that A is orthogonal if and only if its rows (or rather, their transposes) form an orthonormal basis of Rn. The importance of orthogonal matrices from a geometric perspective is that they preserve dot products, and hence lengths and angles.

Theorem 1. Let A ∈ O(n) and x, y ∈ Rn. Then (Ax) · (Ay) = x · y. Conversely, if A is an n × n matrix preserving dot products, then A ∈ O(n).

Proof. For the forward direction, (Ax) · (Ay) = (Ax)>(Ay) = x>A>Ay = x>y = x · y. For the reverse, if A is an n × n matrix preserving dot products, then by considering Aei for standard n basis vectors ei, the columns of A form an orthonormal basis of R , hence A ∈ O(n). Corollary 1. 1. If A ∈ O(n) and x ∈ Rn, then kAxk = kxk.

2. If A ∈ O(n) and x, y ∈ Rn, then Ax ⊥ Ay if and only if x ⊥ y.

Example: The two-dimensional O(2)

a c Before continuing with general results, we describe the matrices in O(2). Let A = be an b d orthogonal 2 × 2 matrix. By Proposition 3, the equations a, b, c, d must satisfy are

a2 + b2 = c2 + d2 = 1 and ac + bd = 0.

From the last equation, (c, d) = (tb, −ta) for some t ∈ R, so then 1 = c2 + d2 = t2(b2 + a2) = t2.

This means that t = ±1, which gives, for a2 + b2 = 1,

a b  a −b A = or A = . b −a b a

The solutions to a2 + b2 = 1 can be parametrised by a single real parameter θ via a = cos θ and b = sin θ, so in conclusion,

cos θ − sin θ cos θ sin θ  O(2) = ∪ . sin θ cos θ sin θ − cos θ

These are matrices that we have seen before: the first set consists of counterclockwise rotations about the origin by angle θ, whereas the second set consists of reflections about lines through the origin making angles θ/2 with the positive x-axis. Algebraically, another way to separate these two sets is that the first set consists of matrices with 1 and the second set consists of matrices with determinant −1.

2 The special orthogonal group

The fact that the only were ±1 in the two-dimensional case is not a coincidence. Proposition 4. If A ∈ O(n), then det A = ±1.

Proof. Since det A> = det A, we have

1 = det I = det(AA>) = (det A)(det A>) = (det A)2.

Definition (Special orthogonal group). The subset SO(n) = {A ∈ O(n) | det A = 1} is the special orthogonal group in dimension n.

Example: The three-dimensional special orthogonal group SO(3)

In three dimensions, the elements of the whole orthogonal group O(3) do not admit as simple a description as in two dimensions, but it turns out that there is a simple description of SO(3). Let A ∈ SO(3), so that det A = 1. We analyse the complex eigenvalues of A. To do this, given a matrix M with complex number entries, define M† to be its conjugate transpose, i.e. M† is obtained from M by taking the transpose and taking the complex conjugate of every entry. If n † v ∈ C is√ a vector, then v v is a non-negative real number, so we can define the magnitude of v by kvk = v†v. The magnitude satisfies kcvk = |c|kvk for all v ∈ Cn and c ∈ C, and if v has all real entries, then the magnitude of v as a complex number vector is the same as the magnitude of v as a real number vector.

Proposition 5. Let v ∈ Cn and A ∈ O(n, R). (Here we use the notation O(n, R) to emphasise that A has real entries.) Then kAvk = kvk.

Proof. It suffices to compare square magnitudes. We have

kAvk2 = (Av)†(Av) = v†A†Av.

Since A is real, A† = A>, so the middle product simplifies to I and we are left with v†v = kvk2.

Corollary 2. Let λ ∈ C be a complex eigenvalue of A ∈ O(n, R). Then |λ| = 1.

Proof. Let v be an eigenvector of A with eigenvalue λ. Then

kvk = kAvk = kλvk = |λ|kvk, so |λ| = 1 since kvk= 6 0.

Returning to the specific case that A ∈ SO(3, R), the eigenvalues satisfy det(A − λI) = 0, which upon expanding the left hand side is a cubic with real coefficients. This shows that A has at least one real eigenvalue, which by Corollary 2 must be ±1. Together with the fact that the product of the roots of the eigenvalue equation, counted with multiplicity, is det A = 1, we can show that A must have 1 as an eigenvalue.

3 • If A has only real roots, then each of the three roots is ±1 and their product is 1. They cannot all be −1, as the product would be −1, so at least one of the roots is 1.

• If A has non-real complex roots, then since the coefficients are real, the two non-real roots must be a conjugate pair λ, λ¯. Their product is λλ¯ = |λ|2 = 1 by Corollary 2, so the last root must be 1. Let u be an eigenvector of A with eigenvalue 1, and if necessary, rescale so that kuk = 1. Since u solves the linear system (A − I)u = 0, which has real coefficients, we can take u to have real coordinates, so although we passed to complex numbers above, we can now return to the 3 3 setting of R . Extend the singleton list (u) to an orthonormal basis B = (u, u2, u3) of R . If S is the matrix whose columns are the vectors of B, then the matrix of A with respect to B is B = S−1AS. Since the columns of S form an orthonormal basis, S is itself an orthogonal matrix, so by Proposition 2(b), B is orthogonal. Moreover, since Au = u, 1 ∗ ∗ B = 0 ∗ ∗ . 0 ∗ ∗ The first column must be orthogonal to the second and third columns, so 1 0 0 B = 0 ∗ ∗ . 0 ∗ ∗ The second and third columns are an orthonormal list, so this together with det B = det A = 1 shows that the bottom 2 × 2 matrix is a two-dimensional matrix. Hence A is a matrix of rotation about the u-axis, and the conclusion is that SO(3) is the collection of all rotations about lines through the origin.

Reflections about

A is orthogonal if it preserves dot products, or equivalently, if its matrix (with respect to the standard basis) is an orthogonal matrix. A useful class of orthogonal maps is that of reflections. In 2 dimensions, the most important reflections are those about lines through the origin (dimension-1 subspaces), whereas in 3 dimensions, the most important reflections are those about planes through the origin (dimension-2 subspaces). To generalise, we define Definition. A (linear) in Rn is a subspace of dimension n − 1. Equivalently, a linear hyperplane is a subspace of Rn defined by a single linear equation.

Proposition 6. Let a1, . . . , an be real numbers, not all zero. Then n V = {x ∈ R | a1x1 + ··· + anxn = 0} is a linear hyperplane in Rn. Conversely, every linear hyperplane is of this form. > Proof. Given such an equation, let a = a1 ··· an . Then V is the orthogonal complement of span(a), hence has dimension n − 1. Conversely, if V has dimension n − 1, then V ⊥ has dimension 1, hence is spanned by a single vector a. Then V is defined by the property that a · x = 0, which expands to a single linear equation a1x1 + ··· + anxn = 0.

4 > The vector a = a1 ··· an is unique up to scaling: Suppose V is defined both by a · x = 0 and b · x = 0. Then a, b ∈ V ⊥, a space of dimension 1, so they must be multiples of each other. (Neither of them can be 0, as otherwise V would be of dimension n instead of dimension n − 1.) In particular, there are exactly two unit vectors n such that V is defined by n · x = 0, and one is the negative of the other. We call either of these a unit normal vector to V .

Definition (Reflection). Let V be a linear hyperplane in Rn and let n be a unit normal vector to V . Then reflection about V is the map Rn → Rn given by

refV (x) = x − 2(x · n)n. (1)

Note that if we replace n with the other unit normal vector −n, we get the same result, as

x − 2(x · (−n))(−n) = x + 2(−(x · n))n = x − 2(x · n)n.

Hence the definition of reflection depends only on the linear hyperplane V and not on the choice of unit normal vector n to V . Proposition 7. Reflection about a linear hyperplane V is an orthogonal linear map.

Proof. For linearity, let x, y ∈ Rn and a, b ∈ R. Then

refV (ax + by) = (ax + by) − 2((ax + by) · n)n = a(x − 2(x · n)n) + b(y − 2(y · n)n)

= a refV (x) + b refV (y).

For , let x, y ∈ Rn. Then

refV (x) · refV (y) = (x − 2(x · n)n) · (y − 2(y · n)n) = x · y − 2(x · n)(n · y) − 2(y · n)(x · n) + 4(x · n)(y · n)(n · n).

All terms but the first cancel because n · n = 1, so refV (x) · refV (y) = x · y.

Since reflection is an orthogonal map, compositions of reflections are also orthogonal maps. We will see shortly that every orthogonal map is a composition of reflections.

Isometries

We now turn to general geometric definitions.

Definition (). An isometry of Rn is a f : Rn → Rn (not necessarily a linear map!) that preserves distances between points, i.e. for any x, y ∈ Rn, kf(x) − f(y)k = kx − yk.

n n n Example (). Fix a vector a ∈ R and let τa : R → R be the translation map n τa(x) = x + a. Then τa is an isometry, as for any x, y ∈ R ,

kτa(x) − τa(y)k = k(x + a) − (y + a)k = kx − yk.

−1 The translation map τa is bijective, with inverse τa = τ−a.

5 Example (Orthogonal maps). Let A ∈ O(n) and let f : Rn → Rn be the linear map f(x) = Ax. Then f is an isometry, as for any x, y ∈ Rn, kf(x) − f(y)k = kAx − Ayk = kA(x − y)k = kx − yk.

Our goal will be to show that these are essentially the only examples of .

Reduction to isometries fixing the origin

The first step is to simplify the problem by showing that it suffices to consider just the isometries which fix the origin.

Proposition 8. The composition of two isometries of Rn is an isometry of Rn.

Proof. Let f, g : Rn → Rn be isometries. Then for x, y ∈ Rn, k(f ◦ g)(x) − (f ◦ g)(y)k = kf(g(x)) − f(g(y))k = kg(x) − g(y)k = kx − yk, so f ◦ g is an isometry of Rn. Proposition 9. If f : Rn → Rn is an isometry, then there is an isometry g : Rn → Rn and a n n n vector b : R → R such that g(0) = 0 and f = τb ◦ g, i.e. f(x) = g(x) + b for all x ∈ R .

Proof. Let b = f(0). By Proposition 8, τ−b ◦ f is an isometry g and g(0) = f(0) − a = 0, and hence τb ◦ g = τb ◦ τ−b ◦ f = f.

Isometries fixing the origin are compositions of reflections

To classify isometries fixing the origin, we first establish a lemma that allows us to determine what an isometry must be based on what it does to a finite set of points.

Lemma 1. Let f : Rn → Rn be an isometry such that f(0) = 0. Then f preserves dot products.

Proof. For x, y ∈ Rn, noting that kf(x)k = kf(x) − f(0)k = kxk, likewise for y, kf(x)k2 + kf(y)k2 − kf(x) − f(y)k2 kxk2 + kyk2 − kx − yk2 f(x) · f(y) = = = x · y. 2 2

n n Lemma 2. Let f, g : R → R be isometries such that f(0) = g(0) = 0 and f(ei) = g(ei) for all standard basis vectors ei. Then f = g.

Proof. By Lemma 1, f and g preserve dot products, so (f(e1), . . . , f(en)) is an orthonormal basis n n B of R . Given any x ∈ R and standard basis vector ei, the i-th B-coordinate of g(x) is

g(x) · f(ei) = g(x) · g(ei) = x · ei = f(x) · f(ei), the i-th B-coordinate of f(x). Since f(x) and g(x) have equal B-coordinates, they are equal.

6 By Lemma 2, in order to show that a given isometry (which fixes the origin) must have a certain form, it suffices to find an isometry of that form which has the same action on the basis vectors, as then the two isometries must be equal. We will do this with reflections, so we need one last lemma that shows there exist suitable reflections for our purposes.

Lemma 3. Let a, b ∈ Rn be distinct points such that kak = kbk. There is a unique linear n hyperplane V such that refV (a) = b; this hyperplane consists precisely of all points x ∈ R such that kx − ak = kx − bk.

Proof. First we show that kx − ak = kx − bk defines a hyperplane:

kx − ak = kx − bk ⇐⇒ kx − ak2 = kx − bk2 ⇐⇒ kxk2 + kak2 − 2(x · a) = kxk2 + kbk2 − 2(x · b) ⇐⇒ (a − b) · x = 0.

a−b Hence we get the linear hyperplane V defined by the unit normal vector n = ka−bk . Then,

2(a · (a − b))(a − b) ref (a) = a − 2(a · n)n = a − V ka − bk2 2kak2 − 2(a · b) = a − (a − b) ka − bk2 kak2 + kbk2 − 2(a · b)2 = a − (a − b) ka − bk2 = a − (a − b) = b, showing that V has the desired reflection property. Finally, suppose W is a linear hyperplane satisfying refW (a) = b with unit normal vector m. Then

b = a − 2(a · n)n = a − 2(a · m)m, so (a · n)n = (a · m)m. Since refW (a) = b 6= a, we know that a 6∈ W , so a · m 6= 0. Hence m must be parallel to n, in which case W = V .

Theorem 2. Every isometry of Rn fixing the origin is a composition of at most n reflections about linear hyperplanes.

n n Proof. Let f : R → R be an isometry fixing the origin and let xi = f(ei). By Lemma 2, it suffices to find a sequence of at most n reflections whose composition sends ei 7→ xi for each i. First, if x1 6= e1, we find a reflection sending e1 to x1; the existence of such a reflection is guaranteed by Lemma 3. If x1 = e1, then we do not need any reflection to send e1 to x1. (We can think of this case as starting with the identity map instead of a reflection.) Now, for a given k < n, suppose we have found a sequence of at most k reflections whose composition 0 sends ei 7→ xi for each i ≤ k. Suppose ek+1 is the image of ek+1 after these reflections. If 0 xk+1 = ek+1, then in fact, ei 7→ xi for each i ≤ k + 1 after this sequence of at most k reflections, 0 and we can proceed. If xk+1 6= ek+1, then by Lemma 3, there is a unique hyperplane V such that 0 0 refV (ek+1) = xk+1. Moreover, for each i ≤ k, we have kxk+1 − xik = kek+1 − eik = kek+1 − xik, so xi ∈ V (by Lemma 3) and refV (xi) = xi. Hence after composing by refV , we have a sequence of at most k + 1 reflections whose composition sends ei 7→ xi for each i ≤ k + 1. Proceeding inductively, we obtain Theorem 2.

7 Corollary 3. Every isometry of Rn fixing the origin is an orthogonal linear map. Corollary 4. Every orthogonal linear map is a composition of reflections.

For an example of this last corollary, every rotation in R2 is a composition of two reflections.

Conclusion

Combining Proposition 9 and Theorem 2, we obtain

Theorem 3. Every isometry of Rn is of the form f(x) = Ax + b, where A ∈ O(n) and b ∈ Rn.

In the particular case of n = 2, we saw that O(n) consists of rotations and reflections, so by Theorem 3, every distance-preserving function on the is some combination of rotations, reflections, and translations.

8