Orthogonal Matrices and Transformations

First, recall the transpose of a matrix...

The Transpose of a Matrix Let A be an m × n matrix. The transpose of A, denoted by AT , is the n × m matrix such that the jth column of A is the jth row of AT . (Equivalently, the ith row of A is the ith column of AT .) In other words, we ﬁnd AT by taking the columns of A (in order) and making them the rows of AT (in order). Properties of the Transpose

1)( AT )T = A

2)( AB)T = BT AT

√ √ √      0  1/√18 1/ 2 −2/3 √18 √ 1. Let U = 4/√18 0√ 1/3 , ~x = − 18, and ~y = √2. 1/ 18 −1/ 2 −2/3 0 2

(a) Verify that the column vectors of U are orthonormal, and thus form an orthonormal basis for R3. (No Gram-Schmidt necessary!) Solution. We can conﬁrm that the column vectors of U are all orthogonal to each other by taking the dot product of each pair of vectors and showing that it equals 0. Furthermore, we can conﬁrm that each column vector is a unit vector by showing that the length of each vector is 1. Thus, the columns of U are orthonormal. Since the column vectors are orthogonal, they are automatically linearly independent. Therefore, U consists of three linearly independent vectors in R3, and thus must also form a basis for R3. (b) Find the matrix product U T U. Solution. √ √ √ √ √  1/ 18 4/ 18 1/ 18  1/ 18 1/ 2 −2/3 1 0 0 T √ √ √ U U =  1/ √2 0 −1/ 2 4/√18 0√ 1/3  = 0 1 0 −2/ 3 1/3 −2/3 1/ 18 −1/ 2 −2/3 0 0 1

Rather than painfully multiplying these two matrices together using the Row-Column Rule, we can notice a pattern. Whenever we multiply a row of U T with its corresponding column in U, we are simply taking the dot product of one of the original columns in U with itself. This dot product should be 1, since the dot product of a vector with itself is equal to the square of its length, and the length of every vector in U is 1. However, when we multiply a row of U T with any other column of U, we are taking the dot product between two diﬀerent column vectors in U. This dot product should be 0, since the columns of A are orthogonal. Thus, we get a matrix of 1’s and 0’s in the exact same pattern as the identity matrix.

This illustrates a very important fact: If an m × n matrix U has orthonormal columns, then T U U = Im.

1 (c) Find the vectors U~x and U~y.

Solution.

 √ √   √    1/√18 1/ 2 −2/3 √18 −2 U~x = 4/√18 0√ 1/3  − 18 =  4  1/ 18 −1/ 2 −2/3 0 4

√ √  √  1/ 18 1/ 2 −2/3  0  1 − 2 2 √ √ √ 3 U~y = 4/ 18 0 1/3 2 =  2   √ √  √   3 √  1/ 18 −1/ 2 −2/3 2 2 2 −1 − 3

(d) Compute |~x|, |U~x|, |~y|, |U~y|. What do you notice?

Solution. q √ √ √ √ |~x| = ( 18)2 + (− 18)2 + 02 = 18 + 18 + 0 = 36 = 6 √ √ |U~x| = p(−2)2 + (4)2 + (4)2 = 4 + 16 + 16 = 36 = 6 q √ √ √ √ |~y| = (0)2 + ( 2)2 + ( 2)2 = 0 + 2 + 2 = 4 = 2 q √ √ √ q √ √ √ 2 2 2 2 2 2 2 2 4 2 8 2 4 2 8 |U~y| = (1 − 3 ) + ( 3 ) + (−1 − 3 ) = (1 − 3 + 9 ) + 9 + (1 + 3 + 9 ) = 4 = 2 Look at that! The vectors |~x| and |U~x| have the same length, as do the vectors |~y| and |U~y|. In other words, the transformation matrix U preserved the lengths of the vectors ~x and ~y.

(e) Find the angle between the vectors ~x and ~y. Then ﬁnd the angle between the vectors U~x and U~y. What do you notice?

Solution. Let α be the angle between ~x and ~y. Then: √ √ ~x · ~y − 18 2 −6 1 cos(α) = = = = − |~x| · |~y| (6)(2) 12 2 Thus, 1 2π α = arccos(− ) = = 120◦ 2 3

Similarly, let β be the angle between U~x and U~y. Then: √ √ √ (U~x) · (U~y) (−2)(1 − 2 2 ) + (4)( 2 ) + (4)(−1 − 2 2 ) −6 1 cos(α) = = 3 3 3 = = − |U~x| · |U~y| (6)(2) 12 2 Thus, 1 2π β = arccos(− ) = = 120◦ 2 3

Look at that! The angles α and β are the same! In other words, the matrix transformation U preserved the angle between the vectors ~x and ~y.

2 Orthogonal Matrices/Transformations An orthogonal matrix is an n × n (square) matrix U such that:

U T U = I

Equivalently, an orthogonal matrix is an n × n (square) matrix U such that:

U −1 = U T

We say that a linear transformation is orthogonal if it can be represented by an orthogonal matrix.

Properties of Orthogonal Matrices

Suppose U is an n × n orthogonal matrix. Then for all ~x and ~y in Rn:

1) |U~x| = |~x|.(U preserves the length of vectors.)

2)( U~x) · (U~y) = ~x · ~y.(U preserves the angle between two vectors, by preserving the dot product.)

3) The columns of U form an orthonormal basis for Rn. (In fact, this works both ways. If the columns of a square matrix are orthonormal, then the matrix itself must be orthogonal. Can you show why?)

Warning: The columns of an orthogonal matrix must be orthonormal! If the columns of a matrix are only orthogonal (and not unit vectors), then the matrix would NOT be orthogonal! For example, 2 3 A = has orthogonal columns, but A is not an orthogonal matrix. 3 −2

 √ √  1/√10 3/ √10 0√ 2. Find the inverse of the matrix A = 3/√20 −1/√20 −1/√ 2. Check your answer via matrix 3/ 20 −1/ 20 1/ 2 multiplication. Solution. Lucky for us, A is an orthogonal matrix, since the columns of A are an orthonormal set. (You should √ √ √ 1/ 10 3/ 20 3/ 20  −1 T √ √ √ verify this for yourself!) So, A = A = 3/ 10 −1/ √20 −1/√ 20 0 −1/ 2 1/ 2

3. Which of the following types of transformations can never be an orthogonal transformation? Which of them must be an orthogonal transformation?

• The Identity Transformation

• Dilations

• Reﬂections

• Rotations

3 • Projections

• Shears

Solution. The identity transformation must be an orthogonal transformation. This is because the identity transformation is given by the identity matrix, which certainly has orthonormal columns.

A dilation is never an orthogonal transformation. Recall that a dilation matrix is given by multiplying the identity matrix by a positive constant a. If a 6= 1, then the columns of the dilation matrix would no longer be unit vectors, and thus the columns would not be orthonormal. (If a = 1, then technically we might still consider the matrix a dilation transformation where we scale the vectors by a factor of 1. But this would just be the identity transformation.)

A reflection must be an orthogonal transformation. Recall that the columns in the matrix of a transformation are given by the images of the standard basis vectors under that transformation. (The first column of A is A~e1, the second column of A is A~e2, etc.) If A is a reflection, then geometrically A would not change the lengths of any standard basis vectors, only reflect them. So, A~ek must have length 1 for all k, and thus the columns of A are unit vectors. Furthermore, a reflection would not change the angles between the basis vectors, which are all orthogonal to each other. So the images of the basis vectors, and hence the columns of A, must be orthogonal to each other too. Therefore, A has orthonormal columns, making it an orthogonal matrix.

A rotation must also be an orthogonal transformation. The same argument in the previous para- graph for reﬂection transformations applies here.

A projection is never an orthogonal transformation. This can be seen in many ways. For example, if a standard basis vector is projected onto any line other than the one that contains it, its length will change to some number other than 1. Thus, at least one of the column vectors of a projection matrix would not be a unit vector. A shear is never an orthogonal transformation. (The only exception would be a trivial shear, where none of the standard basis vectors are slanted. But again, this would essentially be the identity matrix.)

4. Determine if the following statements are true or false. If they are true, give a justiﬁcation. If they are false, give a counterexample.

(a) If a matrix is orthogonal, then it must be invertible.

Solution. True

In fact, if A is orthogonal, then A−1 = AT .

(b) If a matrix is invertible, then it must be orthogonal.

Solution. False

2 0 Counterexample: The matrix A = is invertible, but certainly not orthogonal. 0 2

Solution. True

4 The justiﬁcation for this is essentially the observation from 1(b), but in reverse. In 1(b), we argued that since U had orthonormal columns, U T U should equal I. Here, the argument is that if AT A = I, the columns of A must have been orthonormal. However, it would be wrong to say that A is an orthogonal matrix! This is because an orthogonal matrix must be a square matrix!

(d) If A is an orthogonal matrix, then AT is also an orthogonal matrix.

Solution. True To show that a matrix U is orthogonal, all we need to do is show that U T U = I. So in this case, if we want to prove that AT is orthogonal, we need to show that (AT )T (AT ) = I. Since we’re assuming that A is orthogonal, we know that AT A = I, or AT = A−1 Thus:

(AT )T (AT ) = (A)(AT ) = (A)(A−1) = I

Therefore, by deﬁnition, AT is orthogonal.

(e) If A is an orthogonal matrix, then A−1 is also an orthogonal matrix.

Solution. True Suppose A is an orthogonal matrix. So AT = A−1. Then:

(A−1)T (A−1) = (AT )T (A−1) = (A)(A−1) = I

So by deﬁnition, A−1 is orthogonal. (We could have also shown that A−1 is orthogonal using our result from part (d). Since we’re assuming A is orthogonal, AT = A−1. But part (d) told us that AT is orthogonal. So since A−1 = AT , A−1 must automatically be orthogonal.)

(f) If A is an orthogonal matrix, then A2 is also an orthogonal matrix.

Solution. True Suppose A is an orthogonal matrix. So AT = A−1. Then:

(A2)T (A2) = (AA)T (AA) = (AT AT )(AA) = (A−1A−1)(AA) = A−1A−1AA = A−1IA = A−1A = I

So by deﬁnition, A2 is orthogonal.

(g) If A and B are orthogonal n × n matrices, then AB is orthogonal.

Solution. True Suppose A and B are orthogonal matrices. So AT = A−1 and BT = B−1. Then:

(AB)T (AB) = (BT AT )(AB) = (B−1A−1)(AB) = B−1A−1AB = B−1IB = B−1B = I

So by deﬁnition, AB is orthogonal.

(h) If A and B are orthogonal n × n matrices, then A + B is orthogonal.

Solution. False

5 1 0 2 0 Counterexample: Let A = B = . Then A and B are both orthogonal, but A+B = , 0 1 0 2 which is not orthogonal.

(i) If the column vectors of A are orthonormal, then the row vectors of A must also be orthonormal. (Hint: Use part (d).)

Solution. True Suppose that the column vectors of A are orthonormal. Then A is an orthogonal matrix. Using our result from part (d), this implies that AT is orthogonal. So, AT has orthonormal columns. But the columns of AT are just the rows of A. Therefore, since the columns of AT are orthonormal, the rows of A must also be orthonormal. This result might seem very surprising and unintuitive at ﬁrst. After all, if a matrix has a bunch of random orthonormal columns (like the ones in problems 1 and 2), why should we believe that the rows would just happen to be orthonormal too? But if you check the row vectors in those problems for orthonormality, you’ll see that it works!