Math 480 Diagonalization and the Singular Value Decomposition These Notes Cover Diagonalization and the Singular Value Decomposi
Total Page:16
File Type:pdf, Size:1020Kb
Math 480 Diagonalization and the Singular Value Decomposition These notes cover diagonalization and the Singular Value Decomposition. 1. Diagonalization. Recall that a diagonal matrix is a square matrix with all off-diagonal entries equal to zero. Here are a few examples of diagonal matrices: 0 4 0 0 0 1 0 4 0 0 1 −6 0 0 2 0 0 ; 0 2 0 ; B C : 0 2 @ A B 0 0 1 0 C 0 0 1 @ A 0 0 0 −1 Definition 1.1. We say that an n×n matrix A is diagonalizable if there exists an invertible matrix S such that S−1AS is diagonal. Note that if D = S−1AS is diagonal, then we can equally well write A = SDS−1. So diagonalizable matrices are those that admit a factorization A = SDS−1 with D diagonal. Example: If D is a diagonal n × n matrix and S is an invertible n × n matrix, then A = SDS−1 is diagonalizable, since S−1AS = S−1(SDS−1)S = D: −1 2 For instance, the matrix S = is invertible, so 2 4 −6 0 −2 2 S S−1 = 0 2 8 −2 is diagonalizable. Fact 1.2. If A is a diagonalizable n × n matrix, with S−1AS = D, then the columns of S are eigenvectors of A, and the diagonal entries of D are eigenvalues of A. In particular, if A is diagonalizable then there must exist a basis for Rn consisting of eigenvectors of A. This follows from a simple computation: since S−1AS = D; multiplying both sides by S yields AS = SD: Write S = [~v1 : : :~vn] and set 0 1 λ1 0 0 ··· 0 B 0 λ2 0 ··· 0 C B . C B . C D = B 0 0 λ3 . C : B . C @ . .. 0 A 0 0 ··· 0 λn Since multiplying S by a diagonal matrix (on the right) just scales the columns, SD = [λ1~v1 : : : λn~vn]: On the other hand, AS = A[~v1 : : :~vn] = [A~v1 : : : A~vn]: So the equation AS = SD tells us that A~vi = λi~vi (for each i), which precisely says that ~vi is an eigenvector with eigenvalue λi. The previous discussion also works in reverse, and yields the following conclusion. n Fact 1.3. If A is an n × n matrix and there exists a basis ~v1; : : : ;~vn for R such that ~vi is an eigenvector of A with eigenvalue λi, then A is diagonalizable. More specifically, if S = [~v1 : : :~vn], then S−1AS = D; where D is the n × n diagonal matrix with diagonal entries λ1; : : : ; λn. Example. I claim that the matrix 0 4 2 2 1 A = @ 2 4 2 A 2 2 4 has eigenvalues 2 and 8. To find the corresponding eigenvectors, you can analyze N(A − 2I) and N(A−8I). By considering the parametric form for the homogeneous systems (A−2I)~x = ~0 and (A − 8I)~x = ~0, you'll find that the vectors 0 −1 1 0 −1 1 @ 1 A and @ 0 A 0 1 form a basis for the eigenspace associated to the eigenvalue 2, and 0 1 1 @ 1 A 1 forms a basis for the eigenspace associated with the eigenvalue 8. We can then conclude that S−1AS = D, where 0 −1 −1 1 1 0 2 0 0 1 S = @ 1 0 1 A and D = @ 0 2 0 A : 0 1 1 0 0 8 Note that order is important here: since we put eigenvectors corresponding to 2 into the first two columns of S, we have to put the eigenvalue 2 into the first two diagonal entries of D. We could, however, have switched the order of the eigenvectors corresponding to 2 without changing D, giving a second way of diagonalizing A. A third way of diagonalizing A would be to set 0 1 −1 −1 1 0 8 0 0 1 T = @ 1 0 1 A and E = @ 0 2 0 A ; 1 1 0 0 0 2 and again we have T −1AT = E. Exercise 1: Check these formulas without computing S−1 and T −1. (Multiply both sides of the equations S−1AS = D and T −1AT = E by S or T and check instead that AS = SD and AT = TE.) 1 1 Example: The matrix A = is not diagonalizable. If you compute the character- 0 1 istic polynomial det(A − λI), you'll see that it is simply (1 − λ)2, so the only eigenvalue is λ = 1. The corresponding eigenspace is N(A−1·I) = N(A−I). This space is 1{dimensional (why?) so there cannot be a basis for Rn consisting of eigenvectors of A. So Fact ?? tells us that we can't diagonalize A. Exercise 2: Determine whether or not the following matrices are diagonalizable. For the ones that are diagonalizable, write them in the form SDS−1 with D diagonal. 0 5 −1 −4 1 −3 1 −4 6 ; ; −2 4 −2 : −1 −1 −8 10 @ A −3 −3 6 2. Diagonalization of Symmetric Matrices In general, it's hard to tell if a matrix is diagonalizable, because it's hard to find eigenvalues exactly: they're roots of a complicated polynomial. However, in some cases one can tell very quickly that a matrix is diagonalizable. Theorem 2.1 (The Spectral Theorem). If A is an n × n symmetric matrix, then A is diagonalizable. In other words, there is a basis for Rn consisting of eigenvectors of A. This is hard to prove, and we'll simply take it for granted. However, some additional information is much easier to establish. Fact 2.2. If A is an n × n symmetric matrix, and ~v and ~w are eigenvectors of A with different eigenvalues, then ~v and ~w are perpendicular. This is relatively easy to check using our understanding of orthogonality. Say A~v = λ1~v T and A~w = λ2 ~w with λ1 6= λ2. We need to check that h~v; ~wi = 0. Since h~v; ~wi = ~v ~w, T T λ1~v ~w = (λ1~v) ~w = (A~v)T ~w = ~vT AT ~w T T = ~v A~w = ~v λ2 ~w T = λ2~v ~w: T T T So λ1~v ~w = λ2~v ~w, and since λ1 6= λ2, we conclude that ~v ~w = 0. In this computation, we used the fact that A is symmetric (where?) and the fact that ~w is an eigenvector (where?). We'll mostly be interested in symmetric matrices of the form AT A, where A is any m × n matrix. Remember that all such matrices are symmetric, because (AT A)T = AT (AT )T = AT A: Fact 2.3. For any m × n matrix A, the eigenvalues of the symmetric matrix AT A are all non-negative (real) numbers. This is again easy to check. If λ is an eigenvalue of AT A, then we can always find a (non- zero) eigenvector ~v associated with λ, and dividing ~v by jj~vjj yields a length-one eigenvector. So let's just assume that jj~vjj = 1 and AT A~v = λ~v. Then we have jjA~vjj2 = hA~v; A~vi = (A~v)T A~v = ~vT (AT A~v) = ~vT (λ~v) = λh~v;~vi = λ. So λ = jjA~vjj2, which is a non-negative real number. In this computation, we used the fact that ~v is an eigenvector of AT A (where?) and the fact that jj~vjj = 1 (where?). Exercise 3: Write each of the following symmetric matrices in the form SDS−1 with D diagonal. In the second case, the eigenvalues are −1 and 11. 0 5=2 1=2 0 1 0 3 4 4 1 @ 1=2 5=2 0 A ; @ 4 3 4 A 0 0 5 4 4 3 3. The Singular Value Decomposition Lots of matrices that arise in practice are not diagonalizable, and are often not even square. However, there is something sort of similar to diagonalization that works for any m × n matrix. We will call a square matrix orthogonal if its columns are orthonormal. Exercise 4: Explain the following statement: if A is an orthogonal n × n matrix, then A is invertible and AT = A−1. (This came up when we discussed the QR factorization.) Definition 3.1. A Singular Value Decomposition of an m × n matrix A is an expression A = UΣV T where U is an m×m matrix with orthonormal columns, V is an n×n matrix with orthonormal columns, and Σ = (σi;j) is an m × n matrix with σi;j = 0 for i 6= j and σ1;1 > σ2;2 > σ3;3 > ··· > σm;m > 0: Example: Here is an example of a SVD: 0 1=3 2=3 −2=3 1 6 30 −21 4=5 −3=5 45 0 0 = 2=3 −2=3 −1=3 : 17 10 −22 3=5 4=5 0 15 0 @ A 2=3 1=3 2=3 Exercise 5: Check that the above decomposition is a Singular Value Decomposition. (You need to check that the left-hand matrix in the decomposition has orthonormal columns, that the rows of the right-hand matrix are orthonormal, and that the middle matrix is \diagonal" with decreasing, positive entries on the diagonal. Of course no work is required to check this third condition.) Here are the key facts about the SVD: Theorem 3.2. Every m × n matrix A admits (many) Singular Value Decompositions. Fact 3.3. If A = UΣV T is a Singular Value Decomposition of an m × n matrix A, then T • The numbers σi;i are the square roots of the eigenvalues of A A, repeated according to their multiplicities as roots of the characteristic polynomial of AT A.