Symmetric matrices
Brian Krummel November 2, 2019
Definition 1. A matrix A is symmetric if AT = A. Notice that a symmetric matrix is necessary a square matrix. Example 1. The matrix 0 1 2 A = 1 4 3 2 3 8 is symmetric. The main reason we are interested in symmetric matrices is that such matrices are always diagonalizable, and in a special form. Theorem 1. Let A be an n × n symmetric matrix. Suppose u and v are eigenvectors of A with corresponding eigenvalues λ and µ with λ 6= µ:
Au = λu,Av = µv. (1)
Then u and v are orthogonal.
Reason. First let’s notice that for any two vectors x and y in Rn (Ax) • y = x • (AT y) (2)
This follows from the computation
(Ax) • y = (Ax)T y = xT AT y = xT (AT y) = x • (AT y).
Since A is symmetric, i.e. AT = A, (2) gives us that for any two vectors x and y in Rn (Ax) • y = x • (Ay). (3)
Now taking x = u and y = v and using (1),
λu • v = (Au) • v = u • (Av) = µu • v, that is (λ − µ) u • v = 0. Since λ 6= µ, we must have u • v = 0 so that u and v are orthogonal.
1 One consequence of Theorem 1 is that the eigenspaces of A are mutually orthogonal. Thus for each eigenspace we can find an orthogonal basis for the eigenspace, which together forms an orthogonal set of eigenvectors of A.
Definition 2. An n × n matrix A is orthogonally diagonalizable if there is a diagonal matrix D and orthogonal matrix P , i.e. P −1 = P T , such that
A = PDP T .
Notice that an orthogonally diagonalizable matrix A can be characterized by the fact that Rn has an orthonormal basis of eigenvectors of A, namely the columns of P .
Theorem 2. An n × n matrix A is orthogonally diagonalizable if and only if A is symmetric.
Theorem 2 is pretty remarkable, as from Section 5 we expect some matrices to be non- diagonalizable and thus it takes work to show that a matrix is diagonalizable. Theorem 2 asserts that in the special case of symmetric matrices, symmetric matrices are always diagonalizable. Let’s check that if A is orthogonally diagonalizable then A is symmetric: Check. Let’s verify that orthogonally diagonalizable matrices are symmetric. If A is orthogonally diagonalizable, that is A = PDP T where D is a diagonal matrix and P is an orthogonal matrix, then AT = (PDP T )T = (P T )T DT P T . Clearly (P T )T = P . Since D is diagonal, DT = D. Hence
AT = (P T )T DT P T = PDP T = A.
To prove the converse – if A is symmetric then A is orthogonally diagonalizable – is harder. The basic idea is to make a recursive argument looking at 2 × 2 matrics, then 3 × 3, 4 × 4, etc. Let A be a n × n symmetric matrix. We can associate with the symmetric matrix A the continuous function n X f(x) = x • Ax = aijxixj i,j=1 as a function of unit vectors x on the unit sphere in Rn. It’s a basic fact that continuous functions on the unit sphere {x : kxk = 1} always attain their maximum value (this is like how continu- ous functions on a closed interval attain their maximum value). Using Lagrange multipliers, we can show that f attains its maximum value at an eigenvector x and that the maximum value f(x) = λ = the corresponding eigenvalue. After rotating, we may assume x = en and focus on how multiplication by A and its associated function f behave in the x1x2 ··· xn−1-plane, thereby reducing to the case of an (n − 1) × (n − 1) symmetric matrix, which we show is diagonalizable.
2 At this point we have shown that symmetric matrices have real eigenvalues and orthogonally diagonalizable. Collecting these observations into one theorem:
Theorem 3 (Spectral Theorem for Symmetric Matrices). An n × n symmetric matrix A has the following properties:
(a) A has n real eigenvalues, counting multiplicity.
(b) The dimension of the eigenspace corresponding to each eigenvalue λ equals the multiplicity of λ as a root of the characteristic equation.
(c) The eigenspaces are mutually orthogonal in the sense that eigenvectors corresponding to different eigenvalues are orthogonal.
(d) A is orthogonally diagonalizable.
Example 2. Let’s find the eigenvectors and eigenvalues of the matrix
1 0 2 A = 0 −5 0 2 0 4
Find eigenvalues. We know the eigenvalues λ are roots of the characteristic polynomial
1 − λ 0 2
det(A − λI) = 0 −5 − λ 0
2 0 4 − λ = (1 − λ)(−5 − λ)(4 − λ) − 22(−5 − λ) = −λ3 + 21λ − 20 + 20 + 4λ = −λ3 + 25λ = −λ(λ2 − 25) = −λ(λ − 5)(λ + 5).
Therefore the eigenvalues are λ = 0, 5, −5. In particular, the eigenvalues are all real numbers. Since there are three distinct eigenvalues, the matrix A is diagonalizable. Find eigenvectors corresponding to λ = 0.
1 0 2 1 0 2 A = 0 −5 0 → 0 1 0 . 2 0 4 0 0 0
Thus x3 is a free variable, x1 = −2x3, x2 = 0. Letting x3 = 1 we obtain the eigenvector
−2 0 . 1
3 Find eigenvectors corresponding to λ = −5.
6 0 2 1 0 1 A + 5I = 0 0 0 → 0 0 0 . 2 0 9 0 0 0
Thus x2 is a free variable and x1 = x3 = 0. Letting x2 = 1 we obtain the eigenvector
0 1 . 0
Find eigenvectors corresponding to λ = −5.
−4 0 2 2 0 −1 A − 5I = 0 −10 0 → 0 1 0 . 2 0 −1 0 0 0
Thus x3 is a free variable, x1 = (1/2) x3, and x2 = 0. Letting x2 = 2 we obtain the eigenvector
1 0 . 2
Notice that the set of eigenvectors we found −2 0 1 0 , 1 , 0 1 0 2 is an orthogonal basis for R3. Orthogonally diagonalize A. A = PDP T where D is the diagonal matrix whose diagonal entries are the eigenvalues 0, −5, 5 and P is the orthogonal matrix whose columns are the corresponding eigenvectors normalized to be unit vectors: √ √ 0 0 0 −2/ 5 0 1/ 5 D = 0 −5 0 P = 0√ 1 0√ . 0 0 5 1/ 5 0 2/ 5
One remarkable thing to follow from the Spectral Theorem is the spectral decomposition of an n × n symmetric matrix A. Suppose for simplicity that A is a 3 × 3 matrix and that A has an orthonormal basis of eigenvalues u1, u2, u3 with corresponding eigenvalues λ1, λ2, λ3. Then A = PDP T where λ1 0 0 P = u1 u2 u3 D = 0 λ2 0 . 0 0 λ3
4 Thus
A = PDP T T λ1 0 0 u1 T = u1 u2 u3 0 λ2 0 u2 T 0 0 λ3 u3 T λ1u1 T = u1 u2 u3 λ2u2 T λ3u3 T T T = λ1u1u1 + λ2u2u2 + λ3u3u3 .
The spectral decomposition of A is
T T T A = λ1u1u1 + λ2u2u2 + λ3u3u3 .
T T Here uiui are (n × 1) · (1 × n) = n × n matrices, and in fact uiui is the standard matrix for the orthogonal projection onto the line spanned by the eigenvector ui. Of course, some eigenvalues may repeat. For instance, if λ1 = λ2 and λ1 6= λ3, then
T T T A = λ1 (u1u1 + u2u2 ) + λ3u3u3
T T where u1u1 + u2u2 is the orthogonal projection onto the eigenspace corresponding to λ1 and T u3u3 is the orthogonal projection onto the eigenspace corresponding to λ3. For a general n × n symmetric matrix A with an orthonormal basis of eigenvalues u1, u2,..., un and corresponding eigenvalues λ1, λ2, . . . , λn, the spectral decomposition is
T T T A = λ1u1u1 + λ2u2u2 + ··· + λnunun .
Example 2 continued. Let’s find the spectral decomposition of
1 0 2 A = 0 −5 0 . 2 0 4
Then T T T A = 0 u1u1 − 5u2u2 + 5u3u3 where −2 4 0 −2 1 1 u uT = 0 −2 0 1 = 0 0 0 1 1 5 5 1 −2 0 1 is the orthogonal projection map onto the eigenspace corresponding to λ = 0 (and admittedly this term does not matter since the eigenvalue is zero),
0 0 0 0 1 u uT = 1 0 1 0 = 0 1 0 1 1 5 0 0 0 0
5 is the orthogonal projection map onto the eigenspace corresponding to λ = −5, and 1 1 0 2 1 1 u uT = 0 1 0 2 = 0 0 0 1 1 5 5 2 2 0 4 is the orthogonal projection map onto the eigenspace corresponding to λ = 5. Example 3. Let’s find the eigenvectors and eigenvalues of the matrix 1 −2 4 A = −2 4 2 4 2 1 Note that the eigenvalues of A are 5 with multiplicity two and −4 with multiplicity one. Find eigenvectors corresponding to λ = 5. −4 −2 4 2 1 −2 A − 5I = −2 −1 2 −→ 0 0 0 . 4 2 −4 0 0 0
Thus x2, x3 are free variables and x1 = (−1/2) x2 + x3. Thus −1 1 x x = 2 2 + x 0 . 2 3 0 1 Hence −1 1 2 , 0 0 1 is a basis for the eigenspace corresponding to λ = 5. By applying Gram Schmidt, the projection of the second vector onto the line spanned by the first is 1 −1 4 −1 1 0 − 2 = 2 5 5 1 0 5 so (dropping the factor of 1/5) −1 4 2 , 2 0 5 is an orthogonal basis for the eigenspace corresponding to λ = 5. Find eigenvectors corresponding to λ = −4. 5 −2 4 1 −4 −1 1 −4 −1 R1−R3 7→ R1 R2+2 R1 7→ R2 A + 4I = −2 8 2 −−−−−−−→ −2 8 2 −−−−−−−−→ 0 0 0 4 2 5 4 2 5 R3−4 R1 7→ R3 0 18 9 1 −4 −1 1 0 1 R1+2 R2 7→ R1 −→ 0 2 1 −−−−−−−−→ 0 2 1 . 0 0 0 0 0 0
6 Thus x3 is a free variable and x1 = −x3 and x2 = (−1/2) x3. Letting x3 = 2 we obtain the eigenvector −2 −1 . 2 Notice that the set of eigenvectors we found −1 4 −2 2 , 2 , −1 0 5 2 is an orthogonal basis for R3. Orthogonally diagonalize A. Using the eigenvalues and eigenvectors we found above, A = PDP T where √ √ 5 0 0 −1/√5 4/3√5 −2/3 D = 0 5 0 P = 2/ 5 2/3√5 −1/3 . 0 0 4 2 1/3 5 2/3
7