Symmetric matrices

Brian Krummel November 2, 2019

Definition 1. A A is symmetric if AT = A. Notice that a symmetric matrix is necessary a . Example 1. The matrix  0 1 2  A =  1 4 3  2 3 8 is symmetric. The main reason we are interested in symmetric matrices is that such matrices are always diagonalizable, and in a special form. Theorem 1. Let A be an n × n symmetric matrix. Suppose u and v are eigenvectors of A with corresponding eigenvalues λ and µ with λ 6= µ:

Au = λu,Av = µv. (1)

Then u and v are orthogonal.

Reason. First let’s notice that for any two vectors x and y in Rn (Ax) • y = x • (AT y) (2)

This follows from the computation

(Ax) • y = (Ax)T y = xT AT y = xT (AT y) = x • (AT y).

Since A is symmetric, i.e. AT = A, (2) gives us that for any two vectors x and y in Rn (Ax) • y = x • (Ay). (3)

Now taking x = u and y = v and using (1),

λu • v = (Au) • v = u • (Av) = µu • v, that is (λ − µ) u • v = 0. Since λ 6= µ, we must have u • v = 0 so that u and v are orthogonal.

1 One consequence of Theorem 1 is that the eigenspaces of A are mutually orthogonal. Thus for each eigenspace we can find an orthogonal for the eigenspace, which together forms an orthogonal set of eigenvectors of A.

Definition 2. An n × n matrix A is orthogonally diagonalizable if there is a D and P , i.e. P −1 = P T , such that

A = PDP T .

Notice that an orthogonally A can be characterized by the fact that Rn has an of eigenvectors of A, namely the columns of P .

Theorem 2. An n × n matrix A is orthogonally diagonalizable if and only if A is symmetric.

Theorem 2 is pretty remarkable, as from Section 5 we expect some matrices to be non- diagonalizable and thus it takes work to show that a matrix is diagonalizable. Theorem 2 asserts that in the special case of symmetric matrices, symmetric matrices are always diagonalizable. Let’s check that if A is orthogonally diagonalizable then A is symmetric: Check. Let’s verify that orthogonally diagonalizable matrices are symmetric. If A is orthogonally diagonalizable, that is A = PDP T where D is a diagonal matrix and P is an orthogonal matrix, then AT = (PDP T )T = (P T )T DT P T . Clearly (P T )T = P . Since D is diagonal, DT = D. Hence

AT = (P T )T DT P T = PDP T = A.

To prove the converse – if A is symmetric then A is orthogonally diagonalizable – is harder. The basic idea is to make a recursive argument looking at 2 × 2 matrics, then 3 × 3, 4 × 4, etc. Let A be a n × n symmetric matrix. We can associate with the symmetric matrix A the continuous function n X f(x) = x • Ax = aijxixj i,j=1 as a function of unit vectors x on the unit sphere in Rn. It’s a basic fact that continuous functions on the unit sphere {x : kxk = 1} always attain their maximum value (this is like how continu- ous functions on a closed interval attain their maximum value). Using Lagrange multipliers, we can show that f attains its maximum value at an eigenvector x and that the maximum value f(x) = λ = the corresponding eigenvalue. After rotating, we may assume x = en and focus on how multiplication by A and its associated function f behave in the x1x2 ··· xn−1-plane, thereby reducing to the case of an (n − 1) × (n − 1) symmetric matrix, which we show is diagonalizable.

2 At this point we have shown that symmetric matrices have real eigenvalues and orthogonally diagonalizable. Collecting these observations into one theorem:

Theorem 3 ( for Symmetric Matrices). An n × n symmetric matrix A has the following properties:

(a) A has n real eigenvalues, counting multiplicity.

(b) The dimension of the eigenspace corresponding to each eigenvalue λ equals the multiplicity of λ as a root of the characteristic equation.

(c) The eigenspaces are mutually orthogonal in the sense that eigenvectors corresponding to different eigenvalues are orthogonal.

(d) A is orthogonally diagonalizable.

Example 2. Let’s find the eigenvectors and eigenvalues of the matrix

 1 0 2  A =  0 −5 0  2 0 4

Find eigenvalues. We know the eigenvalues λ are roots of the characteristic polynomial

1 − λ 0 2

det(A − λI) = 0 −5 − λ 0

2 0 4 − λ = (1 − λ)(−5 − λ)(4 − λ) − 22(−5 − λ) = −λ3 + 21λ − 20 + 20 + 4λ = −λ3 + 25λ = −λ(λ2 − 25) = −λ(λ − 5)(λ + 5).

Therefore the eigenvalues are λ = 0, 5, −5. In particular, the eigenvalues are all real numbers. Since there are three distinct eigenvalues, the matrix A is diagonalizable. Find eigenvectors corresponding to λ = 0.

 1 0 2   1 0 2  A =  0 −5 0  →  0 1 0  . 2 0 4 0 0 0

Thus x3 is a free variable, x1 = −2x3, x2 = 0. Letting x3 = 1 we obtain the eigenvector

 −2   0  . 1

3 Find eigenvectors corresponding to λ = −5.

 6 0 2   1 0 1  A + 5I =  0 0 0  →  0 0 0  . 2 0 9 0 0 0

Thus x2 is a free variable and x1 = x3 = 0. Letting x2 = 1 we obtain the eigenvector

 0   1  . 0

Find eigenvectors corresponding to λ = −5.

 −4 0 2   2 0 −1  A − 5I =  0 −10 0  →  0 1 0  . 2 0 −1 0 0 0

Thus x3 is a free variable, x1 = (1/2) x3, and x2 = 0. Letting x2 = 2 we obtain the eigenvector

 1   0  . 2

Notice that the set of eigenvectors we found        −2 0 1   0  ,  1  ,  0   1 0 2  is an orthogonal basis for R3. Orthogonally diagonalize A. A = PDP T where D is the diagonal matrix whose diagonal entries are the eigenvalues 0, −5, 5 and P is the orthogonal matrix whose columns are the corresponding eigenvectors normalized to be unit vectors: √ √  0 0 0   −2/ 5 0 1/ 5  D =  0 −5 0  P =  0√ 1 0√  . 0 0 5 1/ 5 0 2/ 5

One remarkable thing to follow from the Spectral Theorem is the spectral decomposition of an n × n symmetric matrix A. Suppose for simplicity that A is a 3 × 3 matrix and that A has an orthonormal basis of eigenvalues u1, u2, u3 with corresponding eigenvalues λ1, λ2, λ3. Then A = PDP T where   λ1 0 0   P = u1 u2 u3 D =  0 λ2 0  . 0 0 λ3

4 Thus

A = PDP T    T  λ1 0 0 u1   T = u1 u2 u3  0 λ2 0   u2  T 0 0 λ3 u3  T  λ1u1   T = u1 u2 u3  λ2u2  T λ3u3 T T T = λ1u1u1 + λ2u2u2 + λ3u3u3 .

The spectral decomposition of A is

T T T A = λ1u1u1 + λ2u2u2 + λ3u3u3 .

T T Here uiui are (n × 1) · (1 × n) = n × n matrices, and in fact uiui is the standard matrix for the orthogonal projection onto the line spanned by the eigenvector ui. Of course, some eigenvalues may repeat. For instance, if λ1 = λ2 and λ1 6= λ3, then

T T T A = λ1 (u1u1 + u2u2 ) + λ3u3u3

T T where u1u1 + u2u2 is the orthogonal projection onto the eigenspace corresponding to λ1 and T u3u3 is the orthogonal projection onto the eigenspace corresponding to λ3. For a general n × n symmetric matrix A with an orthonormal basis of eigenvalues u1, u2,..., un and corresponding eigenvalues λ1, λ2, . . . , λn, the spectral decomposition is

T T T A = λ1u1u1 + λ2u2u2 + ··· + λnunun .

Example 2 continued. Let’s find the spectral decomposition of

 1 0 2  A =  0 −5 0  . 2 0 4

Then T T T A = 0 u1u1 − 5u2u2 + 5u3u3 where  −2   4 0 −2  1 1 u uT = 0  −2 0 1  = 0 0 0 1 1 5   5   1 −2 0 1 is the orthogonal projection map onto the eigenspace corresponding to λ = 0 (and admittedly this term does not matter since the eigenvalue is zero),

 0   0 0 0  1 u uT = 1  0 1 0  = 0 1 0 1 1   5   0 0 0 0

5 is the orthogonal projection map onto the eigenspace corresponding to λ = −5, and  1   1 0 2  1 1 u uT = 0  1 0 2  = 0 0 0 1 1 5   5   2 2 0 4 is the orthogonal projection map onto the eigenspace corresponding to λ = 5. Example 3. Let’s find the eigenvectors and eigenvalues of the matrix  1 −2 4  A =  −2 4 2  4 2 1 Note that the eigenvalues of A are 5 with multiplicity two and −4 with multiplicity one. Find eigenvectors corresponding to λ = 5.  −4 −2 4   2 1 −2  A − 5I =  −2 −1 2  −→  0 0 0  . 4 2 −4 0 0 0

Thus x2, x3 are free variables and x1 = (−1/2) x2 + x3. Thus  −1   1  x x = 2 2 + x 0 . 2   3   0 1 Hence      −1 1   2  ,  0   0 1  is a basis for the eigenspace corresponding to λ = 5. By applying Gram Schmidt, the projection of the second vector onto the line spanned by the first is  1   −1   4  −1 1 0 − 2 = 2   5   5   1 0 5 so (dropping the factor of 1/5)      −1 4   2  ,  2   0 5  is an orthogonal basis for the eigenspace corresponding to λ = 5. Find eigenvectors corresponding to λ = −4.  5 −2 4   1 −4 −1   1 −4 −1  R1−R3 7→ R1 R2+2 R1 7→ R2 A + 4I =  −2 8 2  −−−−−−−→  −2 8 2  −−−−−−−−→  0 0 0  4 2 5 4 2 5 R3−4 R1 7→ R3 0 18 9  1 −4 −1   1 0 1  R1+2 R2 7→ R1 −→  0 2 1  −−−−−−−−→  0 2 1  . 0 0 0 0 0 0

6 Thus x3 is a free variable and x1 = −x3 and x2 = (−1/2) x3. Letting x3 = 2 we obtain the eigenvector  −2   −1  . 2 Notice that the set of eigenvectors we found        −1 4 −2   2  ,  2  ,  −1   0 5 2  is an orthogonal basis for R3. Orthogonally diagonalize A. Using the eigenvalues and eigenvectors we found above, A = PDP T where    √ √  5 0 0 −1/√5 4/3√5 −2/3 D =  0 5 0  P =  2/ 5 2/3√5 −1/3  . 0 0 4 2 1/3 5 2/3

7