Lecture 25: 7.2 Orthogonal Diagonalization
Wei-Ta Chu
2011/12/21 Spectral Decomposition
¢ If A is a symmetric matrix that is orthogonally diagonalized by λ λ λ P=[ u1 u2 … un], and if 1, 2, …, n are the eigenvalues of A corresponding to the unit eigenvectors u1, u2, …, un, then we know that D=PTAP , where D is a diagonal matrix with the eigenvalues in the diagonal positions.
2 Spectral Decomposition
¢ Multiplying out, we obtain the formula
which is called a spectral decomposition of A. ¢ Each term of the spectral decomposition of A has the form where u is a unit eigenvector of A in column form, and λ is an eigenvalue of A corresponding to u. ¢ It can be proved that uu T is the standard matrix for the orthogonal projection of Rn on the subspace spanned by the vector u.
3 Spectral Decomposition
¢ The spectral decomposition of A tells that the image of a vector x under multiplication by a symmetric matrix A can be obtained by projecting x orthogonally on the lines determined by the eigenvectors of A, then scaling those projections by the eigenvalues, and then adding the scaled projections.
4 Example: Eigenface
Face database
u1 u2 u3 … Mean face Eigenvectors
5 Example of Face Reconstruction
x Ax
= -2181 +627 +389 + …
Reconstruction procedure
6 Example
λ λ ¢ The matrix has eigenvalues 1=-3 and 2=2 with corresponding eigenvectors x1=(1,-2) and x2=(2,1) ¢ Normalizing these basis vectors yields ¢ A spectral docomposition of A is
The standard matrices for the orthogonal projections onto the eigenspaces 7 corresponding to λ1=-3 and λ2=2 Example
¢ The image of the vector x=(1,1)
¢ These provide two different ways of viewing the image of the vector (1,1) under multiplication by A
8 Lecture 25: 7.3 Quadratic Forms
Wei-Ta Chu
2011/12/21 Quadratic Form ( 二次式)
¢ Up to now, we have been interested in linear equations
¢ It’s a function of n variables, called a linear form. ¢ We will be concerned with quadratic forms , which are functions of the form
¢ For example: cross-product terms
10 Quadratic Form
¢ Written in matrix form
¢ They are both of the form xTAx, where x is the column vector of variables, and A is symmetric matrix whose diagonal entries are the coefficients of the squared terms and whose entries off the main diagonal are half the coefficients of the cross-product terms. 11 Example
12 Symmetric Matrix
¢ Symmetric matrices are useful, but not essential, for representing quadratic forms. ¢ For example, the quadratic form 2 x2+6 xy -7y2 can be written as
where the coefficient 6 of the cross-product term has been split as 5+1 rather than 3+3, as in the symmetric representation.
13 Symmetric Matrix
¢ However, symmetric matrices are usually more convenient to work with, so it will always be understood that A is symmetric when we write a quadratic form as xTAx, even if not stated explicitly. ¢ When convenient, we can use Formula (7) of Section 4.1 to express a quadratic form xTAx in terms of the Euclidean inner product as
14 Problems
¢ Problem 1: if xTAx is a quadratic form on R2 or R3, what kind of curve or surface is represented by the equation xTAx=k? ¢ Problem 2: if xTAx is a quadratic form on Rn, what conditions must A satisfy for xTAx to have positive values for x ≠ 0? ¢ Problem 3: if xTAx is a quadratic form on Rn, what are its maximum and minimum values if x is constrained to satisfy || x|| = 1?
15 Change of Variable
¢ Simplify the quadratic form xTAx by making a substitution
x=Py. That expresses the variable x1, x2, …, xn in terms of new variables y1, y2, …, yn. ¢ If P is invertible, we call this change of variable . If P is orthogonal, then we call this orthogonal change of variable . ¢ We obtain: xTAx=( Py)TA(Py)= yTPTAP y=yT(PTAP )y ¢ Since the matrix B=PTAP is symmetric, the effect of the change of variable is to produce a new quadratic form yTBy.
16 Change of Variable
¢ If we choose P to orthogonally diagonalize A, then the new quadratic form will be yTDy, where D is a diagonal matrix with the eigenvalues of A on the main diagonal.
17 Theorem 7.3.1 The Principal Axes Theorem ¢ If A is a symmetric n by n matrix, then there is an orthogonal change of variable that transforms the quadratic form xTAx into a quadratic form yTDy with no cross product terms. Specifically, if P orthogonally diagonalize A, then making the change of variable x=Py in the quadratic form xTAx yields the quadratic form T T λ 2 λ 2 λ 2 x Ax=y Dy= 1y1 + 2y2 +…+ nyn λ λ λ in which 1, 2,…, n are the eigenvalues of A corresponding to the eigenvectors that form the successive columns of P.
18 Example
¢ Find an orthogonal change of variable of the quadratic form 2 2 Q=x1 -x3 -4x1x2+4 x2x3.
¢ The characteristic equation of the matrix A is
¢ The eigenvalues are 0, -3, 3. The orthonormal bases for the three eigenspace are
19 Example
¢ A substitution x=Py that eliminates the cross product terms is
¢ This produces the new quadratic form
20 Positive Definite ( 正定)
¢ Definition: A quadratic form xTAx is called positive definite if xTAx > 0 for all x ≠ 0, negative definite if xTAx < 0 for x ≠ 0 , indefinite if xTAx has both positive and negative values
¢ It’s called positive semidefinite if xTAx ≧ 0 for all x ≠ 0, and negative semidefinite if xTAx ≦ 0 for all x ≠ 0
21 Theorem 7.3.2
¢ If A is a symmetric matrix A, then T ß x Ax is positive definite if and only if all eigenvalues of A are positive. T ß x Ax is negative definite if and only if all eigenvalues of A are negative. T ß x Ax is indefinite if and only if A has at least one positive eigenvalue and at least one negative eigenvalue.
22 Example
¢ We showed that the symmetric matrix has eigenvalues and . Since these are positive, the matrix A is positive definite, and for all x ≠ 0,
23 Identifying Positive Definite Matrices
¢ Positive definite matrices are the most important symmetric matrices in applications. ¢ A method to determine whether a symmetric matrix is positive definite without finding the eigenvalues. ¢ The kth principal submatrix :
First principal submatrix Third principal submatrix Second principal submatrix Fourth principal submatrix
24 Theorem 7.3.4
¢ A symmetric matrix A is positive definite if and only if the determinant of every principal submatrix is positive. ¢ Example:
¢ We are guaranteed that all eigenvalues of A are positive and xTAx > 0 for x ≠ 0.
25 Exercises
¢ Sec. 7.1: 2, 6, 17, 22(True-False) ¢ Sec. 7.2: 5, 16(c), 18(a), 22(True-False) ¢ Sec. 7.3: 6, 12, 21, 25(b), 36(True-False)
26 Lecture 25: 8.1 General Linear Transformations
Wei-Ta Chu
2011/12/21 Definitions and Terminology
n m ¢ A matrix transformation TA: R →R is a mapping of the form TA(x) = Ax, in which A is an m by n matrix. ¢ Matrix transformations are precisely the linear transformations from Rn to Rm. The transformations with linearity properties T(u+v) = T(u) + T(v) and T(ku) = kT (u)
28 Definitions and Terminology
¢ If T: V → W is a function from a vector space V to a vector space W, then T is called a linear transformation from V to W if the following two properties hold for all vectors u and v in V and all scalars k:
ß (1) T(ku) = kT (u) [Homogeneity property]
ß (2) T(u+v) = T(u) + T(v) [Additivity property] ¢ In the special case where V = W, the linear transformation is called a linear operator on the vector space V.
29 Definitions and Terminology
¢ These properties can be used in combination
T(k1v1 + k2v2) = k1T(v1) + k2T(v2) ¢ More generally,
T(k1v1 + k2v2+ …+ krvr) = k1T(v1) + k2T(v2) + … + krT(vr) ¢ Theorem 8.1.1: If T: V → W is a linear transformation, then
ß (a) T(0) = 0
ß (b) T(u-v) = T(u) – T(v) for all u and v in V
30 Example
n m ¢ A matrix transformation TA: R → R is also a liner transformation in this more general sense with V = Rn and W = Rm ¢ The zero transformation: The mapping T: V → W such that T(v) = 0 for every v in V is a linear transformation called the zero transformation . T is linear: T(u+v) = 0, T(u) = 0, T(v) = 0, and T(ku) = 0 Therefore, T(u+v) = T(u) + T(v), and T(ku) = kT (u)
31 Example
¢ The identity operator: The mapping I: V → V defined by I(v) = v is called the identity operator on V. ¢ The mapping T: V → V given by T(x) = kx is a linear operator on V, for if c is any scalar and if u and v are any vectors in V, then T(cu) = k(cu) = c(ku) = cT (u) T(u+v) = k(u+v) = ku + kv = T(u) + T(v) ¢ If 0 < k < 1, then T is called the contraction of V with factor k, and if k > 1, it is called the dilation of V with factor k.
32 Example
n ¢ Let p = p(x) = c0 + c1x + … + cnx be a polynomial in Pn, and define the transformation T: Pn → Pn+1 by 2 n+1 T(p) = T(p(x)) = xp (x) = c0x + c1x + … + cnx ¢ This transformation is linear because for any scalar k and
any polynomial p1 and p2 in Pn we have T(kp) = T(kp (x)) = x(kp (x)) = k(xp (x)) = kT (p)
T(p1 + p2) = T(p1(x) + p2(x)) = x(p1(x) + p2(x)) = xp 1(x) + xp 2(x) = T(p1) + T(p2)
33 Example
¢ Let V be an inner product space, let v0 be any fixed vector in V, and let T: V → R be the transformation
T(x) = 〈x,v0〉 that maps a vector x into its inner product with v0. ¢ This transformation is linear, for if k is any scalar, and if u and v are any vectors in V, then
T(ku) = 〈ku,v0〉=k〈u,v0〉=kT (u)
T(u+v) = 〈u+v,v0〉=〈u,v0〉+〈v,v0〉=T(u) + T(v)
34 Example
¢ Let Mnn be the vector space of n by n matrices. The T mapping T1(A) = A is linear. T T T1(kA ) = ( kA ) = kA = kT 1(A) T T T T1(A+B) = ( A+B) = A + B = T1(A) + T1(B)
¢ The mapping T2(A) = det( A) is not linear. n n T2(kA ) = det( kA ) = k det( A) = k T2(A)
35 Example
¢ A linear transformation maps 0 to 0. This property is useful for identifying transformations that are not linear. 2 ¢ If x0 is a fixed nonzero vector in R , then the transformation T(x) = x + x0 has the geometric effect of translating each point x in a direction parallel to x0 through a distance of || x0||.
¢ This cannot be a linear transformation since T(0) = x0 so T does not map 0 to 0.
36 Example
¢ Let V be a subspace of , let x1, x2, …, xn be distinct real numbers, and let T: V → Rn be the
transformation T(f) = ( f(x1), f(x2), …, f(xn)) that associates with f the n-tuple of function values at x1, x2, …, xn.
¢ We call this the evaluation transformation on V at x1, x2, …, xn.
T(kf ) = ( kf (x1), kf (x2), …, kf (xn)) = k(f(x1), f(x2), …, f(xn)) = kT (f)
T(f+g) = (( f+g )( x1), ( f+g )( x2), …, ( f+g )( xn)) = ( f(x1) + g(x1), f(x2) + g(x2), …, f(xn) + g(xn)) = T(f) + T(g)
37