Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors ◮ Suppose A is an n n symmetric matrix with real entries. × n ◮ The function from R to R defined by x xt Ax 7→ is called a quadratic form. T T 2 ◮ We can maximize x Ax subject to x x = x =1 by || || Lagrange multipliers: xT Ax λ(xT x 1) − − ◮ Take derivatives and get xT x = 1 and 2Ax 2λx = 0 − Richard Lockhart STAT 350: General Theory ◮ We say that v is an eigenvector of A with eigenvalue λ if v = 0 and 6 Av = λv T ◮ For such a v and λ with v v =1 we find v T Av = λv T v = λ. ◮ So the quadratic form is maximized over vectors of length one by the eigenvector with the largest eigenvalue. ◮ Call that eigenvector v1, eigenvalue λ1. T T T ◮ Maximize x Ax subject to x x = 1 and v1 x = 0. ◮ Get new eigenvector and eigenvalue. Richard Lockhart STAT 350: General Theory Summary of Linear Algebra Results Theorem Suppose A is a real symmetric n n matrix. × 1. There are n orthonormal eigenvectors v1,..., vn with corresponding eigenvalues λ λ . 1 ≥···≥ n 2. If P is the n n matrix whose columns are v ,..., v and Λ is × 1 n the diagonal matrix with λ1,...,λn on the diagonal then AP = PΛ or PT ΛP = A and PT AP =Λ and PT P = I and 3. If A is non-negative definite (that is, A is a variance covariance matrix) then each λ 0. i ≥ 4. A is singular if and only if at least one eigenvalue is 0. 5. The determinant of A is λi . Q Richard Lockhart STAT 350: General Theory The trace of a matrix Definition: If A is square then the trace of A is the sum of its diagonal elements: tr(A)= Aii Xi Theorem If A and B are any two matrices such that AB is square then tr(AB)= tr(BA) r If A1,..., Ar are matrices such that j=1 Aj is square then Q tr(A1 Ar )= tr(A2 Ar A1)= = tr(As Ar A1 As 1) · · · · · · · · · · · · · · · − If A is symmetric then tr(A)= λi Xi Richard Lockhart STAT 350: General Theory Idempotent Matrices Definition: A symmetric matrix A is idempotent if A2 = AA = A. Theorem A matrix A is idempotent if and only if all its eigenvalues are either 0 or 1. The number of eigenvalues equal to 1 is then tr(A). Proof: If A is idempotent, λ is an eigenvalue and v a corresponding eigenvector then λv = Av = AAv = λAv = λ2v Since v =0 we find λ λ2 = λ(1 λ) = 0 so either λ = 0 or 6 − − λ = 1. Richard Lockhart STAT 350: General Theory Conversely ◮ Write A = PΛPT so A2 = PΛPT PΛPT = PΛ2PT ◮ Have used the fact that P is orthogonal. ◮ Each entry in the diagonal of Λ is either 0 or 1 2 ◮ So Λ =Λ ◮ So A2 = A. Richard Lockhart STAT 350: General Theory Finally tr(A)= tr(PΛPT ) = tr(ΛPT P) = tr(Λ) Since all the diagonal entries in Λ are 0 or 1 we are done the proof. Richard Lockhart STAT 350: General Theory Independence Definition: If U1, U2,... Uk are random variables then we call U1,..., Uk independent if P(U A ,..., U A )= P(U A ) P(U A ) 1 ∈ 1 k ∈ k 1 ∈ 1 ×···× k ∈ k for any sets A1,..., Ak . We usually either: ◮ Assume independence because there is no physical way for the value of any of the random variables to influence any of the others. OR ◮ We prove independence. Richard Lockhart STAT 350: General Theory Joint Densities ◮ How do we prove independence? ◮ We use the notion of a joint density. ◮ U1,..., Uk have joint density function f = f (u1,..., uk ) if P((U ,..., U ) A)= f (u ,..., u )du du 1 k ∈ · · · 1 k 1 · · · k Z A Z ◮ Independence of U1,..., Uk is equivalent to f (u ,..., u )= f (u ) f (u ) 1 k 1 1 ×···× k k for some densities f1,..., fk . ◮ In this case fi is the density of Ui . ◮ ASIDE: notice that for an independent sample the joint density is the likelihood function! Richard Lockhart STAT 350: General Theory Application to Normals: Standard Case If Z1 . Z = . MVN(0, In n) ∼ × Z n then the joint density of Z, denoted fZ (z1,..., zn) is f (z ,..., z )= φ(z ) φ(z ) Z 1 n 1 ×···× n where 1 z2/2 φ(zi )= e− i √2π Richard Lockhart STAT 350: General Theory So n n/2 1 2 fZ = (2π)− exp zi (−2 ) Xi=1 n/2 1 T = (2π)− exp z z −2 where z1 . z = . z n Richard Lockhart STAT 350: General Theory Application to Normals: General Case If X = AZ + µ and A is invertible then for any set B Rn we have ∈ P(X B)= P(AZ + µ B) ∈ ∈ 1 = P(Z A− (B µ)) ∈ − n/2 1 T = (2π)− exp z z dz dz · · · −2 1 · · · n AZ 1(B Zµ) − − Make the change of variables x = Az + µ in this integral to get n/2 P(X B)= (2π)− ∈ · · · Z B Z 1 1 T 1 exp A− (x µ) A− (x µ) J(x)dx dx × −2 − − 1 · · · n Richard Lockhart STAT 350: General Theory Here J(x) denotes the Jacobian of the transformation ∂zi 1 J(x)= J(x ,..., x )= det = det A− 1 n ∂x j Algebraic manipulation of the integral then gives n/2 P(X B)= (2π)− ∈ · · · Z B Z 1 T 1 1 exp (x µ) Σ− (x µ) detA− dx dx × −2 − − | | 1 · · · n where Σ= AAT 1 1 T 1 Σ− = A− A− 1 1 2 detΣ− = detA− 1 = detΣ Richard Lockhart STAT 350: General Theory Multivariate Normal Density ◮ Conclusion: the MVN(µ, Σ) density is n/2 1 T 1 1/2 (2π)− exp (x µ) Σ− (x µ) (detΣ)− −2 − − ◮ What if A is not invertible? Ans: there is no density. ◮ How do we apply this density? ◮ Suppose X X = 1 X 2 and Σ Σ Σ= 11 12 Σ Σ 21 22 ◮ Now suppose Σ12 = 0 Richard Lockhart STAT 350: General Theory Assuming Σ12 =0 1. Σ21 = 0 2. In homework you checked that Σ 1 0 Σ 1 = 11− − 0 Σ 1 22− 3. Writing x x = 1 x 2 and µ µ = 1 µ 2 we find T 1 T 1 (x µ) Σ− (x µ) = (x µ ) Σ− (x µ ) − − 1 − 1 11 1 − 1 T 1 + (x µ ) Σ− (x µ ) 2 − 2 22 2 − 2 Richard Lockhart STAT 350: General Theory 4. So, if n1 = dim(X1) and n2 = dim(X2) we see that n /2 1 T 1 f (x , x ) = (2π)− 1 exp (x µ ) Σ− (x µ ) X 1 2 −2 1 − 1 11 1 − 1 n /2 1 T 1 (2π)− 2 exp (x µ ) Σ− (x µ ) × −2 2 − 2 22 2 − 2 5. So X1 and X2 are independent. Richard Lockhart STAT 350: General Theory Summary T ◮ If Cov(X , X )= E[(X µ )(X µ ) ] = 0 then X is 1 2 1 − 1 2 − 2 1 independent of X2. ◮ Warning: This only works provided X X = 1 MVN(µ, Σ) X ∼ 2 ◮ Fact: However, it works even if Σ is singular, but you can’t prove it as easily using densities. Richard Lockhart STAT 350: General Theory Application: independence in linear models T 1 T µˆ = X βˆ = X (X X )− X Y = X β + Hǫ ǫˆ = Y X βˆ − = ǫ Hǫ − = (I H)ǫ − So µˆ H ǫ µ = σ + ǫˆ I H σ 0 − A b Hence | {z } | {z } µˆ µ MVN ; AAT ǫˆ ∼ 0 Richard Lockhart STAT 350: General Theory Now H A = σ I H − so H AAT = σ2 HT (I H)T I H − − h i HH H(I H) = σ2 − (I H)H (I H)(I H) − − − H H H = σ2 − H H I H H + HH − − − H 0 = σ2 0 I H − The 0s prove thatǫ ˆ andµ ˆ are independent. It follows thatµ ˆT µˆ, the regression sum of squares (not adjusted) is independent ofǫ ˆT ǫˆ, the Error sum of squares. Richard Lockhart STAT 350: General Theory Joint Densities: some manipulations ◮ Suppose Z1 and Z2 are independent standard normals. ◮ Their joint density is 1 f (z , z )= exp( (z2 + z2)/2) . 1 2 2π − 1 2 2 ◮ Show meaning of joint density by computing density of a χ2 random variable. 2 2 ◮ Let U = Z1 + Z2 . 2 ◮ By definition U has a χ distribution with 2 degrees of freedom. Richard Lockhart STAT 350: General Theory 2 Computing χ2 density ◮ Cumulative distribution function of U is F (u)= P(U u). ≤ ◮ For u 0 this is 0 so take u 0. ≤ ≥ ◮ Event U u is same as event that point (Z , Z ) is in the ≤ 1 2 circle centered at the origin and having radius u1/2. ◮ That is, if A is the circle of this radius then F (u)= P((Z , Z ) A) . 1 2 ∈ ◮ By definition of density this is a double integral f (z1, z2) dz1 dz2 . Z ZA ◮ You do this integral in polar co-ordinates. Richard Lockhart STAT 350: General Theory Integral in Polar co-ordinates ◮ Let z1 = r cos θ and z2 = r sin θ. ◮ we see that 1 f (r cos θ, r sin θ)= exp( r 2/2) . 2π − ◮ The Jacobian of the transformation is r so that dz1 dz2 becomes rdrdθ. ◮ Finally the region of integration is simply 0 θ 2π and ≤ ≤ 0 r u1/2 so that ≤ ≤ 1/2 u 2π 1 P(U u) = exp( r 2/2)rdrdθ ≤ 2π − Z0 Z0 u1/2 = r exp( r 2/2)dr − Z0 u1/2 = exp( r 2/2) − − 0 = 1 exp( u/2) .

Eigenvalues and Eigenvectors

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support