Further linear algebra. Chapter VI. Inner product spaces.
Andrei Yafaev
1 Geometry of Inner Product Spaces
Definition 1.1 Let V be a vector space over R and let , be a symmetric bilinear form on V . We shall call the form positive definiteh− −iif for all non-zero vectors v V we have ∈ v,v > 0. h i Notice that a symmetric bilinear form is positive definite if and only if its canonical form (over R) is In. 2 2 Rn Clearly x1 + . . . + xn is positive definite on . Conversely, suppose is a basis such that the matrix with respect to is the canonical form. ForB any B basis vector bi, the diagonal entry satisfies bi, bi > 0 and hence bi, bi = 1. h i h i Definition 1.2 Let V be a vector space over C. A Hermitian form on V is a function , : V V C such that: h− −i × → For all u,v,w V and all λ C, • ∈ ∈ u + λv,w = u, w + λ v,w ; h i h i h i For all u, v V , • ∈ u, v = v, u . h i h i Example 1.1 The simplest example is the following : take V = C, then
1 If A is a Hermitian matrix then the following is a Hermitian form on Cn:
v,w = vtAw.¯ h i In fact every Hermitian form on Cn is one of these. To see why, suppose we are given a Hermitian form <,>. Choose a basis
B =(b1, . . . , bn). Let v = i λibi and w = j µjbj. We calculate
P P t
n Example 1.2 If V = R , then <,> defined by < x1,...,xn, y1,...,yn >=
i,j xiyj is called the standard inner product. Cn If V = , then <,> defined by < z1,...,zn,w1,...,wn >= i,j ziwj is calledP the standard (hermitian) inner product. P Note that a Hermitian form is conjugate-linear in the second variable, i.e.
u, v + λw = u, v + λ¯ u, w . h i h i h i Note also that by the second axiom
u, u R. h i∈ Definition 1.3 A Hermitian form is positive definite if for all non-zero vectors v we have v,v > 0. h i In other words, < v,v > 0 for all v and < v,v >= 0 if and only if v = 0. ≥
Clearly, the fom zw¯ is positive definite.
Definition 1.4 By an inner product space we shall mean one of the follow- ing: either A finite dimensional vector space V over R with a positive definite symmetric bilinear form;
2 or A finite dimensional vector space V over C with a positive definite Hermitian form.
We shall often write K to mean the field R or C, depending on which is relevant.
Example 1.3 Consider the vector space V of all continuous functions [0, 1] C. −→ Then we can define
1 f,g = f(x)g(x)dx. h i Z0 This defines an inner product on V (easy exercise). Another example. Let V = Mn(R) the vector space of n n-matrices with real entries. Then × < A, B >= tr(ABt) is an inner product on V . t Similarly, if V = Mn(C) and < A, B >= tr(AB ) is an inner product.
Definition 1.5 Let V be an inner product space. We define the norm of a vector v V by ∈ v = v,v . || || h i Lemma 1.4 For λ K we have λλ¯p= λ 2 for for v V we have λv = λ v . ∈ | | ∈ || || | | || || The proof is obvious.
Theorem 1.5 (Cauchy-Schwarz inequality) If V is an inner product space then u, v V u, v u v . ∀ ∈ |h i| ≤ || || · || || Proof. If v = 0 then the result holds so suppose v = 0. We have for all λ K, 6 ∈ u λv,u λv 0. h − − i ≥ Expanding this out we have:
u 2 λ v, u λ¯ u, v + λ 2 v 2 0. || || − h i − h i | | || || ≥ 3 u,v Setting λ = h v 2i we have: || || 2 u, v v, u u, v u 2 h i v, u h i u, v + h i v 2 0. || || − v 2 h i − v 2 h i v 2 || || ≥
|| || || || || || Multiplying by v 2 we get || || 2 u 2 v 2 2 u, v 2 + u, v 0. || || || || − |h i| |h i| ≥ Hence u 2 v 2 u, v 2 . || || || || ≥ |h i| Taking the square root of both sides we get the result.
Theorem 1.6 (Triangle inequality) If V is an inner product space with norm then || · || u, v V u + v u + v . ∀ ∈ || || ≤ || || || || Proof. We have
u + v 2 = u + v, u + v || || h i = u 2 + 2 u, v + v 2. || || ℜh i || || Notice that () hence |ℜ | ≤ | | u + v 2 u 2 + 2 + v 2 || || ≤ || || | | || || So the Cauchy–Schwarz inequality implies that
u + v 2 u 2 + 2 u v + v 2 =( u + v )5. || || ≤ || || || || || || || || || || || || Hence u + v u + v . || || ≤ || || || ||
Definition 1.6 Two vectors v,w in an inner product space are called or- thogonal if v,w = 0. h i
4 Theorem 1.7 (Pythagoras’ Theorem) Let (V,<,>) be an inner product space. If v,w V are orthogonal, then ∈ v 2 + w 2 = v + w 2 || || || || || || Proof. Since
v + w 2 = v + w,v + w = v 2 + 2 v,w + w 2, || || h i || || ℜh i || || so we have v 2 + w 2 = v + w 2 || || || || || || if v,w = 0. h i 2 Gram–Schmidt Orthogonalisation
Definition 2.1 Let V be an inner product space. We shall call a basis of B V an orthonormal basis if bi, bj = δi,j. h i Proposition 2.1 If is an orthonormal basis then for v,w V we have: B ∈ v,w = [v]t [w] . h i B B
Proof. If the basis =(b1, . . . , bn) is orthonormal, then the matrix of <,> B in this basis is the identity In. The proposition follows.
Theorem 2.2 (Gram–Schmidt Orthogonalisation) Let be any basis. Then the basis defined by B C
c1 = b1 b2,c1 c2 = b2 h ic1 − c1,c1 h i b3,c1 b3,c2 c3 = b3 h ic1 h ic2 − c1,c1 − c2,c2 . h i h i . n 1 − bn,cr cn = bn h icr, − c ,c r=1 r r X h i
5 is orthogonal. Furthermore the basis defined by D 1 dr = cr, cr || || is orthonormal.
Proof. Clearly each bi is a linear combination of , so spans V . As the cardinality of is dim V , is a basis. It follows alsoC thatC is a basis. We’ll C C D prove by induction that c1,...,cr is orthogonal. Clearly any one vector is { } orthogonal. Suppose c1,...,cr 1 are orthogonal. The for s < r we have { − } r 1 − br,ct cr,cs = br,cs h i ct,cs . h i h i − c ,c h i t=1 t t X h i By the inductive hypothesis we have
br,cs cr,cs = br,cs h i cs,cs . = br,cs br,cs = 0. h i h i − cs,cs h i h i−h i h i
(notice that < ct,cs >= 0 unless t = s). This shows that c1,...,cr are orthogonal. Hence is an orthogonal basis. It follows easily{ that } is orthonormal. C D
This theorem shows in particular that an orthonormal basis always ex- ists. Indeed, take any basis and turn it into an orthonormal one by applying Gram-Schmidt process to it.
Proposition 2.3 If V is an inner product space with an orthonormal basis n = b1, . . . , bn , then any v V can be written as v = v,ei ei. B { } ∈ i=1h i n n P Proof. We have v = λiei and v,ej = λi ei,ej = λj. i=1 h i i=1 h i Definition 2.2 Let SPbe a subspace of an innerP product space V . The or- thogonal complement of S is defined to be
S⊥ = v V : w S v,w = 0 . { ∈ ∀ ∈ h i }
6 Theorem 2.4 If (V,<,>) is an inner product space and W is a subspace of V then V = W W ⊥, ⊕ and hence any v V can be written as ∈
v = w + w⊥, for unique w W and w⊥ W ⊥. ∈ ∈
Proof. We show first that V = W + W ⊥. Let = e1,...,en be an orthonormal basis for V , such that e1,...,er is a basisE for{ W . This} can be constructed by Gram-Schmidt orthogonalisa-{ } tion. (choose a basis b1, . . . , br for W and complete to a basis b1, . . . , bn of V . { } { } Then apply Gram-Schmidt process. Notice that in Gram-Schmidt pro- cess, when constructing orthonormal basis, the vectors c1,...,ck lie in the space generated by c1,...,ck 1, bk. It follows that the process will give an − orthonormal basis e1,...,en such that e1,...,er is an orthonormal basis of W .) If v V then ∈ r n v = λiei + λiei. i=1 i=r+1 X X Now r
λiei W. ∈ i=1 X If w W then there exist µi R such that ∈ ∈ r
w = µiei. i=1 X So n r n
w, λiej = µiλj ei,ej = 0. h i * j=r+1 + i=1 j=r+1 X X X Hence n
λiei W ⊥. ∈ i=r+1 X 7 Therefore V = W + W ⊥.
Next suppose v W W ⊥. So v,v = 0 and so v = 0. ∈ ∩ h i Hence V = W W ⊥ and so any vector v V can be expressed uniquely as ⊕ ∈ v = w + w⊥, where w W and w⊥ W ⊥. ∈ ∈ 3 Adjoints.
Definition 3.1 An adjoint of a linear map T : V V is a linear map T ∗ → such that T (u),v = u, T ∗(v) for all u, v V . h i h i ∈ Theorem 3.1 (existence and uniqueness) Every T : V V has a unique → adjoint. If T is represented by A (w.r.t. an orthonormal basis) then T ∗ is represented by A¯t.
t Proof. (Existence) Let T ∗ be the linear map represented by A¯ . We’ll prove that it is an adjoint of A.
t t t t Tv,w = [v] A [w] = [v] A¯ [w]. = v, T ∗w . h i h i Notice that here we have used that the basis is orthonormal : we said that the matrix of <,> was the identity. (Uniqueness) Let T ∗, T ′ be two adjoints. Then we have u, (T ∗ T ′)v = 0. h − i for all u, v V . In particular, let u = (T ∗ T ′)v, then (T ∗ T ′)v = 0 ∈ − || − || hence T ∗(v)= T (v) for all v V . Therefore T ∗ = T ′. ∈ Example 3.2 Consider V = C2 with the standard orthonormal basis and let T be represented by 1 i A = i 1 − Then T ∗ = T (such a linear map is called autoadjoint). Notice that T being self-adjoint is equivalent to the matrix representing it being hermitian
8 2i 1+ i A = 1+ i i − Then T ∗ = T − t We also see that T ∗∗ = T (using that T ∗ is represented by A ).
4 Isometries.
Theorem 4.1 If T : V V be a linear map of a Euclidean space V then the following are equivalent.→
1 (i) T T ∗ = Id (i.e. T ∗ = T − ).
(ii) u, v V Tu,Tv = u, v . (i.e. T preserves the inner product.) ∀ ∈ h i h i (iii) v V Tv = v . (i.e. T preserves the norm.) ∀ ∈ || || || || Definition 4.1 If T satisfies any of the above (and so all of them) then T is called an isometry.
t We also see that T ∗∗ = T (using that T ∗ is represented by A ).
Proof. (i) = (ii) ⇒ Let u, v V then ∈ Tu,Tv = u, T ∗Tv = u, v , h i h i h i 1 since T ∗ = T − . (ii) = (iii) If v V then⇒ ∈ Tv 2 = Tv,Tv || || h i so by (ii) Tv 2 = v,v = v 2. || || h i || || Hence Tv = v , so (iii) holds. (iii)|| =|| (||ii)|| We just show that the form can be recovered from the norm. We have⇒
2 u, v = u + v 2 u 2 v 2, v,w = v,iw . ℜh i || || − || || − || || ℑh i ℜh i
9 For the second equality, notice that,
2
< T (u), T (v) >= R < T (u), iT (v) >= R(< T (u), T (iv) >)= R(< u,iv >)= ℑ ℑ . Hence < T (u), T (v) >= . (ii) implies (i):
T ∗Tu,v = Tu,Tv h i h i = u, v . h i
Therefore < (T T ∗ I)u,v >= 0 for all v. In particular, take v =(T T ∗ I)u, − − then (T T ∗ I)u, (T T ∗ 1)u = 0. Therefore T T ∗ = I. h − − i Notice that in an orthonormal basis, an isometry is represented t 1 by a matrix such that A = A− . t We let On(R) be the set of n n real matrices satisfying AA = In (in t 1 × other words A = A− ). f An(R) then det A = 1. If A On(R) then t 1 ± ∈ A = A− so t 1 1 det A = det A = det(A− ) = det A− . Therefore det(A)2 = 1 and det A = 1. ± Theorem 4.2 The following are equivalent.
(i) A On(R). ∈ (ii) The columns of A form an orthonormal basis for Rn (for the standard inner product on Rn).
10 (iii) The rows of A form an orthonormal basis for Rn.
Proof. We prove (i) (ii) (the proof of (i) (iii) is identical). t ⇐⇒ ⇐⇒ Consider A A. If A = [C1,...,Cn], so the jth column of A is Cj, then t t the (i, j)th entry of A A is Ci Cj. t t So A A = In Ci Cj = δi,j Ci,Cj = δi,j C1,...,Cn is an orthonormal⇐⇒ basis for Rn. ⇐⇒ h i ⇐⇒ { }
For example take the matrix:
1/√2 1/√2 − 1/√2 1/√2 This matrix is in O2(R). In fact it is the matrix of rotation by angle π/4. −
Theorem 4.3 Let V be a Euclidean space with orthonormal basis = e1,...,en . E { } If = 1,...,n is a basis for V and P is the transition matrix from to F { } E , then F P Øn(R) is an orthonormal basis for V . ∈ ⇐⇒ F
Proof. The jth column of P is [fj] so E n
fj = pk,jek. k=1 X Hence
n n n n n t fi, fj = pk,iek, pl,jel = pk,ipl,j ek,el = pk,ipk,j =(P P )i,j. h i h i *k=1 l=1 + k=1 l=1 k=1 X X X X X n t So is an orthonormal basis for R fi, fj = δi,j iff P P = In F ⇐⇒ h i ⇐⇒ P Øn(R). ∈
Notice that it is NOT true that matrices in On(R) are diagonal- isable. Indeed, take cos(θ) sin(θ) sin(θ) cos(θ) −
11 where θ is not a multiple of π. The characteristic polynomial is x2 2 cos(θ)x+1. Then, as cos(θ)2 1 < 0, there are no real eigenvalues and the− matrix is not diagonalisable. − Notice that for a given matrix A, it is easy to check that columns are orthogonal. If that is the case, then A is in On(R) and it is easy to calculate 1 t inverse : A− = A .
5 Orthogonal Diagonalisation.
Definition 5.1 Let V be an inner space. A linear map T : V V is self-adjoint if −→ T ∗ = T
Notice that in an othonormal basis, T is represented by a matrix A such t that A = A. In particular if V is real, then A is symmetric.
Theorem 5.1 If A Mn(C) is Hermitian then all the eigenvalues of A are ∈ real.
t Proof. Recall that Hermitian means that A = A and that this implies that < Au, v >=< u, Av > for all u, v. Let λ be an eigenvalue of A and let v = 0 be a corresponding eigenvector. Then 6
Av = λv
It follows that
< Av, v >= λ
As v = 0, we can divide by
In particular a real symmetric matrix always has an eigenvalue : take a complex eigenvalue (always exists !), then by the above theorem it will be real.
Theorem 5.2 (Spectral theorem) Let T : V V be a self-adjoint linear map of an inner product space V . Then V has→ an orthonormal basis of eigenvectors.
12 Proof. This is rather similar to Theorem 5.4. We use induction on dim(V ) = n. True for n = 1 so suppose the result holds n 1 and let dim(V )= n. Since−T is self-adjoint, if is an orthonormal basis for V and A is the matrix representing T in thenE E t A = A .
So A is Hermitian. Hence by Theorem 6.18 A has a real eigenvalue λ. So there is a vector e1 V 0 such that Te1 = λe1. Normalizing ∈ \ { } (dividing by e1 ) we can assume that e1 = 1. || || || || Let W = Span e1 then by Theorem 6.9 we have V = W W ⊥. Now { } ⊕
n = dim(V ) = dim(W ) + dim(W ⊥) = 1+dim(W ⊥), so dim(W ⊥)= n 1. − We claim that T : W ⊥ W ⊥, i.e. T (W ⊥) W ⊥. Let w = µe1 W , → ⊆ ∈ µ R and v W ⊥. Then ∈ ∈
w,Tv = T ∗w,v = Tw,v = T (µe1),v = µTe1,v = µλe1,v = 0, h i h i h i h i h i h i since µλe1 W . Hence T : W ⊥ W ⊥. ∈ → By induction there exists an orthonormal basis of eigenvectors e2,...,en { } for W ⊥. But V = W W ⊥ so = e1,...,en is a basis for V and e1,ei = 0 ⊕ E { } h i for 2 i n and e1 = 1. Hence is an orthonormal basis of eigenvectors for V ≤. ≤ || || E
Theorem 5.3 Let T : V V be a self-adjoint linear map of a Euclidean space V . If λ,µ are distinct→ eigenvalues of T then
u Vλ v Vµ u, v = 0. ∀ ∈ ∀ ∈ h i
Proof. If u Vλ and v Vµ then ∈ ∈
λ u, v = λu,v = Tu,v = u, T ∗v = u,Tv = u,µv = µ u, v . h i h i h i h i h i h i h i So (λ µ) u, v = 0, with λ = µ. Hence u, v = 0. − h i 6 h i
13 Example 5.4 Let 1 i A = i 1 − This matrix is self-adjoint. One calculates the cracteristic polynomial and finds t(t 2) (in particular the minimal polynomial is the same, hence you know that− the matrix is di- agonalisable for other reasons than being self-adjoint). For eigenvalue zero, one finds eigenvector i −1 i For eigenvalue 2, one finds Then we normalise the vectors : v1 = 1 1 i 1 i and v1 = We let √2 −1 √2 1 1 i i P = − √2 1 1 and 1 0 0 P − AP = 0 2 In general the procedure for orthogonal orthonormalisationis as follows. Let A be an n n self-adjoint matrix. × Find eigenvalues λi and eigenspaces V1(λi). Because it is diagonalisable, you will have: V = V1(λ1) Vr(λr) ⊕···⊕ Choose a basis for V as union of bases of V1(λi). Apply Gram-Schmidt to it to get an orthonormal basis. For example :
1 2 2 − A = 2 4 4 −2 4− 4 − This matrix is symmetric hence self-adjoint. One calculates the characteristic polynomial and finds λ2(λ 9). −
14 1 For V1(9), one finds v1 = 2 To make this orthonormal, divide by −2 1 the norm, i.e replace v1 by 3 v1. For V1(0), one finds V1(0) = Span(v2,v3) with
2 v3 = 1 0 and 2 − v4 = 0 1 By Gram-Schmidt process we replace v3 by
2 1 1 √ 5 0 and v4 by 2 1 −4 √ 3 5 5 Let 1/3 2/√5 2/3√5 − P = 2/3 1/√5 4/3√5 − 2/3 0 5/3√5 We have 9 0 0 1 P − AP = 0 0 0 0 0 0
15