Inner Product Spaces
Branko Curgus´
March 4, 2015 at 09:05
1 Inner Product Spaces
We will first introduce several “dot-product-like” objects. We start with the most general. Definition 1.1. Let V be a vector space over a scalar field F. A function [ , ] : V V F · · × → is a sesquilinear form on V if the following two conditions are satisfied. (a) (linearity in the first variable) α, β F u, v, w V ∀ ∈ ∀ ∈ [αu + βv, w]= α[u, w]+ β[v, w].
(b) (anti-linearity in the second variable) α, β F u, v, w V [u, αv+ ∀ ∈ ∀ ∈ βw]= α[u, v]+ β[u, w]. Example 1.2. Let M Cn×n be arbitrary. Then ∈ [x, y] = (Mx) y, x, y Cn, · ∈ is a sesquilinear form on the complex vector space Cn. Here denotes the · usual dot product in C. An abstract form of the Pythagorean Theorem holds for sesquilinear forms. Theorem 1.3 (Pythagorean Theorem). Let [ , ] be a sesquilinear form on · · a vector space V over a scalar field F. If v , , vn V are such that 1 · · · ∈ [vj, vk] = 0 whenever j = k, j, k 1,...,n , then 6 ∈ { } n n n vj, vk = [vj, vj]. "j # j X=1 Xk=1 X=1
1 Proof. Assume that [vj, vk] = 0 whenever j = k, j, k 1,...,n and apply 6 ∈ { } the additivity of the sesquilinear form in both variables to get:
n n n n vj, vk = [vj, vk] "j=1 k=1 # j=1 k=1 X X Xn X = [vj, vj]. j X=1 Theorem 1.4 (Polarization identity). Let V be a vector space over a scalar field F and let [ , ] : V V F be a sesquilinear form on V . If i F, then · · × → ∈ 3 1 [u, v]= ik u + ikv, u + ikv (1) 4 k=0 X for all u, v V . ∈ Corollary 1.5. Let V be a vector space over a scalar field F and let [ , ] : · · V V F be a sesquilinear form on V . If i F and [v, v] = 0 for all × → ∈ v V , then [u, v] = 0 for all u, v V . ∈ ∈ Definition 1.6. Let V be a vector space over a scalar field F. A sesquilinear form [ , ] : V V F is hermitian if · · × → (c) (hermiticity) u, v V [u, v] = [v, u]. ∀ ∈ A hermitian sesquilinear form is also called an inner product.
Corollary 1.7. Let V be a vector space over a scalar field F such that i F. ∈ Let [ , ] : V V F be a sesquilinear form on V . Then [ , ] is hermitian · · × → · · if and only if [v, v] R for all v V . ∈ ∈ Proof. The “only if” direction follows from the definition of a hermitian sesquilinear form. To prove “if” direction assume that [v, v] R for all ∈ v V . Let u, v V be arbitrary. By assumption u + ikv, u + ikv R for ∈ ∈ ∈ all k 0, 1, 2, 3 . Therefore ∈ { } 3 1 [u, v]= ( i)k u + ikv, u + ikv 4 − Xk=0 3 1 = ( i)kik( i)k ( i)ku + v, ( i)ku + v 4 − − − − k=0 X
2 3 1 = ( i)k v + ( i)ku, v + ( i)ku . 4 − − − k=0 X Notice that the values of ( i)k at k = 0, 1, 2, 3, in this particular order are: − 1, i, 1, i. These are exactly the values of ik in the order k = 0, 3, 2, 1. − − Therefore rearranging the order of terms in the last four-term-sum we have
3 3 1 1 ( i)k v + ( i)ku, v + ( i)ku = ik v + iku, v + iku . 4 − − − 4 k=0 k=0 X X Together with Theorem 1.4, the last two displayed equalities yield [u, v] = [v, u].
Let [ , ] be an inner product on V . The hermiticity of [ , ] implies · · · · that [v, v] = [v, v] for all v V . Thus [v, v] R for all v V . The natural ∈ ∈ ∈ trichotomy that arises is the motivation for the following definition.
Definition 1.8. An inner product [ , ] on V is called nonnegative if [v, v] · · ≥ 0 for all v V , it is called nonpositive if [v, v] 0 for all v V , and it is ∈ ≤ ∈ called indefinite if there exist u V and v V such that [u, u] < 0 and ∈ ∈ [v, v] > 0.
2 Nonnegative inner products
The following implication that you might have learned in high school will be useful below.
Theorem 2.1 (High School Theorem). Let a, b, c be real numbers. Assume a 0. Then the following implication holds: ≥ x Q ax2 + bx + c 0 b2 4ac 0. (2) ∀ ∈ ≥ ⇒ − ≤ Theorem 2.2 (Cauchy-Bunyakovsky-Schwartz Inequality). Let V be a vec- tor space over F and let , be a nonnegative inner product on V . Then h · · i u, v V u, v 2 u, u v, v . (3) ∀ ∈ |h i| ≤ h ih i The equality occurs in (3) if and only if there exists α, β F not both 0 such ∈ that αu + βv, αu + βv = 0. h i
3 Proof. Let u, v V be arbitrary. Since , is nonnegative we have ∈ h· ·i t Q u + t u, v v, u + t u, v v 0. (4) ∀ ∈ h i h i ≥ Since , is a sesquilinear hermitian form on V , (4) is equivalent to h· ·i t Q u, u + 2t u, v 2 + t2 u, v 2 v, v 0. (5) ∀ ∈ h i |h i| |h i| h i≥ As v, v 0, the High School Theorem applies and (5) implies h i≥ 4 u, v 4 4 u, v 2 u, u v, v 0. (6) |h i| − |h i| h ih i≤ Again, since u, u 0 and v, v 0, (6) is equivalent to h i≥ h i≥ u, v 2 u, u v, v . |h i| ≤ h ih i Since u, v V were arbitrary, (3) is proved. ∈ Next we prove the claim related to the equality in (3). We first prove the “if” part. Assume that u, v V and α, β F are such that α 2 + β 2 > 0 ∈ ∈ | | | | and αu + βv, αu + βv = 0 h i We need to prove that u, v 2 = u, u v, v . |h i| h ih i Since α 2 + β 2 > 0, we have two cases α =0or β = 0. We consider the | | | | 6 6 case α = 0. The case β = 0 is similar. Set w = αu + βv. Then w, w = 0 6 6 h i and u = γv + δw where γ = β/α and δ = 1/α. Notice that the Cauchy- − Bunyakovsky-Schwarz inequality and w, w = 0 imply that w,x = 0 for h i h i all x V . Now we calculate ∈ u, v = γv + δw, v = γ v, v + δ w, v = γ v, v = γ v, v |h i| |h i| | h i h i| | h i| | |h i and u, u = γv + δw, γv + δw = γv, γv = γ 2 v, v . h i h i h i | | h i Thus, u, v 2 = γ 2 v, v 2 = u, u v, v . |h i| | | h i h ih i This completes the proof of the “if” part. To prove the “only if” part, assume u, v 2 = u, u v, v . If v, v = 0, |h i| h ih i h i then with α = 0 and β = 1 we have
αu + βv, αu + βv = v, v = 0. h i h i
4 If v, v = 0, then with α = v, v and β = u, v we have α 2 + β 2 > 0 h i 6 h i −h i | | | | and
αu + βv, αu + βv = v, v v, v u, u u, v 2 u, v 2 + u, v 2 = 0. h i h i h ih i − |h i| − |h i| |h i| This completes the proof of the characterization of equality in the Cauchy- Bunyakovsky-Schwartz Inequality.
Corollary 2.3. Let V be a vector space over F and let , be a nonnegative inner product on V . Then the following two implicationsh · · i are equivalent.
(i) If v V and u, v = 0 for all u V , then v = 0. ∈ h i ∈ (ii) If v V and v, v = 0, then v = 0. ∈ h i Proof. Assume that the implication (i) holds and let v V be such that ∈ v, v = 0. Let u V be arbitrary. By the the CBS inequality h i ∈ u, v 2 u, u v, v = 0. |h i| ≤ h ih i Thus, u, v = 0 for all u V . By (i) we conclude v = 0. This proves (ii). h i ∈ The converse is trivial. However, here is a proof. Assume that the implication (ii) holds. To prove (i), let v V and assume u, v = 0 for all ∈ h i u V . Setting u = v we get v, v = 0. Now (ii) yields v = 0. ∈ h i Definition 2.4. Let V be a vector space over a scalar field F. An inner product [ , ] on V is nondegenerate if the following implication holds · · (d) (nondegenerecy) u V and [u, v] = 0 for all v V implies u = 0. ∈ ∈ We conclude this section with a characterization of the best approxima- tion property.
Theorem 2.5 (Best Approximation-Orthogonality Theorem). Let (V , , ) be an inner product space with a nonnegative inner product. Let U h·be·i a subspace of V . Let v V and u U . Then ∈ 0 ∈ u U v u , v u v u, v u . (7) ∀ ∈ h − 0 − 0i ≤ h − − i if and only if u U v u , u = 0. (8) ∀ ∈ h − 0 i
5 Proof. First we prove the “only if” part. Assume (7). Let u U be ∈ arbitrary. Set α = v u , u . Clearly α F. Let t Q F be arbitrary. h − 0 i ∈ ∈ ⊆ Since u tαu U , (7) implies 0 − ∈ t Q v u , v u v u + tαu, v u + tαu . (9) ∀ ∈ h − 0 − 0i ≤ h − 0 − 0 i Now recall that α = v u , u and expand the right-hand side of (9): h − 0 i v u + tαu, v u + tαu = v u , v u + v u , tαu h − 0 − 0 i h − 0 − 0i h − 0 i + tαu, v u + tαu, tαu h − 0i h i = v u , v u + tα v u , u h − 0 − 0i h − 0 i + tα u, v u + t2 α 2 u, u h − 0i | | h i = v u , v u + 2t α 2 + t2 α 2 u, u . h − 0 − 0i | | | | h i Thus (9) is equivalent to
t Q 0 2t α 2 + t2 α 2 u, u . (10) ∀ ∈ ≤ | | | | h i By the High School Theorem, (10) implies
4 α 4 4 α 2 u, u 0 = 4 α 4 0. | | − | | h i | | ≤ Consequently α = v u , u = 0. Since u U was arbitrary, (8) is proved. h − 0 i ∈ For the “if” part assume that (8) is true. Let u U be arbitrary. Notice ∈ that u u U and calculate 0 − ∈ v u, v u = v u + u u, v u + u u h − − i h − 0 0 − − 0 0 − i by (8) and Pythag. thm. = v u +, v u + u u, u u h − 0 − 0i h 0 − 0 − i since u u, u u 0 v u +, v u . h 0 − 0 − i≥ ≥ h − 0 − 0i This proves (7).
3 Positive definite inner products
It follows from Corollary 2.3 that a nonnegative inner product , on V h · · i is nondegenerate if and only if v, v = 0 implies v = 0. A nonnegative h i nondegenerate inner product is also called positive definite inner product. Since this is the most often encountered inner product we give its definition as it commonly given in textbooks.
6 Definition 3.1. Let V be a vector space over a scalar field F. A function , : V V F is called a positive definite inner product on V if the h · · i × → following conditions are satisfied;
(a) u, v, w V α, β F αu + βv, v = α u, w + β v, w , ∀ ∈ ∀ ∈ h i h i h i (b) u, v V u, v = v, u , ∀ ∈ h i h i (c) v V v, v 0, ∀ ∈ h i≥ (d) If v V and v, v = 0, then v = 0. ∈ h i A positive definite inner product gives rise to a norm.
Theorem 3.2. Let (V , , ) be a vector space over F with a positive defi- h · · i nite inner product , . The function : V R defined by h · · i k · k → v = v, v , v V , k k h i ∈ is a norm on V . That is for allpu, v V and all α F we have v 0, ∈ ∈ k k ≥ αv = α v , u + v u + v and v = 0 implies v = 0V . k k | |k k k k ≤ k k k k k k Definition 3.3. Let (V , , ) be a vector space over F with a positive h · · i definite inner product , . A set of vectors A V is said to form an h · · i ⊂ orthogonal system in V if for all u, v A we have u, v = 0 whenever u = v ∈ h i 6 and for all v A we have v, v > 0. An orthogonal system A is called an ∈ h i orthonormal system if for all v A we have v, v = 1. ∈ h i Proposition 3.4. Let (V , , ) be a vector space over F with a positive h · · i definite inner product , . Let u1, . . . , un be an orthogonal system in V . n h · · i If v = j=1 αjuj, then αj = v, uj / uj , uj . In particular, an orthogonal system is linearly independent.h i h i P Theorem 3.5 (The Gram-Schmidt orthogonalization). Let (V , , ) be a h · · i vector space over F with a positive definite inner product , . Let n h · · i ∈ N and let v1, . . . , vn be linearly independent vectors in V . Let the vectors u1, . . . , un be defined recursively by
u1 = v1, k vk+1, uj uk = vk h iuj, k 1,...,n 1 . +1 +1 − u , u ∈ { − } j j j X=1 h i
Then the vectors u1, . . . , un form an orthogonal system which has the same fan as the given vectors v1, . . . , vn.
7 Proof. We will prove by Mathematical Induction the following statement: For all k 1,...,n we have: ∈ { } (a) uk, uk > 0 and uj, uk = 0 whenever j 1,...,k 1 ; h i h i ∈ { − } (b) vectors u1, . . . , uk are linearly independent; (c) span u , . . . , uk = span v , . . . , vk . { 1 } { 1 } For k = 1 statements (a), (b) and (c) are clearly true. Let m ∈ 1,...,n 1 and assume that statements (a), (b) and (c) are true for { − } all k 1,...,m . ∈ { } Next we will prove that statements (a), (b) and (c) are true for k = m+1. Recall the definition of um+1:
m vm+1, uj um = vm h iuj. +1 +1 − u , u j j j X=1 h i
By the Inductive Hypothesis we have span u , . . . , um = span v , . . . , vm . { 1 } { 1 } Since v . . . , vm are linearly independent, vm span u , . . . , um . There- 1 +1 +1 6∈ { 1 } fore, nm = 0V . That is um , um > 0. Let k 1,...,m be arbi- +1 6 h +1 +1i ∈ { } trary. Then by the Inductive Hypothesis we have that uj, uk = 0 whenever h i j 1,...,m and j = k. Therefore, ∈ { } 6 m vm+1, uj um , uk = vm , uk h i uj, uk h +1 i h +1 i− u , u h i j j j X=1 h i = vm , uk vm , uk h +1 i − h +1 i = 0.
This proves claim (a). To prove claim (b) notice that by the Inductive Hy- pothesis u , . . . , um are linearly independent and um span u , . . . , um 1 +1 6∈ { 1 } since vm span u , . . . , um . To prove claim (c) notice that the definition +1 6∈ { 1 } of um implies um span v , . . . , vm . Since by the inductive hypoth- +1 +1 ∈ { 1 +1} esis span u , . . . , um = span v , . . . , vm , we have span u , . . . , um { 1 } { 1 } { 1 +1} ⊆ span v , . . . , vm . The converse inclusion follows from the fact that vm { 1 +1} +1 ∈ span u , . . . , um . { 1 +1} It is clear that the claim of the theorem follows from the claim that has been proven.
The following two statements are immediate consequences of the Gram- Schmidt orthogonalization process.
8 Corollary 3.6. If V is a finite dimensional vector space with positive defi- nite inner product , , then V has an orthonormal basis. h · · i Corollary 3.7. If V is a complex vector space with positive definite inner product and T L (V ) then there exists an orthonormal basis B such that B ∈ MB(T ) is upper-triangular. Definition 3.8. Let (V , , ) be a positive definite inner product space h · · i and A V . We define A ⊥ = v V : v, a = 0 a A . ⊂ { ∈ h i ∀ ∈ } The following is a simple proposition.
Proposition 3.9. Let (V , , ) be a positive definite inner product space h · · i and A V . Then A⊥ is a subspace of V . ⊂ Theorem 3.10. Let (V , , ) be a positive definite inner product space h · · i and let U be a finite dimensional subspace of V . Then V = U U ⊥. ⊕ Proof. We first prove that V = U U ⊥. Note that since U is a subspace ⊕ of V , U inherits the positive definite inner product from V . Thus U is a finite dimensional positive definite inner product space. Thus there exists an orthonormal basis of U , B = u , u , . . . uk . { 1 2 } Let v V be arbitrary. Then ∈ k k v = v, uj uj + v v, uj uj , h i − h i j j X=1 X=1 where the first summand is in U . We will prove that the second summand ⊥ k ⊥ is in U . Set w = v, uj uj U . We claim that v w U . To j=1h i ∈ − ∈ prove this claim let u U be arbitrary. Since B is an orhonormal basis of P∈ U , by Proposition 3.4 we have
k u = u, uj uj. h i j X=1 Therefore
k v w, u = v, u v, uj uj, u h − i h i− h ih i j X=1 k = v, u v, u, uj uj h i− h i * j + X=1 9 = v, u v, u h i − h i = 0.
Thus v w, u = 0 for all u U . That is v w U ⊥. This proves that h − i ∈ − ∈ V = U U ⊥. ⊕ To prove that the sum is direct, let v U and v U ⊥. Then v, v = ∈ ∈ h i 0. Since , is positive definite, this implies v = 0V . The theorem is h · · i proved.
Corollary 3.11. Let (V , , ) be a positive definite inner product space h · · i and let U be a finite dimensional subspace of V . Then U ⊥ ⊥ = U .
Exercise 3.12. Let (V , , ) be a positive definite inner product space h · · i and let U be a subspace of V . Prove that U ⊥ ⊥ ⊥ = U ⊥.
Recall that an arbitrary direct sum V = U W gives rise to a projection ⊕ operator PU kW , the projection of V onto U parallel to W . If V = U U ⊥, then the resulting projection of V onto U parallel to ⊕ U ⊥ is called the orthogonal projection of V onto U ; it is denoted simply by PU . By definition for every v V , ∈ ⊥ u = PU v u U and v u U . ⇔ ∈ − ∈ As for any projection we have PU L (V ), ran PU = U , nul PU = U ⊥, ∈ and (PU )2 = PU . Theorems 3.10 and 2.5 yield the following solution of the best approx- imation problem for finite dimensional subspaces of a vector space with a positive definite inner product.
Corollary 3.13. Let (V , , ) be a vector space with a positive definite inner product and let U beh a · finite· i dimensional subspace of V . For arbitrary v V the vector PU v U is the unique best approximation for v in U . That∈ is ∈ v PU v v u for all u U . − ≤ k − k ∈
4 The definition of an adjoint operator
Let V be a vector space over F. The space L (V , F) is called the dual space of V ; it is denoted by V ∗.
10 Theorem 4.1. Let V be a finite dimensional vector space over F and let , be a positive definite inner product on V . Define the mapping h · · i Φ : V V ∗ → as follows: for w V we set ∈ Φ(w) (v)= v, w for all v V . h i ∈ Then Φ is a anti-linear bijection.
Proof. Clearly, for each w V , Φ(w) V ∗. The mapping Φ is anti-linear, ∈ ∈ since for α, β F and u, w V , for all v V we have ∈ ∈ ∈ Φ(αu + βw) (v)= v, αu + βw h i = α v, u + β v, a h i h = α Φ(u) (v)+ β Φ(w) (v) = α Φ(u)+ βΦ(w ) (v).
Thus Φ(αu + βw)= αΦ(u)+ βΦ(w ). This proves anti-linearity. To prove injectivity of Φ, let u, w V be such that Φ(u)=Φ(w). Then ∈ Φ(u) (v) = Φ(w) (v) for all v V . By the definition of Φ this means ∈ v, u = v, w for all v V . Consequently, v, u w = 0 for all v V . h i h i ∈ h − i ∈ In particular, with v = u w we have u w, u w = 0. Since , is a − h − − i h · · i positive definite inner product, it follows that u w = 0V , that is u = w. − To prove that Φ is a surjection we use the assumption that V is finite dimensional. Then there exists an orthonormal basis u1, . . . , un of V . Let ϕ V ∗ be arbitrary. Set ∈ n w = ϕ(uj)uj. j X=1 The proof that Φ(w)= ϕ follows. Let v V be arbitrary. ∈ Φ(w) (v)= v, w h in = v, ϕ(uj )uj j=1 n X = ϕ(uj ) v, uj h i j X=1
11 n = v, uj ϕ(uj ) h i j X=1 n = ϕ v, uj uj h i j ! X=1 = ϕ(v).
The theorem is proved.
The mapping Φ from the previous theorem is convenient to define the adjoint of a linear operator. In the next definition we will deal with two positive definite inner product spaces. To emphasize the different inner products and different mappings Φ we will use subscripts. Let V , , V and W , , W be two finite dimensional vector spaces h · · i h · · i over the same scalar field F and with positive definite inner products. Let T L (V , W ). We define the adjoint T ∗ : W V of T by ∈ → ∗ −1 T w =Φ ΦW (w) T , w W . (11) V ◦ ∈ −1 Since ΦW and Φ are anti-linear, T ∗ is linear For arbitrary α , α F V 1 1 ∈ and w , w V we have 1 2 ∈ ∗ −1 T (α w + α w )=Φ ΦW (α w + α w ) T 1 1 2 2 V 1 1 2 2 ◦ −1 =Φ α ΦW (w )+ α ΦW (w ) T V 1 1 2 2 ◦ −1 =Φ α ΦW (w ) T + α ΦW (w ) T V 1 1 ◦ 2 2 ◦ −1 −1 = α1 Φ V ΦW (w1) T + α2 ΦV ΦW (w2) T ∗ ∗◦ ◦ = α1 T w 1 + α2 T w2. Thus, T ∗ L (W , V ). ∈ Next we will deduce the most important property of T ∗. Recall that from the definition of Φ−1 we have that, for a fixed w W , V ∈ ∗ −1 T w =Φ ΦW (w) T V ◦ is equivalent to
∗ ΦW (w) T (v)= v, T w V for all v V , ◦ h i ∈ which, in turn, is equivalent to
∗ ΦW (w) (T u)= v, T w V for all v V . h i ∈ 12 From the definition of ΦV the last statement is equivalent to
∗ T v, w W = v, T w V for all v V . h i h i ∈ The reasoning above proves the following proposition.
Proposition 4.2. Let V , , V and W , , W be two finite dimen- sional vector spaces over theh · same· i scalar fieldh ·F·and i with positive definite inner products. Let T L (V , W ) and S L (W , V ). Then S = T ∗ if and only if ∈ ∈ T v, w W = v, Sw V for all v V , w W . (12) h i h i ∈ ∈ 5 Properties of the adjoint operator
Theorem 5.1. Let V , , V and W , , W be two finite dimensional vector spaces over the sameh · · i scalar field Fh ·and· i with positive definite inner products. The following statements hold.
(a) The adjoint mapping
∗ : L (V , W ) L (W , V ) → is an anti-linear involution. The inverse of this mapping is the adjoint mapping from L (W , V ) to L (V , W ). (b) For T L (V , W ) we have ∈ (i) nul(T ∗) = (ran T )⊥. (ii) ran(T ∗) = (nul T )⊥. (iii) nul(T ) = (ran T ∗)⊥. (iv) ran(T ) = (nul T ∗)⊥. (c) T L (V , W ) is invertible if and only if T ∗ is invertible; in this case ∈ T −1 ∗ = T ∗ −1. B C V W (d) Let and be orthonormal bases of , , V and , , W , h · ·C i ∗ h · · i respectively, and let T V , , V . Then MB(T ) is the conjugate ∈ B h · · i transpose of the matrix M (T ). C
Theorem 5.2. Let U , , U , V , , V and W , , W be three finite dimensional vectorh space · · i over theh same· · i scalar field hF · and· i with posi- tive definite inner products. Let S L (U , V ) and T L (V , W ). Then ∈ ∈ (TS)∗ = S∗T ∗.
13 Lemma 5.3. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . Let U be a subspace of V and let T L (V ). ∈ The subspace U is invariant under T if and only if the subspace U ⊥ is invariant under T ∗. Proof. By the definition of adjoint we have T u, v = u, T ∗v (13) h i h i for all u, v V . Assume T U U . From (13) we get ∈ ⊂ 0= T u, v = u, T ∗v u U and v U ⊥. h i h i ∀ ∈ ∀ ∈ Therefore, T ∗v U ⊥ for all v U ⊥. This proves “only if” part. ∈ ∈ The proof of the “if” part is similar.
6 Self-adjoint and normal operators
Definition 6.1. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . An operator T L (V ) is said to be self-adjoint ∈ if T = T ∗. An operator T L (V ) is said to be normal if T T ∗ = T ∗T . ∈ Proposition 6.2. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . All eigenvalues of a self-adjoint T L (V ) are real. ∈ In the rest of this section we will consider only scalar fields F which contain the imaginary unit i.
Proposition 6.3. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . Let T L (V ). Then T = 0 if and only if ∈ T v, v = 0 for all v V . h i ∈ Proposition 6.4. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . An operator T L (V ) is self-adjoint if and ∈ only if T v, v R for all v V . h i∈ ∈ Theorem 6.5. Let V be a vector space over F and let , be a positive h · · i definite inner product on V . An operator T L (V ) is normal if and only ∈ if T v = T ∗v for all v V . k k k k ∈ Corollary 6.6. Let V be a vector space over F, let , be a positive h · · i definite inner product on V and let T L (V ) be normal. Then λ C is ∈ ∈ an eigenvalue of T if and only if λ is an eigenvalue of T ∗ and nul T ∗ λI = nul(T λI). − − 14 7 The Spectral Theorem
In the rest of the notes we will consider only the scalar field C.
Theorem 7.1 (Theorem 7.9). Let V be a finite dimensional vector space over C and , be a positive definite inner product on V . Let T L (V ). h · · i ∈ Then V has an orthonormal basis which consists of eigenvectors of T if and only if T is normal. In other words, T is normal if and only if there exists B an orthonormal basis B of V such that MB(T ) is a diagonal matrix. Proof. Let n = dim(V ). Assume that T is normal. By Corollary ?? there B exists an orthonormal basis B = u , . . . , un of V such that MB(T ) is { 1 } upper-triangular. That is,
T u , u T u , u T un, u h 1 1i h 2 1i ··· h 1i 0 T u , u T u , u MB 2 2 n 2 B(T )= . h . i ···. h . i , (14) . . .. . 0 0 T un, un ··· h i or, equivalently,
k T uk = T uk, uj uj for all k 1,...,n . (15) h i ∈ { } j X=1 By Theorem 5.1(d) we have
T u , u 0 0 h 1 1i · · · T u , u T u , u 0 MB ∗ 2 1 2 2 B(T )= h . i h . i ···. . ...... T u , u T u , u T u , u n 1 n 2 n n h i h i ··· h i Consequently,
n ∗ T uk = T uj, uk uj for all k 1,...,n . (16) h i ∈ { } Xj=k Since T is normal, Theorem 6.5 implies
2 ∗ 2 T uk = T uk for all k 1,...,n . k k k k ∈ { }
15 Together with (15) and (16) the last identities become
k n 2 2 T uk, uj = T uj, uk for all k 1,...,n , h i h i ∈ { } j=1 j=k X X or, equivalently,
k n 2 2 T uk, uj = T uj, uk for all k 1,...,n . (17) h i h i ∈ { } j=1 j=k X X
The equality in (17) corresponding to k = 1 reads n 2 2 2 T u , u = T u , u + T uj, u , h 1 1i h 1 1i h 1i j=2 X which implies T uj, u = 0 for all j 2,...,n (18) h 1i ∈ { } In other words we have proved that the off-diagonal entries in the first row B of the upper triangular matrix MB(T ) in (14) are all zero. Substituting the value T u , u = 0 (from (18)) in the equality in (17) h 2 1i corresponding to k = 2 reads we get n 2 2 2 T u , u = T u , u + T uj, u , h 2 2i h 2 2i h 2i j=3 X which implies T uj, u = 0 for all j 3,...,n (19) h 2i ∈ { } In other words we have proved that the off-diagonal entries in the second B row of the upper triangular matrix MB(T ) in (14) are all zero. Repeating this reasoning n 2 more times would prove that all the off- − B diagonal entries of the upper triangular matrix MB(T ) in (14) are zero. That B is, MB(T ) is a diagonal matrix. To prove the converse, assume that there exists an orthonormal basis B = u , . . . , un of V which consists of eigenvectors of T . That is, for { 1 } some λj C, ∈ T uj = λjuj for all j 1,...,n , ∈ { } Then, for arbitrary v V we have ∈ n n n T v = T v, uj uj = v, uj T uj = λj v, uj uj. (20) h i h i h i j ! j j X=1 X=1 X=1 16 Therefore, for arbitrary k 1,...,n we have ∈ { }
T v, uk = λk v, uk . (21) h i h i Now we calculate n ∗ ∗ T T v = T T v, uj uj h i j=1 Xn = T v, T uj uj h i j=1 Xn = T v, T uj uj h i j=1 Xn = λj T v, uj uj h i j=1 Xn = λjλj v, uj uj. h i j X=1 Similarly,
n ∗ ∗ T T v = T T v, uj uj h i j=1 ! n X = v, T uj T uj h i j=1 Xn = v, λj uj λjuj h i j=1 Xn = λjλj v, uj uj. h i j X=1 Thus, we proved T ∗T v = T T ∗v, that is, T is normal.
A different proof of the “only if” part of the spectral theorem for normal operators follows. In this proof we use δij to represent the Kronecker delta function; that is, δij =1 if i = j and δij = 0 otherwise.
17 Proof. Set n = dim V . We first prove “only if” part. Assume that T is normal. Set
w , . . . , wk V and λ , . . . , λk C ∃ 1 ∈ ∃ 1 ∈ K = k 1,...,n : such that wi, wj = δij and T wj = λjwj ∈ { } h i ( for all i, j 1,...,k ) ∈ { } Clearly 1 K. Since K is finite, m = max K exists. Clearly, m n. ∈ ≤ Next we will prove that k K and k W = span w , . . . , wk . { 1 } Since w1, . . . , wk are eigenvectors of T we have T W W . By Lemma 5.3, ∗ ⊥ ⊥ ∗ ⊥ ⊆ T W W . Thus, T W ⊥ L W . Since dim W = k