Part IB — Theorems with proof

Based on lectures by S. J. Wadsley Notes taken by Dexter Chua

Michaelmas 2015

These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures. They are nowhere near accurate representations of what was actually lectured, and in particular, all errors are almost surely mine.

Definition of a (over R or C), subspaces, the space spanned by a subset. , bases, dimension. Direct sums and complementary subspaces. [3] Linear maps, isomorphisms. Relation between and nullity. The space of linear maps from U to V , representation by matrices. Change of . Row rank and column rank. [4] Determinant and trace of a square matrix. Determinant of a product of two matrices and of the inverse matrix. Determinant of an endomorphism. The adjugate matrix. [3] Eigenvalues and eigenvectors. Diagonal and triangular forms. Characteristic and minimal polynomials. Cayley-Hamilton Theorem over C. Algebraic and geometric multiplicity of eigenvalues. Statement and illustration of Jordan normal form. [4] Dual of a finite-dimensional vector space, dual bases and maps. Matrix representation, rank and determinant of dual map. [2] Bilinear forms. Matrix representation, change of basis. Symmetric forms and their link with quadratic forms. Diagonalisation of quadratic forms. Law of inertia, classification by rank and signature. Complex Hermitian forms. [4] Inner product spaces, orthonormal sets, orthogonal projection, V = W ⊕ W ⊥. Gram- Schmidt orthogonalisation. Adjoints. Diagonalisation of Hermitian matrices. Orthogo- nality of eigenvectors and properties of eigenvalues. [4]

1 Contents IB Linear Algebra (Theorems with proof)

Contents

0 Introduction 3

1 Vector spaces 4 1.1 Definitions and examples ...... 4 1.2 Linear independence, bases and the Steinitz exchange lemma . .4 1.3 Direct sums ...... 8

2 Linear maps 9 2.1 Definitions and examples ...... 9 2.2 Linear maps and matrices ...... 10 2.3 The first isomorphism theorem and the rank-nullity theorem . . . 12 2.4 Change of basis ...... 14 2.5 Elementary matrix operations ...... 15

3 Duality 16 3.1 Dual space ...... 16 3.2 Dual maps ...... 17

4 Bilinear forms I 21

5 Determinants of matrices 22

6 Endomorphisms 28 6.1 Invariants ...... 28 6.2 The minimal polynomial ...... 30 6.2.1 Aside on polynomials ...... 30 6.2.2 Minimal polynomial ...... 30 6.3 The Cayley-Hamilton theorem ...... 33 6.4 Multiplicities of eigenvalues and Jordan normal form ...... 36

7 Bilinear forms II 39 7.1 Symmetric bilinear forms and quadratic forms ...... 39 7.2 Hermitian form ...... 43

8 Inner product spaces 45 8.1 Definitions and basic properties ...... 45 8.2 Gram-Schmidt orthogonalization ...... 46 8.3 Adjoints, orthogonal and unitary maps ...... 48 8.4 Spectral theory ...... 50

2 0 Introduction IB Linear Algebra (Theorems with proof)

0 Introduction

3 1 Vector spaces IB Linear Algebra (Theorems with proof)

1 Vector spaces 1.1 Definitions and examples Proposition. In any vector space V , 0v = 0 for all v ∈ V , and (−1)v = −v, where −v is the additive inverse of v. Proposition. Let U, W be subspaces of V . Then U + W and U ∩ W are subspaces.

Proof. Let ui + wi ∈ U + W , λ, µ ∈ F. Then

λ(u1 + w1) + µ(u2 + w2) = (λu1 + µu2) + (λw1 + µw2) ∈ U + W.

Similarly, if vi ∈ U ∩ W , then λv1 + µv2 ∈ U and λv1 + µv2 ∈ W . So λv1 + µv2 ∈ U ∩ W . Both U ∩ W and U + W contain 0, and are non-empty. So done.

1.2 Linear independence, bases and the Steinitz exchange lemma

Lemma. S ⊆ V is linearly dependent if and only if there are distinct s0, ··· , sn ∈ S and λ1, ··· , λn ∈ F such that n X λisi = s0. i=1

Proof. If S is linearly dependent, then there are some λ1, ··· , λn ∈ F not all P zero and s1, ··· , sn ∈ S such that λisi = 0. Wlog, let λ1 6= 0. Then

n X λi s = − s . 1 λ i i=2 1 Pn Conversely, if s0 = i=1 λisi, then

n X (−1)s0 + λisi = 0. i=1 So S is linearly dependent.

Proposition. If S = {e1, ··· , en} is a subset of V over F, then it is a basis if and only if every v ∈ V can be written uniquely as a finite linear combination of elements in S, i.e. as n X v = λiei. i=1 Proof. We can view this as a combination of two statements: it can be spanned in at least one way, and it can be spanned in at most one way. We will see that the first part corresponds to S spanning V , and the second part corresponds to S being linearly independent. In fact, S spanning V is defined exactly to mean that every item v ∈ V can be written as a finite linear combination in at least one way.

4 1 Vector spaces IB Linear Algebra (Theorems with proof)

Now suppose that S is linearly independent, and we have

n n X X v = λiei = µiei. i=1 i=1 Then we have n X 0 = v − v = (λi − µi)ei. i=1

Linear independence implies that λi − µi = 0 for all i. Hence λi = µi. So v can be expressed in a unique way. On the other hand, if S is not linearly independent, then we have

n X 0 = λiei i=1 where λi 6= 0 for some i. But we also know that

n X 0 = 0 · ei. i=1 So there are two ways to write 0 as a linear combination. So done.

Theorem (Steinitz exchange lemma). Let V be a vector space over F, and S = {e1, ··· , en} a finite linearly independent subset of V , and T a spanning subset of V . Then there is some T 0 ⊆ T of order n such that (T \ T 0) ∪ S still spans V . In particular, |T | ≥ n.

Corollary. Let {e1, ··· , en} be a linearly independent subset of V , and sup- pose {f1, ··· , fm} spans V . Then there is a re-ordering of the {fi} such that {e1, ··· , en, fn+1, ··· , fm} spans V . 0 Proof. Suppose that we have already found Tr ⊆ T of order 0 ≤ r < n such that

0 Tr = (T \ Tr) ∪ {e1, ··· , er} spans V . 0 (Note that the case r = 0 is trivial, since we can take Tr = ∅, and the case r = n is the theorem which we want to achieve.) Suppose we have these. Since Tr spans V , we can write

k X er+1 = λiti, λi ∈ F, ti ∈ Tr. i=1

We know that the ei are linearly independent, so not all ti’s are ei’s. So there is 0 some j such that tj ∈ (T \ Tr). We can write this as

1 X λi tj = er+1 + − ti. λj λj i6=j

0 0 We let Tr+1 = Tr ∪ {tj} of order r + 1, and

0 Tr+1 = (T \ Tr+1) ∪ {e1, ··· , er+1} = (Tr \{tj}} ∪ {er+1}

5 1 Vector spaces IB Linear Algebra (Theorems with proof)

Since tj is in the span of Tr ∪ {er+1}, we have tj ∈ hTr+1i. So

V ⊇ hTr+1i ⊇ hTri = V.

So hTr+1i = V . Hence we can inductively find Tn.

Corollary. Suppose V is a vector space over F with a basis of order n. Then (i) Every basis of V has order n.

(ii) Any linearly independent set of order n is a basis. (iii) Every spanning set of order n is a basis. (iv) Every finite spanning set contains a basis.

(v) Every linearly independent subset of V can be extended to basis.

Proof. Let S = {e1, ··· , en} be the basis for V . (i) Suppose T is another basis. Since S is independent and T is spanning, |T | ≥ |S|. The other direction is less trivial, since T might be infinite, and Steinitz does not immediately apply. Instead, we argue as follows: since T is linearly independent, every finite subset of T is independent. Also, S is spanning. So every finite subset of T has order at most |S|. So |T | ≤ |S|. So |T | = |S|. (ii) Suppose now that T is a linearly independent subset of order n, but hT i= 6 V . Then there is some v ∈ V \ hT i. We now show that T ∪ {v} is independent. Indeed, if

m X λ0v + λiti = 0 i=1

with λi ∈ F, t1, ··· , tm ∈ T distinct, then

m X λ0v = (−λi)ti. i=1

Then λ0v ∈ hT i. So λ0 = 0. As T is linearly independent, we have λ0 = ··· = λm = 0. So T ∪ {v} is a linearly independent subset of size > n. This is a contradiction since S is a spanning set of size n. (iii) Let T be a spanning set of order n. If T were linearly dependent, then there is some t0, ··· , tm ∈ T distinct and λ1, ··· , λm ∈ F such that X t0 = λiti.

So t0 ∈ hT \{t0}i, i.e. hT \{t0}i = V . So T \{t0} is a spanning set of order n − 1, which is a contradiction.

6 1 Vector spaces IB Linear Algebra (Theorems with proof)

(iv) Suppose T is any finite spanning set. Let T 0 ⊆ T be a spanning set of least possible size. This exists because T is finite. If |T 0| has size n, then done by (iii). Otherwise by the Steinitz exchange lemma, it has size |T 0| > n. So T 0 must be linearly dependent because S is spanning. So there is some P t0, ··· , tm ∈ T distinct and λ1, ··· , λm ∈ F such that t0 = λiti. Then 0 T \{t0} is a smaller spanning set. Contradiction. (v) Suppose T is a linearly independent set. Since S spans, there is some S0 ⊆ S of order |T | such that (S \ S0) ∪ T spans V by the Steinitz exchange lemma. So by (ii), (S \ S0) ∪ T is a basis of V containing T .

Lemma. If V is a finite dimensional vector space over F, U ⊆ V is a proper subspace, then U is finite dimensional and dim U < dim V . Proof. Every linearly independent subset of V has size at most dim V . So let S ⊆ U be a linearly independent subset of largest size. We want to show that S spans U and |S| < dim V . If v ∈ V \ hSi, then S ∪ {v} is linearly independent. So v 6∈ U by maximality of S. This means that hSi = U. Since U 6= V , there is some v ∈ V \ U = V \ hSi. So S ∪ {v} is a linearly independent subset of order |S| + 1. So |S| + 1 ≤ dim V . In particular, dim U = |S| < dim V .

Proposition. If U, W are subspaces of a finite dimensional vector space V , then

dim(U + W ) = dim U + dim W − dim(U ∩ W ).

Proof. Let R = {v1, ··· , vr} be a basis for U ∩W . This is a linearly independent subset of U. So we can extend it to be a basis of U by

S = {v1, ··· , vr, ur+1, ··· , us}.

Similarly, for W , we can obtain a basis

T = {v1, ··· , vr, wr+1, ··· , wt}.

We want to show that dim(U + W ) = s + t − r. It is sufficient to prove that S ∪ T is a basis for U + W . We first show spanning. Suppose u + w ∈ U + W , u ∈ U, w ∈ W . Then u ∈ hSi and w ∈ hT i. So u + w ∈ hS ∪ T i. So U + W = hS ∪ T i. To show linear independence, suppose we have a linear relation

r s t X X X λivi + µjuj + νkwk = 0. i=1 j=r+1 k=r+1

So X X X λivi + µjuj = − νkwk. Since the left hand side is something in U, and the right hand side is something in W , they both lie in U ∩ W . Since S is a basis of U, there is only one way of writing the left hand vector as a sum of vi and uj. However, since R is a basis of U ∩ W , we can write the

7 1 Vector spaces IB Linear Algebra (Theorems with proof)

left hand vector just as a sum of vi’s. So we must have µj = 0 for all j. Then we have X X λivi + νkwk = 0.

Finally, since T is linearly independent, λi = νk = 0 for all i, k. So S ∪ T is linearly independent.

Proposition. If V is a finite dimensional vector space over F and U ∪ V is a subspace, then dim V = dim U + dim V/U.

Proof. Let {u1, ··· , um} be a basis for U and extend this to a basis {u1, ··· , um, vm+1, ··· , vn} for V . We want to show that {vm+1 + U, ··· , vn + U} is a basis for V/U. It is easy to see that this spans V/U. If v + U ∈ V/U, then we can write X X v = λiui + µivi.

Then X X X v + U = µi(vi + U) + λi(ui + U) = µi(vi + U). So done. To show that they are linearly independent, suppose that X λi(vi + U) = 0 + U = U.

Then this requires X λivi ∈ U.

Then we can write this as a linear combination of the ui’s. So X X λivi = µjuj

for some µj. Since {u1, ··· , um, vn+1, ··· , vn} is a basis for V , we must have λi = µj = 0 for all i, j. So {vi + U} is linearly independent.

1.3 Direct sums

8 2 Linear maps IB Linear Algebra (Theorems with proof)

2 Linear maps 2.1 Definitions and examples

Lemma. If U and V are vector spaces over F and α : U → V , then α is an isomorphism iff α is a bijective linear map. Proof. If α is an isomorphism, then it is clearly bijective since it has an inverse function. Suppose α is a linear bijection. Then as a function, it has an inverse β : V → U. We want to show that this is linear. Let v1, v2 ∈ V , λ, µ ∈ F. We have

αβ(λv1 + µv2) = λv1 + µv2 = λαβ(v1) + µαβ(v2) = α(λβ(v1) + µβ(v2)). Since α is injective, we have

β(λv1 + µv2) = λβ(v1) + µβ(v2). So β is linear.

Proposition. Let α : U → V be an F-linear map. Then (i) If α is injective and S ⊆ U is linearly independent, then α(S) is linearly independent in V . (ii) If α is surjective and S ⊆ U spans U, then α(S) spans V . (iii) If α is an isomorphism and S ⊆ U is a basis, then α(S) is a basis for V . Proof. (i) We prove the contrapositive. Suppose that α is injective and α(S) is linearly dependent. So there are s0, ··· , sn ∈ S distinct and λ1, ··· , λn ∈ F not all zero such that n n ! X X α(s0) = λiα(si) = α λisi . i=1 i=1 Since α is injective, we must have n X s0 = λisi. i=1

This is a non-trivial relation of the si in U. So S is linearly dependent. (ii) Suppose α is surjective and S spans U. Pick v ∈ V . Then there is some u ∈ U such that α(u) = v. Since S spans U, there is some s1, ··· , sn ∈ S and λ1, ··· , λn ∈ F such that n X u = λisi. I=1 Then n X v = α(u) = λiα(si). i=1 So α(S) spans V .

9 2 Linear maps IB Linear Algebra (Theorems with proof)

(iii) Follows immediately from (i) and (ii).

Corollary. If U and V are finite-dimensional vector spaces over F and α : U → V is an isomorphism, then dim U = dim V . Proof. Let S be a basis for U. Then α(S) is a basis for V . Since α is injective, |S| = |α(S)|. So done.

Proposition. Suppose V is a F-vector space of dimension n < ∞. Then writing n e1, ··· , en for the standard basis of F , there is a bijection n Φ: {isomorphisms F → V } → {(ordered) basis(v1, ··· , vn) for V }, defined by α 7→ (α(e1), ··· , α(en)). Proof. We first make sure this is indeed a function — if α is an isomorphism, then from our previous proposition, we know that it sends a basis to a basis. So (α(e1), ··· , α(en)) is indeed a basis for V . We now have to prove surjectivity and injectivity. Suppose α, β : Fn → V are isomorphism such that Φ(α) = Φ(β). In other words, α(ei) = β(ei) for all i. We want to show that α = β. We have     x1 n ! x1  .  X X X  .  α  .  = α xiei = xiα(ei) = xiβ(ei) = β  .  . i=1 xn xn Hence α = β. Next, suppose that (v1, ··· , vn) is an ordered basis for V . Then define   x1  .  X α  .  = xivi. xn It is easy to check that this is well-defined and linear. We also know that α is P P injective since (v1, ··· , vn) is linearly independent. So if xivi = yivi, then xi = yi. Also, α is surjective since (v1, ··· , vn) spans V . So α is an isomorphism, and by construction Φ(α) = (v1, ··· , vn).

2.2 Linear maps and matrices

Proposition. Suppose U, V are vector spaces over F and S = {e1, ··· , en} is a basis for U. Then every function f : S → V extends uniquely to a linear map U → V . Proof. For uniqueness, first suppose α, β : U → V are linear and extend f : S → V . We have sort-of proved this already just now. Pn If u ∈ U, we can write u = i=1 uiei with ui ∈ F since S spans. Then X  X X α(u) = α uiei = uiα(ei) = uif(ei).

Similarly, X β(u) = uif(ei).

10 2 Linear maps IB Linear Algebra (Theorems with proof)

So α(u) = β(u) for every u. So α = β. P For existence, if u ∈ U, we can write u = uiei in a unique way. So defining X α(u) = uif(ei)

is unambiguous. To show linearity, let λ, µ ∈ F, u, v ∈ U. Then X  α(λu + µv) = α (λui + µvi)ei X = (λui + µvi)f(ei) X  X  = λ uif(ei) + µ vif(ei) = λα(u) + µα(v).

Moreover, α does extend f.

Corollary. If U and V are finite-dimensional vector spaces over F with bases (e1, ··· , em) and (f1, ··· , fn) respectively, then there is a bijection

Matn,m(F) → L(U, V ), P sending A to the unique linear map α such that α(ei) = ajifj.

Proof. If α is a linear map U → V , then for each 1 ≤ i ≤ m, we can write α(ei) uniquely as n X α(ei) = ajifj j=1

for some aji ∈ F. This gives a matrix A = (aij). The previous proposition tells us that every matrix A arises in this way, and α is determined by A.

Proposition. Suppose U, V, W are finite-dimensional vector spaces over F with bases R = (u1, ··· , ur), S = (v1, .., vs) and T = (w1, ··· , wt) respectively. If α : U → V and β : V → W are linear maps represented by A and B respectively (with respect to R, S and T ), then βα is linear and represented by BA with respect to R and T .

Proof. Verifying βα is linear is straightforward. Next we write βα(ui) as a linear combination of w1, ··· , wt: ! X βα(ui) = β Akivk k X = Akiβ(vk) k X X = Aki Bjkwj k j ! X X = BjkAki wj j k X = (BA)jiwj j

11 2 Linear maps IB Linear Algebra (Theorems with proof)

2.3 The first isomorphism theorem and the rank-nullity theorem Theorem (First isomorphism theorem). Let α : U → V be a linear map. Then ker α and im α are subspaces of U and V respectively. Moreover, α induces an isomorphism

α¯ : U/ ker α → im α (u + ker α) 7→ α(u)

Proof. We know that 0 ∈ ker α and 0 ∈ im α. Suppose u1, u2 ∈ ker α and λ1, λ2 ∈ F. Then

α(λ1u1 + λ2u2) = λ1α(u1) + λ2α(u2) = 0.

So λ1u1 + λ2u2 ∈ ker α. So ker α is a subspace. Similarly, if α(u1), α(u2) ∈ im α, then λα(u1)+λ2α(u2) = α(λ1u1 +λ2u2) ∈ im α. So im α is a subspace. Now by the first isomorphism theorem of groups, α¯ is a well-defined isomor- phism of groups. So it remains to show that α¯ is a linear map. Indeed, we have α¯(λ(u + ker α)) = α(λu) = λα(u) = λ(¯α(u + ker α)). Soα ¯ is a linear map. Corollary (Rank-nullity theorem). If α : U → V is a linear map and U is finite-dimensional, then r(α) + n(α) = dim U.

Proof. By the first isomorphism theorem, we know that U/ ker α =∼ im α. So we have dim im α = dim(U/ ker α) = dim U − dim ker α. So the result follows. Proposition. If α : U → V is a linear map between finite-dimensional vector spaces over F, then there are bases (e1, ··· , em) for U and (f1, ··· , fn) for V such that α is represented by the matrix

I 0 r , 0 0 where r = r(α) and Ir is the r × r identity matrix. In particular, r(α) + n(α) = dim U.

Proof. Let ek+1, ··· , em be a basis for the kernel of α. Then we can extend this to a basis of the (e1, ··· , em). Let fi = α(ei) for 1 ≤ i ≤ k. We now show that (f1, ··· , fk) is a basis for im α (and thus k = r). We first show that it spans. Suppose v ∈ im α. Then we have m ! X v = α λiei i=1

12 2 Linear maps IB Linear Algebra (Theorems with proof)

for some λi ∈ F. By linearity, we can write this as

m k X X v = λiα(ei) = λifi + 0. i=1 i=1

So v ∈ hf1, ··· , fki. To show linear dependence, suppose that

k X µifi = 0. i=1 So we have k ! X α µiei = 0. i=1 Pk So i=1 µiei ∈ ker α. Since (ek+1, ··· , em) is a basis for ker α, we can write

k m X X µiei = µiei i=1 i=k+1

for some µi (i = k + 1, ··· , m). Since (e1, ··· , em) is a basis, we must have µi = 0 for all i. So they are linearly independent. Now we extend (f1, ··· , fr) to a basis for V , and ( fi 1 ≤ i ≤ k α(ei) = . 0 k + 1 ≤ i ≤ m

Corollary. Suppose α : U → V is a linear map between vector spaces over F both of dimension n < ∞. Then the following are equivalent

(i) α is injective; (ii) α is surjective; (iii) α is an isomorphism. Proof. It is clear that, (iii) implies (i) and (ii), and (i) and (ii) together implies (iii). So it suffices to show that (i) and (ii) are equivalent. Note that α is injective iff n(α) = 0, and α is surjective iff r(α) = dim V = n. By the rank-nullity theorem, n(α) + r(α) = n. So the result follows immediately.

Lemma. Let A ∈ Mn,n(F) = Mn(F) be a square matrix. The following are equivalent

(i) There exists B ∈ Mn(F) such that BA = In.

(ii) There exists C ∈ Mn(F) such that AC = In. If these hold, then B = C. We call A invertible or non-singular, and write A−1 = B = C.

13 2 Linear maps IB Linear Algebra (Theorems with proof)

Proof. Let α, β, γ, ι : Fn → Fn be the linear maps represented by matrices A, B, C, In respectively with respect to the standard basis. We note that (i) is equivalent to saying that there exists β such that βα = ι. This is true iff α is injective, which is true iff α is an isomorphism, which is true iff α has an inverse α−1. Similarly, (ii) is equivalent to saying that there exists γ such that αγ = ι. This is true iff α is injective, which is true iff α is isomorphism, which is true iff α has an inverse α−1. So these are the same things, and we have β = α−1 = γ.

2.4 Change of basis

Theorem. Suppose that (e1, ··· , em) and (u1, ··· , um) are basis for a finite- dimensional vector space U over F, and (f1, ··· , fn) and (v1, ··· , vn) are basis of a finite-dimensional vector space V over F. Let α : U → V be a linear map represented by a matrix A with respect to (ei) and (fi) and by B with respect to (ui) and (vi). Then

B = Q−1AP, where P and Q are given by

m n X X ui = Pkiek, vi = Qkifk. k=1 k=1 Proof. On the one hand, we have

n X X X X α(ui) = Bjivj = BjiQ`jf` = [QB]`if`. j=1 j ` `

On the other hand, we can write

m ! m X X X X α(ui) = α Pkiek = Pki A`kf` = [AP ]`if`. k=1 k=1 ` `

Since the f` are linearly independent, we conclude that

QB = AP.

Since Q is invertible, we get B = Q−1AP .

Corollary. If A ∈ Matn,m(F), then there exists invertible matrices P ∈ GLm(F),Q ∈ GLn(F) so that I 0 Q−1AP = r 0 0

for some 0 ≤ r ≤ min(m, n).

T Theorem. If A ∈ Matn,m(F), then r(A) = r(A ), i.e. the row rank is equivalent to the column rank.

14 2 Linear maps IB Linear Algebra (Theorems with proof)

Proof. We know that there are some invertible P,Q such that

I 0 Q−1AP = r , 0 0 where r = r(A). We can transpose this whole equation to obtain

I 0 (Q−1AP )T = P T AT (QT )−1 = r 0 0

So r(AT ) = r.

2.5 Elementary matrix operations

Proposition. If A ∈ Matn,m(F), then there exists invertible matrices P ∈ GLm(F),Q ∈ GLn(F) so that I 0 Q−1AP = r 0 0

for some 0 ≤ r ≤ min(m, n).

m m n n Proof. We claim that there are elementary matrices E1 , ··· ,Ea and F1 , ··· ,Fb (these E are not necessarily the shears, but any elementary matrix) such that

I 0 Em ··· EmAF n ··· F n = r 1 a 1 b 0 0

m n This suffices since the Ei ∈ GLM (F) and Fj ∈ GLn(F). Moreover, to prove the claim, it suffices to find a sequence of elementary row and column operations reducing A to this form. If A = 0, then done. If not, there is some i, j such that Aij 6= 0. By swapping row 1 and row i; and then column 1 and column j, we can assume A11 6= 0. By 1 rescaling row 1 by , we can further assume A11 = 1. A11 Now we can add −A1j times column 1 to column j for each j 6= 1, and then add −Ai1 times row 1 to row i 6= 1. Then we now have

1 0 ··· 0 0  A =   .  .B  0

Now B is smaller than A. So by induction on the size of A, we can reduce B to a matrix of the required form, so done.

15 3 Duality IB Linear Algebra (Theorems with proof)

3 Duality 3.1 Dual space

Lemma. If V is a finite-dimensional vector space over f with basis (e1, ··· , en), ∗ then there is a basis (ε1, ··· , εn) for V (called the dual basis to (e1, ··· , en)) such that εi(ej) = δij. Proof. Since linear maps are characterized by their values on a basis, there exists ∗ unique choices for ε1, ··· , εn ∈ V . Now we show that (ε1, ··· , εn) is a basis. Suppose θ ∈ V ∗. We show that we can write it uniquely as a combination of Pn Pn ε1, ··· , εn. We have θ = i=1 λiεi if and only if θ(ej) = i=1 λiεi(ej) (for all j) if and only if λj = θ(ej). So we have uniqueness and existence. Corollary. If V is finite dimensional, then dim V = dim V ∗.

Proposition. Let V be a finite-dimensional vector space over F with bases (e1, ··· , en) and (f1, ··· , fn), and that P is the change of basis matrix so that

n X fi = Pkiek. k=1

Let (ε1, ··· , εn) and (η1, ··· , ηn) be the corresponding dual bases so that

εi(ej) = δij = ηi(fj).

−1 T Then the change of basis matrix from (ε1, ··· , εn) to (η1, ··· , ηn) is (P ) , i.e.

n X T εi = P`i η`. `=1

Proof. For convenience, write Q = P −1 so that

n X ej = Qkjfk. k=1 So we can compute

n ! n ! n ! X X X Pi`η` (ej) = Pi`η` Qkjfk `=1 `=1 k=1 X = Pi`δ`kQkj k,` X = Pi`Q`j k,`

= [PQ]ij

= δij.

Pn T So εi = `=1 P`i η`.

16 3 Duality IB Linear Algebra (Theorems with proof)

Proposition. Let V be a vector space over F and U a subspace. Then dim U + dim U 0 = dim V.

Proof. Let (e1, ··· , ek) be a basis for U and extend to (e1, ··· , en) a basis for ∗ V . Consider the dual basis for V , say (ε1, ··· , εn). Then we will show that

0 U = hεk+1, ··· , εni.

0 So dim U = n − k as required. This is easy to prove — if j > k, then εj(ei) = 0 0 0 for all i ≤ k. So εk+1, ··· , εn ∈ U . On the other hand, suppose θ ∈ U . Then we can write n X θ = λjεj. j=1

But then 0 = θ(ei) = λi for i ≤ k. So done.

∗ ∗ Proof. Consider the restriction map V → U , given by θ 7→ θ|U . This is obviously linear. Since every linear map U → F can be extended to V → F, this is a surjection. Moreover, the kernel is U 0. So by rank-nullity theorem,

dim V ∗ = dim U 0 + dim U ∗.

Since dim V ∗ = dim V and dim U ∗ = dim U, we’re done.

Proof. We can show that U 0 ' (V/U)∗, and then deduce the result. Details are left as an exercise.

3.2 Dual maps Proposition. Let α ∈ L(V,W ) be a linear map. Then α∗ ∈ L(W ∗,V ∗) is a linear map.

∗ Proof. Let λ, µ ∈ F and θ1, θ2 ∈ W . We want to show

∗ ∗ ∗ α (λθ1 + µθ2) = λα (θ1) + µα (θ2).

To show this, we show that for every v ∈ V , the left and right give the same result. We have

∗ α (λθ1 + µθ2)(v) = (λθ1 + µθ2)(αv)

= λθ1(α(v)) + µθ2(α(v)) ∗ ∗ = (λα (θ1) + µα (θ2))(v).

So α∗ ∈ L(W ∗,V ∗).

Proposition. Let V,W be finite-dimensional vector spaces over F and α : V → W be a linear map. Let (e1, ··· , en) be a basis for V and (f1, ··· , fm) be a basis for W ;(ε1, ··· , εn) and (η1, ··· , ηm) the corresponding dual bases. Suppose α is represented by A with respect to (ei) and (fi) for V and W . Then α∗ is represented by AT with respect to the corresponding dual bases.

17 3 Duality IB Linear Algebra (Theorems with proof)

Proof. We are given that m X α(ei) = Akifk. k=1 ∗ We must compute α (ηi). To do so, we evaluate it at ej. We have

m ! m ∗ X X α (ηi)(ej) = ηi(α(ej)) = ηi Akjfk = Akjδik = Aij. k=1 k=1 We can also write this as

n ∗ X α (ηi)(ej) = Aikεk(ej). k=1 Since this is true for all j, we have

n n ∗ X X T α (ηi) Aikεk = Akiεk. k=1 k=1 So done.

Lemma. Let α ∈ L(V,W ) with V,W finite dimensional vector spaces over F. Then

(i) ker α∗ = (im α)0. (ii) r(α) = r(α∗) (which is another proof that row rank is equal to column rank). (iii) im α∗ = (ker α)0.

Proof. (i) If θ ∈ W ∗, then

θ ∈ ker α∗ ⇔ α∗(θ) = 0 ⇔ (∀v ∈ V ) θα(v) = 0 ⇔ (∀w ∈ im α) θ(w) = 0 ⇔ θ ∈ (im α)0.

(ii) As im α ≤ W , we’ve seen that

dim im α + dim(im α)0 = dim W.

Using (i), we see n(α∗) = dim(im α)0. So r(α) + n(α∗) = dim W = dim W ∗. By the rank-nullity theorem, we have r(α) = r(α∗).

18 3 Duality IB Linear Algebra (Theorems with proof)

(iii) The proof in (i) doesn’t quite work here. We can only show that one includes the other. To draw the conclusion, we will show that the two spaces have the dimensions, and hence must be equal. Let θ ∈ im α∗. Then θ = φα for some φ ∈ W ∗. If v ∈ ker α, then

θ(v) = φ(α(v)) = φ(0) = 0.

So im α∗ ⊆ (ker α)0. But we know dim(ker α)0 + dim ker α = dim V, So we have

dim(ker α)0 = dim V − n(α) = r(α) = r(α∗) = dim im α∗.

Hence we must have im α∗ = (ker α)0.

Lemma. Let V be a vector space over F. Then there is a linear map ev : V → (V ∗)∗ given by ev(v)(θ) = θ(v). We call this the evaluation map.

Proof. We first show that ev(v) ∈ V ∗∗ for all v ∈ V , i.e. ev(v) is linear for any ∗ v. For any λ, µ ∈ F, θ1, θ2 ∈ V , then for v ∈ V , we have

ev(v)(λθ1 + µθ2) = (λθ1 + µθ2)(v)

= λθ1(v) + µθ2(v)

= λ ev(v)(θ1) + µ ev(v)(θ2).

So done. Now we show that ev itself is linear. Let λ, µ ∈ F, v1, v2 ∈ V . We want to show ev(λv1 + µv2) = λ ev(v1) + µ ev(v2). To show these are equal, pick θ ∈ V ∗. Then

ev(λv1 + µv2)(θ) = θ(λv1 + µv2)

= λθ(v1) + µθ(v2)

= λ ev(v1)(θ) + µ ev(v2)(θ)

= (λ ev(v1) + µ ev(v2))(θ).

So done. Lemma. If V is finite-dimensional, then ev : V → V ∗∗ is an isomorphism.

Proof. We first show it is injective. Suppose ev(v) = 0 for some v ∈ V . Then θ(v) = ev(v)(θ) = 0 for all θ ∈ V ∗. So dimhvi0 = dim V ∗ = dim V . So dimhvi = 0. So v = 0. So ev is injective. Since V and V ∗∗ have the same dimension, this is also surjective. So done.

Lemma. Let V,W be finite-dimensional vector spaces over F after identifying (V and V ∗∗) and (W and W ∗∗) by the evaluation map. Then we have

19 3 Duality IB Linear Algebra (Theorems with proof)

(i) If U ≤ V , then U 00 = U. (ii) If α ∈ L(V,W ), then α∗∗ = α.

Proof. (i) Let u ∈ U. Then u(θ) = θ(u) = 0 for all θ ∈ U 0. So u annihilates everything in U 0. So u ∈ U 00. So U ⊆ U 00. We also know that

dim U = dim V − dim U 0 = dim V − (dim V − dim U 00) = dim U 00.

So we must have U = U 00. (ii) The proof of this is basically — the transpose of the transpose is the original matrix. The only work we have to do is to show that the dual of the dual basis is the original basis.

Let (e1, ··· , en) be a basis for V and (f1, ··· , fm) be a basis for W , and let (ε1, ··· , εn) and (η1, ··· , ηn) be the corresponding dual basis. We know that ei(εj) = δij = εj(ei), fi(ηj) = δij = ηj(fi).

So (e1, ··· , en) is dual to (ε1, ··· , εn), and similarly for f and η. If α is represented by A, then α∗ is represented by AT . So α∗∗ is represented by (AT )T = A. So done.

Proposition. Let V be a finite-dimensional vector space F and U1, U2 are subspaces of V . Then we have

0 0 0 (i)( U1 + U2) = U1 ∩ U2 0 0 0 (ii)( U1 ∩ U2) = U1 + U2 Proof. (i) Suppose θ ∈ V ∗. Then

0 θ ∈ (U1 + U2) ⇔ θ(u1 + u2) = 0 for all ui ∈ Ui

⇔ θ(u) = 0 for all u ∈ U1 ∪ U2 0 0 ⇔ θ ∈ U1 ∩ U2 .

(ii) We have

0 0 0 0 0 0 0 0 00 0 0 (U1 ∩ U2) = ((U1 ) ∩ (U2 ) ) = (U1 + U2 ) = U1 + U2 .

So done.

20 4 Bilinear forms I IB Linear Algebra (Theorems with proof)

4 Bilinear forms I

Proposition. Suppose (e1, ··· , en) and (v1, ··· , vn) are basis for V such that X vi = Pkiek for all i = 1, ··· , n;

and (f1, ··· , fm) and (w1, ··· , wm) are bases for W such that X wi = Q`jf` for all j = 1, ··· , m.

Let ψ : V × W → F be a bilinear form represented by A with respect to (e1, ··· , en) and (f1, ··· , fm), and by B with respect to the bases (v1, ··· , vn) and (w1, ··· , wm). Then B = P T AQ. Proof. We have

Bij = φ(vi, wj) X X  = φ Pkiek, Q`jf` X = PkiQ`jφ(ek, f`)

X T = PikAk`Q`j k,` T = (P AQ)ij. ∗ Lemma. Let (ε1, ··· , εn) be a basis for V dual to (e1, ··· , en) of V ;(η1, ··· , ηn) ∗ be a basis for W dual to (f1, ··· , fn) of W . If A represents ψ with respect to (e1, ··· , en) and (f1, ··· , fm), then A also T represents ψR with respect to (f1, ··· , fm) and (ε1, ··· , εn); and A represents ψL with respect to (e1, ··· , en) and (η1, ··· , ηm). Proof. We just have to compute X  ψL(ei)(fj) = Aij = Ai`η` (fj). So we get X T ψL(ei) = A`iη`. T So A represents ψL. We also have ψR(fj)(ei) = Aij. So X ψR(fj) = Akjεk.

Lemma. Let V and W be finite-dimensional vector spaces over F with bases (e1, ··· , en) and (f1, ··· , fm) be their basis respectively. Let ψ : V × W → F be a bilinear form represented by A with respect to these bases. Then φ is non-degenerate if and only if A is (square and) invertible. In particular, V and W have the same dimension. T Proof. Since ψR and ψL are represented by A and A (in some order), they both have trivial kernel if and only if n(A) = n(AT ) = 0. So we need r(A) = dim V and r(AT ) = dim W . So we need dim V = dim W and A have full rank, i.e. the corresponding linear map is bijective. So done.

21 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

5 Determinants of matrices

Lemma. det A = det AT . Proof.

n T X Y det A = ε(σ) Aσ(i)i

σ∈Sn i=1 n X Y = ε(σ) Ajσ−1(j)

σ∈Sn j=1 n X −1 Y = ε(τ ) Ajτ(j)

τ∈Sn j=1

Since ε(τ) = ε(τ −1), we get

n X Y = ε(τ) Ajτ(j)

τ∈Sn j=1 = det A.

Lemma. If A is an upper triangular matrix, i.e.   a11 a12 ··· a1n  0 a22 ··· a2n  A =    . . .. .   . . . .  0 0 ··· ann

Then n Y det A = aii. i=1 Proof. We have n X Y det A = ε(σ) Aiσ(i)

σ∈Sn i=1

But Aiσ(i) = 0 whenever i > σ(i). So

n Y Aiσ(i) = 0 i=1

if there is some i ∈ {1, ··· , n} such that i > σ(i). However, the only permutation in which i ≤ σ(i) for all i is the identity. So the only thing that contributes in the sum is σ = id. So

n Y det A = Aii. i=1 Lemma. det A is a volume form.

22 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

Proof. To see that det is multilinear, it is sufficient to show that each

n Y Aiσ(i) i=1

is multilinear for all σ ∈ Sn, since linear combinations of multilinear forms are multilinear. But each such product is contains precisely one entry from each column, and so is multilinear. To show it is alternating, suppose now there are some k, ` distinct such that A(k) = A(`). We let τ be the transposition (k `). By Lagrange’s theorem, we can write Sn = An q τAn, where An = ker ε and q is the disjoint union. We also know that

n n X Y X Y Aiσ(i) = Ai,τσ(i),

σ∈An i=1 σ∈An i=1

since if σ(i) is not k or l, then τ does nothing; if σ(i) is k or l, then τ just swaps them around, but A(k) = A(l). So we get

n n X Y X Y Aiσ(i) = Aiσ0(i), 0 σ∈An i=1 σ ∈τAn i=1 But we know that det A = LHS − RHS = 0. So done.

Lemma. Let d be a volume form on Fn. Then swapping two entries changes the sign, i.e.

d(v1, ··· , vi, ··· , vj, ··· , vn) = −d(v1, ··· , vj, ··· , vi, ··· , vn).

Proof. By linearity, we have

0 = d(v1, ··· , vi + vj, ··· , vi + vj, ··· , vn)

= d(v1, ··· , vi, ··· , vi, ··· , vn)

+ d(v1, ··· , vi, ··· , vj, ··· , vn)

+ d(v1, ··· , vj, ··· , vi, ··· , vn)

+ d(v1, ··· , vj, ··· , vj, ··· , vn)

= d(v1, ··· , vi, ··· , vj, ··· , vn)

+ d(v1, ··· , vj, ··· , vi, ··· , vn).

So done.

Corollary. If σ ∈ Sn, then

d(vσ(1), ··· , vσ(n)) = ε(σ)d(v1, ··· , vn)

n for any vi ∈ F .

23 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

Theorem. Let d be any volume form on Fn, and let A = (A(1) ··· A(n)) ∈ Matn(F). Then (1) (n) d(A , ··· ,A ) = (det A)d(e1, ··· , en), where {e1, ··· , en} is the standard basis. Proof. We can compute

n ! (1) (n) X (2) (n) d(A , ··· ,A ) = d Ai1ei,A , ··· ,A i=1 n X (2) (n) = Ai1d(ei,A , ··· ,A ) i=1 n X (3) (n) = Ai1Aj2d(ei, ej,A , ··· ,A ) i,j=1 n X Y = d(ei1 , ··· , ein ) Aij j.

i1,··· ,in j=1

We know that lots of these are zero, since if ik = ij for some k, j, then the term is zero. So we are just summing over distinct tuples, i.e. when there is some σ such that ij = σ(j). So we get

n (1) (n) X Y d(A , ··· ,A ) = d(eσ(1), ··· , eσ(n)) Aσ(j)j.

σ∈Sn j=1 However, by our corollary up there, this is just

n (1) (n) X Y d(A , ··· ,A ) = ε(σ)d(e1, ··· , en) Aσ(j)j = (det A)d(e1, ··· , en).

σ∈Sn j=1 So done.

Theorem. Let A, B ∈ Matn(F). Then det(AB) = det(A) det(B). Proof. Let d be a non-zero volume form on Fn (e.g. the “determinant”). Then we can compute

d(ABe1, ··· , ABen) = (det AB)d(e1, ··· , en), but we also have

d(ABe1, ··· , ABen) = (det A)d(Be1, ··· ,Ben) = (det A)(det B)d(e1, ··· , en). Since d is non-zero, we must have det AB = det A det B.

Corollary. If A ∈ Matn(F) is invertible, then det A 6= 0. In fact, when A is invertible, then det(A−1) = (det A)−1. Proof. We have 1 = det I = det(AA−1) = det A det A−1. So done.

24 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

Theorem. Let A ∈ Matn(F). Then the following are equivalent: (i) A is invertible. (ii) det A 6= 0. (iii) r(A) = n. Proof. We have proved that (i) ⇒ (ii) above, and the rank-nullity theorem implies (iii) ⇒ (i). We will prove (ii) ⇒ (iii). In fact we will show the contrapositive. Suppose r(A) < n. By rank-nullity theorem, n(A) > 0. So there is some   λ1  .  x =  .  such that Ax = 0. Suppose λk 6= 0. We define B as follows: λn   1 λ1  .. .   . .     1 λk−1    B =  λk     λk+1 1     . ..   . .  λn 1

So AB has the kth column identically zero. So det(AB) = 0. So it is sufficient to prove that det(B) 6= 0. But det B = λk 6= 0. So done.

Lemma. Let A ∈ Matn(F). Then (i) We can expand det A along the jth column by

n X i+j det A = (−1) Aij det Aˆij. i=1

(ii) We can expand det A along the ith row by

n X i+j det A = (−1) Aij det Aˆij. j=1

Proof. Since det A = det AT , (i) and (ii) are equivalent. So it suffices to prove just one of them. We have

det A = d(A(1), ··· ,A(n)), where d is the volume form induced by the determinant. Then we can write as

n ! (1) X (n) det A = d A , ··· , Aijei, ··· ,A i=1 n X (1) (n) = Aijd(A , ··· , ei, ··· ,A ) i=1

25 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

The volume form on the right is the determinant of a matrix with the jth column replaced with ei. We can move our columns around so that our matrix becomes

 Aˆ 0 B = ij stuff 1

We get that det B = det Aˆij, since the only permutations that give a non-zero sum are those that send n to n. In the row and column swapping, we have made n − j column transpositions and n − i row transpositions. So we have

n X n−j n−i det A = Aij(−1) (−1) det B i=1 n X i+j = Aij(−1) det Aˆij. i=1

Theorem. If A ∈ Matn(F), then A(adj A) = (det A)In = (adj A)A. In particu- lar, if det A 6= 0, then 1 A−1 = adj A. det A Proof. We compute

n n X X i+j [(adj A)A]jk = (adj A)jiAik = (−1) det AˆijAik. (∗) i=1 i=1

So if j = k, then [(adj A)A]jk = det A by the lemma. Otherwise, if j 6= k, consider the matrix B obtained from A by replacing the jth column by the kth column. Then the right hand side of (∗) is just det B by the lemma. But we know that if two columns are the same, the determinant is zero. So the right hand side of (∗) is zero. So

[(adj A)A]jk = det Aδjk

The calculation for [A adj A] = (det A)In can be done in a similar manner, or by T T T T T considering (A adj A) = (adj A) A = (adj(A ))A = (det A)In. Lemma. Let A, B be square matrices. Then for any C, we have

AC det = (det A)(det B). 0 B

Proof. Suppose A ∈ Matk(F), and B ∈ Mat`(F), so C ∈ Matk,`(F). Let

AC X = . 0 B

Then by definition, we have

k+` X Y det X = ε(σ) Xiσ(i).

σ∈Sk+` i=1

26 5 Determinants of matrices IB Linear Algebra (Theorems with proof)

If j ≤ k and i > k, then Xij = 0. We only want to sum over permutations σ such that σ(i) > k if i > k. So we are permuting the last j things among themselves, and hence the first k things among themselves. So we can decompose this into σ = σ1σ2, where σ1 is a permutation of {1, ··· , k} and fixes the remaining things, while σ2 fixes {1, ··· , k}, and permutes the remaining. Then

k ` X Y Y det X = ε(σ1σ2) Xiσ1(i) Xk+j σ2(k+j) σ=σ1σ2 i=1 j=1

k !  `  X Y X Y = ε(σ1) Aiσ1(i)  ε(σ2) Bjσ2(j)

σ1∈Sk i=1 σ2∈S` j=1 = (det A)(det B)

Corollary.   A1 stuff n  A2  Y det   = det A  ..  i  .  i=1 0 An

27 6 Endomorphisms IB Linear Algebra (Theorems with proof)

6 Endomorphisms 6.1 Invariants

Lemma. Suppose (e1, ··· , en) and (f1, ··· , fn) are bases for V and α ∈ End(V ). If A represents α with respect to (e1, ··· , en) and B represents α with respect to (f1, ··· , fn), then B = P −1AP, where P is given by n X fi = Pjiej. j=1 Proof. This is merely a special case of an earlier more general result for arbitrary maps and spaces. Lemma.

(i) If A ∈ Matm,n(F) and B ∈ Matn,m(F), then tr AB = tr BA.

(ii) If A, B ∈ Matn(F) are similar, then tr A = tr B.

(iii) If A, B ∈ Matn(F) are similar, then det A = det B. Proof. (i) We have

m m n n m X X X X X tr AB = (AB)ii = AijBji = BjiAij = tr BA. i=1 i=1 j=1 j=1 i=1

(ii) Suppose B = P −1AP . Then we have

tr B = tr(P −1(AP )) = tr((AP )P −1) = tr A.

(iii) We have

det(P −1AP ) = det P −1 det A det P = (det P )−1 det A det P = det A.

Lemma. If A and B are similar, then they have the same characteristic polyno- mial. Proof. det(tI − P −1AP ) = det(P −1(tI − A)P ) = det(tI − A).

Lemma. Let α ∈ End(V ) and λ1, ··· , λk distinct eigenvalues of α. Then

k M E(λ1) + ··· + E(λk) = E(λi) i=1 is a direct sum.

28 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Proof. Suppose k k X X xi = yi, i=1 i=1 with xi, yi ∈ E(λi). We want to show that they are equal. We are going to find some clever map that tells us what xi and yi are. Consider βj ∈ End(V ) defined by Y βj = (α − λrι). r6=j Then

k ! k X X Y βj xi = (α − λrι)(xi) i=1 i=1 r6=j k X Y = (λi − λr)(xi). i=1 r6=j Each summand is zero, unless i 6= j. So this is equal to

k ! X Y βj xi = (λj − λr)(xj). i=1 r6=j Similarly, we obtain

k ! X Y βj yi = (λj − λr)(yj). i=1 r6=j P P Since we know that xi = yi, we must have Y Y (λj − λr)xj = (λj − λr)yj. r6=j r6=j Q Since we know that (λr − λj) 6= 0, we must have xi = yi for all i. r6=j P So each expression for xi is unique.

Theorem. Let α ∈ End(V ) and λ1, ··· , λk be distinct eigenvalues of α. Write Ei for E(λi). Then the following are equivalent: (i) α is diagonalizable. (ii) V has a basis of eigenvectors for α. Lk (iii) V = i=1 Ei. Pk (iv) dim V = i=1 dim Ei. Proof.

– (i) ⇔ (ii): Suppose (e1, ··· , en) is a basis for V . Then

α(ei) = Ajiej,

where A represents α. Then A is diagonal iff each ei is an eigenvector. So done

29 6 Endomorphisms IB Linear Algebra (Theorems with proof)

P – (ii) ⇔ (iii): It is clear that (ii) is true iff Ei = V , but we know that this must be a direct sum. So done.

– (iii) ⇔ (iv): This follows from example sheet 1 Q10, which says that Lk V = i=1 Ei iff the bases for Ei are disjoint and their union is a basis of V .

6.2 The minimal polynomial 6.2.1 Aside on polynomials

Lemma (Polynomial division). If f, g ∈ F[t] (and g 6= 0), then there exists q, r ∈ F[t] with deg r < deg g such that f = qg + r.

Lemma. If λ ∈ F is a root of f, i.e. f(λ) = 0, then there is some g such that f(t) = (t − λ)g(t).

Proof. By polynomial division, let

f(t) = (t − λ)g(t) + r(t)

for some g(t), r(t) ∈ F[t] with deg r < deg(t − λ) = 1. So r has to be constant, i.e. r(t) = a0 for some a0 ∈ F. Now evaluate this at λ. So

0 = f(λ) = (λ − λ)g(λ) + r(λ) = a0.

So a0 = 0. So r = 0. So done.

Lemma. A non-zero polynomial f ∈ F[t] has at most deg f roots, counted with multiplicity.

Corollary. Let f, g ∈ F[t] have degree < n. If there are λ1, ··· , λn distinct such that f(λi) = g(λi) for all i, then f = g. Proof. Given the lemma, consider f − g. This has degree less than n, and (f − g)(λi) = 0 for i = 1, ··· , n. Since it has at least n ≥ deg(f − g) roots, we must have f − g = 0. So f = g.

Corollary. If F is infinite, then f and g are equal if and only if they agree on all points. Theorem (The fundamental theorem of algebra). Every non-constant polyno- mial over C has a root in C.

6.2.2 Minimal polynomial Theorem (Diagonalizability theorem). Suppose α ∈ End(V ). Then α is diago- nalizable if and only if there exists non-zero p(t) ∈ F[t] such that p(α) = 0, and p(t) can be factored as a product of distinct linear factors.

30 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Proof. Suppose α is diagonalizable. Let λ1, ··· , λk be the distinct eigenvalues of α. We have k M V = E(λi). i=1 So each v ∈ V can be written (uniquely) as

k X v = vi with α(vi) = λivi. i=1 Now let k Y p(t) = (t − λi). i=1 Then for any v, we get

k k X X p(α)(v) = p(α)(vi) = p(λi)vi = 0. i=1 i=1

So p(α) = 0. By construction, p has distinct linear factors. Conversely, suppose we have our polynomial

k Y p(t) = (t − λi), i=1 with λ1, ··· , λk ∈ F distinct, and p(α) = 0 (we can wlog assume p is monic, i.e. the leading coefficient is 1). We will show that

k X V = Eα(λi). i=1

In other words, we want to show that for all v ∈ V , there is some vi ∈ Eα(λi) P for i = 1, ··· , k such that v = vi. To find these vi out, we let

Y t − λi qj(t) = . λj − λi i6=j

This is a polynomial of degree k − 1, and qj(λi) = δij. Now consider k X q(t) = qi(t). i=1

We still have deg q ≤ k − 1, but q(λi) = 1 for any i. Since q and 1 agree on k points, we must have q = 1. Let πj : V → V be given by πj = qj(α). Then the above says that

k X πj = ι. j=1

31 6 Endomorphisms IB Linear Algebra (Theorems with proof)

P Hence given v ∈ V , we know that v = πjv. We now check that πjv ∈ Eα(λj). This is true since

k 1 Y 1 (α − λ ι)π v = (α − λ )(v) = p(α)(v) = 0. j j Q (λ − λ ) ι Q (λ − λ ) i6=j j i i=1 i6=j j i So αvj = λjvj. So done.

Lemma. Let α ∈ End(V ), and p ∈ F[t]. Then p(α) = 0 if and only if Mα(t) is a factor of p(t). In particular, Mα is unique.

Proof. For all such p, we can write p(t) = q(t)Mα(t) + r(t) for some r of degree less than deg Mα. Then

p(α) = q(α)Mα(α) + r(α).

So if r(α) = 0 iff p(α) = 0. But deg r < deg Mα. By the minimality of Mα, we must have r(α) = 0 iff r = 0. So p(α) = 0 iff Mα(t) | p(t). So if M1 and M2 are both minimal polynomials for α, then M1 | M2 and M2 | M1. So M2 is just a scalar multiple of M1. But since M1 and M2 are monic, they must be equal. Theorem (Diagonalizability theorem 2.0). Let α ∈ End(V ). Then α is diago- nalizable if and only if Mα(t) is a product of its distinct linear factors. Proof. (⇐) This follows directly from the previous diagonalizability theorem. (⇒) Suppose α is diagonalizable. Then there is some p ∈ F[t] non-zero such that p(α) = 0 and p is a product of distinct linear factors. Since Mα divides p, Mα also has distinct linear factors. Theorem. Let α, β ∈ End(V ) be both diagonalizable. Then α and β are simultaneously diagonalizable (i.e. there exists a basis with respect to which both are diagonal) if and only if αβ = βα.

Proof. (⇒) If there exists a basis (e1, ··· , en) for V such that α and β are repre- sented by A and B respectively, with both diagonal, then by direct computation, AB = BA. But AB represents αβ and BA represents βα. So αβ = βα. (⇐) Suppose αβ = βα. The idea is to consider each eigenspace of α individu- ally, and then diagonalize β in each of the eigenspaces. Since α is diagonalizable, we can write k M V = Eα(λi), i=1 where λi are the eigenvalues of V . We write Ei for Eα(λi). We want to show that β sends Ei to itself, i.e. β(Ei) ⊆ Ei. Let v ∈ Ei. Then we want β(v) to be in Ei. This is true since

α(β(v)) = β(α(v)) = β(λiv) = λiβ(v).

So β(v) is an eigenvector of α with eigenvalue λi.

32 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Now we can view β|Ei ∈ End(Ei). Note that

Mβ(β|Ei ) = Mβ(β)|Ei = 0.

Since Mβ(t) is a product of its distinct linear factors, it follows that β|Ei is

diagonalizable. So we can choose a basis Bi of eigenvectors for β|Ei . We can do this for all i. Sk Then since V is a direct sum of the Ei’s, we know that B = i=1 Bi is a basis for V consisting of eigenvectors for both α and β. So done.

6.3 The Cayley-Hamilton theorem Theorem (Cayley-Hamilton theorem). Let V be a finite-dimensional vector space and α ∈ End(V ). Then χα(α) = 0, i.e. Mα(t) | χα(t). In particular, deg Mα ≤ n.

Lemma. An endomorphism α is triangulable if and only if χα(t) can be written as a product of linear factors, not necessarily distinct. In particular, if F = C (or any algebraically closed field), then every endomorphism is triangulable. Proof. Suppose that α is triangulable and represented by   λ1 ∗ · · · ∗  0 λ2 · · · ∗    .  . . .. .   . . . .  0 0 ··· λn

Then   t − λ1 ∗ · · · ∗ n  0 t − λ2 · · · ∗  Y χ (t) = det   = (t − λ ). α  . . .. .  i  . . . .  i=1 0 0 ··· t − λn So it is a product of linear factors. We are going to prove the converse by induction on the dimension of our space. The base case dim V = 1 is trivial, since every 1 × 1 matrix is already upper triangular. Suppose α ∈ End(V ) and the result holds for all spaces of dimensions < dim V , and χα is a product of linear factors. In particular, χα(t) has a root, say λ ∈ F. Now let U = E(λ) 6= 0, and let W be a complementary subspace to U in V , i.e. V = U ⊕ W . Let u1, ··· , ur be a basis for U and wr+1, ··· , wn be a basis for W so that u1, ··· , ur, wr+1, ··· , wn is a basis for V , and α is represented by   λIr stuff 0 B

r We know χα(t) = (t − λ) χB(t). So χB(t) is also a product of linear factors. We let β : W → W be the map defined by B with respect to wr+1, ··· , wn. (Note that in general, β is not α|W in general, since α does not necessarily map W to W . However, we can say that (α−β)(w) ∈ U for all w ∈ W . This can

33 6 Endomorphisms IB Linear Algebra (Theorems with proof)

be much more elegantly expressed in terms of quotient spaces, but unfortunately that is not officially part of the course) Since dim W < dim V , there is a basis vr+1, ··· , vn for W such that β is represented by C, which is upper triangular. For j = 1, ··· , n − r, we have

n−r X α(vj+r) = u + Ckjvk+r k=1 for some u ∈ U. So α is represented by   λIr stuff 0 C with respect to (u1, ··· , ur, vr+1, ··· , vn), which is upper triangular. Theorem (Cayley-Hamilton theorem). Let V be a finite-dimensional vector space and α ∈ End(V ). Then χα(α) = 0, i.e. Mα(t) | χα(t). In particular, deg Mα ≤ n. Proof. In this proof, we will work over C. By the lemma, we can choose a basis {e1, ··· , en} is represented by an upper triangular matrix.   λ1 ∗ · · · ∗  0 λ2 · · · ∗  A =   .  . . .. .   . . . .  0 0 ··· λn We must prove that n Y χα(α) = χA(α) = (α − λiι) = 0. i=1

Write Vj = he1, ··· , eji. So we have the inclusions

V0 = 0 ⊆ V1 ⊆ · · · ⊆ Vn−1 ⊆ Vn = V.

We also know that dim Vj = j. This increasing sequence is known as a flag. Now note that since A is upper-triangular, we get

i X α(ei) = Akiek ∈ Vi. k=1

So α(Vj) ⊆ Vj for all j = 0, ··· , n. Moreover, we have

j−1 X (α − λjι)(ej) = Akjek ⊆ Vj−1 k=1 for all j = 1, ··· , n. So every time we apply one of these things, we get to a smaller space. Hence by induction on n − j, we have n Y (α − λiι)(Vn) ⊆ Vj−1. i=j

34 6 Endomorphisms IB Linear Algebra (Theorems with proof)

In particular, when j = 1, we get n Y (α − λiι)(V ) ⊆ V0 = 0. i=1

So χα(α) = 0 as required. Note that if our field F is not C but just a subfield of C, say R, we can just pretend it is a complex matrix, do the same proof. Proof. We’ll now prove the theorem again, which is somewhat a formalization of the “nonsense proof” where we just substitute t = α into det(α − tι). Let α be represented by A, and B = tI − A. Then

B adj B = det BIn = χα(t)In. But we know that adj B is a matrix with entries in F[t] of degree at most n − 1. So we can write n−1 n−2 adj B = Bn−1t + Bn−2t + ··· + B0, with Bi ∈ Matn(F). We can also write n n−1 χα(t) = t + an−1t + ··· + a0. Then we get the result n−1 n−2 n n−1 (tIn − A)(Bn−1t + Bn−2t + ··· + B0) = (t + an−1t + ··· + a0)In. We would like to just throw in t = A, and get the desired result, but in all these derivations, t is assumed to be a real number, and, tIn − A is the matrix   t − a11 a12 ··· a1n  a21 t − a22 ··· a2n     . . .. .   . . . .  an1 an2 ··· t − ann It doesn’t make sense to put our A in there. However, what we can do is to note that since this is true for all values of t, the coefficients on both sides must be equal. Equating coefficients in tk, we have

−AB0 = a0In

B0 − AB1 = a1In . .

Bn−2 − ABn−1 = an−1In

ABn−1 − 0 = In We now multiply each row a suitable power of A to obtain

−AB0 = a0In 2 AB0 − A B1 = a1A . . n−1 n n−1 A Bn−2 − A Bn−1 = an−1A n n A Bn−1 − 0 = A .

Summing this up then gives χα(A) = 0.

35 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Lemma. Let α ∈ End(V ), λ ∈ F. Then the following are equivalent: (i) λ is an eigenvalue of α.

(ii) λ is a root of χα(t).

(iii) λ is a root of Mα(t). Proof. – (i) ⇔ (ii): λ is an eigenvalue of α if and only if (α − λι)(v) = 0 has a non-trivial root, iff det(α − λι) = 0.

– (iii) ⇒ (ii): This follows from Cayley-Hamilton theorem since Mα | χα. – (i) ⇒ (iii): Let λ be an eigenvalue, and v be a corresponding eigenvector. Then by definition of Mα, we have

Mα(α)(v) = 0(v) = 0.

We also know that Mα(α)(v) = Mα(λ)v.

Since v is non-zero, we must have Mα(λ) = 0. – (iii) ⇒ (i): This is not necessary since it follows from the above, but we could as well do it explicitly. Suppose λ is a root of Mα(t). Then Mα(t) = (t − λ)g(t) for some g ∈ F[t]. But deg g < deg Mα. Hence by minimality of Mα, we must have g(α) 6= 0. So there is some v ∈ V such that g(α)(v) 6= 0. Then

(α − λι)g(α)(v) = Mα(α)v = 0.

So we must have α(g(α)(v)) = λg(α)(v). So g(α)(v) ∈ Eα(λ) \{0}. So (i) holds.

6.4 Multiplicities of eigenvalues and Jordan normal form Lemma. If λ is an eigenvalue of α, then

(i)1 ≤ gλ ≤ aλ

(ii)1 ≤ cλ ≤ aλ. Proof. (i) The first inequality is easy. If λ is an eigenvalue, then E(λ) 6= 0. So gλ = dim E(λ) ≥ 1. To prove the other inequality, if v1, ··· , vg is a basis for E(λ), then we can extend it to a basis for V , and then α is represented by   λIg ∗ 0 B g So χα(t) = (t − λ) χB(t). So aλ > g = gλ.

(ii) This is straightforward since Mα(λ) = 0 implies 1 ≤ cλ, and since Mα(t) | χα(t), we know that cλ ≤ αλ.

36 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Lemma. Suppose F = C and α ∈ End(V ). Then the following are equivalent: (i) α is diagonalizable.

(ii) gλ = aλ for all eigenvalues of α.

(iii) cλ = 1 for all λ. Proof. P – (i) ⇔ (ii): α is diagonalizable iff dim V = dim Eα(λi). But this is equivalent to X X dim V = gλi ≤ aλi = deg χα = dim V. P P So we must have gλi = aλi . Since each gλi is at most aλi , they must be individually equal.

– (i) ⇔ (iii): α is diagonalizable if and only if Mα(t) is a product of distinct linear factors if and only if cλ = 1 for all eigenvalues λ.

Theorem (Jordan normal form theorem). Every matrix A ∈ Matn(C) is similar to a matrix in Jordan normal form. Moreover, this Jordan normal form matrix is unique up to permutation of the blocks. Theorem. Let α ∈ End(V ), and A in Jordan normal form representing α. Then the number of Jordan blocks Jn(λ) in A with n ≥ r is

n((α − λι)r) − n((α − λι)r−1).

Proof. We work blockwise for   Jn1 (λ1)

 Jn2 (λ2)  A =   .  ..   . 

Jnk (λk) We have previously computed ( r r r ≤ m n((Jm(λ) − λIm) ) = . m r > m

Hence we know ( r r−1 1 r ≤ m n((Jm(λ) − λIm) ) − n((Jm(λ) − λIm) ) = 0 otherwise.

It is also easy to see that for µ 6= λ,

r r n((Jm(µ) − λIm) ) = n(Jm(µ − λ) ) = 0

Adding up for each block, for r ≥ 1, we have

r r−1 n((α−λι) )−n((α−λι) ) = number of Jordan blocks Jn(λ) with n ≥ r.

37 6 Endomorphisms IB Linear Algebra (Theorems with proof)

Theorem (Generalized eigenspace decomposition). Let V be a finite-dimensional vector space C such that α ∈ End(V ). Suppose that

k Y ci Mα(t) = (t − λi) , i=1 with λ1, ··· , λk ∈ C distinct. Then

V = V1 ⊕ · · · ⊕ Vk,

ci where Vi = ker((α − λiι) ) is the generalized eigenspace. Proof. Let Y ci pj(t) = (t − λi) . i6=j

Then p1, ··· , pk have no common factors, i.e. they are coprime. Thus by Euclid’s algorithm, there exists q1, ··· , qk ∈ C[t] such that X piqi = 1.

We now define the endomorphism

πj = qj(α)pj(α)

for j = 1, ··· , k. P cj Then πj = ι. Since Mα(α) = 0 and Mα(t) = (t − λjι) pj(t), we get

cj (α − λjι) πj = 0.

So im πj ⊆ Vj. Now suppose v ∈ V . Then

k X X v = ι(v) = πj(v) ∈ Vj. j=1

So X V = Vj.

To show this is a direct sum, note that πiπj = 0, since the product contains Mα(α) as a factor. So

X  2 πi = ιπi = πj πi = πi . P So π is a projection, and πj|Vj = ιVj . So if v = vi, then applying πi to both sides gives vi = πi(v). Hence there is a unique way of writing v as a sum of L things in Vi. So V = Vj as claimed.

38 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

7 Bilinear forms II 7.1 Symmetric bilinear forms and quadratic forms

Lemma. Let V be a finite-dimensional vector space over F, and φ : V × V → F is a symmetric bilinear form. Let (e1, ··· , en) be a basis for V , and let M be the matrix representing φ with respect to this basis, i.e. Mij = φ(ei, ej). Then φ is symmetric if and only if M is symmetric.

Proof. If φ is symmetric, then

Mij = φ(ei, ej) = φ(ej, ei) = Mji.

So M T = M. So M is symmetric. If M is symmetric, then X X  φ(x, y) = φ xiei, yjej X = xiMijyj i,j n X = yjMjixi i,j = φ(y, x).

Lemma. Let V is a finite-dimensional vector space, and φ : V × V → F a bilinear form. Let (e1, ··· , en) and (f1, ··· , fn) be bases of V such that

n X fi = Pkiek. k=1

If A represents φ with respect to (ei) and B represents φ with respect to (fi), then B = P T AP. Proof. Special case of general formula proven before.

Proposition (Polarization identity). Suppose that char F 6= 2, i.e. 1 + 1 6= 0 on F (e.g. if F is R or C). If q : V → F is a quadratic form, then there exists a unique symmetric bilinear form φ : V × V → F such that q(v) = φ(v, v).

Proof. Let ψ : V × V → F be a bilinear form such that ψ(v, v) = q(v). We define φ : V × V → F by 1 φ(v, w) = (ψ(v, w) + ψ(w, v)) 2

for all v, w ∈ F. This is clearly a bilinear form, and it is also clearly symmetric and satisfies the condition we wants. So we have proved the existence part.

39 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

To prove uniqueness, we want to find out the values of φ(v, w) in terms of what q tells us. Suppose φ is such a symmetric bilinear form. We compute

q(v + w) = φ(v + w, v + w) = φ(v, v) + φ(v, w) + φ(w, v) + φ(w, w) = q(v) + 2φ(v, w) + q(w).

So we have 1 φ(v, w) = (q(v + w) − q(v) − q(w)). 2 So it is determined by q, and hence unique.

Theorem. Let V be a finite-dimensional vector space over F, and φ : V ×V → F a symmetric bilinear form. Then there exists a basis (e1, ··· , en) for V such that φ is represented by a diagonal matrix with respect to this basis. Proof. We induct over n = dim V . The cases n = 0 and n = 1 are trivial, since all matrices are diagonal. Suppose we have proven the result for all spaces of dimension less than n. First consider the case where φ(v, v) = 0 for all v ∈ V . We want to show that we must have φ = 0. This follows from the polarization identity, since this φ induces the zero quadratic form, and we know that there is a unique bilinear form that induces the zero quadratic form. Since we know that the zero bilinear form also induces the zero quadratic form, we must have φ = 0. Then φ will be represented by the zero matrix with respect to any basis, which is trivially diagonal. If not, pick e1 ∈ V such that φ(e1, e1) 6= 0. Let

U = ker φ(e1, · ) = {u ∈ V : φ(e1, u) = 0}.

∗ Since φ(e1, · ) ∈ V \{0}, we know that dim U = n − 1 by the rank-nullity theorem. Our objective is to find other basis elements e2, ··· , en such that φ(e1, ej) = 0 for all j > 1. For this to happen, we need to find them inside U. Now consider φ|U×U : U × U → F, a symmetric bilinear form. By the induction hypothesis, we can find a basis e2, ··· , en for U such that φ|U×U is represented by a diagonal matrix with respect to this basis. Now by construction, φ(ei, ej) = 0 for all 1 ≤ i 6= j ≤ n and (e1, ··· , en) is a basis for V . So we’re done. Theorem. Let φ be a symmetric bilinear form over a complex vector space V . Then there exists a basis (v1, ··· , vm) for V such that φ is represented by   Ir 0 0 0 with respect to this basis, where r = r(φ).

Proof. We’ve already shown that there exists a basis (e1, ··· , en) such that φ(ei, ej) = λiδij for some λij. By reordering the ei, we can assume that λ1, ··· , λr 6= 0 and λr+1, ··· , λn = 0.

40 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

2 For each 1 ≤ i ≤ r, there exists some µi such that µi = λi. For r +1 ≤ r ≤ n, we let µi = 1 (or anything non-zero). We define

ei vi = . µi Then ( 1 0 i 6= j or i = j > r φ(vi, vj) = φ(ei, ej) = µiµj 1 i = j < r. So done.

Corollary. Every symmetric A ∈ Matn(C) is congruent to a unique matrix of the form I 0 r . 0 0 Theorem. Let φ be a symmetric bilinear form of a finite-dimensional vector space over R. Then there exists a basis (v1, ··· , vn) for V such that φ is represented   Ip  −Iq  , 0 with p + q = r(φ), p, q ≥ 0. Equivalently, the corresponding quadratic forms is given by n ! p p+q X X 2 X 2 q aivi = ai − aj . i=1 i=1 j=p+1

Proof. We’ve already shown that there exists a basis (e1, ··· , en) such that φ(ei, ej) = λiδij for some λ1, ··· , λn ∈ R. By reordering, we may assume  λ > 0 1 ≤ i ≤ p  i λi < 0 p + 1 ≤ i ≤ r  λi = 0 i > r

We let µi be defined by √ λ 1 ≤ i ≤ p √ i µi = −λi p + 1 ≤ i ≤ r 1 i > r

Defining 1 vi = ei, µi we find that φ is indeed represented by   Ip  −Iq  , 0

41 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

Theorem (Sylvester’s law of inertia). Let φ be a symmetric bilinear form on a finite-dimensional real vector space V . Then there exists unique non-negative integers p, q such that φ is represented by   Ip 0 0  0 −Iq 0 0 0 0 with respect to some basis. Proof. We have already proved the existence part, and we just have to prove uniqueness. To do so, we characterize p and q in a basis-independent way. We already know that p + q = r(φ) does not depend on the basis. So it suffices to show p is unique. To see that p is unique, we show that p is the largest dimension of a subspace P ⊆ V such that φ|P ×P is positive definite. First we show we can find such at P . Suppose φ is represented by   Ip 0 0  0 −Iq 0 0 0 0 with respect to (e1, ··· , en). Then φ restricted to he1, ··· , epi is represented by Ip with respect to e1, ··· , ep. So φ restricted to this is positive definite. Now suppose P is any subspace of V such that φ|P ×P is positive definite. To show P has dimension at most p, we find a subspace complementary to P with dimension n − p. Let Q = hep+1, ··· , eni. Then φ restricted to Q × Q is represented by

−I 0 q . 0 0

Now if v ∈ P ∩ Q \{0}, then φ(v, v) > 0 since v ∈ P \{0} and φ(v, v) ≤ 0 since v ∈ Q, which is a contradiction. So P ∩ Q = 0. We have

dim V ≥ dim(P + Q) = dim P + dim Q = dim P + (n − p).

Rearranging gives dim P ≤ p. A similar argument shows that q is the maximal dimension of a subspace Q ⊆ V such that φ|Q×Q is negative definite. Corollary. Every real symmetric matrix is congruent to precisely one matrix of the form   Ip 0 0  0 −Iq 0 . 0 0 0

42 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

7.2 Hermitian form

Lemma. Let φ : V × V → C be a sesquilinear form on a finite-dimensional vector space over C, and (e1, ··· , en) a basis for V . Then φ is Hermitian if and only if the matrix A representing φ is Hermitian (i.e. A = A†). Proof. If φ is Hermitian, then

† Aij = φ(ei, ej) = φ(ej, ei) = Aij. If A is Hermitian, then     X X † † † X X φ λiei, µjej = λ Aµ = µ A λ = φ µjej, λjej .

So done. Proposition (Change of basis). Let φ be a Hermitian form on a finite di- mensional vector space V ;(e1, ··· , en) and (v1, ··· , vn) are bases for V such that n X vi = Pkiek; k=1

and A, B represent φ with respect to (e1,..., en) and (v1, ··· , vn) respectively. Then B = P †AP. Proof. We have

Bij = φ(vi, vj) X X  = φ Pkiek, P`je` n X = P¯kiP`jAk` k,`=1 † = (P AP )ij.

Lemma (Polarization identity (again)). A Hermitian form φ on V is determined by the function ψ : v 7→ φ(v, v). Proof. We have the following:

ψ(x + y) = φ(x, x) + φ(x, y) + φ(y, x) + φ(y, y) −ψ(x − y) = −φ(x, x) + φ(x, y) + φ(y, x) − φ(y, y) iψ(x − iy) = iφ(x, x) + φ(x, y) − φ(y, x) + iφ(y, y) −iψ(x + iy) = −iφ(x, x) + φ(x, y) − φ(y, x) − iφ(y, y)

So 1 φ(x, y) = (ψ(x + y) − ψ(x − y) + iψ(x − iy) − iψ(x + iy)). 4

43 7 Bilinear forms II IB Linear Algebra (Theorems with proof)

Theorem (Hermitian form of Sylvester’s law of inertia). Let V be a finite- dimensional complex vector space and φ a hermitian form on V . Then there exists unique non-negative integers p and q such that φ is represented by   Ip 0 0  0 −Iq 0 0 0 0 with respect to some basis.

Proof. Same as for symmetric forms over R.

44 8 Inner product spaces IB Linear Algebra (Theorems with proof)

8 Inner product spaces 8.1 Definitions and basic properties Theorem (Cauchy-Schwarz inequality). Let V be an inner product space and v, w ∈ V . Then |(v, w)| ≤ kvkkwk.

Proof. If w = 0, then this is trivial. Otherwise, since the norm is positive definite, for any λ, we get

0 ≤ (v − λw, v − λw) = (v, v) − λ¯(w, v) − λ(v, w) + |λ|2(w, w).

We now pick a clever value of λ. We let

(w, v) λ = . (w, w)

Then we get

|(w, v)|2 |(w, v)|2 |(w, v)|2 0 ≤ (v, v) − − + . (w, w) (w, w) (w, w)

So we get |(w, v)|2 ≤ (v, v)(w, w). So done. Corollary (Triangle inequality). Let V be an inner product space and v, w ∈ V . Then kv + wk ≤ kvk + kwk.

Proof. We compute

kv + wk2 = (v + w, v + w) = (v, v) + (v, w) + (w, v) + (w, w) ≤ kvk2 + 2kvkkwk + kwk2 = (kvk + kwk)2.

So done. Lemma (Parseval’s identity). Let V be a finite-dimensional inner product space with an orthonormal basis v1, ··· , vn, and v, w ∈ V . Then

n X (v, w) = (vi, v)(vi, w). i=1 In particular, n 2 X 2 kvk = |(vi, v)| . i=1

45 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Proof.

 n n  X X (v, w) =  (vi, v)vi, (vj, w)vj i=1 j=1 n X = (vi, v)(vj, w)(vi, vj) i,j=1 n X = (vi, v)(vj, w)δij i,j=1 n X = (vi, v)(vi, w). i=1

8.2 Gram-Schmidt orthogonalization Theorem (Gram-Schmidt process). Let V be an inner product space and e1, e2, ··· a linearly independent set. Then we can construct an orthonormal set v1, v2, ··· with the property that

hv1, ··· , vki = he1, ··· , eki for every k. Proof. We construct it iteratively, and prove this by induction on k. The base case k = 0 is contentless. Suppose we have already found v1, ··· , vk that satisfies the properties. We define k X uk+1 = ek+1 − (vi, ei+1)vi. i=1

We want to prove that this is orthogonal to all the other vi’s for i ≤ k. We have

k X (vj, uk+1) = (vj, ek+1) − (vi, ek+1)δij = (vj, ek+1) − (vj, ek+1) = 0. i=1 So it is orthogonal. We want to argue that uk+1 is non-zero. Note that

hv1, ··· , vk, uk+1i = hv1, ··· , vk, ek+1i

since we can recover ek+1 from v1, ··· , vk and uk+1 by construction. We also know hv1, ··· , vk, ek+1i = he1, ··· , ek, ek+1i

by assumption. We know he1, ··· , ek, ek+1i has dimension k + 1 since the ei are linearly independent. So we must have uk+1 non-zero, or else hv1, ··· , vki will be a set of size k spanning a space of dimension k + 1, which is clearly nonsense. Therefore, we can define

uk+1 vk+1 = . kuk+1k

Then v1, ··· , vk+1 is orthonormal and hv1, ··· , vk+1i = he1, ··· , ek+1i as re- quired.

46 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Corollary. If V is a finite-dimensional inner product space, then any orthonor- mal set can be extended to an orthonormal basis.

Proof. Let v1, ··· , vk be an orthonormal set. Since this is linearly independent, we can extend it to a basis (v1, ··· , vk, xk+1, ··· , xn). We now apply the Gram-Schmidt process to this basis to get an orthonormal basis of V , say (u1, ··· , un). Moreover, we can check that the process does not modify our v1, ··· , vk, i.e. ui = vi for 1 ≤ i ≤ k. So done. Proposition. Let V be a finite-dimensional inner product space, and W ≤ V . Then V = W ⊥ W ⊥. Proof. There are three things to prove, and we know (iii) implies (ii). Also, (iii) is obvious by definition of W ⊥. So it remains to prove (i), i.e. V = W + W ⊥. Let w1, ··· , wk be an orthonormal basis for W , and pick v ∈ V . Now let

k X w = (wi, v)wi. i=1 Clearly, we have w ∈ W . So we need to show v − w ∈ W ⊥. For each j, we can compute

k X (wj, v − w) = (wj, v) − (wi, v)(wj, wi) i=1 k X = (wj, v) − (wi, v)δij i=1 = 0.

Hence for any λi, we have X  λjwj, v − w = 0.

So we have v − w ∈ W ⊥. So done. Proposition. Let V be a finite-dimensional inner product space and W ≤ V . Let (e1, ··· , ek) be an orthonormal basis of W . Let π be the orthonormal projection of V onto W , i.e. π : V → W is a function that satisfies ker π = W ⊥, π|W = id. Then (i) π is given by the formula

k X π(v) = (ei, v)ei. i=1

(ii) For all v ∈ V, w ∈ W , we have

kv − π(v)k ≤ kv − wk,

with equality if and only if π(v) = w. This says π(v) is the point on W that is closest to v.

47 8 Inner product spaces IB Linear Algebra (Theorems with proof)

W ⊥ v

w π(v)

Proof. (i) Let v ∈ V , and define X w = (ei, v)ei. i=1 We want to show this is π(v). We need to show v − w ∈ W ⊥. We can compute k X (ej, v − w) = (ej, v) − (ei, v)(ej, ei) = 0. i=1 So v − w is orthogonal to every basis vector in w, i.e. v − w ∈ W ⊥.So

π(v) = π(w) + π(v − w) = w

as required. (ii) This is just Pythagoras’ theorem. Note that if x and y are orthogonal, then

kx + yk2 = (x + y, x + y) = (x, x) + (x, y) + (y, x) + (y.y) = kxk2 + kyk2.

We apply this to our projection. For any w ∈ W , we have

kv − wk2 = kv − π(v)k2 + kπ(v) − wk2 ≥ kv − π(v)k2

with equality if and only if kπ(v) − wk = 0, i.e. π(v) = w.

8.3 Adjoints, orthogonal and unitary maps Lemma. Let V and W be finite-dimensional inner product spaces and α : V → W is a linear map. Then there exists a unique linear map α∗ : W → V such that

(αv, w) = (v, α∗w)(∗)

for all v ∈ V , w ∈ W .

Proof. There are two parts. We have to prove existence and uniqueness. We’ll first prove it concretely using matrices, and then provide a conceptual reason of what this means.

48 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Let (v1, ··· , vn) and (w1, ··· , wm) be orthonormal basis for V and W . Suppose α is represented by A. To show uniqueness, suppose α∗ : W → V satisfies (αv, w) = (v, α∗w) for all v ∈ V , w ∈ W , then for all i, j, by definition, we know

∗ (vi, α (wj)) = (α(vi), wj) ! X = Akiwk, wj k X = A¯ki(wk, wj) = A¯ji. k So we get ∗ X ∗ X α (wj) = (vi, α (wj))vi = A¯jivi. i i Hence α∗ must be represented by A†. So α∗ is unique. To show existence, all we have to do is to show A† indeed works. Now let α∗ be represented by A†. We can compute the two sides of (∗) for arbitrary v, w. We have  X  X  X ¯ α λivi , µjwj = λiµj(α(vi), wj) i,j ! X ¯ X = λiµj Akiwk, wj i,j k X ¯ = λiA¯jiµj. i,j We can compute the other side and get ! X ∗ X  X ¯ X † λivi, α µjwj = λiµj vi, Akjvk i,j k X ¯ = λiA¯jiµj. i,j So done. Lemma. Let V be a finite-dimensional space and α ∈ End(V ). Then α is orthogonal if and only if α−1 = α∗. Proof. (⇐) Suppose α−1 = α∗. If α−1 = α∗, then (αv, αv) = (v, α∗αv) = (v, α−1αv) = (v, v).

(⇒) If α is orthogonal and (v1, ··· , vn) is an orthonormal basis for V , then for 1 ≤ i, j ≤ n, we have

∗ δij = (vi, vj) = (αvi, αvj) = (vi, α αvj). So we know n ∗ X ∗ α α(vj) = (vi, α αvj))vi = vj. i=1 ∗ ∗ ∗ −1 So by linearity of α α, we know α α = idV . So α = α .

49 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Corollary. α ∈ End(V ) is orthogonal if and only if α is represented by an orthogonal matrix, i.e. a matrix A such that AT A = AAT = I, with respect to any orthonormal basis.

Proof. Let (e1, ··· , en) be an orthonormal basis for V . Then suppose α is represented by A. So α∗ is represented by AT . Then A∗ = A−1 if and only if AAT = AT A = I. Proposition. Let V be a finite-dimensional real inner product space and (e1, ··· , en) is an orthonormal basis of V . Then there is a bijection

O(V ) → {orthonormal basis for V }

α 7→ (α(e1, ··· , en)).

Proof. Same as the case for general vector spaces and general bases. Lemma. Let V be a finite dimensional complex inner product space and α ∈ End(V ). Then α is unitary if and only if α is invertible and α∗ = α−1. Corollary. α ∈ End(V ) is unitary if and only if α is represented by a unitary matrix A with respect to any orthonormal basis, i.e. A−1 = A†. Proposition. Let V be a finite-dimensional complex inner product space. Then there is a bijection

U(V ) → {orthonormal basis of V }

α 7→ {α(e1), ··· , α(en)}.

8.4 Spectral theory Lemma. Let V be a finite-dimensional inner product space, and α ∈ End(V ) self-adjoint. Then (i) α has a real eigenvalue, and all eigenvalues of α are real.

(ii) Eigenvectors of α with distinct eigenvalues are orthogonal. Proof. We are going to do real and complex cases separately. (i) Suppose first V is a complex inner product space. Then by the fundamental theorem of algebra, α has an eigenvalue, say λ. We pick v ∈ V \{0} such that αv = λv. Then

λ¯(v, v) = (λv, v) = (αv, v) = (v, αv) = (v, λv) = λ(v, v).

Since v 6= 0, we know (v, v) 6= 0. So λ = λ¯.

For the real case, we pretend we are in the complex case. Let e1, ··· , en be an orthonormal basis for V . Then α is represented by a symmetric matrix A (with respect to this basis). Since real symmetric matrices are Hermitian viewed as complex matrices, this gives a self-adjoint endomorphism of Cn. By the complex case, A has real eigenvalues only. But the eigenvalues of A are the eigenvalues of α and MA(t) = Mα(t). So done.

50 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Alternatively, we can prove this without reducing to the complex case. We know every irreducible factor of Mα(t) in R[t] must have degree 1 or 2, since the roots are either real or come in complex conjugate pairs. Suppose f(t) were an irreducible factor of degree 2. Then

m  α (α) 6= 0 f

since it has degree less than the minimal polynomial. So there is some v ∈ V such that M  α (α)(v) 6= 0. f So it must be that f(α)(v) = 0. Let U = hv, α(v)i. Then this is an α-invariant subspace of V since f has degree 2.

Now α|U ∈ End(U) is self-adjoint. So if (e1, e2) is an orthonormal basis of U, then α is represented by a real symmetric matrix, say

a b b a

2 2 But then χα|U (t) = (t − a) − b , which has real roots, namely a ± b. This

is a contradiction, since Mα|U = f, but f is irreducible. (ii) Now suppose αv = λv, αw = µw and λ =6 µ. We need to show (v, w) = 0. We know (αv, w) = (v, αw) by definition. This then gives

λ(v, w) = µ(v, w)

Since λ 6= µ, we must have (v, w) = 0.

Theorem. Let V be a finite-dimensional inner product space, and α ∈ End(V ) self-adjoint. Then V has an orthonormal basis of eigenvectors of α. Proof. By the previous lemma, α has a real eigenvalue, say λ. Then we can find an eigenvector v ∈ V \{0} such that αv = λv. Let U = hvi⊥. Then we can write

V = hvi ⊥ U.

We now want to prove α sends U into U. Suppose u ∈ U. Then

(v, α(u)) = (αv, u) = λ(v, u) = 0.

⊥ So α(u) ∈ hvi = U. So α|U ∈ End(U) and is self-adjoint. By induction on dim V , U has an orthonormal basis (v2, ··· , vn) of α eigen- vectors. Now let v v = . 1 kvk

Then (v1, v2, ··· , vn) is an orthonormal basis of eigenvectors for α.

51 8 Inner product spaces IB Linear Algebra (Theorems with proof)

Corollary. Let V be a finite-dimensional vector space and α self-adjoint. Then V is the orthogonal (internal) direct sum of its α-eigenspaces.

Corollary. Let A ∈ Matn(R) be symmetric. Then there exists an orthogonal matrix P such that P T AP = P −1AP is diagonal.

Proof. Let ( · , · ) be the standard inner product on Rn. Then A is self-adjoint as an endomorphism of Rn. So Rn has an orthonormal basis of eigenvectors for A, say (v1, ··· , vn). Taking P = (v1 v2 ··· vn) gives the result. Corollary. Let V be a finite-dimensional real inner product space and ψ : V × V → R a symmetric bilinear form. Then there exists an orthonormal basis (v1, ··· , vn) for V with respect to which ψ is represented by a diagonal matrix.

Proof. Let (u1, ··· , un) be any orthonormal basis for V . Then ψ is represented by a symmetric matrix A. Then there exists an orthogonal matrix P such that T P P AP is diagonal. Now let vi = Pkiuk. Then (v1, ··· , vn) is an orthonormal basis since X X  (vi, vj) = Pkiuk, P`ju`

X T = Pik(uk, u`)P`j T = [P P ]ij

= δij.

T Also, ψ is represented by P AP with respect to (v1, ··· , vn). Corollary. Let V be a finite-dimensional real vector space and φ, ψ symmetric bilinear forms on V such that φ is positive-definite. Then we can find a basis (v1, ··· , vn) for V such that both φ and ψ are represented by diagonal matrices with respect to this basis. Proof. We use φ to define an inner product. Choose an orthonormal basis for V (equipped with φ)(v1, ··· , vn) with respect to which ψ is diagonal. Then φ is represented by I with respect to this basis, since ψ(vi, vj) = δij. So done.

Corollary. If A, B ∈ Matn(R) are symmetric and A is positive definitive (i.e. vT Av > 0 for all v ∈ Rn \{0}). Then there exists an invertible matrix Q such that QT AQ and QT BQ are both diagonal. Proposition.

(i) If A ∈ Matn(C) is Hermitian, then there exists a unitary matrix U ∈ Matn(C) such that U −1AU = U †AU is diagonal. (ii) If ψ is a Hermitian form on a finite-dimensional complex inner product space V , then there is an orthonormal basis for V diagonalizing ψ. (iii) If φ, ψ are Hermitian forms on a finite-dimensional complex vector space and φ is positive definite, then there exists a basis for which φ and ψ are diagonalized.

52 8 Inner product spaces IB Linear Algebra (Theorems with proof)

† (iv) Let A, B ∈ Matn(C) be Hermitian, and A positive definitive (i.e. v Av > 0 for v ∈ V \{0}). Then there exists some invertible Q such that Q†AQ and Q†BQ are diagonal.

Theorem. Let V be a finite-dimensional complex vector space and α ∈ U(V ) be unitary. Then V has an orthonormal basis of α eigenvectors. Proof. By the fundamental theorem of algebra, there exists v ∈ V \{0} and λ ∈ C such that αv = λv. Now consider W = hvi⊥. Then V = W ⊥ hvi.

We want to show α restricts to a (unitary) endomorphism of W . Let w ∈ W . We need to show α(w) is orthogonal to v. We have

(αw, v) = (w, α−1v) = (w, λ−1v) = 0.

So α(w) ∈ W and α|W ∈ End(W ). Also, α|W is unitary since α is. So by induction on dim V , W has an orthonormal basis of α eigenvectors. If we add v/kvk to this basis, we get an orthonormal basis of V itself comprised of α eigenvectors.

53