<<

Abstract Vector Spaces, Linear Transformations, and Their Coordinate Representations

Contents

1 Vector Spaces 1 1.1 Definitions...... 1 1.1.1 Basics...... 1 1.1.2 Linear Combinations, Spans, Bases, and Dimensions...... 1 1.1.3 Sums and Products of Vector Spaces and Subspaces...... 3 1.2 Basic Theory...... 5

2 Linear Transformations 15 2.1 Definitions...... 15 2.1.1 Linear Transformations, or Vector Space ...... 15 2.2 Basic Properties of Linear Transformations...... 18 2.3 Monomorphisms and ...... 20 2.4 Rank-Nullity Theorem...... 23

3 Representations of Linear Transformations 25 3.1 Definitions...... 25 3.2 Matrix Representations of Linear Transformations...... 26

4 Change of Coordinates 31 4.1 Definitions...... 31 4.2 Change of Coordinate Maps and Matrices...... 31

1 Vector Spaces

1.1 Definitions

1.1.1 Basics

A vector space (linear space) V over a field F is a V on which the operations , + : V × V → V , and left F -action or scalar multiplication, · : F × V → V , satisfy: for all x, y, z ∈ V and a, b, 1 ∈ F VS1 x + y = y + x  VS2 (x + y) + z = x + (y + z)  (V, +) is an abelian VS3 ∃0 ∈ V such that x + 0 = x  VS4 ∀x ∈ V, ∃y ∈ V such that x + y = 0  VS5 1x = x VS6 (ab)x = a(bx) VS7 a(x + y) = ax + ay VS8 (a + b)x = ax + ay If V is a vector space over F , then a subset W ⊆ V is called a subspace of V if W is a vector space over the same field F and with addition and scalar multiplication +|W ×W and ·|F ×W .

1 1.1.2 Linear Combinations, Spans, Bases, and Dimensions

If ∅ 6= S ⊆ V , v1, . . . , vn ∈ S and a1, . . . , an ∈ F , then a of v1, . . . , vn is the finite sum a1v1 + ··· + anvn (1.1) which is a vector in V . The ai ∈ F are called the coefficients of the linear combination. If a1 = ··· = an = 0, then the linear combination is said to be trivial. In particular, considering the special case of 0 in V , the zero vector, we note that 0 may always be represented as a linear combination of any vectors u1, . . . , un ∈ V ,

0u1 + ··· + 0un = 0 | {z } |{z} 0∈F 0∈V

This representation is called the trivial representation of 0 by u1, . . . , un. If, on the other hand, there are vectors u1, . . . , un ∈ V and scalars a1, . . . , an ∈ F such that

a1u1 + ··· + anun = 0 where at least one ai 6= 0 ∈ F , then that linear combination is called a nontrivial representation of 0. Using linear combinations we can generate subspaces, as follows. If S 6= ∅ is a subset of V , then the span of S is given by

span(S) := {v ∈ V | v is a linear combination of vectors in S} (1.2)

The span of ∅ is by definition span(∅) := {0} (1.3) In this case, S ⊆ V is said to generate or span V , and to be a generating or spanning set. If span(S) = V , then S said to be a generating set or a spanning set for V . A nonempty subset S of a vector space V is said to be linearly independent if, taking any finite number of distinct vectors u1, . . . , un ∈ S, we have for all a1, . . . , an ∈ F that

a1u1 + a2u2 + ··· + anun = 0 =⇒ a1 = ··· = an = 0

That is S is linearly independent if the only representation of 0 ∈ V by vectors in S is the trivial one. In this case, the vectors u1, . . . , un themselves are also said to be linearly independent. Otherwise, if there is at least one nontrivial representation of 0 by vectors in S, then S is said to be linearly dependent.

Given ∅ 6= S ⊆ V , a nonzero vector v ∈ S is said to be an essentially unique linear combination of the vectors in S if, up to order of terms, there is one and only one way to express v as a linear combination of u1, . . . , un ∈ S. That is, if there are a1, . . . , an, b1, . . . , bm ∈ F \{0} and distinct u1, . . . , un ∈ S and distinct v1, . . . , vm ∈ S distinct, then, re-indexing the bis if necessary, ) ( ) v = a1u1 + ··· + anun ai = bi =⇒ m = n and ∀i = 1, . . . , n = b1v1 + ··· + bmvm ui = vi

A subset β ⊆ V is called a (Hamel) if it is linearly independent and span(β) = V . We also say that the vectors of β form a basis for V . Equivalently, as explained in Theorem 1.13 below, β is a basis if every nonzero vector v ∈ V is an essentially unique linear combination of vectors in β. In the context of inner product spaces of inifinite dimension, there is a difference between a vector space basis, the Hamel basis of V , and an orthonormal basis for V , the Hilbert basis for V , because though the two always exist, they are not always equal unless dim(V ) < ∞.

2 The dimension of a vector space V is the cardinality of any basis for V , and is denoted dim(V ). V finite-dimensional if it is the zero vector space {0} or if it has a basis of finite cardinality. Otherwise, if it’s basis has infinite cardinality, it is called infinite-dimensional. In the former case, dim(V ) = |β| = n < ∞ for some n ∈ N, and V is said to be n-dimensional, while in the latter case, dim(V ) = |β| = κ, where κ is a cardinal number, and V is said to be κ-dimensional. If V is finite-dimensional, say of dimension n, then an ordered basis for V a finite sequence or n- (v1, . . . , vn) of linearly independent vectors v1, . . . , vn ∈ V such that {v1, . . . , vn} is a basis for V . If V is infinite-dimensional but with a countable basis, then an ordered basis is a sequence

(vn)n∈N such that the set {vn | n ∈ N} is a basis for V . The term standard ordered basis is n ω B n applied only to the ordered bases for F , (F )0, (F )0 and F [x] and its subsets Fn[x]. For F , the standard ordered basis is given by

(e1, e2,..., en) where e1 = (1, 0,..., 0), e2 = (0, 1, 0,..., 0),..., en = (0,..., 0, 1). Since F is any field, 1 represents the multiplicative identity and 0 the additive identity. For F = R, the standard ordered basis is just as above with the usual 0 and 1. If F = C, then 1 = (1, 0) and 0 = (0, 0), so the ith standard basis vector for Cn looks like this: ith term  ei = (0, 0),..., (1, 0) ,..., (0, 0)

ω For (F )0 the standard ordered basis is given by

(en)n∈N, where en(k) = δnk B and generally for (F )0 the standard ordered basis is given by: ( 1, if x = b (eb)b∈B, where eb(x) = 0, if x 6= b

When the term standard ordered basis is applied to the vector space Fn[x], the subspace of F [x] of polynomials of degree less than or equal to n, it refers the basis

(1, x, x2, . . . , xn)

For F [x], the standard ordered basis is given by

(1, x, x2,... )

1.1.3 Sums and Products of Vector Spaces and Subspaces

If S1,...,Sk ∈ P (V ) are subsets or subspaces of a vector space V , then their (internal) sum is given by k X S1 + ··· + Sk = Si := {v1 + ··· + vk | vi ∈ Si for 1 ≤ i ≤ k} (1.4) i=1

More generally, if {Si | i ∈ I} ⊆ P (V ) is any collection of subsets or subspaces of V , then their sum is defined to be X n [ o Si = v1 + ··· + vn si ∈ Si and n ∈ N (1.5) i∈I i∈I If S is a subspace of V and v ∈ V , then the sum {v} + S = {v + s | s ∈ S} is called a coset of S and is usually denoted v + S (1.6)

3 We are of course interested in the (internal) sum of subspaces of a vector space. In this case, we may have the special condition

k X X V = Si and Sj ∩ Si = {0}, ∀j = 1, . . . , k (1.7) i=1 i6=j

When this happens, V is said to be the (internal) of S1,...,Sk, and this is symbolically denoted k M V = S1 ⊕ S2 ⊕ · · · ⊕ Sk = Si = {s1 + ··· + sk | si ∈ Si} (1.8) i=1

In the infinite-dimensional case we proceed similarly. If F = {Si | i ∈ I} is a family of subsets of V such that X X V = Si and Sj ∩ Si = {0}, ∀j ∈ I (1.9) i∈I i6=j then V is a direct sum of the Si ∈ F, denoted M M n [ o V = F = Si = s1 + ··· + sn sj ∈ Si, and sji ∈ Si if sji 6= 0 (1.10) i∈I i∈I

We can also define the (external) sum of distinct vector spaces which do not lie inside a larger vector space: if V1,...,Vn are vector spaces over the same field F , then their external direct sum is the V1 × · · · × Vn, with addition and scalar multiplication defined componentwise. And we denote the sum, confusingly, by the same notation:

V1 ⊕ V2 ⊕ · · · ⊕ Vn := {(v1, v2, . . . , vn) | vi ∈ Vi, i = 1, . . . , n} (1.11)

In the infinite-dimensional case, we have two types of external direct sum, one where there is no restriction on the sequences, the other where we only allow sequences with finite support. To distinguish these two cases we use the terms direct product and external direct product, respectively: let F = {Vi | i ∈ I} be a family of vector spaces over the same field F . Then their direct product is given by Y n o Vi := v = (vi)i∈I vi ∈ Vi (1.12) i∈I n [ o = f : I → Vi f(i) ∈ Vi (1.13) i∈I The infinite-dimensional external direct sum is then defined: M n o Vi := v = (vi)i∈I vi ∈ Vi and v has finite support (1.14) i∈I n [ o = f : I → Vi f(i) ∈ Vi and f has finite support (1.15) i∈I

When all the Vi are equal, there is some special notation:

V n (finite-dimensional) (1.16) Y V I := V (infinite-dimensional) (1.17) i∈I I M (V )0 := V (1.18) i∈I

4 n ω B ω Example 1.1 Common examples of vector spaces are the sequence space F , F , F , (F )0 B and (F )0 in a field F over that field, i.e. the various types of cartesian products of F equipped with addition and scalar multiplication operations defined componentwise (ω = N and B is any set, ω B and where (F )0 and (F )0 denote all functions f : ω → F , respectively f : B → F , with finite support). Some typical ones are

Qn Rn Cn Qω Rω Cω QB RB CB ω ω ω (Q )0 (R )0 (C )0 B B B (Q )0 (R )0 (C )0 each over Q or R or C, respectively. 

5 1.2 Basic Vector Space Theory

Theorem 1.2 If V is a vector space over a field F , then for all x, y, z ∈ V and a, b ∈ F the following hold: (1) x + y = z + y =⇒ x = z (Cancellation Law) (2)0 ∈ V is unique. (3) x + y = x + z = 0 =⇒ y = z (Uniqueness of the Additive Inverse) (4)0 x = 0 (5)( −a)x = −(ax) = a(−x)

(6) The finite external direct sum V1 ⊕ · · · ⊕ Vn is a vector space over F if each Vi is a vector space over F and if we define + and · componentwise by

(v1, . . . , vn) + (u1, . . . , un) := (v1 + u1, . . . , vn + un)

c(v1, . . . , vn) := (cv1, . . . , cvn)

More generally, if F = {Vi | i ∈ I} is a family of vector spaces over the same field F , then the Q L I direct product i∈I Vi and external direct sum i∈I Vi = (V )0 are vector spaces over F , with + and · given by  [   [   [  v + w = f : I → Vi + g : I → Vi := f + g : I → Vi i∈I i∈I i∈I  [   [  cv = c f : I → Vi := cf : I → Vi i∈I i∈I Proof: (1) x = x+0 = x+(y +(−y)) = (x+y)+(−y) = (z +y)+(−y) = z +(y +(−y)) = z +0 = z. (2) ∃01, 02 =⇒ 01 = 01 + 02 = 02 + 01 = 02. (3)(a) x + y1 = x + y2 = 0 =⇒ y1 = y2 by the Cancellation Law. (b) y1 = y1 + 0 = y1 + (x + y2) = (y1 + x) + y2 = 0 + y2 = y2 + 0 = y2. (4) 0x + 0x = (0 + 0)x = 0x = 0x + 0V =⇒ 0x = 0V by the Cancellation Law. (5) ax + (−(ax)) = 0, and by 3 −(ax) is unique. But note ax + (−a)x = (a + (−a))x = 0x = 0, hence (−a)x = −(ax), and also a(−x) = a((−1)x) = (a(−1))x = (−a)x, which we already know equals −(ax). (6) All 8 vector space properties are verified componentwise, since that is how addition and scalar multiplication are defined. But the components are vectors in a vector space, so all 8 properties will be satisfied. 

Theorem 1.3 Let V be a vector space over F . A subset S of V is a subspace iff for all a, b ∈ F and x, y ∈ S we have one of the following equivalent conditions:

(1) ax + by ∈ S (2) x + y ∈ S and ax ∈ S (3) x + ay ∈ S Proof: If S satisfies any of the above implications, then it is a subspace because vector space properties VS1-2 and VS5-8 follow from the fact that S ⊆ V , while VS3 follows because we can choose a = b = 0 and 0 ∈ V is unique (Theorem 1.2), and VS4 follows from VS3, in the first case by choosing b = −a and x = y, in the second case by choosing y = (−1)x, and in the third case by choosing a = −1 and y = x. Conversely, if S is a vector space, then (1)-(3) follow by definition. 

6 Theorem 1.4 Let V be a vector space. If C = {Si | i ∈ K} is a collection of subspaces of V , then X M \ Si, Si and Si (1.19) i∈K i∈K i∈K are subspaces.

Proof: (1) If C = {Si | i ∈ K} is a collection of subspaces, then none of the Si are empty by P S the previous theorem, so if a, b ∈ F and v, w ∈ Si, then v = si + ··· + si ∈ Si and S i∈K 1 n i∈K w = sj1 + ··· + sjm ∈ i∈K Si such that sik ∈ Sik for all ik ∈ K. Consequently

av + bw = a(si1 + ··· + sin ) + b(sj1 + ··· + sjn )

= asi1 + ··· + asin + bsj1 + ··· + bsjn X ∈ Si i∈K because asi ∈ Si for all ik ∈ K because the Si are subspaces. Hence by the previous theorem P k k k i∈K Si is a subspace of V . The direct sum case is a subcase of this general case. T (2) If a, b ∈ F and u, v ∈ Si, then u, v ∈ Si for all i ∈ K, whence au + bv ∈ Si for all i ∈ K, T Ti∈K or au + bv ∈ i∈K Si, and i∈K Si is a subspace by the previous theorem. 

Theorem 1.5 If S and T are subspaces of a vector space V , then S ∪ T is a subspace iff S ⊆ T or T ⊆ S. Proof: If S ∪ T is a subspace, then for all a, b ∈ F and x, y ∈ S ∪ T we have ax + by ∈ S ∪ T . Suppose neither S ⊆ T nor S ⊆ T holds, and take x ∈ S\T and y ∈ T \S. Then, x + y is neither in S nor in T (for if it were in S, say, then y = (x + y) + (−x) ∈ S, a contradiction), contradicting the fact that S ∪ T is a subspace. Conversely, suppose, say, S ⊆ T , and choose any a, b ∈ F and x, y ∈ S ∪ T . If x, y ∈ S, then ax + by ∈ S ⊆ S ∪ T , while if x, y ∈ T , then ax + by ∈ T ⊆ S ∪ T . If x ∈ S and y ∈ T , then x ∈ T as well and ax + by ∈ T ⊆ S ∪ T . Thus in all cases ax + by ∈ S ∪ T , and S ∪ T is a subspace. 

Theorem 1.6 If V is a vector space, then for all S ⊆ V , span(S) is a subspace of V . Moreover, if T is a subspace of V and S ⊆ T , then span(S) ⊆ T .

Proof: If S = ∅, span(∅) = {0}, which is a subspace of V ; if T is a subspace, then ∅ ⊆ T and {0} ⊆ T . If S 6= ∅, the any x, y ∈ S have the form x = a1v1 + ··· + anvn and y = b1w1 + ··· + bmwm for some a1, . . . an, b1, . . . , bn ∈ F and v1, . . . , vn, w1, . . . , wn ∈ S. Then, if a, b ∈ F we have

ax + by = a(a1v1 + ··· + anvn) + b(b1w1 + ··· + bmwm)

= (aa1)v1 + ··· + (aan)vn + (bb1)w1 + ··· + (bbm)wm ∈ span(S) so span(S) is a subspace of V . Now, suppose T is a subspace of V and S ⊆ T . If w ∈ span(S), then w = a1w1 + ··· + akwn for some w1, . . . , wn ∈ S and a1, . . . , an ∈ F . Since S ⊆ T , w1, . . . , wk ∈ T , whence w = c1w1 + ··· + ckwk ∈ T . 

Corollary 1.7 If V is a vector space, then S ⊆ V is a subspace of V iff span(S) = S. Proof: If S is a subspace of V , then by the theorem S ⊆ S =⇒ span(S) ⊆ S. But if s ∈ S, then s = 1s ∈ span(S), so S ⊆ span(S), and therefore S = span(S). Conversely, if S = span(S), then S is a subspace of V by the theorem. 

7 Corollary 1.8 If V is a vector space and S, T ⊆ V , then the following hold: (1) S ⊆ T ⊆ V =⇒ span(S) ⊆ span(T ) (2) S ⊆ T ⊆ V and span(S) = V =⇒ span(T ) = V (3) span(S ∪ T ) = span(S) + span(T ) (4) span(S ∩ T ) ⊆ span(S) ∩ span(T ) Proof: (1) and (2) are immediate, so we only need to prove 3 and 4:

(3) If v ∈ span(S ∪ T ), then ∃v1, . . . , vm ∈ S, u1, . . . , un ∈ T and a1, . . . , am, b1, . . . , bn ∈ F such that v = v1a1 + ··· + vmam + b1u1 + ··· + bnun

Note, however, v1a1 + ··· + vmam ∈ span(S) and b1u1 + ··· + bnun ∈ span(T ), so that v ∈ span(S) + span(T ). Thus span(S ∪ T ) ⊆ span(S) + span(T ). Conversely, if v = s + t ∈ span(S) + span(T ), then s ∈ span(S) and t ∈ span(T ), so that by 1, since S ⊆ S ∪ T and T ⊆ S ∪ T , we must have span(S) ⊆ span(S ∪ T ) and span(T ) ⊆ span(S ∪ T ). Consequently, s, t ∈ span(S ∪ T ), and since span(S ∪ T ) is a subspace, v = s + t ∈ span(S ∪ T ). That is span(S) + span(T ) ⊆ span(S ∪ T ). Thus, span(S) + span(T ) = span(S ∪ T ). (4) By Theorem 1.6 span(S ∩T ), span(S) and span(T ) are subspaces, and by Theorem 1.4 span(S)∩ span(T ) is a subspace. Now, consider x ∈ span(S ∩ T ). There exist vectors v1, . . . , vn ∈ S ∩ T and scalars a1, . . . , an ∈ F such that x = a1v1 + ··· + anvn. But since v1, . . . , vn belong to both S and T , x ∈ span(S) and x ∈ span(T ), so that x ∈ span(S) ∩ span(T ). It is not in general true, however, 2 that span(S) ∩ span(T ) ⊆ span(S ∩ T ). For example, if S = {e1, e2} ⊆ R and T = {e1, (1, 1)}, 2 2 2 2 then span(S) ∩ span(T ) = R ∩ R = R , but span(S ∩ T ) = span({e1}) = R, and R 6⊆ R. 

Theorem 1.9 If V is a vector space and S ⊆ T ⊆ V , then (1) S linearly dependent =⇒ T linearly dependent (2) T linearly independent =⇒ S linearly independent

Proof: (1) If S is linearly dependent, there are scalars a1, . . . , am ∈ F , not all zero, and vectors v1, . . . , vm ∈ S such that a1v1 + ··· + amvm = 0. If S ⊆ T , then v1, . . . , vm are also in T , and have the same property, namely a1v1 + ··· + amvm = 0. Thus, T is linearly dependent. (2) If T is linearly independent, then if v1, . . . , vn ∈ S, a1, . . . , an ∈ F and a1v1 + ··· + anvn = 0, since S ⊆ T implies v1, . . . , vn ∈ T , we must have a1 = a2 = ··· = an = 0, and S is linearly independent. 

Theorem 1.10 If V is a vector space, then S ⊆ V is linearly dependent iff S = {0} or there exist distinct vectors v, u1, . . . , un ∈ S such that v is a linear combination of u1, . . . , un. Proof: If S is linearly dependent, then S consists of at least one element by definition. Suppose S has only one element, v. Then, either v = 0 or v 6= 0. If v 6= 0, consider a ∈ F . Then, av = 0 =⇒ a = 0, which means S is linearly independent, contrary to assumption. Therefore, v = 0, and so S = {0}. If S contains more than one element and v ∈ S, then since S is linearly dependent, there exist distinct u1, . . . , un ∈ S and a1, . . . , an ∈ F , not all zero such that

a1u1 + ··· anun = 0

If v = 0, we’re done. If v 6= 0, then, since {u1, . . . , un} is linearly dependent, so is {u1, . . . , un, v}, by Theorem 1.9, so that there are b1, . . . , bn+1 ∈ F , not all zero, such that

b1u1 + ··· + bnun + bn+1v = 0

8 Choose bn+1 6= 0, for if we let bn+1 = 0 we have gained nothing. Then,     b1 bn v = − u1 + ··· + − un bn+1 bn+1 and v is a linear combination of distinct vectors in S. Conversely, if S = {0}, clearly S is linearly dependent, for we can choose a, b ∈ F both nonzero and have a0 + b0 = 0. Otherwise, if there are distinct u1, . . . , un, v ∈ S and a1, . . . , an ∈ F such that v = a1u1 + ··· + anun, then obviously 0 = −1v + a1u1 + ··· + anun, and since −1 6= 0, S is linearly dependent. 

Corollary 1.11 If S = {u1, . . . , un} ⊆ V , then S is linearly dependent iff u1 = 0 or for some 1 ≤ k < n we have uk+1 ∈ span(u1, . . . , uk). 

Theorem 1.12 If V is a vector space and u, v ∈ V are distinct, then {u, v} is linearly dependent iff u or v is a multiple of the other. Proof: If {u, v} is linearly dependent, there are scalars a, b ∈ F not both zero such that au+bv = 0. If a = 0, then b 6= 0, and so au + bv = bv = 0 =⇒ v = 0 =⇒ u 6= 0, and so v = 0u. If neither a nor b is equal to zero, then au + bv = 0 =⇒ u = −a−1bv. Conversely, suppose, say, u = av. If a = 0, then u = 0 and v can be any vector other than v, since by hypothesis u 6= v, whence au + 0v = 0 for any a ∈ F , including nonzero a, so that {u, v} is linearly dependent. Otherwise, if a 6= 0, we have u = av =⇒ 1u + (−a)v = 0, and both 1 6= 0 and −a 6= 0, so that {u, v} is linearly dependent. 

Theorem 1.13 (Properties of a Basis) If V is a vector space, then the following are equivalent, and any S ⊆ V satisfying any of them is a basis:

(1) S is linearly dependent and V = span(S). (2) (Essentially Unique Representation) Ever nonzero vector v ∈ V is an essentially unique linear combination of vectors in S. (3) No vector in S is a linear combination of other vectors in S. (4) S is a minimal spanning set, that is V = span(S) but no proper subset of S spans V . (5) S is a maximal linearly independent set, that is S is linearly independent but any proper superset is not linearly independent. Proof: (1)=⇒ (2): If S is linearly independent and v ∈ S is a nonzero vector such that for distinct si and ti in S

v = a1s1 + ··· + ansn

= b1t1 + ··· + bmsm

then grouping any sij and tij that are equal, we have

0 = (ai1 − bi1 )si1 + ··· + (aik − bik )sik + aik+1 sik+1 + ··· + ain sin

+ bik+1 tik+1 + ··· + bim tim

implies aik+1 = ··· = ain = bik+1 = ··· = bim = 0, so n = m = k, and for j = 1, . . . , k we have aij = bij , whence v is essentially uniquely represented as a linear combination of vectors in S.

9 (1)=⇒ (2): The contrapositive of Theorem 1.10.

(3)=⇒ (1): If 3 holds and a1s1 + ··· + ansn = 0, where the si are distinct and a1 6= 0, then n > 1 1 and s1 = − (a2s2 + ··· + ansn), which violates 3. Thus 3 implies 1. a1 (1)⇐⇒ (4): If S is a linearly independent spanning set and T is a proper subset of S that also spans V , then any vectors in S\T would have to be linear combinations of the vectors in T , violating 3 for S (3 is equivalent to 1, as we just showed). Thus S is a minimal spanning set. Conversely, if S is a minimal spanning set, then it must be linearly independent, for otherwise there would be some s ∈ S that is a linear combination of other vectors in S, which would mean S\{s} also spans V , contradicting minimality. Thus 4 implies 1. (1)⇐⇒ (5): If 1 holds, so S is linearly independent and spans V , but S is not maximal, then ∃v ∈ V \S such that S ∪ {v} is also linearly independent. But v∈ / span(S) by 3, contradiction. Therefore S is maximal and 1 implies 5. Conversely, if S is a maximal linearly independent set, then S must span V , for otherwise there is a v ∈ V \S that isn’t a linear combination of vectors in S, implying S ∪ {v} is linearly independent proper superset, violating maximality. Therefore span(S) = V , and 5 implies 1. 

Corollary 1.14 If V is a vector space, then S = {v1, . . . , vn} ⊆ V is a basis for V iff

V = span(v1) ⊕ span(v2) ⊕ · · · ⊕ span(vn) (1.20)



The previous theorem did not say that any such basis exists for a given vector space. It merely gave the defining characteristics of a basis, should we ever come across one. The next theorem assures us that all vector spaces have a basis – a consequence of Zorns lemma.

Theorem 1.15 (Existence of a Basis) If V is a nonzero vector space and I ⊆ S ⊆ V with I linearly independent and V = span(S), then ∃B basis for V for which I ⊆ B ⊆ S (1.21) More specifically, (1) Any nonzero vector space has a basis. (2) Any linearly independent set in V is contained in a basis. (3) Any spanning set in V contains a basis. Proof: Let A ⊆ P (V ) be the set of all linearly independent sets L such that I ⊆ L ⊆ S. Then A is non-empty because I ⊆ I ⊆ S, so I ∈ A. Now, if

C = {Ik | k ∈ K} ⊆ A is a chain in A, that is a totally ordered (under set inclusion) subset of A, then the union [ [ U = C = Ii i∈K is linearly independent and satisfies I ⊆ U ⊆ S, that is U ∈ A. But by Zorn’s lemma every chain has a maximal element B, so that ∃B ∈ A maximal which is linearly independent. But of course such a B is a basis for V = span(S), for if any s ∈ S is not a linear combination of elements in B, then B ∪{s} is linearly independent and B ( B ∪{s}, contradicting the maximality of B. Therefore S ⊆ span(B), and so V = span(S) ⊆ span(B) ⊆ V , or V = span(B). This shows that (1) there is a basis B for V , (2) any linearly independent set I has I ⊆ B for this B, and (3) any spanning set S has B ⊆ S for this B. 

10 Corollary 1.16 (All Subpaces Have Complements) If S is a subspace of a vector space V , then there exists a subspace T of V such that

V = S ⊕ T (1.22)

Proof: If S = V , then let T = {0}. Otherwise, if V \S 6= ∅, then since V has a basis B and B is maximal, we must have S = span(B ∩ S) and B ∩ (V \S) 6= ∅. That is, there is a nonzero subspace T = span(B ∩ (V \S)), and by Corollary 1.8

V = span(B) = span((B ∩ S) ∪ (B ∩ (V \S))) = span(B ∩ S) + span(B ∩ (V \S)) = S + T

But S ∩ T = {0} because of the essentially unique representation of vectors in V as linear combina- tions of vectors in B. Hence, V = S ⊕ T , and T is a complement of S. 

Theorem 1.17 (Replacement Theorem) If V is a vector space such that V = span(S) for some S ⊆ V with |S| = n, and if B ⊆ V is linearly independent with |B| = m, then (1) m ≤ n (2) ∃C ⊆ S with |C| = n − m such that V = span(B ∪ C)

Proof: The proof is by induction on m, beginning with m = 0. For this m, B = ∅, so taking C = S, we have B ∪ C = ∅ ∪ S = S, which generates V . Now suppose the theorem holds for some m ≥ 0. 0 For m+1, let B = {v1, . . . , vm+1} ⊆ V be linearly independent. By Corollary 1.8, B = {v1, . . . , vm} 0 is linearly independent, and so by the induction hypothesis m ≤ n and ∃C = {u1, . . . , un−m} ⊆ S 0 such that {v1, . . . , vn} ∪ C generates V . This means ∃a1, . . . , am, b1, . . . , bn−m ∈ F such that

vm+1 = a1v1 + ··· + amvm + b1u1 + ··· + bn−mun−m (1.23)

0 0 Note that if n = m, C = ∅, so that vm+1 ∈ span(B ), contradicting the assumption that B is linearly independent. Therefore, n > m, or n ≥ m + 1. Moreover, some bi, say b1, is nonzero, for otherwise we have vm+1 = a1v1 + ··· + amvm, leading again to B being linearly dependent in contradiction to assumption. Solving (1.23) for u1, we get

−1 −1 −1 −1 −1 u1 = (−b1 a1)v1 + ··· + (−b1 am)vm + b1 vm+1 + (−b1 b2)u2 + ··· + (−b1 bn−m)un−m

Let C = {u2, . . . , un−m}. Then u1 ∈ span(B ∪ C), and since v1, . . . , vm, u2, . . . , un−m ∈ B ∪ C, they are also in span(B ∪ C), whence B0 ∪ C0 ⊆ span(B ∪ C) Now, since span(B0 ∪ C0) = V by the induction hypothesis, we have by Corollary 1.8 that span(B ∪ C) = V . So we have B linearly independent with |B| = m + 1 vectors, m + 1 ≤ n, span(B ∪ C) = V , and C ⊆ S with |C| = (n − m) − 1 = n − (m + 1) vectors, so the theorem holds for m + 1. Therefore the theorem is true for all m ∈ N by induction. 

Corollary 1.18 If V is a vector space with a finite spanning set, then every basis for V contains the same number of vectors. Proof: Suppose S is a finite spanning set for V , and let β and γ be two bases for V with |γ| > |β| = k, so that some subset T ⊆ γ contains exactly k + 1 vectors. Since T is linearly independent and β generates V , the replacement theorem implies that k + 1 ≤ k, a contradiction. Therefore γ is finite and |γ| ≤ k. Reversing the roles of β and γ shows that k ≤ |γ|, so that |γ| = |β| = k. 

11 Corollary 1.19 If V is a finite-dimensional vector space with dim(V ) = n, then the following hold: (1) Any finite spanning set for V contains at least n vectors, and a spanning set for V that contains exactly n vectors is a basis for V . (2) Any linearly independent subset of V that contains exactly n vectors is a basis for V . (3) Every linearly independent subset of V can be extended to a basis for V . Proof: Let B be a basis for V . (1) Let S ⊆ V be a finite spanning set for V . By Theorem 1.15, ∃B ⊆ S that is a basis for V , and by Corollary 1.18 |B| = n vectors. Therefore |S| ≥ n, and |S| = n =⇒ S = B, so that S is a basis for V . (2) Let I ⊆ V be linearly independent with |I| = n. By the replacement theorem ∃T ⊆ B with |T | = n − n = 0 vectors such that V = span(I ∪ T ) = span(I ∪ ∅) = span(I). Since I is also linearly independent, it is a basis for V . (3) If I ⊆ V is linearly independent with |I| = m vectors, then the replacement theorem asserts that ∃H ⊆ B containing exactly n − m vectors such that V = span(I ∪ H). Now, |I ∪ H| ≤ n, so that by 1 |I ∪ H| = n, and so I ∪ H is a basis for V extended from I. 

Theorem 1.20 Any two bases for a vector space V have the same cardinality. Proof: If V is finite-dimensional, then the previous corollary applies, so we need only consider bases that are infinite sets. Let B = {bi | i ∈ I} ba a basis for V and let C be any other basis for V . Then all c ∈ C can be written as finite linear combinations of vectors in B, where all the coefficients are nonzero. That is, if Uc = {1, . . . , nc}, we have

n Xc X c = aibi = aibi

i=1 i∈Uc S S for unique a1, . . . , anc ∈ F . But because C is a basis, we must have I = c∈C Uc, for if c∈C ( I, then all c ∈ C can be expressed by a proper subset B0 of B, so that V = span(B0), which is contradiction of the minimality of B as a spanning set. Now, for all c ∈ C we have |Uc| < ℵ0, which implies that

[ |B| = |I| = Uc ≤ ℵ0|C| = |C| c∈C Reversing the roles of B and C gives |C| ≤ |B|, so by the Schr¨oder-Bernsteintheorem we have |B| = |C|. 

Corollary 1.21 If V is a vector space and S is a subspace of V : (1) dim(S) ≤ dim(V ) (2) dim(S) = dim(V ) < ∞ =⇒ S = V (3) V is infinite-dimensional iff it contains an infinite linearly independent subset. 

Theorem 1.22 Let V be a vector space.

(1) If B is a basis for V and B = B1 ∪ B2, where B1 ∩ B2 = ∅, then V = span(B1) ⊕ span(B2). (2) If V = S⊕T and B1 is a basis for S and B2 is a basis for T , then B1 ∩B2 = ∅ and B = B1 ∪B2 is a basis for V .

Proof: (1) If B1∩B2 = ∅ and B = B1∪B2 is a basis for V , then 0 ∈/ B1∪B2. But, if a nonzero vector v ∈ span(B1) ∩ span(B2), then B1 ∩ B2 6= ∅, a contradiction. Hence {0} = span(B1) ∩ span(B2). Moreover, since B1 ∪ B2 is a basis for V , and since it is also a basis for span(B1) + span(B2), we

12 must have V = span(B1) + span(B2), and hence V = span(B1) ⊕ span(B2). (2) If V = S ⊕ T , then S ∩ T = {0}, and since 0 ∈/ B1 ∪ B2, we have B1 ∩ B2 = ∅. Also, since all v ∈ V = S ⊕ T have the form a1u1 + ··· + amum + b1v1 + ··· + bnvn for u1, . . . , um ∈ B1 and v1, . . . , vn ∈ B2, v ∈ span(B1 ∪ B2), so B1 ∪ B2 is a basis for V by Theorem 1.13. 

Theorem 1.23 If S and T are subspaces of a vector space V , then

dim(S) + dim(T ) = dim(S + T ) + dim(S ∩ T ) (1.24)

As a consequence, V = S ⊕ T iff dim(V ) = dim(S) + dim(T ).

Proof: If B = {bi | i ∈ I} is a basis for S ∩ T , then we can extend this to a basis A ∪ B for S and to another basis B ∪ C for T , where A = {aj | j ∈ J}, C = {ck | k ∈ K} and A ∩ B = ∅ and B ∩ C = ∅. Then A ∪ B ∪ C is a basis for S + T : clearly span(A ∪ B ∪ C) = S + T , so we need only verify that A ∪ B ∪ C is linearly independent. To that end, suppose not, suppose ∃c1, . . . , cn ∈ F \{0} and v1, . . . , vn ∈ A ∪ B ∪ C such that

c1v1 + c2v2 + ··· + cnvn = 0

Then some vi ∈ A∩C since by our construction A∪B and B ∪C are linearly independent. Isolating the vectors in A, say v1, . . . , vk, on one side of the equality shows that there is a nonzero vector,

∈span(A) z }| { x = a1v1 + ··· + akvk =⇒ x ∈ span(A) ∩ span(B ∪ C) = span(A) ∩ T ⊆ S ∩ T = ak+1vk+1 + ··· + anvn | {z } ∈span(B∪C) since span(A) ⊆ S. Consequently x = 0, and therefore a1 = ··· = an = 0 since A and B ∪ C are linearly independent sets, a contradiction. Thus A ∪ B ∪ C is linearly independent and hence a basis for S + T . Moreover,

dim(S) + dim(T ) = |A ∪ B| + |B ∪ C| = |A| + |B| + |B| + |C| = |A ∪ B ∪ C| + |B| = dim(S + T ) + dim(S ∩ T )

Of course if S ∩ T = {0}, then dim(S ∩ T ) = 0, and

dim(S) + dim(T ) = dim(S ⊕ T ) = dim(V ) while if dim(V ) = dim(S) + dim(T ), then dim(S ∩ T ) = 0, so S ∩ T = {0}, and so V = S ⊕ T . 

Corollary 1.24 If S and T are subspaces of a vector space V , then (1) dim(S + T ) ≤ dim(S) + dim(T ) (2) dim(S + T ) ≤ max{dim(S), dim(T )}, if S and T are finite-dimensional. 

13 Theorem 1.25 (Direct Sums) If F = {Si | i ∈ I} is a family of subspaces of a vector space V P such that V = i∈I Si, then the following statements are equivalent: L (1) V = i∈I Si.

(2)0 ∈ V has a unique expression as a sum of vectors each from different Si, namely as a sum of zeros: for any distinct j1, . . . , jn ∈ I, we have

vj1 + vj2 + ··· + vjn = 0 and vji ∈ Sji for each i =⇒ vj1 = ··· = vjn = 0

(3) Each v ∈ V has a unique expression as a sum of distinct vji ∈ Sji \{0},

v = vj1 + vj2 + ··· + vjn S (4) If γi is a basis for Si, then γ = i∈I γi is a basis for V . If V is finite-dimensional, then this may be restated in terms of ordered bases γi and γ. L Proof: (1) =⇒ (2): Suppose 1 is true, that is V = i∈I Si, and choose v1, . . . , vn ∈ V such that vi ∈ Ski and v1 + ··· + vk = 0. Then for any j, X X −vj = vi ∈ Si i6=j i6=j

However, −vj ∈ Sj, whence X −vj ∈ Wj ∩ Si = {0} i6=j so that vj = 0. This is true for all j = 1, . . . , k, so v1 = ··· = vk = 0, proving 2. P (2) =⇒ (3): Suppose 2 is true and let v ∈ V = i∈I Si be given by

v = u1 + ··· + un

= w1 + ··· + wm

for some ui ∈ Ski and wj ∈ Skj . Then, grouping terms from the same subspaces

0 = v − v = (ui1 − wi1 ) + ··· + (uik − wik ) + uik+1 + ··· + uin − wik+1 − · · · − wim ( ui = wi , ··· , ui = wi =⇒ 1 1 k k and uik+1 = ··· = uin = wik+1 = ··· = wim = 0 proving uniqueness. Pn (3) =⇒ (4): Suppose each vector v ∈ V = i∈I Si can be uniquely written as v = v1 + ··· + vk, Pn for vi ∈ Sji . For each i ∈ I let γi be an ordered basis for Si, so that since V = i∈I Si, we must Sk  have that V = span(γ) = span i=1 γi , so we only need to show that γ is linearly independent. To that end, let

v11, v12, . . . , v1n1 ∈ γi1 . .

vm1, vm2, . . . , vmnm ∈ γim

for any bases γij for Sij ∈ F, then let aij ∈ F for i = 1, . . . , m and j = 1, . . . , ni, and let wi = Pni j=1 aijvij. Then, suppose X w1 + ··· + wm = aijvij = 0 i,j

14 Tm since 0 ∈ i=1 Sji and 0 = 0 + ··· + 0, by the assumed uniqueness of expression of v ∈ V by wi ∈ Si we must have wi = 0 for all i, and since all the γi are linearly independent, we must have aij = 0 for all i, j. S (4) =⇒ (1): if 4 is true, then there exist ordered bases γi the Si such that γ = i∈I γi is an ordered basis for V . By Corollary 1.8 and Theorem 1.22, [ X X V = span(γ) = span( γi) = span(γi) = Si i∈I i∈I i∈I P Choose any j ∈ I and suppose for a nonzero vector v ∈ V we have v ∈ Sj ∩ i6=j Si. Then, [ X v ∈ Sj = span(γj) ∩ span( γi) = Si i6=j i6=j S which means v is a nontrivial linear combination of elements in γi and elements in γi, so that S i6=j v can be expressed as a linear combination of i∈I γi in more than one way, which contradicts uniqueness of representation of vectors in V in terms of a basis for V , Theorem 1.13. Consequently, any such v must be 0. That is, X Sj ∩ Si = {0} i6=j L and the sum is direct, i.e. V = i∈I Si, proving 1. 

15 2 Linear Transformations

2.1 Definitions

2.1.1 Linear Transformations, or Vector Space Homomorphisms

If V and W are vector spaces over the same field F , a function T : V → W is called a linear transformation or vector space if it preserves the vector space structure, that is if for all vectors x, y ∈ V and all scalars a, b ∈ F we have

T (x + y) = T (x) + T (y) or equivalently T (ax + by) = aT (x) + bT (y) (2.1) T (ax) = aT (x)

The set of all linear transformations from V to W is denoted

L(V,W ) or Hom(V,W ) (2.2)

Linear transformation, like group and homomorphisms, come in different types:

1. linear operator (vector space endomorphism) on V – if V is a vector space over F , a linear operator, or vector space endomorphism, is a linear transformation T ∈ L(V,V ). The set of all linear operators, or vector space endomorphisms, is denoted

L(V ) or End(V ) (2.3)

2. monomorphism (embedding) – a 1-1 or injective linear transformation. 3. epimorphism – an onto or surjective linear transformation. 4. (invertible linear transformation) from V to W – a bijective (and there- fore invertible) linear transformation T ∈ L(V,W ), i.e. a monomorphism which is also an epimorphism. If such an isomorphism exists, V is said to be isomorphic to W , and this relationship is denoted V =∼ W (2.4) The set of all linear isomorphomorphisms from V to W is denoted

GL(V,W ) (2.5)

by analogy with the general linear group. 5. automorphism – a bijective linear operator. The set of all automorphisms of V is denoted Aut(V ) or GL(V ) (2.6)

Some common and important linear transformations are:

1. The identity transformation, IV : V → V, given by IV (x) = x for all x ∈ V . It’s linearity is obvious: IV (ax + by) = ax + by = aIV (x) + bIV (y).

2. The zero transformation, T0 : V → W, given by T0(x) = 0 for all x ∈ V . Sometimes this is denoted by 0 if the context is clear, like the zero polynomial 0 ∈ F [x]. This is also clearly a linear transformation. 3. An idempotent operator T ∈ L(V ) satisfies T 2 = T .

16 A crucial concept, which will allow us to classify and so distinguish operators, is that of similarity of operators. Two linear operators T,U ∈ L(V ) are said to be similar, denoted

T ∼ U (2.7) if there exists an automorphism φ ∈ Aut(V ) such that

T = φ ◦ U ◦ φ−1 (2.8)

Similarity is an on L(V ): 1. T ∼ T , because T = I ◦ T ◦ I−1 2. T ∼ U implies T = φ ◦ U ◦ φ−1 for some automorphism φ ∈ L(V ), so that U = φ−1 ◦ T ◦ φ, or U = ψ ◦ T ◦ ψ−1, where ψ = φ−1 is an automorphism. Thus U ∼ T . 3. S ∼ T and T ∼ U implies S = φ◦T ◦φ−1 and T = ψ◦U ◦ψ−1 where φ and ψ are automorphisms, so S = φ ◦ T ◦ φ−1 = φ ◦ (ψ ◦ U ◦ ψ−1) ◦ φ−1 = (φ ◦ ψ) ◦ U(φ ◦ ψ)−1 and since φ ◦ ψ is an automorphism of V , we have S ∼ U. This partitions L(V ) into equivalence classes. As we will see, each equivalence class corresponds to a single matrix representation, depending only on the choice of basis for V , which maybe different for each operator. The or null space of a linear tranformation T ∈ L(V,W ) is the pre-image of 0 ∈ W under T : ker(T ) = T −1(0) (2.9) i.e. ker(T ) = {x ∈ V | T (x) = 0}. As we will show below, ker(T ) is a subspace of V . The range or image of a linear tranformation T ∈ L(V,W ) is the range of T as a set function: it is variously denoted by R(T ), T (V ) or im(T ),

R(T ) = im(T ) = T (V ) := {T (x) | x ∈ V } (2.10)

As we show below, R(T ) is a subspace of W . The rank of a linear transformation T ∈ L(V,W ) is the dimension of its range: rank(T ) := dimR(T ) (2.11) The nullity of T ∈ L(V,W ) is the dimension of its kernel:

null(T ) := dimker(T ) (2.12)

Given an operator T ∈ L(V ), a subspace S of V is said to be T -invariant if v ∈ S =⇒ T (v) ∈ S, that is if T (S) ⊆ S. The restriction of T to a T -invariant subspace S is denoted TS, and of course TS ∈ L(S) is a linear operator on S. In the case that V = R ⊕ S decomposes as a direct sum of T -invariant subspaces R and S, we say the pair (R,S) reduces T , or is a reducing pair for T , and we say that T is the direct sum of operators TR and TS, denoting this by

T = TR ⊕ TS (2.13) The expression T = τ ⊕ σ ∈ L(V ) means there are subspaces R and S of V for which (R,S) reduces T and τ = TR and σ = TS.

If V is a direct sum of subspaces V = R ⊕ S, then the function πR,S : V → V given by

πR,S(v) = πR,S(r + s) = r (2.14) where r ∈ R and s ∈ S, is called the projection onto R along S.

17 2 2 Example 2.1 The projection on the y-axis along the x-axis is the map πRy ,Rx : R → R given by πRy ,Rx (x) = y. 

Example 2.2 The projection on the y-axis along the line L = {(s, s) | s ∈ R} is the map πRy ,L : 2 2 R → R given by πRy ,L(x) = πRy ,L(s, s + y) = y. 

Any projection πR,S is linear, i.e. πR,S ∈ L(V ):

πR,S(au + bv) = πR,S(a(r1 + s1) + b(r2 + s2))

= πR,S((ar1 + br2) + (as1 + bs2))

= ar1 + br2

= aπR,S(u) + bπR,S(v)

And clearly R = im(πR,S) and S = ker(πR,S) (2.15)

Two projections ρ, σ ∈ L(V ) are said to be orthogonal, denoted

ρ ⊥ σ (2.16) if ρ ◦ σ = σ ◦ ρ = 0 ≡ T0 (2.17) This is of course by analogy with orthogonal vectors v ⊥ w, for which we have hv, wi = hw, vi = 0.

We use projections to define a resolution of the identity, a sum of the form π1 + ··· + πk = I ∈ L(V ), where the πi are pairwise orthogonal projections, that is πi ⊥ πj if i 6= j. This occurs when V = S ⊕ S ⊕ · · · ⊕ S , π = π L , and π ⊥ π for i 6= j. Resolutions of the identity will be 1 2 k i Si, j6=i Sj i j important in establishing the spectral resolution of a linear operator.

18 2.2 Basic Properties of Linear Transformations

Theorem 2.3 If V and W are vector spaces over the same field F and T ∈ L(V,W ), then ker(T ) is a subspace of V and im(T ) is a subspace of W . Moreover, if T ∈ L(V ), then {0}, ker(T ), im(T ) and V are T -invariant. Proof: (1) If x, y ∈ ker(T ) and a, b ∈ F , we have T (ax + by) = aT (x) + bT (y) = a0 + b0 = 0 so ax + by ∈ ker(T ), and the statement follows by Theorem 1.3. If x, y ∈ im(T ) and a, b ∈ F , then ∃u, v ∈ V such that T (u) = x and T (v) = y, whence ax+by = aT (u) +bT (v) = T (au+bv) ∈ im(T ). (2) Clearly {0} and V are T -invariant, the first because T (0) = 0, the second because the codomain of T is V . If x ∈ im(T ), then of course x ∈ V , so that T (x) ∈ im(T ), while if x ∈ ker(T ), then T (x) = 0 ∈ ker(T ). 

Theorem 2.4 If V and W are vector spaces and B = {vi | i ∈ I} is a basis for V , then for any T ∈ L(V,W ) we have im(T ) = span(T (B)) (2.18)

Proof: T (vi) ∈ im(T ) for each i ∈ I by definition, and because im(T ) is a subspace, we have span(T (B)) ⊆ im(T ). Moreover, span(T (B)) is a subspace by Theorem 1.6. For the reverse inclusion, choose any w ∈ im(T )j. Then ∃v ∈ V such that w = T (v), so that by Theorem 1.13 ∃!a1, . . . , an ∈ F such that v = a1v1 + ··· + anvn, where by vi we mean vi ≡ vji . Consequently,

w = T (v) = T (a1v1 + ··· + anvn) = a1T (v1) + ··· + anT (vn) ∈ span(T (B)) 

Theorem 2.5 (A Linear Transformation Is Defined by Its Action on a Basis) Let V and W be vector spaces over the same field F . If B = {vi | i ∈ I} is a basis for V , then we can define a unique linear transformation T ∈ L(V,W ) by arbitrarily specifying the values wi = T (vi) ∈ W for each i ∈ I and extending T by linearity, i.e. specifying that for all a1, . . . , an ∈ F our T satisfy

T (a1v1 + ··· + anvn) = a1T (v1) + ··· + anT (vn) (2.19)

= a1w1 + ··· + anwn

Proof: By Theorem 1.13: ∀v ∈ V , ∃a1, . . . , an ∈ F such that v = a1v1 + ··· + anvn. Defining T ∈ W V by (2.19) we have the following results:

(1) T ∈ L(V,W ): For any u, v ∈ V , there exist unique a1, . . . , an, b1, . . . , bm ∈ F and u1, . . . , un, v1, . . . , vm ∈ B such that

u = a1u1 + ··· + anun v = b1v1 + ··· + bmvm

If w1, . . . , wn, z1, . . . , zm ∈ W are the vectors for which we have specified

T (u) = a1T (u1) + ··· + anT (un) T (v) = b1T (v1) + ··· + bmT (vm)

= a1w1 + ··· + anwn = b1z1 + ··· + bmzm then, for all a, b ∈ F we have  T (au + bv) = T a(a1u1 + ··· + anun) + b(b1v1 + ··· + bmvm)

= aa1T (u1) + ··· + aanT (un) + bb1T (v1) + ··· + bbmT (vm)

= aT (a1u1 + ··· + anun) + bT (b1v1 + ··· + bmvm) = aT (u) + bT (u)

19 (2) T is unique: suppose ∃U ∈ L(V,W ) such that U(vi) = wi = T (vi) for all i ∈ I. Then for any Pn v ∈ V we have v = i=1 aivi for unique ai ∈ F , so that

n n X X U(v) = aiU(vi) = aiwi = T (v) i=1 i=1 and U = T . 

Theorem 2.6 If V, W, Z are vector spaces over the same field F and T ∈ L(V,W ) and U ∈ L(W, Z), then U ◦ T ∈ L(V,Z). Proof: If v, w ∈ V and a, b ∈ F , then

(U ◦ T )(ac + bw) = U(T (ac + bw)) = U(aT (v) + bT (w)) = aU(T (v)) + bU(T (w)) = a(U ◦ T )(x) + b(U ◦ T )(y) so U ◦ T ∈ L(V,Z). 

Theorem 2.7 If V and W are vector spaces over F , then L(V,W ) is a vector space over F under ordinary addition and scalar multiplication of functions. Proof: First, note that if T,U ∈ L(V,W ) and a, b, c, d ∈ F , then ∀v, w ∈ V

(aT + bU)(cv + dw) = aT (cv + dw) + bU(cv + dw) = acT (v) + adT (w) + bcU(v) + bdU(w) = caT (v) + bU(v) + daT (w) + bU(w) = c(aT + bU)(v) + d(aT + bU)(w) so (aT + bU) ∈ L(V,W ), and L(V,W ) is closed under addition and scalar multiplication. Moreover, L(V,W ) satisfies VS1-8 because W is a vector space, and addition and scalar multiplication in L(V,W ) are defined pointwise for values which any functions S, T, U ∈ L(V,W ) take in W . 

20 2.3 Monomorphisms and Isomorphisms

Theorem 2.8 If V, W, Z are vector spaces over the same field F and T ∈ GL(V,W ) and U ∈ GL(W, Z) are isomorphisms, then (1) T −1 ∈ L(W, V ) (2)( T −1)−1 = T , and hence T −1 ∈ GL(W, V ). (3)( U ◦ T )−1 = T −1 ◦ U −1 ∈ GL(Z,V ). Proof: (1) If w, z ∈ W and a, b ∈ F , since T is bijective ∃!u, v ∈ V such that T (u) = w and T (v) = z. Hence T −1(w) = u and T −1(z) = v, so that T −1(aw + bz) = T −1aT (u) + bT (v) = T −1T (au + bv) = au + bv = aT −1(w) + bT −1(z)

−1 −1 −1 −1 −1 (2) T ◦ T = IV and T ◦ T = IW shows that T is the inverse of T , that is T = (T ) . (3) First, any composition of bijections is a bijection, so U ◦ T is bijective. Moreover,

−1 −1 −1 (U ◦ T ) ◦ (U ◦ T ) = IZ = U ◦ U = U ◦ IW ◦ U = U ◦ (T ◦ T −1) ◦ U −1 = (U ◦ T ) ◦ (T −1 ◦ U −1) and

−1 −1 −1 (U ◦ T ) ◦ (U ◦ T ) = IV = T ◦ T = T ◦ IW ◦ T = T −1 ◦ (U −1 ◦ U) ◦ T = (U −1 ◦ T −1) ◦ (U ◦ T ) so by the cancellation law and the fact that L(V,Z) is a vector space, we must have (U ◦ T )−1 = T −1 ◦ U −1 ∈ L(Z,V ) and by (1) it’s an isomorphism. Alternatively, we can get (3) from (2) via Theorem 2.5: U ◦ T : V → Z, so (U ◦ T )−1 : Z → V and of course T −1 ◦ U −1 : Z → V Let α, β and γ be bases for V , W and Z, respectively, such that T (α) = β and U(β) = U(T (α)) = (U ◦ T )(α) = γ, so that (U ◦ T )−1(γ) = α which is possible by Theorem 2.10. But we also have U −1(γ) = β and T −1(β) = α so that (T −1 ◦ U −1)(γ) = α −1 −1 −1 By Theorem 2.5 (uniqueness) (U ◦ T ) = T ◦ U and by (1) it’s an isomorphism. 

Theorem 2.9 If V and W are vector spaces over the same field F and T ∈ L(V,W ), then T is injective iff ker(T ) = {0}. Proof: If T is 1-1 and x ∈ ker(T ), then T (x) = 0 = T (0), so that x = 0, whence ker(T ) = {0}. Conversely, if ker(T ) = {0} and T (x) = T (y), then, 0 = T (x) − T (y) = T (x − y) =⇒ x − y = 0 or x = y, and so T is 1-1. 

21 Theorem 2.10 Let V and W be vector spaces over the same field and S ⊆ V . Then for any T ∈ L(V,W ), (1) T is injective iff it carries linearly independent sets into linearly independent sets. (2) T is injective implies S ⊆ V is linearly independent iff T (S) ⊆ W is linearly independent.

Proof: (1) If T is 1-1 and S ⊆ V is linearly independent, then for any v1, . . . , vn ∈ S we have 0V = a1v1 + ··· + anvn =⇒ a1 = ··· an = 0, so

0W = T (0V ) = T (a1v1 + ··· + anvn) = a1T (v1) + ··· + anT (vn) whence {T (v1),...,T (vn)} is linearly independent, and since this is true for all vi ∈ S, T (S) is lin- early independent, so T carries linearly independent sets into linearly independent sets. Conversely, if T carries linearly independent sets into linearly independent sets, let B be a basis for V and suppose T (u) = T (v) for some u, v ∈ V . Since u = a1u1 + ··· + anun and v = b1v1 + ··· + bmvm for unique ai, bi ∈ F and u1, . . . , un, v1, . . . , vm ∈ B, we have 0 = T (u) − T (v) = T (u − v) = T (ai1 − bj1 )vi1 + ··· + (aik − bjk )vik + aik+1 uik+1 + ··· + ain uin  +bik+1 vik+1 + ··· + bim vim

= (ai1 − bj1 )T (vi1 ) + ··· + (aik − bjk )T (vik )

+aik+1 T (uik+1 ) + ··· + ain T (uin ) + bik+1 T (vik+1 ) + ··· + bim T (vim ) so that aik+1 = ··· = ain = bik+1 = ··· = bim = 0, which means m = n = k and therefore ai1 = bj1 ,

. . . , aik = bjk . Consequently

u = ai1 vi1 + ··· + aik vik = v whence T is 1-1. (2) If T is 1-1 and S ⊆ V is linearly independent, then by (1) T (S) is linearly independent. Con- versely, if T is 1-1 and T (S) is linearly independent, then for K = {v1, . . . , vk} ⊆ S ⊆ V we have that T (0) = 0 = a1T (v1) + ··· + akT (vk) = T (a1v1 + ··· + akvk) implying simultaneously that a1v1 + ··· + akvk = 0 and a1 = ··· = ak = 0, so K is linearly independent. But this is true for all such K ⊆ S, so S is linearly independent. 

Theorem 2.11 If V and W are vector spaces over the same field F and T ∈ GL(V,W ), then for any S ⊆ V we have (1) V = span(S) iff W = span(T (S)). (2) S is linearly independent in V iff T (S) is linearly independent in W . (3) S is a basis for V iff T (S) is a basis for W .

T ∈GL(V,W ) z }| { Proof: (1) V = span(S) ⇐⇒ W = im(T ) = T (span(S)) = span(T (S)). Pn (2) S linearly independent means for any s1, . . . , sn ∈ S we have i=1 aisi = 0 ⇐⇒ for all ai = 0, which implies

n n n X  X T ∈ GL(V,W ) X T aisi = aiT (si) = 0 = T (0) =⇒ aisi = 0 i=1 i=1 i=1 S lin. indep. =⇒ a1 = ··· = an = 0

22 so T (S) is linearly independent, since this is true for all si ∈ S. Conversely, if T (S) is linearly independent we have for any T (s1),...,T (sn) ∈ T (S) n n n X X  T ∈ GL(V,W ) X 0 = aiT (si) = T aisi = T (0) =⇒ aisi = 0 i=1 i=1 i=1 T (S) lin. indep. =⇒ a1 = ··· = an = 0 so S is linearly independent.

(1), (2) (3) S basis for V ⇐⇒ T (S) linearly independent in W and W = span(T (S)) =⇒ T (S) is a basis for W . We could also say that since T −1 is an isomorphism by Theorem 2.8, we have that −1 since T (S) is a basis for W , S = T (T (S)) is a basis for V . 

Theorem 2.12 (Isomorphisms Preserve Bases) If V and W are vector spaces over the same field F and B is a basis for V , then T ∈ GL(V,W ) iff T (B) is a basis for W . Proof: If T ∈ GL(V,W ) is an isomorphism, then T is bijective, so by Theorem 2.11 T (B) is a basis for W . Conversely, if T (B) is a basis for W , then ∀u ∈ V , ∃!a1, . . . , an ∈ F and v1, . . . , vn ∈ B such that u = a1v1 + ··· + anvn. Therefore,

0 = T (u) = T (a1v1 + ··· + anvn) = a1T (v1) + ··· + anT (vn) =⇒ a1 = ··· = an = 0 so ker(T ) = {0}, whence by Theorem 2.9 T is injective. But since span(T (B)) = W , we have for all w ∈ W that ∃!a1, . . . an ∈ F such that w = a1T (v1) + ··· + anT (vn) = T (a1v1 + ··· + anvn), so ∃u = a1v1 + ··· + anvn ∈ V such that w = T (u), and W ⊆ im(T ). We obviously have im(T ) ⊆ W , and so T is surjective. Therefore it is bijective and an isomorphism. 

Theorem 2.13 (Isomorphisms Preserve Dimension) If V and W are vector spaces over the same field F , then V =∼ W ⇐⇒ dim(V ) = dim(W ) (2.20) Proof: V =∼ W =⇒ ∃T ∈ GL(V,W ), so B basis for V =⇒ T (B) basis for W , and dim(V ) = |B| = |T (B)| = dim(W ). Conversely, if dim(V ) = |B| = |C| = dim(W ), where C is a basis for W , then ∃T : B → C bijective. Extending T to V by linearity defines a unique T ∈ L(V,W ) by Theorem 2.5 and T is an isomorphism because it is surjective, im(T ) = W , and injective, ker(T ) = {0}, so ∼ V = W . 

Corollary 2.14 If V and W are vector spaces over the same field and T ∈ L(V,W ), then dim(V ) < dim(W ) =⇒ T cannot be onto and dim(V ) > dim(W ) =⇒ T cannot be 1-1. 

Corollary 2.15 If V is an n-dimensional vector space over F , where n ∈ N, then V =∼ F n (2.21) If κ is any cardinal number, B is a set of cardinality κ and V is a κ-dimensional vector space over F , then ∼ B V = (F )0 (2.22) n B  Proof: As a result of the fact that dim(F ) = n and dim (F )0 = κ, by the previous theorem. 

Corollary 2.16 If V and W are vector spaces over the same field F and T ∈ GL(V,W ), then for any subspace S of V we have that T (S) is a subspace of W and dim(S) = dim(T (S)). Proof: If S is a subspace of V , then ∀x, y ∈ S and a, b ∈ F , we have ax+by ∈ S, and T (x),T (y) ∈ S and T (ax+by) = aT (x)+bT (y) ∈ T (S) imply that S is a subspace of W , while dim(S) = dim(T (S)) follows from the theorem. 

23 2.4 Rank-Nullity Theorem

Lemma 2.17 If V and W are vector spaces over the same field F and T ∈ L(V,W ), then any complement of the kernel of T is isomorphic to the range of T , that is V = ker(T ) ⊕ ker(T )c =⇒ ker(T )c =∼ im(T ) (2.23) where ker(T )c is any complement of ker(T ). Proof: By Theorem 1.23 dim(V ) = dimker(T ) + dimker(T )c. Let c c T = T |ker(T )c : ker(T ) → im(T ) and note that T is injective by Theorem 2.9 because ker(T c) = ker(T ) ∩ ker(T c) = {0} Moreover T c is surjective because T c(V ) = im(T ): first we obviously have T c(V ) ⊆ im(T ), while for the reverse inclusion suppose v ∈ im(T ). Then by Theorem 1.25 ∃!s ∈ ker(T ) and t ∈ ker(T )c such that v = s + t, so that 0 ¨* c c T (v) = T (s + t) = T (s) +¨T (¨t) = T (s) = T (s) ∈ T (V ) so im(T ) = T c(V ) and T c is surjective, so that T c is an isomorphism, whence c ∼ ker(T ) = im(T ) 

Theorem 2.18 (Rank-Nullity Theorem) If V and W are vector spaces over the same field F and T ∈ L(V,W ), then dim(V ) = rank(T ) + null(T ) (2.24) Proof: By the lemma and we have ker(T )c =∼ im(T ), which by the previous theorem implies dimker(T )c = dimim(T ) = rank(T ), while by Theorem 1.23 V = ker(T ) ⊕ ker(T )c implies dim(V ) = dimker(T ) + dimker(T )c = dimker(T ) + dimim(T ) = null(T ) + rank(T ) which completes the proof. 

Corollary 2.19 Let V and W are vector spaces over the same field F and T ∈ L(V,W ). If dim(V ) = dim(W ), then the following are equivalent: (1) T is injective. (2) T is surjective. (3) rank(T ) = dim(V ) Proof: By the Rank-Nullity Theorem, rank(T ) + null(T ) = dim(V ), and by Theorem 2.9 we have

T is 1-1 Thm⇐⇒ 2.9 ker(T ) = {0} ⇐⇒ null(T ) = 0 assump. R-N⇐⇒ Thm dim(im(T )) = rank(T ) = dim(V ) = dim(W ) Thm⇐⇒ 1.21.1 im(T ) = W. ⇐⇒ T is onto which completes the proof. 

24 Theorem 2.20 For any linear operator T ∈ L(V ) we have that

ker(T ) ⊆ ker(T 2) ⊆ · · · (2.25)

Moreover, if there is an m ∈ N for which rank(T m) = rank(T m+1), then for all n ∈ N we have rank(T m) = rank(T m+n) ker(T m) = ker(T m+n) (2.26)

Proof: (1) Let n ∈ N and v ∈ ker(T n), so that T n(v) = 0. Then, 0 = T (0) = T (T n(v)) = T n+1(v), so v ∈ ker(T n+1), which proves (2.25).

(2) If there is an m ∈ N such that rank(T m) = rank(T m+1), then by the Rank-Nullity Theo- rem null(T m) = null(T m+1), which by (1) shows that ker(T m) = ker(T m+1) and hence T m(V ) = m+1 m T (V ). But then T |ker(T m) = T0 ∈ L(ker(T )) and T |T m(V ) must be bijective. Consequently, n n T |T m(V )(v) 6= 0 if v 6= 0, so T |T m(V )(v) 6= 0 if v 6= 0 for all n ∈ N. This shows that T |T m(V )(v) = 0 iff v ∈ ker(T m) for all n ∈ N, so ker(T m+n) = ker(T m) and T m(V ) = T m+n(V ) for all n ∈ N, proving (2.26). 

25 3 Matrix Representations of Linear Transformations

3.1 Definitions

If β = (v1, . . . , vn) is an ordered basis for a finite-dimensional vector space V over F , then by Theorem 1.13 we know that for any vector x ∈ V there are unique scalars a1, . . . , an ∈ F such that x = a1v1 + ··· + anvn. The coordinate vector of x ∈ V with respect to (relative to) β is defined n to be the (column) vector in F consisting of the scalars ai:   a1 a2  [x] :=   (3.1) β  .   .  an and the coordinate map, also called the standard representation of V with respect to β, n φβ : V → F (3.2) is given by φβ(x) = [x]β (3.3) Let V and W be finite dimensional vector spaces over the same field F with ordered bases β = (v1, . . . , vn) and γ = (w1, . . . , wm), respectively. If there exist (and there do exist) unique scalars aij ∈ F such that m X T (vj) = aijwi for j = 1, . . . , n (3.4) i=1 then the matrix representation of a linear transformation T ∈ L(V,W ) in the ordered bases β and γ is the m × n matrix A defined by Aij = aij,

γ h i A = [T ]β := [T (v1)]γ [T (v2)]γ ··· [T (vn)]γ

      a11 a12 a1n   a11 ··· a1n  a21   a22   a2n  =     ···   =  . .. .   .   .   .   . . .   .   .   .  am1 ··· amn am1 am2 amn

Here are some examples: 2 3 1. If T ∈ L(R , R ) is given by T (x1, x2) = (x1 + 3x2, 0, 2x1 − 4x2), and β = (e1, e2) and 2 3 γ = (e1, e2, e3) are the standard ordered bases for R and R , respectively, then the matrix representation of T is found as follows: ) 1 3 T (e1) = T (1, 0) = (1, 0, 2) = 1e1 + 0e2 + 2e3 γ =⇒ [T ]β = 0 0 T (e2) = T (0, 1) = (3, 0, −4) = 3e1 + 0e2 − 4e3 2 −4 0 2 3 2 2. If T ∈ L(R3[x], R2[x]) is given by T (f(x)) = f (x), and β = (1, x, x , x ) and γ = (1, x, x ) are the standard ordered bases for R3[x] and R2[x], respectively, the matrix representation of T is found as follows: 2  T (1) = 0 · 1 + 0x + 0x     T (x) = 1 · 1 + 0x + 0x2  0 1 0 0 γ 0 0 2 0 2 2 =⇒ [T ]β =   T (x ) = 0 · 1 + 2x + 0x   0 0 0 3 T (x3) = 0 · 1 + 0x + 3x2

26 3.2 Matrix Representations of Linear Transformations

Theorem 3.1 If A ∈ Mm,n(F ), B,C ∈ Mn,p(F ), D,E ∈ Mq,m(F ), and a ∈ F , then (1) A(B + C) = AB + AC and (D + E)A = DA + EA (2) a(AB) = (aA)B = A(aB)

(3) ImA = A = AIn Proof: For each case we show the result by showing that it holds for each ijth entry as a result of the properties of the field F : n n n X X X [A(B + C)]ij = Aik(B + C)kj = Aik(Bkj + Ckj) = (AikBkj + AikCkj) k=1 k=1 k=1 n n X X = AikBkj + AikCkj = (AB)ij + (AC)ij = [AB + AC]ij k=1 k=1

m m m X X X [(D + E)A]ij = (D + E)ikAkj = (Dik + Eik)Akj = (DikAkj + EikAkj) k=1 k=1 k=1 m m X X = DikAkj + EikAkj = (DA)ij + (EA)ij = [DA + EA]ij k=1 k=1

n n n X X X [a(AB)]ij = a AikBkj = aAikBkj = [(aA)B]ij = aAikBkj k=1 k=1 k=1 n X = AikaBkj = [A(aB)]ij k=1 m m n n X X X X (ImA)ij = (Im)ikAkj = δikAkj = Aij = Aikδkj = Aik(In)kj = (AIn)ij  k=1 k=1 k=1 k=1

Theorem 3.2 If A ∈ Mm,n(F ) and B ∈ Mn,p(F ), and bj is the jth column of B, then

AB = A[b1 b2 ··· bp] (3.5)

= [Ab1 Ab2 ··· Abp] (3.6)

That is if uj is the jth column of AB, then uj = Abj. As a result, if A ∈ Mm,n(F ) and aj is the jth column of A, then A = AIn = ImA (3.7) Proof: First, we have    Pn    (AB)1j k=1 A1kBkj B1j  .   .   .  uj =  .  =  .  = A  .  = Avj Pn (AB)mj k=1 AmkBkj Bnj As a result A = [a1 a2 ··· an] = [Ae1 Ae2 ··· Aen] = AIn and ImA = Im[a1 a2 ··· an] = [Ima1 Ima2 ··· Iman] = [a1 a2 ··· an] = A 

27 Theorem 3.3 If V is a finite-dimensional vector space over a field F with ordered basis β = n (b1, . . . , bn), the coordinate map φβ : V → F is an isomorphism,

n φβ ∈ GL(V,F ) (3.8)

n Proof: First, φβ ∈ L(V,F ) since ∀x, y ∈ V and a, b ∈ F we have by the ordinary rules of matrix addition and scalar multiplication (or alternatively by the rules of addition and scalar multiplication on n-) φβ(ax + by) = [ax + by]β = a[x]β + [y]β = aφβ(x) + φβ(y)

Second, because φβ(b1) = e1, . . . , φβ(bn) = en, it takes a basis into a basis. Therefore it is an isomorphism by Theorem 2.12. 

Theorem 3.4 Let V and W be finite-dimensional vector spaces having ordered bases β and γ, respectively, and let T ∈ L(V,W ). Then for all v ∈ V we have

γ [T (v)]γ = [T ]β[v]β (3.9)

γ n m In other words, if A = [T ]β is the matrix representation of T , if TA ∈ L(F ,F ) is the matrix n n multiplication map, and if φβ ∈ GL(V,F ) and φγ ∈ GL(W, F ) are the respective coordinate maps, then φγ ◦ T = TA ◦ φβ (3.10) and the following diagram commutes:

T V - W

φβ φγ ? ? F n - F m TA

Proof: If β = (v1, . . . , vn) is an odered basis for V and γ = (w1, . . . , wm) is an ordered basis for W , then let       a11 a1n a11 ··· a1n h i [T ]γ = [T (v )] ··· [T (v )] =  .  ···  .  =  . .. .  β 1 γ n γ  .   .   . . .  am1 amn am1 ··· amn

Now, ∀u ∈ V, ∃!b1, . . . , bn ∈ F such that u = b1v1 + ··· + bnvn. Therefore,

T (u) = T (b1v1 + ··· + bnvn)

= b1T (v1) + ··· + bnT (vn) so that   [T (u)]γ = φγ T (u) = φγ b1T (v1) + ··· + bnT (vn)   = b1φγ T (v1) + ··· + bnφγ T (vn)

= b1[T (v1)]γ + ··· + bn[T (vn)]γ γ = [T ]β[u]β which completes the proof. 

28 Theorem 3.5 If V, W, Z are finite-dimensional vector spaces with ordered bases α, β and γ, re- spectively, and T ∈ L(V,W ) and U ∈ L(W, Z), then

γ γ β [U ◦ T ]α = [U]β[T ]α (3.11) Proof: By the previous two theorems we have

γ h i [U ◦ T ]α = [(U ◦ T )(v1)]γ ··· [(U ◦ T )(vn)]γ h i = [U(T (v1))]γ ··· [U(T (vn))]γ

h γ γ i = [U]β[T (v1)]β ··· [U]β[T (vn)]β

γ h i = [U]β [T (v1)]β ··· [T (vn)]β γ β = [U]β[T ]α which completes the proof. 

Corollary 3.6 If V is an n-dimensional vector space over F with an ordered basis β and I ∈ L(V ) is the identity operator, then [I]β = In ∈ Mn(F ).

Proof: By Theorem 3.5, given any T ∈ L(V ), T = I ◦T , so that [T ]β = [I ◦T ]β = [I]β[T ]β = In[T ]β, so that [I]β = In by the cancellation law. 

Theorem 3.7 If V and W are finite-dimensional vector spaces of dimensions n and m, respectively, with ordered bases β and γ, respectively, and T ∈ L(V,W ), then

γ γ rank(T ) = rank([T ]β) and null(T ) = null([T ]β) (3.12)

Proof: This follows from Theorem 3.4, since φγ and φβ are isomorphisms, and so carry bases into bases, and φγ ◦ T = TA ◦ φβ, so that   rank(T ) = dim im(T ) = dim(φγ (im(T )) = rank(φγ ◦ T )   = rank(TA ◦ φβ) = dim im(TA ◦ φβ)) = dim im(TA) γ = rank(TA) = rank(A) = rank([T ]β) while the second equality follows from the rank nullity theorem along with the fact that the ranks are equal. 

Theorem 3.8 If V and W are finite-dimensional vector spaces over F with ordered bases β = (b1, . . . , bn) and γ = (c1, . . . , cm), then the function

Φ: L(V,W ) → Mm,n(F ) (3.13) γ Φ(T ) = [T ]β (3.14) is an isomorphism,  Φ ∈ GL L(V,W ),Mm,n (3.15) and consequently ∼ L(V,W ) = Mm,n(F ) (3.16) and   dim L(V,W ) = dim Mm,n(F ) = mn (3.17)

29  Proof: First, Φ ∈ L L(V,W ),Mm,n : there exist scalars rij, sij ∈ F , for i = 1, . . . , m and j = 1, . . . , n, such that for any T ∈ L(V,W ) we have

T (b1) = r11c1 + ··· + rm1cm U(b1) = s11c1 + ··· + sm1cm . . . and .

T (bn) = r1nc1 + ··· + rmncm U(bn) = s1nc1 + ··· + smncm

Hence, for all k, l ∈ F and j = 1, . . . , n we have

m m m X X X (kT + lU)(bj) = kT (bj) + lU(bj) = k rijci + l sijci = (krij + lsij)ci i=1 i=1 i=1 As a consequence of this and the rules of matrix addition and scalar multiplication we have

γ Φ(kT + lU) = [kT + lU]β   kr11 + ls11 ··· kr1n + ls1n  . .. .  =  . . .  krm1 + lsm1 ··· krmn + lsmn     r11 ··· r1n s11 ··· s1n  . .. .   . .. .  = k  . . .  + l  . . .  rm1 ··· rmn sm1 ··· smn γ γ = k[T ]β + l[U]β = kΦ(T ) + lΦ(U)

Moreover, Φ is bijective by Theorem 2.5, since ∀A ∈ Mm,n, ∃!T ∈ L(V,W ) such that Φ(T ) = A, because there exists a unique T ∈ L(V,W ) such that

T (bj) = A1jc1 + ··· + Anjcm for j = 1, . . . , n

This makes Φ onto, and also 1-1 because ker(Φ) = {T0} because for O ∈ Mm×n(F ) there is only T0 ∈ L(V,W ) such that Φ(T0) = O, because

T (bj) = 0c1 + ··· + 0cm = 0 for j = 1, . . . , n defines a unique transformation, T0. 

Theorem 3.9 If V and W are finite-dimensional vector spaces over the same field F with ordered bases β = (b1, . . . , bn) and γ = (c1, . . . , cn), respectively, then T ∈ L(V,W ) is an isomporhpism iff γ [T ]β is invertible, in which case −1 β γ −1 [T ]γ = [T ]β (3.18)

−1 −1 Proof: If T is an isomorphism and dim(V ) = dim(W ) = n, then T ∈ L(W, V ), and T ◦T = IW −1 and T ◦ T = IV , so that by Theorem 3.5, Corollary 3.6 and Theorem 3.1 we have

γ −1 β −1 −1 −1 β γ [T ]β[T ]γ = [T ◦ T ]γ = [IW ]γ = In = [IV ]β = [T ◦ T ]β = [T ]γ [T ]β

γ −1 β so that [T ]β is invertibe with inverse [T ]γ , and by the uniqueness of the multiplicative inverse in −1 n Mn(F ), which follows from the uniqueness of TA ∈ L(F ), we have

−1 β γ −1 [T ]γ = [T ]β

30 γ Conversely, if A = [T ]β is invertible, there is a n × n matrix B such that AB = BA = In, so by Pn β Theorem 2.5 there is a unique U ∈ L(W, V ) such that U(cj) = vj = i=1 Bijbj. Hence B = [U]γ . To show that U = T −1, note that

β γ γ β [U ◦ T ]β = [U]γ [T ]β = BA = In = [IV ]β and [T ◦ U]γ = [T ]β[U]γ = AB = In = [IW ]γ  But since Φ ∈ L(V,W ),Mm,n(F ) is an isomorphism, and therefore 1-1, we must have that

U ◦ T = IV and T ◦ U = IW

−1 By the uniqueness of the inverse, however, U = T , and T is an isomorphism. 

31 4 Change of Coordinates

4.1 Definitions

We can also now define the change of coordinates or , operator. If V is a finite-dimensional vector space, say dim(V ) = n, and β = (b1, . . . , bn) and γ = (c1, . . . , cn) are n two ordered bases for V , then the coordinate maps φβ, φγ ∈ L(V,F ), which are isomorphisms by n Theorem 2.15, may be used to define a change of coordinates operator φβ,γ ∈ L(F ) changing β coordinates into γ coordinates, that is having the property

φβ,γ ([v]β) = [v]γ (4.1)

We define the operator by −1 φβ,γ := φγ ◦ φβ (4.2) The relationship between these three functions is illustrated in the following commutative diagram:

φ ◦ φ−1 n γ β n F  - F -

φβ φγ V The change of coordinates operator has a matrix representation

ρ h i Mβ,γ = [φβ,γ ]ρ = [φγ ]β = [b1]γ [b2]γ ··· [bn]γ (4.3)

4.2 Change of Coordinate Maps and Matrices

Theorem 4.1 (Change of Coordinate Matrix) Let V be a finite-dimensional vector space over a field F and β = (b1, . . . , bn) and γ = (c1, . . . , cn) be two ordered bases for V . Since φβ and φγ are isomorphisms, the following diagram commutes,

φ ◦ φ−1 n γ β n F  - F -

φβ φγ V

−1 n and the change of basis operator φβ,γ := φγ ◦φβ ∈ L(F ), changing β coordinates into γ coordinates, is an automorphism of F n. It’s matrix representation,

Mβ,γ = [φβ,γ ]ρ ∈ Mn(F ) (4.4)

n where ρ = (e1,..., en) is the standard ordered basis for F , is called the change of coordinate matrix, and it satisfies the following conditions:

1.Mβ,γ = [φβ,γ ]ρ ρ = [φγ ]β h i = [b1]γ [b2]γ ··· [bn]γ

2. [v]γ = Mβ,γ [v]β, ∀v ∈ V −1 ρ 3.Mβ,γ is invertible and Mβ,γ = Mγ,β = [φβ]γ

32 Proof: The first point is shown as follows: h i Mβ,γ = [φβ,γ ]ρ = φβ,γ (e1) φβ,γ (e2) ··· φβ,γ (en)

h −1 −1 −1 i = (φγ ◦ φβ )(e1)(φγ ◦ φβ )(e2) ··· (φγ ◦ φβ )(en)

h −1 −1 −1 i = (φγ ◦ φβ )([b1]β)(φγ ◦ φβ )([b2]β) ··· (φγ ◦ φβ )([bn]β) h i = φγ (b1) φγ (b2) ··· φγ (bn) h i = [b1]γ [b2]γ ··· [bn]γ

= [φγ ]β or by Theorem 3.5 and Theorem 3.9 we have that

−1 ρ −1 β ρ ρ −1 ρ −1 ρ ρ Mβ,γ = [φβ,γ ]ρ = [φγ ◦ φβ ]ρ = [φγ ]β[φβ ]ρ = [φγ ]γ ([φβ]β) = [φγ ]βIn = [φγ ]βIn = [φγ ]β The second point follows from Theorem 3.4, since

−1 −1 φγ (v) = (φγ ◦ I)(v) = (φγ ◦ (φβ ◦ φβ))(v) = ((φγ ◦ φβ ) ◦ φβ)(v) implies that

−1 −1 [v]γ = [φγ (v)]ρ = [((φγ ◦ φβ ) ◦ φβ)(v)]ρ = [φγ ◦ φβ ]ρ[φβ(v)]ρ = [φβ,γ ]ρ[v]β = Mβ,γ [v]β

And the last point follows from the fact that φβ and φγ are isomorphism, so that φβ,γ is an isomor- −1 n phism, and hence φβ,γ ∈ L(F ) is an isomorphism, and because the diagram above commutes we must have −1 −1 −1 −1 φβ,γ = (φγ ◦ φβ ) = φβ ◦ φγ = φβ,γ so that by 1 −1 −1 Mβ,γ = [φβ,γ ]ρ = [φγ,β]ρ = Mγ,β or alternatively by Theorem 3.9

−1 −1 −1 ρ Mβ,γ = ([φβ,γ ]ρ) = [φβ,γ ]ρ = [φγ,β]ρ = [φβ]γ = Mγ,β 

Corollary 4.2 (Change of Basis) Let V and W be finite-dimensional vector spaces over the same field F and let (β, γ) and (β0, γ0) be pairs of ordered bases for V and W , respectively. If T ∈ L(V,W ), then

γ0 γ [T ]β0 = Mγ,γ0 [T ]βMβ0,β (4.5) γ −1 = Mγ,γ0 [T ]βMβ,β0 (4.6) where Mγ,γ0 and Mβ0,β are change of coordinate matrices. That is, the following diagram commutes

n TA - n F  F φβ,β0 φγ0,γ - - T 0  F n A - F n φ φ β 6 6 γ φβ0 φγ0 V - W T

33 0 0 0 Proof: This follows from the fact that if β = (b1, . . . , bn), β = (b1, . . . , bn), γ = (c1, . . . , cm), 0 0 γ = (c1, . . . , cm), then for each i = 1, . . . , n we have

0 −1 0 [T (bi)]γ0 = [(φγ,γ0 ◦ T ◦ φβ,β0 )(bi)]γ0 so

γ0 −1 γ0 [T ]β0 = [φγ,γ0 ◦ T ◦ φβ,β0 ]β0 γ −1 0 = [φγ,γ ]ρm [T ]β[φβ,β0 ]ρn γ −1 0 0 = [φγ,γ ]ρm [T ]β([φβ,β ]ρn ) γ −1 = Mγ,γ0 [T ]βMβ,β0 which completes the proof. 

Corollary 4.3 (Change of Basis for a Linear Operator) If V is a finite-dimensional vector space with ordered bases β and γ, and T ∈ L(V ), then

−1 [T ]γ = Mβ,γ [T ]βMβ,γ (4.7) where Mβ,γ is the change of coordinates matrix. 

Corollary 4.4 If we are given any two of the following:

(1) A ∈ Mn(F ) invertible (2) an ordered basis β for F n (3) an ordered basis γ for F n the third is uniquely determined by the equation A = Mβ,γ , where Mβ,γ is the change of coordinates matrix of the previous theorem. h i Proof: If we have A = Mβ,γ = [φβ]γ = [b1]γ [b2]γ ··· [bn]γ , suppose we know A and γ. Then,

[bi]γ is given by A, so bi = Ai1c1 + ··· + Aincn, so β is uniquely determined. If β and γ are given, h i then by the previous theorem Mβ,γ is given by Mβ,γ = [b1]γ [b2]γ ··· [bn]γ . Lastly, if A and β −1 are given, then γ is given by the first case applied to A = Mγ,β. 

34