<<

Notes on Linear Transformations

November 17, 2014

T Recall that a linear transformation is a V / W between vector spaces V and W such that

(i) T (c~v)=cT (~v )forall~v in V and all scalars c.(Geometrically,T takes lines to lines.)

(ii) T (~v 1 + ~v 2)=T (~v 1)+T (~v 2)forall~v 1,~v2 in V .(Geometrically,T takes parallelograms to parallelograms.)

T In the following, we will always assume that V / W is a linear transformation. Here is asummaryofwhatwillbediscussedbelow.

T (1) V / W is completely determined by how it acts on a basis = ~v ,...,~v of V . B { 1 n} T (2) Every Rn / Rm is multiplication by a matrix A.

(3) -coordinates turn V into Rn. B T (4) To interpret V / W as multiplication by a matrix, we can choose bases = ~v 1,...,~vn n m B { } of V and 0 = w~ 1,...,w~ m of W ,whichturnV into R and W into R so that multi- plying byB a matrix{ makes sense.}

(5) An in-depth example: the and are linear transformations.

T (1) V / W is determined by how it acts on . B Let = ~v ,...,~v be a basis of V .If~v is any vector in V , then there is a unique way to B { 1 n} write ~v as a linear combination of the vectors in the basis: ~v = c1~v 1 + + cn~v n.Thelinear combination exists since the basis vectors span V ,anditisuniquesincethebasisvectors··· are independent. Then, by linearity of T ,wecompute

T (~v )=T (c ~v + + c ~v )=c T (~v )+ + c T (~v ), 1 1 ··· n n 1 1 ··· n n which shows that every T (~v )isfullydeterminedbythevectorsT (~v 1), ,T(~v n). In partic- ular, this shows that any two linear transformations that act the same··· way on all the basis vectors must in fact be the same linear transformation. T (2) Every Rn / Rm is multiplication by a matrix A. When our vector spaces are Rn and Rm, there is a matrix A so that T (~v )=A~v for all ~v in Rn.ThismeansthatT is just multiplication by the matrix A.TofindthisA,computehow T acts on the standard basis vectors of Rn and use the resulting vectors as the columns of A.Forinstance,whenn =3,computethevectors

1 0 0 T 0 ,T 1 ,T 0 , 02 31 02 31 02 31 0 0 1 @4 5A @4 5A @4 5A which are in Rm, and use these three vectors as the columns of A. Why does this give the right matrix? By (1), it is enough to check that T and A act the same way on all the standard basis vectors. But since

1 1 A 0 =firstcolumnofA = T 0 2 3 02 31 0 0 4 5 @4 5A and similar hold for the second and third columns of A,ourdefinitionofA was exactly the right one to ensure that A agrees with T on the standard basis vectors.

(3) -coordinates turn V into Rn. B T In order to interpret V / W as multiplication by a matrix, we first need to make sure that V looks like Rn.Thisisnecessarybecausematricesmultiplycolumnvectors,andthe vector space V could consist of vectors that bear no resemblance to column vectors (for instance, V could be a vector space of ). Similarly, we will need to make W look like Rm,becausemultiplyingourmatrixbycolumnvectors“inV ” will yield column vectors that are supposed to be in W . Luckily for us, a basis is exactly the right thing to make a vector space look like Rn. Let = ~v ,...,~v be a basis for V . As noted above, any vector ~v in V can be uniquely B { 1 n} expressed as a linear combination ~v = c1~v 1 + + cn~v n.Thescalarsc1,...,cn are called the -coordinates of ~v ,andweputthemintoacolumnvector··· B

c1 . [~v ] = . . B 2 3 cn 6 7 4 5 The notation [~v ] means “the -coordinates of ~v ”, which is a vector in Rn.Bytaking - B B B coordinates of all the vectors in V ,wee↵ectivelyturnV into Rn.(Moreprecisely,taking [] -coordinates is a V B / Rn that induces a one-to-one correspondence between B the vectors in V and the vectors in Rn.Suchlinearmapsarecalledisomorphisms.)

2 T (4) Using , 0 to interpret V / W as multiplication by a matrix. B B Choose a basis = ~v ,...,~v of V and a basis 0 = w~ ,...,w~ of W .Bytaking B { 1 n} B { 1 m} coordinates, we can view any ~v in V as a column vector [~v ] in Rn.Similarly,anyw~ in B W has an associated column vector [w~ ] in Rm. Now our method in (2) will reveal that B0 T is multiplication by a matrix A. Following (2), we need to determine how T acts on the standard basis vectors of Rn.Butthestandardbasisvectorsaresimplythe -coordinates B [~v 1] ,...,[~v n] of our basis ! Thus we’re interested in computing T (~v 1),...,T(~v n). However, B B B if we want to use the vectors T (~v 1),...,T(~v n)asthecolumnsofA,wefirstneedtoturn them into column vectors – this is achieved by taking 0-coordinates. Thus the columns of A are B [T (~v 1)] ,...,[T (~v n)] . B0 B0 This definition of A exactly ensures that

A[~v 1] =[T (~v 1)] ,...,A[~v n] =[T (~v n)] , B B0 B B0 which is a confusing way of saying that A agrees with T when we turn V and W into Rn and Rm. One thing to emphasize is that this matrix A depends heavily on the chosen bases and B 0.Di↵erentbasesleadtodi↵erentmatrices. B (5) In-depth example: derivative and integral of polynomials. Let V be the vector space of polynomials in x of degree 3, and let W be the vector space of T  d~v polynomials in x of degree 2. Let V / W be the derivative T (~v )= dx .Thefactthat the derivative is linear is one of the basic properties you learned in I! For instance, linearity says that you can compute

d (1 + 2x x3)= d (1) + d (2x) d (x3)=0+2 3x2 =2 3x2 dx dx dx dx term-by-term in a manner that has hopefully become instinctive for you. Let’s try to view T as multiplication by a matrix. First, we pick bases for V and W .Thenicestchoicesare = 1,x,x2,x3 for V and 2 B { } 0 = 1,x,x for W . With these nice bases, computing the coordinates of a ~v simplyB { amounts} to building a column vector out of the coecients of ~v .Forinstance

1 2 2 [1 + 2x x3] = 2 3 and [2 3x2] = 0 . B 0 B0 2 3 3 6 17 6 7 4 5 4 5 The columns of our matrix A are 0 1 0 0 [T (1)] =[0] = 0 , [T (x)] =[1] = 0 , [T (x2)] =[2x] = 2 , [T (x3)] =[3x2] = 0 , B0 B0 2 3 B0 B0 2 3 B0 B0 2 3 B0 B0 2 3 0 0 0 3 4 5 4 5 4 5 4 5 3 so 0100 A = 0020. 2 3 0003 4 5 To see how A acts on a polynomial like 1+2x x3,wefirstcompute -coordinates [1+2x x3] B as above and then take the product B

1 0100 2 2 A[1 + 2x x3] = 00202 3 = 0 =[2 3x2] , B 2 3 0 2 3 B0 0003 3 6 17 4 5 6 7 4 5 4 5 which agrees with our earlier calculation of how the derivative acts! Note that the first column of A is the only free column, so the nullspace of A is

1 0 N(A) = span 02 31 =span([1] ). 0 B B607C B6 7C @4 5A The column vectors in this nullspace correspond to the polynomials, which are indeed the kernel of T : the derivative of any constant is 0! Also, since A has rank 3, the column space of A has dimension 3, so the range of T must be all of W .Thismeansthat every polynomial of degree 2 is the derivative of a polynomial of degree 3.   Let’s apply similar reasoning for the indefinite integral (the antiderivative). Using the S same V and W ,theindefiniteintegralgivesalineartransformation W / V defined by S(w~ )= w~ d x .SincetheindefiniteintegralisonlydefineduptoaconstantC,wemustmake achoice,namelyC = 0. Again, the linearity of the integral is one of the basic properties R you saw in Calculus I (here it is important that we chose C =0!).Forinstance,linearity allows us to compute

S(2 3x2)= (2 3x2) dx = 2 dx 3x2 dx =2x x3. Z Z Z

Let’s find the matrix B for S in the bases 0, .ThecolumnsofB are B B 0 0 0

1 1 2 0 2 1 3 0 [S(1)] =[x] = 2 3 , [S(x)] =[2 x ] = 2 1 3 , [S(x )] =[3 x ] = 2 3 , B B 0 B B 2 B B 0 607 607 6 1 7 6 7 6 7 6 3 7 4 5 4 5 4 5 so that 000 100 B = 2 1 3 . 0 2 0 600 1 7 6 3 7 4 5 4 Let’s check how B acts on 2 3x2: 000 0 2 2 100 2 3 B[2 3x ] = 2 1 3 0 = 2 3 =[2x x ] , B0 0 0 2 3 0 B 2 3 600 1 7 6 17 6 3 7 4 5 6 7 4 5 4 5 which agrees with our above calculation. Note that B has rank 3, so N(B)= ~0 ,sothe kernel of S is 0 :theonlypolynomialwhoseindefiniteintegralis0isthe0polynomial.{ } { } Moreover, C(B)consistsofallcolumnvectorsinR4 with 0 as their first coordinate, so the range of S is all polynomials with no constant term: we made the choice to use the constant term 0 for our indefinite back when we defined S. One more thing to do is to think about what happens when we act by both T and S in either order. Since S is the antiderivative,

d T (S(w~ )) = dx w~ d x = w~ Z leaves w~ unchanged (the composition T S is the identity transformation on W ). To see how T and S act in the other order, let ~v = a + bx + cx2 + dx3 be a general element of V . Then d~v 2 2 3 S(T (~v )) = dx dx = (b +2cx +3dx ) dx = bx + cx + dx , Z Z which leaves ~v the same except for killing the constant term (the composition S T is a projection onto span(x, x2,x3)). These properties can be easily seen using the matrices A and B.ThecompositionT S corresponds to the matrix product AB,whichyields 000 0100 100 100 AB = 00202 3 = 010, 2 3 0 1 0 2 3 0003 2 001 600 1 7 4 5 6 3 7 4 5 4 5 the identity transformation. Likewise,

000 0000 0100 100 0100 BA = 2 3 0020= 2 3 0 1 0 2 3 0010 2 0003 600 1 7 600017 6 3 7 4 5 6 7 4 5 4 5 is the projection matrix we described above. Although A and B are not inverses (since A and B are not square, there’s no chance of them being invertible), B is the pseudoinverse A+ of A:thetwocompositionscalculatedabovegiveprojectionsontothecolumnspaceand row space of A.(See7.3ofStrangformoreaboutpseudoinverses.)

5