<<

3 Linear

3.1 Introduction

In , a vector in the plane R2 with components 2 and −3 is usually written using notation such as −→v = h2, −3i. For our purposes it turns out to be more convenient to express such a vector as a 2 × 1 :

2 x = . −3 More generally, a vector in Rn is written as an n × 1 matrix. When writing vectors in text we usually use the matrix transpose notation to avoid unseemly vertical spacing. For instance, we might write x = [6, −1, 3, 2]T , when we want to say 6 −1 x = . 3    2    The addition and multiplication defined for matrices (Section 2.1) gives an addition and for vectors, which coincides with the calculus definitions. The idea of a function plays a central role in calculus and the same is true for . For most of the functions in calculus the inputs and outputs are both real numbers, but in linear algebra, the functions we study have inputs and outputs that are vectors. For instance, here is a function L from the set R2 to the set R3:

x1 +4x2 x1 L = 3x1 − x2 . x2 x2   The notation works just like it did in calculus. For example, if the input vector is [2, 1]T , then the output vector is

(2) + 4(1) 6 2 L = 3(2) − (1) = 5 . 1       (1) 1    

This function satisfies a couple of properties that make it “linear,” meaning that it is compatible with the addition and scalar multiplication of vectors (the precise definition is given below). Linear functions are the main functions in linear algebra. We study them in this section.

1 3 LINEAR FUNCTION 2

3.2 Definition and examples

Linear function.

A function L : Rn → Rm is linear if (a) L(x + y)= L(x)+ L(y), (b) L(αx)= αL(x), for all x, y ∈ Rn, α ∈ R.

The notation L : Rn → Rm is used to indicate that the input vectors come from the set Rn (= domain of L) and the output vectors are in the set Rm (= codomain of L).

3.2.1 Example Show that the function L : R2 → R3 given by

x1 +4x2 L(x)= 3x1 − x2 x2   is linear.

Solution First, the input vector x is an element of R2 (according to the nota- 2 3 T tion L : R → R ), so it is of the form x = [x1, x2] . This is the meaning of x1 and x2 in the formula. We need to verify that L satisfies the two properties in the definition of linear function. For any x, y ∈ R2, we have

x1 + y1 L(x + y)= L x2 + y2 (x1 + y1)+4(x2 + y2) (In the formula, x1 + y1 plays the role of x1 = 3(x1 + y1) − (x2 + y2) and x2 + y2 plays the role of x2.) (x2 + y2)   (x1 +4x2) + (y1 +4y2) = (3x1 − x2)+(3y1 − y2) (x2) + (y2)   x1 +4x2 y1 +4y2 = 3x1 − x2 + 3y1 − y2 x2 y2     = L(x)+ L(y), 3 LINEAR FUNCTION 3

so property (a) holds. Next, for any x ∈ R2 and α ∈ R, we have

(αx1)+4(αx2) αx1 L(αx)= L = 3(αx1) − (αx2) αx2 (αx2)   α(x1 +4x2) x1 +4x2 = α(3x1 − x2) = α 3x1 − x2 α(x2) x2     = αL(x), so property (b) holds. Therefore, L is linear.

3.2.2 Example Show that the function L : R1 → R2 given by

2x1 L(x)= −x1 is linear.

Solution For any x, y ∈ R1, we have

2(x1 + y1) L(x + y)= L([x1 + y1]) = −(x1 + y1) (2x1)+(2y1) = (−x1) + (−y1) 2x1 2y1 = + −x1 −y1 = L(x)+ L(y), so property (a) holds. Next, for any x ∈ R1 and α ∈ R, we have

2(αx1) L(αx)= L([αx1]) = −(αx1) α(2x1) 2x1 = = α α(−x1) −x1 = αL(x), so property (b) holds. Therefore, L is linear.

If a is any number, then the function f : R → R given by f(x)= ax has as its graph a straight (through the origin with a). In fact, this function is linear in the sense of the above definition (regarding R as the same thing as R1). The next theorem generalizes this statement with the number a being replaced by a matrix A. 3 LINEAR FUNCTION 4

Theorem. Let A be an m × n matrix. The function L : Rn → Rm defined by L(x)= Ax is linear.

The function L in the theorem is called the linear function corresponding to the matrix A.

Proof. It should be checked that L makes sense as a function from Rn to Rm. If x is an input vector, then it is an element of Rn, and is therefore an n × 1 matrix. Since A is m×n, the product Ax is defined and equals an m×1 matrix, which is an element of Rm, as desired. We now check that L satisfies the two properties of a linear function. For any x, y ∈ Rn, we have L(x + y)= A(x + y)= Ax + Ay = L(x)+ L(y), where the second equality is due to the distributive property of matrix multi- plication (property (d) in Section 2.3). This verifies property (a). Next, for any x ∈ Rn and α ∈ R, we have L(αx)= A(αx)= α(Ax)= αL(x) where the second equality is due to a property of matrix and scalar multipli- cation (property (i) in Section 2.3). This verifies property (b) and finishes the proof that L is linear.

This gives us another way to check whether a given function is linear:

3.2.3 Example Use the last theorem to show that the function L : R2 → R3 given by x1 +4x2 L(x)= 3x1 − x2 x2   is linear.

Solution We have

x1 +4x2 1 4 x1 L(x)= 3x1 − x2 = 3 −1 = Ax, x2 x2 0 1     where 1 4 A = 3 −1 . 0 1   3 LINEAR FUNCTION 5

Therefore, L is linear by the preceding result.

The zero vector in Rn is the vector T 0 = [0, 0,..., 0] .

Theorem. Let L : Rn → Rm be a function. If L is linear, then

L(0)= 0.

Proof. Assume that L is linear. We have L(0)+ L(0)= L(0 + 0)= L(0), where the first equality is due to property (a) of a linear function. Subtracting L(0) from both sides of this equation gives L(0)= 0, as desired.

Put another way, the theorem says that if L does not send 0 to 0, then it cannot be linear.

3.2.4 Example Is the function F : R1 → R2, given by

2x1 +1 F (x)= ,  −x1  linear? Explain.

Solution Note that 2(0) + 1 1 0 F (0)= = 6= = 0  −(0)  0 0 (the string says that F (0) 6= 0), so F is not linear according to the preceding theorem.

3.2.5 Example Is the function F : R2 → R2, given by

x1x2 F (x)= ,  x1  linear? Explain.

Solution If we can show that the function does not send 0 to 0, then we can quickly conclude that it is not linear (as in the preceding example). However, (0)(0) 0 F (0)= = = 0,  (0)  0 3 LINEAR FUNCTION 6

so all we know is that F has a chance of being linear. We see if we can verify property (a) of a linear function. Let x, y ∈ R2. We have

x1 + y1 (x1 + y1)(x2 + y2) F (x + y)= F = x2 + y2  (x1 + y1)  x1x2 + x1y2 + y1x2 + y1y2 = .  x1 + y1 

We are trying to show that this equals

x1x2 y1y2 F (x)+ F (y)= +  x1   y1  x1x2 + y1y2 = .  x1 + y1 

Since the first components (in red) do not match up, we suspect that F is not linear. We cannot write F (x+y) 6= F (x)+F (y), though, since there are choices for x and y that actually give equality (for instance, x = 0 and y = 0). However, in order to show that F fails property (a) it is enough to give a single counterexample. Using inspection, we see that if x1, x2, y1, y2 are all equal to 1, for instance, then the first components are not equal, so this should give our counterexample. Everything we have done up to this point can be considered scratch work. It was done just to come up with an idea for a counterexample. To solve the problem, all we really need to write is this: If x = [1, 1]T and y = [1, 1]T , then

2 4 2 1 1 F (x + y)= F = 6= = + = F (x)+ F (y), 2 2 2 1 1 so F is not linear. 3 LINEAR FUNCTION 7

3.3 Image, Preimage, and Kernel

Definition of image.

Let L : Rn → Rm be a function. n

ˆ Let x be a vector in R . The image of x under L is L(x).

ˆ The image of L (denoted im L) is the set of all images L(x) as x ranges through Rn. In symbols,

n im L = {L(x) | x ∈ R }.

In other words, given an input vector x, its image is the corresponding output vector. And the image of L is the set of all actual output vectors.

3.3.1 Example Let L : R3 → R2 be given by

x1 − 3x2 +2x3 L(x)= −2x1 +6x2 − x3

(a) Find the image of [4, 1, −7]T under L. (b) Is [−5, 7]T in im L? Explain.

Solution (a) The image of [4, 1, −7]T under L is

4 (4) − 3(1) + 2(−7) −13 L 1 = = .   −2(4) + 6(1) − (−7) 5 −7       (b) The question amounts to asking if there is a vector x in R3 such that 3 LINEAR FUNCTION 8

L(x) = [−5, 7]T , that is,

x1 − 3x2 +2x3 −5 = . −2x1 +6x2 − x3  7 

This equality of vectors holds if and only if the vectors’ components are the same, so this leads to a system of equations with corresponding augmented matrix 1 −3 2 −5 ,  −2 6 −1 7  which has row echelon form 1 −3 2 −5 .  0 0 3 −3 

T There is no pivot in the augmented column, so a solution x = [x1, x2, x3] exists. Therefore, [−5, 7]T is in im L.

Definition of preimage.

Let L : Rn → Rm be a function. Let y be a vector in Rm. The preimage of y under L (denoted L−1(y)) is the set of all x in Rn that have image under L equal to y. In symbols:

− n L 1(y)= {x ∈ R | L(x)= y}. 3 LINEAR FUNCTION 9

Definition of kernel. Let L : Rn → Rm be a function. The kernel of L (denoted ker L) is the preimage of 0 under L. In other words, ker L is the set of all vectors in Rn that have image under L equal to 0. In symbols:

− n ker L = L 1(0)= {x ∈ R | L(x)= 0}.

3.3.2 Example Let L : R3 → R2 be given by

x1 − 3x2 +2x3 L(x)= −2x1 +6x2 − x3

(a) Determine whether the vector [2, 0, −3]T is in the preimage of [−4, 8]T under L. (b) Find L−1([−5, 7]T ). (c) Find ker L.

Solution (a) Asking whether the vector [2, 0, −3]T is in the preimage of [−4, 8]T under L is asking whether L([2, 0, −3]T ) = [−4, 8]T . Since 2 2 − 3(0) + 2(−3) −4 −4 L 0 = = 6= ,   −2(2) + 6(0) − (−3) −1 8 −3         [2, 0, −3]T is not in the preimage of [−4, 8]T . (b) We seek the set of all vectors x in R3 for which L(x) = [−5, 7]T , that is,

x1 − 3x2 +2x3 −5 = . −2x1 +6x2 − x2  7  This equality of vectors holds if and only if the vectors’ components are the same, so this leads to a system of equations with corresponding augmented 3 LINEAR FUNCTION 10

matrix 1 −3 2 −5 ,  −2 6 −1 7  which has reduced row echelon form (RREF)

1 −3 0 −3 .  0 0 1 −1 

The preimage of [−5, 7]T is the solution set of the corresponding system, which is {[3t − 3, t, −1]T | t ∈ R}. (c) The kernel of L is the preimage of the zero vector, so the solution is just like the solution to (b) except with [0, 0]T in place of [−5, 7]T . The augmented column in the augmented matrix now consists of 0’s and, since row operations never change a column of all 0’s, we can immediately write down the reduced row echelon form of the system:

1 −3 0 0 .  0 0 1 0 

Therefore, ker L = {[3t, t, 0]T | t ∈ R}.

3.4 Linear operator

A special name is given to a linear function L : Rn → Rm in the case m = n, that is, when the domain and the codomain of L are the same:

Linear operator.

A linear operator on Rn is a linear function from Rn to Rn.

Let L be a linear operator on R2 (the plane). Since L is a linear function from the plane to itself, we can think of it as simply moving vectors in the plane: an input vector gets moved to the corresponding output vector. (A similar statement can be made for a linear operator on Rn for any n.)

2 2 3.4.1 Example Let L : R → R be “projection onto the x1-axis.”

(a) Find the image of [2, 3]T under L geometrically. (b) Find the kernel of L geometrically. 3 LINEAR FUNCTION 11

(c) Find a general formula for L(x). (d) Use the general formula found in part (c) to redo parts (a) and (b) ana- lytically.

Solution (a) The image of [2, 3]T under L is L([2, 3]T ), which is [2, 0]T :

(b) The kernel of L is the set of all vectors x such that L(x)= 0. This set is T the x2-axis, so ker L = {[0,t] | t ∈ R}:

T (c) A general formula for L(x) is L(x) = [x1, 0] (keep the first component the same, but change the second component to 0). (d) Redoing part (a) using the formula, we have L([2, 3]T ) = [2, 0]T . T For part (b), we seek the set of all x for which L(x)= 0, that is, [x1, 0] = T [0, 0] . This last equation forces x1 = 0 but places no restriction on x2, T so ker L = {[0, x2] | x2 ∈ R} (which is the same as the set obtained in (b) since x2 acts as a dummy , meaning that renaming it has no effect).

3.4.2 Example Let L : R2 → R2 be “90◦ counterclockwise rotation.” 3 LINEAR FUNCTION 12

(a) Find the image of [3, 1]T under L geometrically. (b) Find the preimage of [2, −3]T under L geometrically. (c) Find a general formula for L(x). (d) Use the general formula found in part (c) to redo parts (a) and (b) ana- lytically.

Solution (a) The image of [3, 1]T under L is L([3, 1]T ), which is [−1, 3]T :

(b) The preimage of [2, −3]T under L is the set of all those vectors that L moves to [2, −3]T . There is only one such vector, namely [−3, −2]T , so L−1([2, −3]T )= {[−3, −2]T }:

T (c) The general formula for L is L(x) = [−x2, x1] (switch components and then negate the first). (Part (a) shows that this formula works for x in the first quadrant and one can check that it works in the other three quadrants as well.) (d) Redoing part (a) using the formula, we have L([3, 1]T ) = [−1, 3]T . 3 LINEAR FUNCTION 13

For part (b), we seek the set of all x for which L(x) = [2, −3]T , that is, T T [−x2, x1] = [2, −3] . This equation forces x1 = −3 and x2 = −2, so L−1([2, −3]T )= {[−3, −2]T }.

The functions given in the last two examples are linear (as can be checked by using the general formulas). The following functions from R2 to itself are all linear:

ˆ projection onto line through origin,

ˆ rotation about origin,

ˆ reflection across line through origin,

ˆ dilation (= multiplication by scalar > 1),

ˆ contraction (= multiplication by scalar between 0 and 1).

However, translation by a vector t (which sends x to x + t) is not linear if t is nonzero (since, for instance, it does not send 0 to 0).

3.5 Matrix of a linear function

We have seen that if A is an m × n matrix, then we get a linear function L : Rn → Rm by defining L(x)= Ax. Here we turn things around and show that if we start with a linear function L : Rn → Rm, then we can use it to build a matrix A so that the above equation holds. The construction requires the following notation:

2 1 0

ˆ In R , e1 = , e2 = ; 0 1 1 0 0 3

ˆ In R , e1 = 0, e2 = 1, e3 = 0; 0 0 1       and so forth. These are the standard unit vectors. 3 LINEAR FUNCTION 14

Matrix of a linear function.

Let L : Rn → Rm be a linear function. There is a unique m × n matrix A such that L(x)= Ax for all x ∈ Rn. Moreover,

A = L(e1) L(e2) ··· L(en) .   The matrix A is called the matrix of L.

This is a special case of a theorem that will be presented later, so we postpone the proof till then.

3.5.1 Example Let L : R2 → R3 be the linear function given by

x1 +4x2 L(x)= 3x1 − x2 . x2  

(a) Find the matrix of L. (b) Use part (a) to find L([5, −2]T ). (c) Find L([5, −2]T ) directly from the formula for L and verify that it agrees with the answer to part (b).

Solution (a) We have

(1) + 4(0) 1 T L(e1)= L([1, 0] )= 3(1) − (0) = 3 , (0) 0     T and similarly, L(e2) = [4, −1, 1] , so the matrix A of L is

1 4 A = L(e1) L(e2) = 3 −1 .   0 1   (b) Using the formula L(x)= Ax, we have

1 4 −3 T 5 L([5, −2] )= 3 −1 = 17 .   −2   0 1   −2     3 LINEAR FUNCTION 15

(c) The formula for L gives

(5) + 4(−2) −3 T L([5, −2] )= 3(5) − (−2) =  17 , (−2) −2     in agreement with part (b).

2 2 3.5.2 Example Let L : R → R be “reflection across the x2-axis.”

(a) Find the matrix of L. (b) Use part (a) to find L([1, 3]T ). (c) Find L([1, 3]T ) geometrically and verify that it agrees with the answer to part (b).

Solution (a) The matrix A of L is

−1 0 A = L(e1) L(e2) = .  0 1   (b) Using the formula L(x)= Ax, we have

T −1 0 1 −1 L([1, 3] )= = .  0 13  3 

(c) Since reflection across the x2-axis negates the first component of a vector and keeps the second component the same, we have L([1, 3]T ) = [−1, 3]T , in agreement with part (b).

3.6 Composition

The reader is likely familiar with the concept of a composition of real-valued functions. For instance, if f(x)=2x + 3 and g(x) = x2, then the composition of f and g is given by

(g ◦ f)(x)= g(f(x)) = (f(x))2 = (2x + 3)2.

The composition can be described as “doing f first and then g”. In more detail, the composition takes an input x, uses f to produce the output f(x), and then uses g with input f(x) to produce the final output g(f(x)). 3 LINEAR FUNCTION 16

We can compose linear functions as well: If L : Rn → Rm and M : Rm → Rl are linear functions, then the composition of L and M is the function M ◦L : Rn → Rl given by (M ◦ L)(x)= M(L(x)). In order for the composition to make sense, the domain of M must be the same as the codomain of L (both equal to Rm above), for otherwise the output produced by L could not be used as an input for M.

3.6.1 Example Let L : R2 → R3 be the linear function given by

x1 +2x2 L(x)=  −x1  3x1 − 4x2   and let M : R3 → R2 be the linear function given by

5x1 + x2 − 7x3 M(x)= .  −x1 + x3  Find a formula for the composition M ◦ L : R2 → R2.

Solution We have x1 +2x2 (M ◦ L)(x)= M(L(x)) = M  −x1  3x1 − 4x2   5(x1 +2x2) + (−x1) − 7(3x1 − 4x2) =  −(x1 +2x2)+(3x1 − 4x2)  −17x1 + 38x2 = .  2x1 − 6x2 

Theorem. Let L : Rn → Rm and M : Rm → Rl be linear functions, let A be the matrix of L, and let B be the matrix of M.

(a) M ◦ L is linear, (b) the matrix of M ◦ L is BA.

Proof. (a) For any x, y ∈ Rn, we have (M ◦ L)(x + y)= M(L(x + y)) = M(L(x)+ L(y)) (L is linear) = M(L(x)) + M(L(y)) (M is linear) = (M ◦ L)(x) + (M ◦ L)(y), 3 LINEAR FUNCTION 17

so M ◦ L satisfies the first property of a linear function. Verification of the second property is left to the exercises (see Exercise 3–10). (b) For any x ∈ Rn, we have

(M ◦ L)(x)= M(L(x)) = M(Ax)= B(Ax) = (BA)x, so the matrix of M ◦ L is BA (by the uniqueness assertion in 3.5).

3.6.2 Example Let L : R2 → R3 and M : R3 → R2 be as in Example 3.6.1.

(a) Find the matrix A of L and the matrix B of M and use these matrices to find the matrix C of the composition M ◦ L. (b) Use the formula for M ◦ L found in Example 3.6.1 to find the matrix of M ◦ L directly and compare with the answer to part (a).

Solution (a) We have

1 2 A = L(e1) L(e2) = −1 0    3 −4   and 5 1 −7 B = M(e1) M(e2) M(e3) = . −10 1    Therefore, according to the theorem, the matrix C of the composition is

1 2 5 1 −7 −17 38 C = BA = −1 0 = . −10 1   2 −6   3 −4     (b) Using the formula for M ◦ L found in Example 3.6.1, we have

−17 38 C = (M ◦ L)(e1) (M ◦ L)(e2) = ,  2 −6   in agreement with part (a).

3.6.3 Example Let L : R2 → R2 be “90◦ counterclockwise rotation” and 2 2 let M : R → R be “projection onto the x1-axis.”

(a) Using geometry, find the matrix A of L and the matrix B of M and then use these matrices to find the matrix C of the composition M ◦ L. 3 LINEAR FUNCTION 18

(b) Using geometry, find the matrix of M ◦ L directly and compare with the answer to part (a).

Solution (a) Using geometry to see where L and M send the vectors e1 and e2, we get 0 −1 A = L(e1) L(e2) = 1 0    and 1 0 B = M(e1) M(e2) = . 0 0   Therefore, according to the theorem, the matrix C of the composition is

1 0 0 −1 0 −1 C = BA = = . 0 01 0  0 0 

T T T (b) The vector e1 = [1, 0] is sent by L to [0, 1] , which is sent by M to [0, 0] , T T so (M ◦ L)(e1) = [0, 0] . Similarly, (M ◦ L)(e2) = [−1, 0] . Therefore,

0 −1 C = (M ◦ L)(e1) (M ◦ L)(e2) = , 0 0    in agreement with part (a).

3 – Exercises

3–1 Show directly from the definition that the function L : R3 → R2 given by 2x1 − x2 +4x3 L(x)=  x1 − 6x3  is linear. 3 LINEAR FUNCTION 19

3–2 Show directly from the definition that the function L : R2 → R1 given by L(x)= 5x1 − 8x2   is linear.

3–3 Let L : R3 → R2 be the linear function corresponding to the matrix

−1 7 2 A =  1 −6 1

(see Section 3.2 for what this means).

(a) Find L([4, 1, −3]T ) (b) Determine whether [6, 2, −2]T is in L−1([4, −8]T ). (c) Find L−1([4, −8]T ) and use it to verify your answer to part (b).

3–4 In each case, determine whether the function F is linear:

2 1 2 2 (a) F : R → R given by F (x)= x1 − x2 ,   x1 (b) F : R2 → R2 given by F (x)= . cos x2

3–5 Let L : R2 → R3 be given by

x1 +3x2 L(x)= −2x1 − 6x2 3x1 +9x2  

(a) Find the image of [−2, 1]T under L. (b) Is [−2, 4, 6]T in im L? Explain.

3–6 Find the kernel of the linear function L : R3 → R3 given by L(x) = T [0, 2x2 − x3, −6x2 +3x3] . 3 LINEAR FUNCTION 20

2 2 ◦ 3–7 Let L : R → R be “reflection across the 45 line x2 = x1.” (In calculus, this line is written y = x.)

(a) Find the image of [2, 1]T under L geometrically. (b) Find the preimage of [1, −3]T under L geometrically. (c) Find a general formula for L(x). (d) Use the general formula found in part (c) to redo parts (a) and (b) ana- lytically.

3–8 Let L : R3 → R2 be the linear function given by

2x1 − x2 +5x3 L(x)= .  7x2 +4x3 

(a) Find the matrix of L. (b) Use part (a) to find L([3, 2, −1]T ). (c) Find L([3, 2, −1]T ) directly from the formula for L and verify that it agrees with the answer to part (b).

3–9 Let L : R2 → R2 be “90◦ clockwise rotation.”

(a) Find the matrix of L. (b) Use part (a) to find L([2, 1]T ). (c) Find L([2, 1]T ) geometrically and verify that it agrees with the answer to part (b).

3–10 Let L : Rn → Rm and M : Rm → Rl be linear functions. Verify that

(M ◦ L)(αx)= α(M ◦ L)(x) for all x ∈ Rn and α ∈ R. (This is the second part of the verification that M ◦ L is linear. See the theorem of Section 3.6 and its proof.) 3 LINEAR FUNCTION 21

3–11 Let L : R3 → R2 be the linear function given by

−x1 +4x2 +2x3 L(x)=  3x1 − x3  and let M : R2 → R3 be the linear function given by

7x1 + x2 M(x)= x1 +6x2 . 5x2   Find a formula for the composition M ◦ L : R3 → R3.

2 2 3–12 Let L : R → R be “reflection across the line x2 = −x1” and let 2 2 M : R → R be “projection onto the x2-axis.”

(a) Using geometry, find the matrix A of L and the matrix B of M and then use these matrices to find the matrix C of the composition M ◦ L. (b) Using geometry, find the matrix of M ◦ L directly and compare with the answer to part (a).