<<

A taste of theory

SM242 Fall 2019

1 Vectors

A vector is an array of numbers such as

v = 7 −2 1 7 .

In this example, we would write v ∈ Z4, because the entries of v are all integers, and the dimension of v (i.e., the number of entries) is 4. More generally, if S is a set and n is a non-negative integer, then Sn is the set of all length-n vectors with entries from S. We will write the ith entry of vector v as vi, always starting from index i = 1 and going up to the vector dimension n. So in the above example, v4 = 7. You have probably seen vectors in calculus class, physics, or elsewhere. Vectors can be used to represent many things: a physical force with direction and magnitude, a single point in n-dimensional space, or the list of partial derivatives of a multivariable function. In this class, vectors may mean any of these things, or none of these things. It will be simplest to think of a vector just as an array of numbers from some underlying set.

2 Vector operations

We can do arithmetic with vectors, although it sometimes works differently than arithmetic with individual numbers. Consider, for example, a store which sells Snickers bars, M&Ms, and Twix for $1.00, $.75, and $1.50 each, respectively. We can represent this information in a vector a ∈ Q3:

a = 1 .75 1.5

2.1 Dot product Now let’s say I want to buy 3 Snickers and 2 Twix. I can represent my purchase amounts as another vector b ∈ Z3:

1 b = 3 0 2 To compute how much it will cost in total to complete my purchase, I multiply the corresponding entries from a and b and add them up. This is called a dot product or inner product:

a · b = 1 × 3 + .75 × 0 + 1.5 × 2 = 6 Because a · b = 6, the transaction in this example costs $6.00. n Pn In general, the dot product of two vectors u, v ∈ S is the single scalar value i=1 uivi.

2.2 Vector addition Now let’s say my friend Cheryl asks me to buy her a pack of M&M’s and a Twix. Cheryl’s preferences are represented by another vector c ∈ Z3: c = 0 1 1

How much will both of our purchases cost? To answer this we will perform the addition of two vectors, which can be computed by adding up the corresponding entries into a new vector:

3 0 3 b + c = 0 + 1 = 1 2 1 3 In general, the addition of two vectors u, v ∈ Sn is another vector w ∈ Sn defined by wi = ui + vi for each 1 ≤ i ≤ n. Notice that I wrote these vectors standing up rather than laying down. They are the same vectors either way, but when standing up we call them column vectors and when laying down we call them row vectors. (Some other authors prefer to write b| to indicate a row vector, which will make sense to you when we learn what the transpose of a matrix is shortly.)

2.3 Vector times scalar Finally, imagine that I repeat my order 52 times, every week of the year. How many of each candy bar do I buy in total? This is called a scalar product of a vector times a number:

3 156 52b = 52 0 =  0  2 104

In general, for a scalar x ∈ S and a vector v ∈ Sn, the scalar product xv is another n vector w ∈ S defined by wi = xvi for each 1 ≤ i ≤ n.

2 How do these three operations of dot product, addition, and scalar multiplication inter- act? Pretty much how you would expect. Let u, v, w ∈ Sn be any n-dimensional vectors, and let x, y ∈ S be any scalar value. Then vector arithmetic follows these rules, which are pretty much what you should expect: • Dot products are commutative: u · v = v · u

• Vector addition is commutative and associative:

(u + v) + w = u + (v + w) = (w + v) + u

• Scalar multiplication is distributive:

(x + y)(u + v) = x(u + v) + y(u + v) = (x + y)u + (x + y)v = xu + yu + xv + yv

• Dot products are distributive too:

u · (v + w) = u · v + u · w

• Finally, scalar and dot products are associative with each other:

x(u · v) = (xu) · v

The important thing when dealing with vector arithmetic is to make sure the types and dimensions always match. The definitions don’t make any sense to compute the dot product or addition of two vectors with different dimensions. Similarly, we can’t compute the scalar product of two vectors, or a dot product of a number times a vector. When you are doing math with vectors, make sure you always know what is a vector and what is a scalar, and what are the dimensions of any vectors you are dealing with.

3 Matrices

A matrix is a rectangular array   a1,1 a1,2 ··· a1,n  a a ··· a   2,1 2,2 2,n  A =  . . .. .   . . . .  am,1 am,2 ··· am,n

m×n If each entry ai,j of A is contained in the same set S, then we say A ∈ S . That is, the set Sm×n is the set of matrices with entries from S organized into m rows and n columns. Importantly, notice that a matrix is just an array of vectors! If you understand what we just learned about vectors, then most facts about matrices will start to make sense.

3 For any row index 1 ≤ i ≤ m, the ith row vector of matrix A is   ai,1 ··· ai,n . And for any column index 1 ≤ j ≤ m, the jth column vector of A is   a1,j  .   .  . an,j

3.1 Matrix addition and scalar multiplication From what we know about vector arithmetic, two kinds of matrix arithmetic should be no surprise: • Matrix addition: To add two matrices A, B ∈ Sm×n, create a new matrix with the same dimensions, formed by adding each entry of A to the corresponding entry in B. For example, 5 6 5 4 1 8 9 7 13 + = 2 0 6 3 2 5 5 2 11 • Scalar multiplication: To multiply a matrix A times a scalar value x ∈ S, just multiply each entry of A times x. For example, 3 2 5 −6 −4 −10 −2 = 0 8 8 0 −16 −16

4 Matrix products

Returning to our example from the first section, we said that one store sells Snickers, M&M’s and Twix for $1,00, $.75, and $1.50 each, respectively. Imagine a second store opens up and sells all three candy bars for $1.15. This gives us a 2 × 3 matrix A ∈ Q2×3:  1 .75 1.5  A = 1.15 1.15 1.15

4.1 Matrix-vector multiplication If I now want to calculate how much my purchase of 3 Snickers and 2 Twix will be at each of these two stores, I need to calculate two dot products, each store’s vector of prices (a row in the matrix) times my vector of purchase amounts. This is a matrix-vector product:

3  1 .75 1.5   3 × 1 + 0 × .75 + 2 × 1.5   6  0 = = 1.15 1.15 1.15   3 × 1.15 + 0 × 1.15 + 2 × 1.15 5.75 2

4 What this tells us is that my purchase will cost $6.00 at the first store, but only $5.75 at the second store. Notice that column dimension of the matrix must match up with the dimension of the vector — in this case, 3 kinds of candy. And the row dimension of the matrix will be the same as the dimension of the resulting product vector — in this case, 2 different stores. More generally, the matrix-vector product of A ∈ Sm×n and u ∈ Sn is a vector v ∈ Sm, Pn written Au = v, where the ith entry of v, for every 1 ≤ i ≤ m, is defined by vi = j=1 ai,juj.

4.2 Matrix- Let’s extend this further. Remember again from the original example that my friend Cheryl wants to purchase 1 M&M’s and 1 Twix, corresponding to the vector

0 1 . 1 Now we ask the question: how much will each of our purchases cost, at each of the stores in town? For this we will need four dot products, each store’s price vector times each of these two purchase vectors. This is a matrix-matrix product:

3 0  1 .75 1.5   6 2.25 0 2 = 1.15 1.15 1.15   5.75 2.30 1 1 Notice that the output entry in row i and column j is computed as the dot product of the ith row of the first matrix, times the jth column in the second matrix. For example, the price $2.25 that Cheryl will spend at the first store equals the first row of

 1 .75 1.5  1.15 1.15 1.15

times the second column of 3 0 0 1 1 1 computed as 0   1 .75 1.5 · 1 = 2.25 1 To compute a matrix product, you have to do a different dot product for every entry in the output matrix. But all that work gives you a lot of information! In this case, we learn the total price of each purchase at each of the two different stores. If price matters, I should shop at the second store to get my best price of $5.75, but Cheryl should shop at the first store to get her sugar fix for just $2.25.

5 More generally, the product of two matrices A ∈ Sm×` and B ∈ S`×n is another matrix C ∈ Sm×n whose entries are defined by

` X ci,j = ai,kbk,j, k=1 for every 1 ≤ i ≤ m and 1 ≤ j ≤ n.

4.3 Dimensions for matrix-matrix products Notice that matrix-vector multiplication can be viewed as a special case of matrix-matrix multiplication, if you think of a n-dimensional vector as a n × 1 matrix. Also notice that, just like with dot products and matrix-vector products, the dimensions must match for matrix multiplication to make sense. In our example, we have to be talking about the same kinds of candy everywhere, so the number of prices for each store must match the number of candy amounts for each person (which in this case is 3). Specifically, to multiply to matrices A times B, the number of columns in A must equal the number of rows in B. Or to put it more simply, the “inner dimensions” must match up, while the “outer dimensions” determine the dimensions of the resulting matrix. For example, if we multiply a 2 × 4 times a 4 × 5 matrix, this is valid because the inner dimensions match, 4 = 4, and the resulting matrix will be 2 × 5.

5 Special matrices and multiplication properties

Think about what makes 0 special as a number: when I add it to any other number, it doesn’t change what that number is. In other words, x + 0 = x for all x ∈ R. Similarly, 1 is special for the same reason with multiplication: x · 1 = x for all x ∈ R.

5.1 Zero matrix What matrices play similar roles like zero and one for matrix addition and multiplication? For addition, the answer is pretty easy: the zero matrix is a matrix where all the entries are zero. For any dimensions m, n, we write 0m,n as a matrix with m rows and n columns full of zeros. Then for any m × n matrix A, we can see that A + 0m,n = A.

5.2 For multiplication, finding a matrix that plays a role similar to the number 1 is less obvious. If you try playing around with the idea, you will discover that the only matrix which you can multiply times any other matrix and always get back that other matrix as a result, is the identity matrix. Written In, the identity matrix is always a square n × n matrix that looks like this:

6 1 0 ··· 0 0 1 ··· 0   In = . . .. . . . . . 0 0 ··· 1

Another way to define the identity matrix is to say that the i, j entry of In is defined as ( 1, i = j (In) = i,j 0, i 6= j

Try out a few examples to convince yourself that, for any m × n matrix A, the matrix- matrix product AIn equals A.

5.3 Matrix transpose Another important kind of matrix is the transpose matrix, which is formed by taking any matrix and changing all the rows to columns and columns to rows. For any m × n matrix A, we write A| for the transpose of A. The entry in row i and column j of the original matrix A becomes the entry in row j and column i of the transpose A|. For example, 6 1 6 9 8| = 9 8 1 8 7   8 7 What happens when you multiply the transpose of two matrices, compared to multiplying them first and then taking the transpose of that product? If you try it, you’ll discover that you also have to reverse the order of the product to make it work. Or to say all this in one line: (AB)| = B|A|

5.4 Matrix rules In summary, we have learned the following properties of matrices:

1. Matrix addition is associative and commutative:

(A + B) + C = A + (B + C) = (C + B) + A

2. Matrix multiplication is associative:

(AB)C = A(BC)

3. Matrix multiplication is not commutative: There exist A, B such that AB 6= BA.

7 4. Matrix multiplication and addition are distributive:

A(B + C) = AB + AC and (A + B)C = AC + BC.

5. For any m × n matrix A and any other dimension `,

A + 0m,n = A and A0n,` = 0m,` and 0`,mA = 0`,n.

6. For any m × n matrix A, ImA = AIn = A.

7.( AB)| = B|A|. Actually each of these “rules” is a theorem that you could prove based on the definitions and the properties of real numbers that you already know!

6 Linear systems and matrices

After eating so much candy with my friend Cheryl, I decided to go on a diet. Every day for breakfast, I want to eat some combination of three foods, with the following nutritional properties: • Every 100 grams of plain Greek yogurt has 53 calories, 3g of carbohydrate, and 10g of protein. • Every 100 grams of granola provides 536 calories, 54g of carbohydrate, and 11g of protein. • Every 100 grams of blueberries have 57 calories, 15g of carbs, and 1g of protein. According to the internet, an optimal breakfast will have in total 500 calories, 63g of carbs, and 38g of protein. So the question is, how much of each food should I eat to get the perfect breakfast?

6.1 Linear system of equations We can set this up as a system of equations with 3 equations and 3 unknown variables: x, y, z will represent (respectively) how many grams of yogurt, granola, and blueberries I should mix together. Putting this all together, I want to find values for x, y, z to solve these three equations:

.53x + 5.36y + .57z = 500 .03x + .54y + .15z = 63 .1x + .11y + .01z = 38

8 These equations are called linear because the variables are not raised to any powers, and the variables are multiplied just by constant coefficients, not times each other. It is a system of equations because we want to find x, y, z values which simultaneously make all three of the equations true.

6.2 Linear system in matrix form As you might have noticed already, we can rewrite this system of 3 equations into just one equation by using a matrix-vector product, as follows:

.53 5.36 .57 x 500 .03 .54 .15 y =  63  .1 .11 .01 z 38 In other words, we know a matrix of values A and a right-hand vector b, and we need to find a solution vector x such that Ax = b. This problem is called solving a linear system and it has played a big role not only in the development of interesting , but also in the history of computing. In the case of my ideal breakfast, the solution is (roughly) to have x = 315 grams of yogurt, y = 39 grams of granola, and z = 216 grams of blueberries. Yum! But how do we come up with these values?

7 Generic linear system solving

Let’s look at how to solve linear systems like we saw in the previous section. To keep the arithmetic simpler, we’ll change the numbers in our example system:

2x − 6y + 10z = 32 −x + y − z = −4 2x − 2y + 8z = 38

7.1 System solving via elimination and substitution Here’s how you might solve this system by hand: 1. Solve the first equation for x, by dividing everything by 2: x = 3y − 5z + 16

2. Plug this value into the second and third equations, which eliminates x from those equations: −(3y − 5z + 16) + y − z = −4 2(3y − 5z + 16) − 2y + 8z = 38

9 and simplify:

−2y + 4z = 12 4y − 2z = 6

3. Now solve the second equation for y:

y = 2z − 6

4. Plug this into the first and third equations to remove y from both of those. In the first equation we get

x = 3(2z − 6) − 5z + 16 = z − 2

And for the third equation, we get

4(2z − 6) − 2z = 6 6z = 30

5. Now we divide the third equation by 6 to solve for z:

z = 5

6. Finally, plug in this value for z into the first and second equations to eliminate z from both of those:

x = z − 2 = 3 y = 2z − 6 = 4

7. At this point, we know that x = 3, y = 4, and z = 5. Let’s try these in the original equations to make sure it all worked out.

2x − 6y + 10z = 6 − 24 + 50 = 32 −x + y − z = −3 + 4 − 5 = −4 2x − 2y + 8z = 6 − 8 + 40 = 38

We got it!

10 7.2 Gaussian elimination (without swapping rows) So, how do we do this consistently, for any linear system like this one? This process is called Gaussian elimination and we can follow the same steps with the matrix formulation. Every time we “solve”, in the matrix formulation it means multiplying a row times a scalar to get a 1 somewhere. And every time we “plug in”, it means adding a multiple of one row to another row, to cancel some matrix entry and get a 0. Here are exactly the same steps in terms of matrix operations. The only real difference is that we will not move things around from one side of the equation to the other, but all of the arithmetic is identical. 0. Write down the matrix linear system problem  2 −6 10  x  32  −1 1 −1 y = −4 2 −2 8 z 38

1. Solve the first equation for x — we divide the first row by 2 to get a 1 in the first row and column. Important: to maintain the equality, we have to also divide the first row of the right-hand side by 2.  1 −3 5  x  16  −1 1 −1 y = −4 2 −2 8 z 38

2. Eliminate x from the second and third equations. This means adding the first row to the second row, and subtracting 2 times the first row from the third row. Again, we have to do the same thing on the right-hand side to maintain the equality. 1 −3 5  x 16 0 −2 4  y = 12 0 4 −2 z 6

3. Solve the second equation for y, meaning we divide the second row by −2 to make the middle entry of the matrix a 1. 1 −3 5  x  16  0 1 −2 y = −6 0 4 −2 z 6

4. Eliminate y from the first and third equations, by adding 3 times the second row to the first row, and subtracting 4 times the second row from the third row. In terms of the matrix, this means making two more zeros in the second column. 1 0 −1 x −2 0 1 −2 y = −6 0 0 6 z 30

11 5. Solve the last equation for z by dividing the third row (of both sides!) by 6. We will now have all 1’s along the main diagonal of the matrix: 1 0 −1 x −2 0 1 −2 y = −6 0 0 1 z 5

6. Finally, eliminate z from the first two equations, which means zeroing out the entries above the 1 in the last column of the matrix. In this case, this means adding the third row to the first row, and adding two times the third row to the second row (on both sides!). 1 0 0 x 3 0 1 0 y = 4 0 0 1 z 5

7. The system is now solved! Notice that all our work was really transforming the left- hand matrix into the identity matrix I3, which from the rules of identity matrix mul- tiplication means that we now have simply x 3 y = 4 z 5 We can plug this vector into the original matrix-vector product to check that  2 −6 10  3  32  −1 1 −1 4 = −4 2 −2 8 5 38

8 Matrix inverse

We already talked about matrix addition and multiplication, and the zero and identity matrices which act kind of like the numbers 0 and 1. What about division — can this operation ever make sense for matrices? The answer is sometimes. What we divide two numbers, say 30 divided by 5, we are really 1 multiplying by the inverse of the second number: 30 × 5 . And that inverse is whatever you would multiply in order to get 1. In terms of matrices, remember that the role of the number 1 is played by the identity matrix In. So in order to find the inverse of matrix A, we are looking for some other matrix B such that AB = In. Look back at the left-hand side matrix from the previous problem  2 −6 10  A = −1 1 −1 2 −2 8

12 We want to find a matrix B such that AB = I3:       2 −6 10 b1,1 b1,2 b1,3 1 0 0 −1 1 −1 b2,1 b2,2 b2,3 = 0 1 0 2 −2 8 b3,1 b3,2 b3,3 0 0 1

Let’s start by considering the first column of B and the first column of I3 in this equation:       2 −6 10 b1,1 1 −1 1 −1 b2,1 = 0 2 −2 8 b3,1 0 This is solving a linear system again, just like before! And in fact, we can re-run exactly the same steps we used before to solve the linear system, even though the right-hand side is different:

1. Divide first row by 2.

2. Add first row to second row. Subtract 2 times first row from third row.

3. Divide second row by −2.

4. Add 3 times second row to first row. Subtract 4 times second row from third row.

5. Divide third row by 6.

6. Add the third row to the first row. Add two times the third row to the second row.

After following these same steps, we find the first column of the inverse:

  −1  b1,1 /4 −1 b2,1 =  /4 b3,1 0 We can repeat exactly the same steps for the second and third columns of B. Actually we could do all these steps at once, just following the same 6 steps above starting with the complete identity matrix I3, to find the inverse   −1/4 −7/6 1/6 −1 B = A = −1/4 1/6 1/3 0 1/3 1/6

Check for yourself that AB = I3. Next try computing BA — what did you think it would be? There are many interesting properties of the inverse matrix, but one use of this is to solve the same linear system of equations many times with different right-hand sides. That

13 is because once we know the inverse of A, we can solve any linear system Av = c by multiplying both sides by A−1: v = A−1c That is, once we know the inverse of a matrix, solving a linear system just means multiplying the inverse times that right-hand side vector. The downside is, the inverse of a matrix doesn’t always exist. The next section discusses how we can still solve linear systems in those cases.

9 and RREF

9.1 Matrix equation as an augmented matrix The method of Section 7 was a way to manipulate a matrix-vector equation Av = b by gradually turning A into the identity matrix. A convenient way to deal with all of this in just one matrix is to form the augmented matrix consisting of the left-hand matrix A and the right-hand solution b pasted next to A. So in the previous example, instead of transforming the equation

 2 −6 10  x  32  −1 1 −1 y = −4 2 −2 8 z 38 into the equation 1 0 0 x 3 0 1 0 y = 4 0 0 1 z 5 instead we just transform the augmented matrix

 2 −6 10 32  −1 1 −1 −4 2 −2 8 38 to the (solved) augmented matrix

1 0 0 3 0 1 0 4 0 0 1 5

9.2 Reduced Still, the method doesn’t always work quite as easily as in the previous example. Sometimes it’s not possible to get the left-hand matrix to exactly become the identity matrix I, such as when the original system we are solving is not a (with the same number of equations as unknowns).

14 To solve any linear system, we use Gaussian elimination to transform the augmented matrix into reduced row echelon form (RREF). In any matrix, the first nonzero entry in a row is called a pivot, and if a row has no pivot (because the row is all zeros), it’s called a zero row. A matrix is in Reduced Row Echelon Form (RREF) if and only if: 1. Every pivot equals 1. 2. Every other value in the same column as a pivot equals zero. (So there is only at most one pivot per column.) 3. The pivot in any row is to the left of the pivot in any lower row. 4. All zero rows (if any) are at the bottom of the matrix.

You can confirm that any identity matrix In is in RREF, but so are many other matrices, for example 0 1 3 0 2 4 0 0 0 0 1 5 8 0   0 0 0 0 0 0 1 0 0 0 0 0 0 0 Notice that this example has 3 pivots, all equal to 1, in columns 2, 4, and 7.

9.3 Elementary row operations and RREF procedure To transform any matrix to RREF, we only need to use the following three elementary row operations: 1. Multiply or divide a row by a nonzero constant 2. Add or subtract a multiple of one row to another row 3. Swap two rows We already used the first two operations in the previous example. The third operation (swapping two rows) is needed to find a nonzero pivot in order to get the last two properties of RREF. In general, to convert any matrix into RREF, you repeat the following procedure for each column (moving left to right): 1. Put a pivot in place. Find the first nonzero entry in this column below any existing pivot rows from earlier steps. If this column has no nonzero pivots, skip it and move to the next column. Otherwise, you may have to swap two rows so that pivot is in the row just below the previous pivot. 2. Divide the row by the pivot, so that the pivot becomes equal to 1. This is similar to the “solve for x/y/z” steps in Section 7.

15 3. Eliminate all other nonzeros in the column by subtracting a multiple of the pivot row from the other rows.

You can check that, after repeating this process for each column, the matrix will satisfy all 4 requirements for RREF. Let’s look at some examples!

9.4 Example 1: Underdetermined system Find a solution to the following linear system:

3x + 12y + 6z = 15 2w + 6x + 20y + 14z = 46 2w + 4x + 12y + 14z = 48

This system has 3 equations and 4 unknowns, so we get a 3 × 5 augmented matrix:

0 3 12 6 15 2 6 20 14 46 2 4 12 14 48

• First column The top-left entry is zero, which means we need to swap the first and second rows to get a nonzero pivot in place:

2 6 20 14 46 0 3 12 6 15 2 4 12 14 48

Next, divide the first row by 2 so that pivot becomes 1:

1 3 10 7 23 0 3 12 6 15 2 4 12 14 48

And then subtract 2 times the first row from the third row to eliminate the rest of the column: 1 3 10 7 23 0 3 12 6 15 0 −2 −8 0 2

• Second column

16 The next spot where we want a pivot is row 2, column 2. Fortunately, this is already nonzero (it equals 3). So we divide that second row by 3 to get the pivot to equal 1.

1 3 10 7 23 0 1 4 2 5  0 −2 −8 0 2

Now eliminate the other entries in the column, by first subtracting 3 times the second row from the first row, then adding 2 times the second row to the third row: 1 0 −2 1 8  0 1 4 2 5  0 0 0 4 12

• Third column At this point, we want a pivot in row 3, column 3. But this entry is zero, and there’s nothing below it to swap with. So we have no pivot in this column; we skip it and move on. • Fourth column There is a pivot in row 3, column 4, and the pivot equals 4. Start by dividing the third row by 4: 1 0 −2 1 8 0 1 4 2 5 0 0 0 1 3

Now eliminate the two nonzero entries above that pivot.

1 0 −2 0 5  0 1 4 0 −1 0 0 0 1 3

• Since we reached the last row, the matrix is now in RREF.

In this case, the RREF has pivots in columns 1, 2, and 4. The corresponding variables w, x, z are called leading variables. The variable y from non-pivot column 3 is called a free variable. To see what this means, let’s write out the system equations corresponding to our aug- mented RREF matrix.

w − 2y = 5 x + 4y = −1 z = 3

17 The reason y is called a free variable is that we are “free” to choose any value of y in these equations. For example, we could choose y = −10 in these equations to give the solution w = 25, x = −41, y = −10, and z = 3. Go ahead and plug those values into the original system of equations to make sure it worked. We could also choose any other value of y, like (the simpler option!) y = 0, which gives another solution with w = 5, x = −1, y = 0, and z = 3. In this case, the linear system is called under-determined because it has infinitely many solutions. Remember, this corresponds to having any column in the left part of the RREF matrix which contains no pivot.

9.5 Example 2: Overdetermined system Here is a linear system with 3 equations and only 2 unknowns:

5x + 20y = 5 3x + 12y = 3 2x + 5y = −4

Now we have a 3 × 3 augmented system:

5 20 5 3 12 3 2 5 −4 Let’s work on converting this to RREF.

• First column We already have the nonzero pivot 5 where we need one, so we start by dividing that first row by 5:

1 4 1 3 12 3 2 5 −4

Next, eliminate the first entry in rows 2 and 3 by subtracting multiples of the first row:

1 4 1 0 0 0 0 −3 −6

• Second column We want the next pivot in row 2, column 2. But that entry is currently zero. So we must first swap the second and third rows to get a pivot in place.

18 1 4 1 0 −3 −6 0 0 0

Now divide the second row by −3 to make the pivot equal 1.

1 4 1 0 1 2 0 0 0

And subtract 4 times the second row from the first row to eliminate all non-pivot entries in this column.

1 0 −7 0 1 2 0 0 0

• The matrix is now in RREF.

Converting the augmented matrix back to a system of equations, we have

x = −7 y = 2 0 = 0

Obviously the last equation is redundant, and we have a unique solution with x = −7 and y = 2. (Check that this solution really works with the original equations!) In this case, because there were more equations than unknowns, the system is said to be overdetermined. If you work back to the original equations, the second equation actually isn’t necessary to solve the system — that corresponds to when we found the second row to be zero in the RREF procedure. What it means is that any x and y which would solve the first equation

5x + 20y = 5 would also solve the second equation

3x + 12y = 3.

In this case, we say these equations are linearly dependent; in other words, the second one doesn’t give us any new information about x and y.

19 9.6 Example 3: Inconsistent system Let’s try that previous example again, except changing the right-hand side of the second equation:

5x + 20y = 5 3x + 12y = 10 2x + 5y = −4

So here is the new 3 × 3 augmented system:

5 20 5 3 12 10 2 5 −4 Converting this to RREF works the same as before in the first column: we divide the first row by 5 then subtract multiples from the other two rows, to get

1 4 1 0 0 7 0 −3 −6 This matrix is not in RREF, but we can actually stop here and conclude that there are no solutions to the original system. Do you see why? Try converting this augmented system back to equations:

x + 4y = 1 0 = 7 −3y = −6

And now you see the problem, in the second row of the matrix we have an impossible situation that 0 = 7. So this linear system is said to be inconsistent, meaning that there are no values of x and y which would make all three equations true. In terms of the augmented matrix, an inconsistent system will have a row of all zeros except for the last column (which is the right-hand side of the equation) — that is, the pivot is in the last column.

9.7 Types of solutions to linear systems In summary, every linear system of equations either has:

• Zero solutions, if any row in the RREF has a pivot in the last column;

20 • Exactly one solution, when every column in the RREF (except the last column) con- tains a pivot; or

• Infinitely many solutions, when some column in the RREF other than the last column does not contain any pivot.

21