<<

2 Rank and Algebra

2.1 Rank In our introduction to systems of linear equations we mentioned that a system can have no solutions, a unique solution, or infinitely many solutions. In this section we’re going to introduce an invariant1 of matrices, and when this is computed for the matrix of coefficients of a system of linear equations, it can give us information about the of solutions to this system.

Rank.2The rank of a matrix A is the number of pivots in rref(A), the reduced row- echelon form of A. We denote the rank of A by rank(A).

Example 8. Consider the matrix

 4 3 2 −1   5 4 3 −1  A =   .  −2 −2 −1 2  11 6 4 1

This matrix has reduced row-echelon form  1 0 0 1   0 1 0 −3  rref(A) =   ,  0 0 1 2  0 0 0 0 so rank(A) = 3, since rref(A) has 3 pivots.

Note. Here are some of the connections between the rank of a matrix and the number of solutions to a system of linear equations. Suppose we have a system of n linear equations in m variables, and that the n × m matrix A is the coefficient matrix of this system. Then

1. We have rank(A) ≤ n and rank(A) ≤ m, because there cannot be more pivots than there are rows, nor than there are columns.

2. If the system of equations is inconsistent, then rank(A) < n. This is because in row- reducing an inconsistent system we eventually have a row of zeros, augmented by a nonzero solution. This row of zeros can’t have a pivot, so the number of pivots is at most n − 1. 1An invariant of a mathematical object is a property that doesn’t change when we apply certain oper- ations to the object. In this case, the rank of a matrix is invariant under elementary row operations. 2We’ll soon give a different, much better definition of the rank of a matrix. This is a probationary definition from which we’ll soon move on.

10 3. If the system has exactly one solution, then rank(A) = m. If rank(A) < m, then the system would have a free variable, meaning that if there is a solution, then there are infinitely many solutions. 4. If the system has infinitely many solutions, then rank(A) < m, because a system with infinitely many solutions must have a free . We can combine these observations with some remarks made in the previous section to conclude that if m = n (i.e., we have the same number of equations as variables), then the system has a unique solution if and only if rref(A) = In, the n × n .

2.2 Matrix Algebra So far we’ve only seen matrices used to represent systems of linear equations. They have another (perhaps more important) use as representatives of linear transformations. For example, matrices can be used to model the evolution of the electorate between election cycles. Suppose that each election cycle, 90% of voters who were members of the Republican party remain in the Republican party, while the other 10% join the Democratic party. At the same time, suppose 5% of Democrats become Republicans, while the other 95% remain in the Democratic party. If each party has 500,000 voters in one election cycle, the following matrix equation computes the number of voters each party will have in the next cycle: A 0.95 0.10 500, 000 525, 000 = . (6) 0.05 0.90 500, 000 475, 000 Once we learn how to interpret (6) we’ll see that in the next election there should be 525,000 Democratic voters and 475,000 Republican voters. In this equation the matrix A represents the transformation from one election cycle to the next, and this is the idea we’d like to focus on soon: matrices as transformations.

Before we can really consider matrices as representing transformations, we’ll need to first make sense of expressions such as (6). To this end, we’ll make clear some vocabulary sur- rounding matrices and then discuss of matrices.

Size of a matrix. The size of a matrix is given by the number of rows and columns it has. A matrix with m rows and n columns is said to be “m-by-n”, written m × n. We occasionally call a matrix with only one row a row matrix and call a matrix with just one column a column matrix; we will call either of these types of matrices vectors. Matrices which have the same number of rows and columns are called matrices.

Example 9. Below we have a 2 × 3 matrix, a row matrix, a column matrix, and a , respectively: 6  0 −1 1  2 7 −3 , 0 9 1 , 2 , −2 2 −4 . 1 0 9     3 4 −8 8

11 We use lowercase letters with double subscripts to identify the entries of a matrix. For example, if the square matrix above is A, then a23 = −4. Now we define addition of matrices. This operation is easy, but it is important to remember that addition of matrices is only defined when the matrices involved have the same size. That is, we can’t add a 2 × 3 matrix to a 5 × 4 matrix, but we can add two 2 × 3 matrices together, and we do so entry-wise. For instance, 2 7 −3 1 2 3 3 9 0  + = . 1 0 9 4 5 6 5 5 15 We can similarly define subtraction entry-wise, subject to the same restriction on size: 2 7 −3 1 2 3  1 5 −6 − = . 1 0 9 4 5 6 −3 −5 3 Because addition is commutative and we defined matrix addition entry-wise, ma- trix addition is commutative. That is, if A and B are matrices of the same size, then A + B = B + A. We this out because it will not be true for multiplication.

After addition and subtraction, it makes sense to define . We’re going to defer most of this discussion a little longer, but we will at least define the product Ax, where A is a matrix and x is a vector. As with addition and subtraction, multiplication has a size condition:

Size condition for the product Ax. If A is an n×m matrix and x is a vector with k component, then the product Ax only exists if m = k. That is, the number of columns of A must equal the number of components of x. If this condition is met, then Ax will be a vector with n components. To prepare ourselves for the definition of Ax, we first define a product between vectors with the same number of components, called the .

Dot product. Suppose we have vectors

v = hv1, v2, . . . , vni and w = hw1, w2, . . . , wni.

Then the dot product of v and w is a scalar given by

v · w = v1w1 + v2w2 + ··· + vnwn.

Notice that w · v = v · w. Example 10. We have  2     3  1 1 −1 1 ·   = 1 · 2 + 1 · 3 + (−1) · 6 + 1 · 1 = 0.  6  1

12 Though it’s not necessary, we often write the first vector in a dot product as a row vector and the second as a column vector.

We now define the product Ax in terms of dot products.

The product Ax. Let A be an n×m matrix and let x be a vector with m components. If the rows of A are the vectors w1,..., wn, each with m components, then   w1 · x  .  Ax =  .  . wn · x

Example 11. Compute each of the following products, or write that the product does not exist:  0   1 0 8   1 0 8   24   0.95 0.10   525, 000  (a) 6 (b) (c) . 4 −3 2   4 −3 2 −12 0.05 0.90 475, 000 3

(Solution)

 0   1 0 8   1 · 0 + 0 · 6 + 8 · 3   24  (a) 6 = = . 4 −3 2   4 · 0 + (−3) · 6 + 2 · 3 −12 3

(b) Since the matrix in the product is 2 × 3 and the vector has two components (rather than three), this product is not defined.

(c) This represents another election cycle in the two-party example at the beginning of this section:  0.95 0.10   525, 000   0.95 · 525, 000 + 0.10 · 475, 000   546, 250  = = . 0.05 0.90 475, 000 0.05 · 525, 000 + 0.90 · 475, 000 453, 750

So after another cycle, there will be 546,250 Democratic voters and 453,750 Republican voters.

One helpful property of matrix multiplication is that it plays nicely with vector and matrix addition, as well as :

A(x + y) = Ax + Ay, (A + B)x = Ax + Bx, and A(kx) = k(Ax),

for all n × m matrices A and B, vectors x and y with m components, and scalars k.

13 We’ll conclude with an example of arguably the most important matrix product in this course. Consider the product

 1 2 3   x   x + 2y + 3z   2 −1 6   y  =  2x − y + 6z  . 3 0 −4 z 3x − 4z

The entries in the resulting vector look at lot like the left side of a system of linear equations. Indeed, requiring, say,  1 2 3   x   6   2 −1 6   y  =  7  , 3 0 −4 z −2 is the same as insisting that (x, y, z) be a solution to the system of linear equations given by

x + 2y + 3z = 6 2x − y + 6z = 7 . 3x+0y − 4z = −2

This is an important observation. It means that we may take an arbitrary system

a11x1 + a12x2 + ··· + a1nxn = b1

a21x1 + a22x2 + ··· + a2nxn = b2 . . . = .

am1xn + am2x2 + ··· + amnxn = bm and write it succinctly as a matrix product       a11 a12 ··· a1n x1 b1  a a ··· a   x   b   21 22 2n   2   2   . . .. .   .  =  .  .  . . . .   .   .  am1 am2 ··· amn xn bn

That is, we want to solve the equation Ax = b, where A, x and b are what you’d expect them to be. If this were an equation of and A were nonzero, we’d know how to solve for x: divide by A. We’ll apply similar ideas to this equation, though division rules for matrices are slightly less straightforward than they are for numbers.

14 References

15