MA 0540 fall 2013, Row operations on matrices

December 2, 2013

This is all about m by n matrices (of real or complex numbers). If A is such a then A n m corresponds to a linear map F → F , which we will denote by TA. The map is defined by matrix n multiplication: TA(X) = AX, where the vector X ∈ F is thought of as an n by 1 matrix and the m m by 1 matrix AX is thought of as a vector in F .

n At times I will view the matrix as a list of m vectors R1,...,Rm, all in F (the rows). At other m times I will view it as a list of n vectors C1,...,Cn, all in F (the columns).

1 The of a matrix

1.1 The column space of a matrix

m The subspace of F spanned by the columns of A is called the column space of A, and the rank of this subspace is called the column rank of A.

The column space is the same as the range of TA, because TA takes the vector (x1, . . . , xn) to x1C1 + ... + xnCn.

Thus the column rank of the matrix A is the dimension of the range of the linear map TA.

Of course, the column rank is no bigger than m, and it is equal to m if and only if TA is surjective.

1.2 The row space of a matrix

n The subspace of F spanned by the rows of A is called the row space of A, and its rank is called the row rank of A.

1 1.3 Equality of row rank and column rank

In fact we know that the dimension of the column space of A is always the same as the dimension m n of the row space. We proved this before by using the nullspace of the linear map F → F that corresponds to the transpose matrix. We will see a different proof below.

2 Row operations

How can we determine the rank of a matrix A? How can we find a for the nullspace of TA? Row operations give answers to these and other questions.

2.1 The operations

For this discussion, think of an m by n matrix as a list of its rows Ri.

There are three types of things we can do to a matrix that are called row operations.

1. Interchange Ri with Rj for some i and j 6= i.

2. For some i, replace Ri by cRi for some scalar c 6= 0. Leave the other rows unchanged.

3. For some i and j 6= i, replace Ri by Ri + cRj for some scalar c. Leave the rows other than Ri unchanged.

We say that two m by n matrices A and B are row-equivalent if it is possible to change A into B by a sequence of row operations.

It is clear that row operations can be reversed: if some row operation turns A into B then some row 0 operation turns B into A. For example, if you get B be replacing the ith row Ri by Ri = Ri + cRj 0 0 then you can recover A from B by replacing the new ith row Ri by Ri + (−c)Rj = Ri.

Here is one way to look at this: Any row operation applies a certain linear operator to every column of the matrix: if the columns of the matrix A are C1,...,Cn then the columns of the new matrix are PC1,...,PCn, where P is a certain m by m matrix. (The new matrix is PA.) For any row operation of any of the three types, there is some m by m matrix P such that this is true. The matrix P is always invertible.

2.2 The effect of row operations on the row space and column space of A

If A and B are row-equivalent, then they have the same row space. In fact, when a matrix is altered by a row operation then every new row belongs to the span of the old rows, so the new row space

2 is contained in the old row space; and since row operations can be reversed it is also true that the old row space is contained in the new row space. Therefore the new row space and the old row space are the same.

It follows, of course, that the row rank of a matrix does not change when we perform row operations on it.

Row operations do change the column space of a matrix, but nevertheless they do not change its dimension. To see this, suppose that that a row operation corresponds to the P as above. The row space of PA has the same dimension as the row space of A because the one m subspace of R is obtained from the other by applying an invertible operator TP to it.

So, just like the row rank, the column rank of A is unaltered by row operations.

3 Echelon matrices

We call an m by n matrix A an echelon matrix if it has the following form:

For some number r (an integer satisfying 0 ≤ r ≤ m) there are numbers c(1), . . . , c(r) satisfying 1 ≤ c(1) < . . . < c(r) ≤ n such that:

When i > r then ai,j = 0 for all j.

For every i from 1 to r we have ai,c(i) = 1.

For every i from 1 to r and every j < c(i) we have ai,j = 0.

For every i from 1 to r and every k < i we have ak,c(i) = 0.

In other words,

Ri = 0 if i > r; each row after the first r rows is entirely zero.

If i ≤ r then in the row Ri the first nonzero entry is a 1, and it occurs in column number c(i), where c(i) is an increasing function of i.

In the column Cc(i) everything above the 1 in row i is a zero.

Note that inside this matrix there is an r by r submatrix that is an ; it is in the rows numbered 1, 2, . . . , r and the columns numbered c(1), c(2), . . . , c(r).

3 3.1 Rank of an echelon matrix

We now show directly that for an echelon matrix both the row rank and the column rank are equal to the number called r above, the number of rows that are not entirely made of zeroes.

To see that the row rank is r, it suffices to show that the first r rows are linearly independent. For this, suppose we have a linear relation x1R1 + ... + xrRr = 0, i.e. a linear relation x1a1,j + x2a2,j + ... + xrar,j = 0 valid for every j from 1 to n. For any i from 1 to r we may see that xi = 0 by taking j to be c(i): this yields the equation x1a1,c(i) + x2a2,c(i) + ... + xrar,c(i) = 0, which tells us that xi = 0, since the only one of a1,c(i), . . . , ar,c(i) that is not zero is ai,c(i) = 1.

To see that the column rank of an echelon matrix is also r we can observe that the column space m is the r-dimensional subspace of F consisting of all vectors (x1, . . . , xm) such that for every i > r the number xi is zero. It is contained in that subspace because each column Cj is that subspace (i.e. the numbers aij are all zero for i > r), and it is all of that subspace because the columns Cc(1),...,Cc(r) are the obvious basis of that subspace.

3.2 Every matrix is row-equivalent to some echelon matrix

Here is a procedure (“row reduction of a matrix”) for doing row operations on a matrix to put it in echelon form.

Look at the first column.

If it is entirely zero, then ignore it and go to work on the remaining m by n − 1 matrix. If some row operations turn that submatrix into an echelon matrix, say E, then these same operations will also turn A into an echelon matrix (E with one extra row of zeroes on the left).

If the first column is not entirely zero, then arrange for the upper left entry a1,1 to be not 0, by interchanging the first row with some other row if necessary.

Now that a 6= 0, multiply the first row by 1 . So in the new matrix a = 1. 1,1 a1,1 1,1

Now for each i 6= 1 do a row operation to make ai,1 into 0. (Replace Ri by Ri − ai,1R1.)

At this point the first column has 1 at the top and all zeroes below.

If m = 1 then we are done: this 1 by n matrix is in echelon form. Also if n = 1 then we are done: this m by 1 matrix is jun echelon form. So assume m > 1 and n > 1.

Temporarily ignore the first column and the first row and look at the remaining m − 1 by n − 1 matrix. Do row operations on it until it is in echelon form. These same operations performed on the original matrix will not change the first row, and they will not affect the first column either (because ai,1 = 0 for i > 1).

4 At this point our m by n matrix is almost in echelon form. The only problem is in the first row: we need a1,c(i) to be zero for each i > 1. We can arrange this by subtracting a multiple of Ri from R1, for each i from 2 to r.

3.3 Row rank equals column rank

We have already seen that the row and column rank of a matrix do not change when it is subjected to row operations. So here is a new proof that column rank equals row rank: for any matrix A there is a row-equivalent echelon matrix E, and we can say rowrank(A) = rowrank(E) = columnrank(E) = columnrank(A).

4 The case of square matrices

The biggest possible rank of an m by m matrix is m, and the rank m matrices are the invertible matrices.

The only m by m echelon matrix is the identity matrix. Therefore every invertible matrix is row- equivalent to the identity. One consequence of this is that the inverse of A can be found by row operations, as follows:

4.1 Inverting a matrix by row operations

Write down A next to I as an m by 2m matrix (A, I). Perform row operations on this in such a way as to turn the left half (A) into I. Row operation is left multiplication by some invertible matrix P . So for some sequence of invertible matrices Pk we are getting

(A, I) 7→ (P1A, P1) 7→ (P2P1A, P2P1) 7→ ... (Pr ...P2P1A, Pr ...P2P1).

At the end, if A has turned into the identity then Pr ...P2P1 is the inverse of A; we end up with (I,A−1).

In other words, if A is in fact invertible then you can find its inverse by performing row operations on A to turn it into I, and as you go along simply performing the same operations on I.

If A is not invertible then you will discover that A is not invertible when the echelon matrix obtained from A turns out to have rank less than m.

4.2 Another way to think about row-equivalence

We have seen that if two m by n matrices A and B are row equivalent then there exists an invertible m by m matrix P such B = PA.

5 In fact, the converse is true: If B = PA where P is invertble, then P can be obtained from the identity matrix by performing row operations, and therefore PA can be obtained from A by performing the same operations. Thus B and A have to be row-equivalent in this case.

5 Row operations and systems of equations

We have seen that the echelon matrix which is row-equivalent to A tells us what the rank of A is, and even gives us a basis for the row space. What else does it do for us?

5.1 Solving systems of homogeneous linear equations

Given a list of m linear equations in n variables:

ai,1x1 + ai,2x2 + ... + ai,nxn , 1 ≤ i ≤ m, we can encode it as a matrix A. The system of equations can be read as a single vector equation AX = 0, and the problem of finding all solutions (x1, . . . , xn) becomes the problem of describing the nullspace of the linear map TA. Of course the dimension of the solution space is

n dim(F ) − dim(range T ) = n − r, where r is the rank of A.

When E is an echelon matrix equivalent to A, then E = PA for some invertible P , and it follows that the nullspace of E is the same as that of A.(EX = 0 if and only if P AX = 0 if and only if AX = 0.) Thus the system of equations given by the rows of E has precisely the same solution as the original system.

If A is an echelon matrix, then it gives you particularly useful equations. In fact, it gives you r equations numbered 1 through r, of which the equation numbered i (when we leave out all the terms that are zero) says: xc(i) = −Σj ai,j xj, with j running through all values from c(i) + 1 up to n excluding c(i + 1), . . . , c(r). The equations express the variables xc(1), . . . xc(r) as linear combinations of all the other variables xj, which can therefore be called independent variables. We can also read off a basis for the nullspace. There is one basis vector for each j that is not among the c(i)s. For each such j, choose xj to be 1 and choose all the other independent variables to be 0. Then xc(i) is −ai,j if c(i) < j and 0 if c(i) ≥ j.

5.2 Inhomogeneous equations

A system of equations

ai,1x1 + ai,2x2 + ... + ai,nxn = bi , 1 ≤ i ≤ m,

6 corresponds to a single vector equation AX = b, where A is m by n and b is m by 1. In other m n words, it means specifying a vector b belonging to F and asking about the set of all vectors v ∈ F such that TA(v) = b.

If b = 0 (the homogeneous case), there is at least one solution, namely v = 0, and the space of n solutions is a vector subspace of F (the nullspace of T ) whose dimension is the rank of A.

In general, there might not be any solutions. If there is at least one, say vp, then the equation T v = b becomes T (v − vp) = 0; the vectors v such that T v = b are the vectors v such that v − vp belongs to the nullspace of T , i.e. the vectors of the form vp + x where x belongs to the nullspace.

How does row reduction help to work out an example like this?

Make an m by n + 1 matrix by putting b as one additional column: (A, b). Do row operations to transform A into an echelon matrix E = PA, and at the same perform the same operations on the additional column b to get a new column c, so that the matrix (A, b) becomes (E, c) = (P A, P b). If the new matrix (E, c) is interpreted as a system of inhomogenous equations, the solutions of these are the same as the solutions of the original system. (P AX = P b if and only if AX = b.) On the other hand, we can now tell at a glance whether there are any solutions: Say that the rank is r, so that E has all zeroes after the first r rows. If the extra column c has anything non-zero after the first r rows, i.e. if ci 6= 0 for some i > r, then there can be no solution. On the other hand, if ci = 0 for all i > r, then you have a list of r equations with which you can solve for the xc(i) as in the homogeneous case. In particular by setting all the independent variables equal to zero (i.e. making xj = 0 if j is not any c(i)), we obtain a particular solution with xc(i) = −bi for 1 ≤ i ≤ r, from which the general solution can be obtained by adding solutions of the homogeneous system of equations.

7