<<

Orthogonal Subspaces

Learning Goals: encounter the concept of orthogonal subspaces in the context of the four fundamental subspaces of a . Learn about orthogonal complements.

Early on we briefly encountered the idea of angles between vectors and the . The dot product of x and y is xTy. One of the most important things we learned is that x and y are orthogonal if their dot product is zero. This meant that they were at right angles, or that one or both were zero. If we look at , Ax, the result is that each row of A is multiplied by x. We are taking the dot product of x with each row of A. Let’s look at the matrix ⎡ 2 1 0 −3 1⎤ ⎢ 4 2 1 −3 2⎥ A = ⎢ ⎥ . Notice that if we solve Ax = 0, looking for the nullspace, any ⎢−2 −1 0 2 0⎥ ⎢ ⎥ ⎣ 8 4 2 −7 5⎦ solution is orthogonal to every row. Then, since dot products are distributive and associative over multiplication, x is orthogonal to every combination of the rows. In fact, every vector in the nullspace is orthogonal to every vector in the row . This deserves a definition:

Definition: let V and W be two subspaces of some . If every vector in V is orthogonal to every vector in W, then V and W are called orthogonal subspaces.

This definition is symmetric in V and W, of course. As examples, the xy-plane and the z- axis are orthogonal. The x-axis and the z-axis are orthogonal. But two perpendicular planes in R3 (such as a floor and a wall) are not orthogonal. For if v is a vector that lies in the intersection line, it would have to be orthogonal to itself, which it isn’t.

Corollary: if V and W are orthogonal, then their intersection contains only the zero vector.

So the row space and the nullspace are orthogonal. For our matrix, the reduced row echelon ⎡1 1 / 2 0 0 −1⎤ ⎢0 0 1 0 3 ⎥ form is R = ⎢ ⎥. We can read off a for the nullspace as (–1/2, 1, 0, 0, 0) ⎢0 0 0 1 −1⎥ ⎢ ⎥ ⎣0 0 0 0 0 ⎦ and (1, 0, –3, 1, 1). Both of these, and therefore any of them, are orthogonal to the entire row space. What’s more, everything that is orthogonal to the row space is a null vector.

Definition: if S is any set in a vector space, then S⊥ is the set of all vectors that are orthogonal to everything in S. It is called the orthogonal complement of S.

Orthogonal complements are never empty, since they always contain the zero vector. In fact, they are always subspaces, even if S isn’t. For if v and w have zero dot products with everything in S, so does any linear combination of v and w. Note that the orthogonal complement construction is not symmetric. It is possible for S⊥ to equal T but T ⊥ to not equal S, even if S is originally a subspace. Fortunately that problem doesn’t exist in finite dimensions!

Theorem: in Rn, if V is s subspace with orthogonal complement W, then the orthogonal complement of W is V.

Proof: Let v1, v2, …, vr be a basis for V. Make these the rows of a matrix A. This matrix will have r, so there will be n – r free variables. Thus the nullspace, which by definition contains all the vectors that are orthogonal to the row space, must have dimension n – r. So the r basis vectors for V plus the n – r basis vectors for V⊥ must form a basis for Rn (there are n total, and they must be independent else some nontrivial combo of rows both equals and is orthogonal to some nontrivial null vector—not possible). So we could reverse the roles and make the n – r null basis vectors the rows of a matrix, and then V would appear as its nullspace.

Corollary: If two subspaces are orthogonal and have total dimension equal to that of the parent space, they are orthogonal complements of each other.

Corollary: if V and W are orthogonal complements, and x is any vector, then there are unique v and w with v in V, w in W, and v + w = x.

We can, of course, turn all this around and apply it to the column and left nullspaces. The column space and the left nullspace are orthogonal complements, because they are orthogonal (by definition) and their dimensions add up. (Or, we could just apply all the arguments to AT). In fact, we have the

Theorem: (Fundamental Theorem of , part II). The column and left nullspaces are orthogonal complements. The row and nullspaces are orthogonal complements. To be cute, R(A)⊥ = N(AT ) and R(AT )⊥ = N(A) .

In our matrix above, the first, third, and fourth columns are a basis for the column space. From elimination, we find that a basis for the left nullspace is (–1, –2, –1, 1). Clearly the dimensions add up (3 + 1 = 4). Everything that is orthogonal to the column space is a multiple of (–1, –2, –1, 1) and everything orthogonal to this is in the column space. In fact, note that this vector is also the condition on the entries of a column (–b1 – 2b2 – b3 + b4 = 0) for it to even be in the column space. In general, something can only be in the column space if it is orthogonal to the left nullspace.

The action of a matrix The true action of a matrix is to take a vector in Rn and send it to a vector in Rm according to the formula x → Ax. Since x is in Rn it can be written uniquely as a piece in the row space and a piece in the nullspace. The nullspace piece gets sent to zero! The rowspace piece must get sent into the column space. Not only that, there must be a one-to-one and onto correspondence of row and column vectors. Why? Two different row vectors get sent to two different column vectors. For if Ax = Ay, then x – y is in the nullspace. But x – y is also in the row space, and to be in both a space and its complement we must have x – y = 0. So the mapping is one-to-one. It is onto, because every z in the column space comes from some x in Rn. That x is split into xr + xn, and as we have seen the nullspace part gets sent to zero, so xr gets sent to z, so the mapping from the row space to the column space is onto. We have seen that if a matrix has full row rank, then every left hand side has a solution, so in particular there is a right inverse. But this right inverse is not unique, for any null vector can be added to any column. There is a similar non-unique left inverse when a matrix has full column rank. Later on, we will pick out a particular matrix A+ for any matrix A called the pseudoinverse that works on either side. It undoes this row space-column space correspondence, and sends the left nullspace to zero just as the original matrix sets up the correspondence while sending the (right) nullspace to zero.

Optional: infinite dimensional counterexample Let’s look in L2, the space of all square-summable sequences of real numbers. That is, ∞ the set of sequences of real numbers (a , a , a , …) for which 2 converges. If (a , …) and 1 2 3 ∑ai 1 i=1 (b1, …) are such sequences, it is not too hard to show that their “dot product” a1b1 + a2b2 + ! converges and satisfies the normal rules for dot products. Let V be the set of sequences with only a finite number of non-zero terms—i.e. they are zero from some point onward. This is a subspace. Its orthogonal complement contains only the zero sequence. For if (a1, a2, …) is a sequence orthogonal to all sequences that are zero from some point onward, it is orthogonal to (1, 0, 0, …) and to (0, 1, 0, …) and so forth. So it must have a1 = 0, and a2 = 0 and so on. But the orthogonal complement of just this zero sequence is all sequences in L2, not just the eventually-zero ones!

Reading: 4.1 Problems: 4.1: 3, 4, 6, 7, 9, 11, 17, 18, 21, 22, 23, 28, 29, 30, 32, 33