<<

MA 0540 fall 2013, The dual of a vector

November 15, 2013

If V is a over F then a linear L : V → F is called a linear on V . The of all linear functionals on V is denoted by V ∗ and called the dual of V .

V ∗ is a vector space. This is a special case of something we have seen before: that in general L(V,W ), the set of all linear maps V → W , is a vector space. But let’s recall how it goes in this case.

The sum of two elements of V ∗ is defined by (L + M)v = Lv + Mv.

∗ The of c ∈ F and L ∈ V is defined by (cL)v = c(Lv).

1 The dual

Let’s show that if V is finite-dimensional then V ∗ has the same as V . Suppose that ∗ ∗ ∗ (v1, . . . , vn) is a basis for V . We define a basis for V as follows: For 1 ≤ j ≤ n define vj ∈ V to ∗ ∗ ∗ be the V → F such that vj (vj) = 1 and vj (vk) = 0 if k 6= j. In other words, define vj by ∗ vj (x1v1 + ... + xnvn) = xj.

∗ ∗ ∗ ∗ ∗ To show that (v1, . . . , vn) is a basis for V , let’s first show that it spans V . Given any L ∈ V , we show that ∗ ∗ L = L(v1)v1 + ... + L(vn)vn. This is an equation between two linear maps V → F. To verify it, we can use the fact that two linear maps from V to a vector space must be equal if they agree at every element of a basis for V . Thus we just have to show that for each k from 1 to n

∗ ∗ L(vk) = (L(v1)v1 + ... + L(vn)vn)(vk).

∗ Evaluating the right-hand side, we get the sum of n terms L(vj)vj (vk), of which all but one are ∗ ∗ zero. (If j 6= k, then L(vj)vj (vk) = L(vj)0 = 0.) The remaining is L(vk)vk(vk) = L(vk).

1 ∗ Now we show that the vectors vj are linearly independent. Suppose that a1, . . . , an are scalars such ∗ ∗ that a1v1 + ... + anvn = 0. Plugging in the vector vk into this linear functional, we get that

∗ ∗ (a1v1 + ... + anvn)(vk) = 0, that is, ak = 0. Since this is true for all k, the vectors are in fact linearly independent.

2 The dual of a linear map

Suppose now that T : V → W is linear. This determines a map W ∗ → V ∗ which we will call T ∗, or the dual map of T , defined by T ∗L = L ◦ T. Note that this T ∗L is a linear map, being the composition of linear maps

V → W → F.

∗ It is a linear map from V to F, so it is an element of V .

Next we verify that this map T ∗ : W ∗ → V ∗ is linear. This just means that it is additive and homogeneous. That is, we have to verify two things. The first is (L + M) ◦ T = L ◦ T + M ◦ T . Well, (L + M)(T v) = L(T v) + M(T v) = (L ◦ T + M ◦ T )(v) The second is (cL) ◦ T = c(L ◦ T ). Well,

(cL)(T v) = c(L(T v)) = c(L ◦ T )(v).

Note that when T : V → W is composed with a linear map S : U → V then we get

(T ◦ S)∗ = S∗ ◦ T ∗, since L ◦ (T ◦ S) = (L ◦ T ) ◦ S.

3 The of the dual of a linear map

Now recall how a matrix corresponds to a linear map. If T : V → W is linear and V and W have bases (v1, . . . , vn) and (w1, . . . , wm) respectively, then T determines an m by n matrix M(T ;(v1,...), (w1,...)). If we call the matrix A and let aij be the element in row i and column j then the relationship between T and the matrix is given by

T vj = Σiaijwi,

2 where i is summed from 1 to m, or equivalently

T (Σj xjvj) = Σi(Σjaijxj)wi, where j is summed from 1 to n.

The linear map T : V → W determines a linear map T ∗ : W ∗ → V ∗, as we have seen. The bases of V and W determine bases of V ∗ and W ∗, as we have also seen. This leads to an n by m matrix ∗ ∗ ∗ M(T ;(v1,...), (w1,...)). Call it B. Thus ∗ ∗ ∗ T wj = Σibijvi . (This is the second to last equation above, rewritten for T ∗ instead of T .)

I claim that the n by m matrix B is the of the m by n matrix A. That is, I claim that bji = aij.

To work this out, let’s first rewrite that last equation, changing the roles of i and j (so that again i goes from 1 to m and j from 1 to n).

∗ ∗ ∗ T wi = Σjbjivj . Now note that this last equation gives

∗ ∗ (T wi )(vj) = bji, while on the other hand we have

∗ ∗ ∗ ∗ (T wi )(vj) = wi (T vj) = wi (a1jw1 + ... + amjwm) = aij.

∗ ∗ ∗ Conclusion: M(T ;(v1,...), (w1,...)) is always the transpose of M(T ;(v1,...), (w1,...))

4 The of a linear map and of its dual

The rank of a linear map T : V → W between finite-dimensional vector is defined as the dimension of the range of T . Of course rank T ≤ dim W . Also, since rank T = dim V −dim null T , we know that rank T ≤ dim V .

If dim W ≤ dim V then the largest possible rank is dim W , and this maximal rank occurs precisely when T is surjective.

If dim W ≥ dim V then the largest possible rank is dim V , and this maximal rank occurs precisely when T is injective.

If dim W = dim V then the maximal rank occurs precisely when T is invertible.

Theorem: rank T ∗ = rank T .

3 Note that in particular this means that T is surjective if and only T ∗ is injective, and vice versa.

Proof of Theorem: We will prove it by looking at the nullspace of T ∗.

As before, let m be the dimension of W and let n be the dimension of V . Let r be the rank of T . What we have to show is that the dimension of null T ∗ is m − r.

Choose a basis for the range of T . Extend this to a basis for all of W , putting the basis of the range at the end. So we have vectors (w1, . . . , wm) forming a basis for W , such that the last r vectors in ∗ ∗ the list, (wm−r+1, . . . , vm), form a basis for range T . Consider the (w1, . . . , wm). I claim ∗ ∗ ∗ that the first part, (w1, . . . , wm−r), is a basis for the nullspace of T .

Well, an element L of W ∗ belongs to the nullspace of T ∗ if and only if T ∗L = 0, if and only if for every v ∈ V we have (T ∗L)(v) = 0, if and only if for every v we have L(T v) = 0. So L ∈ null T ∗ if and only if Lw = 0 whenever w is in the range of T .

What does that mean in terms of our basis for W ? It means that Lwi = 0 for every i > m − r. And what does that mean in terms of the dual basis for W ∗? It means that if we write our element of W ∗ as ∗ ∗ L = y1w1 + ... + ymwm then L is in the nulllspace if and only if the scalar yi is 0 for every i > m − r. In other words the ∗ ∗ elements of the nullspace are precisely the linear combinations of (w1, . . . , wm−r) (the first part of the dual basis). Thus we are looking at a basis for the nullspace, and the dimension of the nullspace is m − r as asserted.

5 Row rank and column rank of a matrix

m The column rank of an m by n matrix A is defined as the rank of the subspace of F spanned n m by the columns of the matrix A. In other words, it is the rank of the linear map T : F → F determined by the matrix in the usual way (multiplying A by an n by 1 matrix to get an m by 1 matrix).

More generally, if the matrix A corresponds to T : V → W by choosing a basis for V and a basis for W then the column rank of A is equal to the rank of T .

m The row rank of A is defined as the rank of the subspace of F spanned by the rows of the matrix A. This is the same as the column rank of the transpose of A. So, using the Theorem above, we can say that if A corresponds to T then the row rank of A equals the rank of T ∗, equals the rank of T , equals the column rank of A.

Conclusion: Row rank equals column rank.

This is a not so obvious statement about matrices, which we proved by using dual vector spaces.

4