Linear Algebra I: Vector Spaces A
1 Vector spaces and subspaces
1.1
Let F be a field (in this book, it will always be either the field of reals R or the field of complex numbers C). A vector space
V D .V; C; o;˛. /.˛2 F//
over F is a set V with a binary operation C, a constant o and a collection of unary operations (i.e. maps) ˛ W V ! V labelled by the elements of F, satisfying (V1) .x C y/ C z D x C .y C z/, (V2) x C y D y C x, (V3) 0 x D o, (V4) ˛ .ˇ x/ D .˛ˇ/ x, (V5) 1 x D x, (V6) .˛ C ˇ/ x D ˛ x C ˇ x,and (V7) ˛ .x C y/ D ˛ x C ˛ y. Here, we write ˛ x and we will write also ˛x for the result ˛.x/ of the unary operation ˛ in x. Often, one uses the expression “multiplication of x by ˛”; but it is useful to keep in mind that what we really have is a collection of unary operations (see also 5.1 below). The elements of a vector space are often referred to as vectors. In contrast, the elements of the field F are then often referred to as scalars. In view of this, it is useful to reflect for a moment on the true meaning of the axioms (equalities) above. For instance, (V4), often referred to as the “associative law” in fact states that the composition of the functions V ! V labelled by ˇ; ˛ is labelled by the product ˛ˇ in F, the “distributive law” (V6) states that the (pointwise) sum of the mappings labelled by ˛ and ˇ is labelled by the sum ˛ C ˇ in F, and (V7) states that each of the maps ˛ preserves the sum C. See Example 3 in 1.2.
I. Kriz and A. Pultr, Introduction to Mathematical Analysis, 451 DOI 10.1007/978-3-0348-0636-7, © Springer Basel 2013 452 A Linear Algebra I: Vector Spaces
1.2 Examples
Vector spaces are ubiquitous. We present just a few examples; the reader will certainly be able to think of many more. 1. The n-dimensional row vector space Fn. The elements of Fn are the n-tuples .x1;:::;xn/ with xi 2 F, the addition is given by
.x1;:::;xn/ C .y1;:::;yn/ D .x1 C y1;:::;xn C yn/;
o D .0;:::;0/,andthe˛’s operate by the rule
˛..x1;:::;xn// D .˛x1;:::;˛xn/:
Note that F1 can be viewed as the F. However, although the operations a come from the binary multiplication in F, their role in a vector space is different. See 5.1 below. 2. Spaces of real functions.ThesetF.M/ of all real functions on a set M , with pointwise addition and multiplication by real numbers is obviously a vector space over R. Similarly, we have the vector space C.J/ of all the continuous functions on an interval J , or e.g. the space C 1.J / of all continuously differentiable functions on an open interval J or the space C 1.J / of all smooth functions on J , i.e. functions which have all higher derivatives. There are also analogous C-vector spaces of complex functions. 3. Let V be the set of positive reals. Define x ˚ y D xy, o D 1, and for arbitrary ˛ 2 R, ˛ x D x˛.Then.V; ˚; o;˛ . /.˛ 2 R// is a vector space (see Exercise (1)).
1.3 An important convention
We have distinguished above the elements of the vector space and the elements of the field by using roman and greek letters. This is a good convention for a definition, but in the row vector spaces Fn, which will play a particular role below, it is somemewhat clumsy. Instead, we will use for an arithmetic vector a bold-faced variant of the letter denoting the coordinates. Thus,
x D .x1;:::;xn/; a D .a1;:::;an/; etc.
Similarly we will write
f D .f1;:::;fn/
for the n-tuple of functions fj W X ! R resp. C (after all, they can be viewed as mappings f W X ! Fn), and similarly. 1 Vector spaces and subspaces 453
These conventions make reading about vectors much easier, and we will maintain them as long as possible (for example in our discussion of multivariable differential calculus in Chapter 3). The fact is, however, that in certain more advanced settings the conventions become cumbersome or even ambiguous (for example in the context of tensor calculus in Chapter 15), and because of this, in the later chapters of this book we eventually abandon them, as one usually does in more advanced topics of analysis. We do, however, use the symbol o universally for the zero element of a general vector space Ð so that in Fn we have o D .0;0;:::;0/.
1.4
We have the following trivial
Observation. In any vector space V , for all x 2 V , we have x C o D x and there exists precisely one y such that x C y D o, namely y D . 1/x.
(Indeed, x C o D 1 x C 0 x D .1 C 0/x D x and x C . 1/x D 1x C . 1/x D .1 C . 1//x D 0 x D o,andifx C y D o and x C z D o then y D y C .x C z/ D .y C x/ C z D z.)
1.5 (Vector) subspaces
A subspace of a vector space V is a subset W V that is itself a vector space with the operations inherited from V . Since the equations required in V hold for special as well as general elements, we have a trivial
Observation. A subset W V of a vector space is a subspace if and only if (a) o 2 W , (b) if x;y 2 W then x C y 2 W , and (c) for all ˛ 2 F and x 2 W , ˛x 2 W .
1.5.1 Also the following statement is immediate.
Proposition. The intersection of an arbitrary set of subspaces of a vector space V is a subspace of V .
1.6 Generating sets
By 1.5.1, we see that for each subset M of V there exists the smallest subspace W V containing M , namely 454 A Linear Algebra I: Vector Spaces \ L.M / D fW j W subspace of V and M W g:
For M finite, we use the notatiom
L.u1;:::;un/ instead of L.fu1;:::;ung/:
Obviously L.;/ Dfog. We say that M generates L.M /; in particular if L.M / D V we say that M is a generating set (of V ). One often speaks of a set of generators but we have to keep in mind that this does not imply each of its elements generates V , which would be a much stronger statement. If there exists a finite generating system we say that V is finitely generated,or finite-dimensional.
1.7 The sum of subspaces
Let W1;W2 be subspaces. Unlike the intersection W1 \ W2, the union W1 [ W2 is generally (and typically) not a subspace. But we have the smallest subspace containing both W1 and W2, namely L.W1 [ W2/. It will be denoted by
W1 C W2
and called the sum of W1 and W2. (One often uses the symbol ‘˚’ instead of ‘C’ when one also has W1 \ W2 Dfog.)
2 Linear combinations, linear independence
2.1
A linear combination of a system x1;:::;xn of elements of a vector space V over F is a formula
Xn ˛1x1 C C˛nxn (briefly, ˛j xj /: (*) j D1
The “system” in question is to be understood as the sequence, although the order in which it is presented will play no role. However, a possible repetition of an individual element is essential. Note that we spoke of (*) as of a “formula”. That is, we had in mind the full information involved (more pedantically, we could speak of the linear combination as of the sequence together with the mapping f1;:::;ng!F sending j to ˛j ). The vector obtained as the result of the indicated operations should be referred to as the result of the linear combination (*). We will follow this convention consistently 2 Linear combinations, linear independence 455
Xn to begin with; later, we will speak of a linear combination ˛j xj more loosely, j D1 trusting that the reader will be able to tell from the context whether we will mean the explicit formula or its result.
2.2
A linear combination (*)issaidtobenon-trivial if at least one of the ˛j is non-zero. Asystemx1;:::;xn is linearly dependent if there exists a non-trivial linear combination (*) with result o. Otherwise, we speak of a linearly independent system.
2.2.1 Proposition. 1. If x1;:::;xn is linearly dependent resp. independent then for any permutation of f1;:::;ng the system x .1/;:::;x .n/ is linearly dependent resp. independent. 2. A subsystem of a linearly independent system is linearly independent. 3. Let ˇ2;:::;ˇn be arbitrary. Then x1 :::;xn is linearly independent if and only if Xn the system x1 C ˇj xj ;x2;:::;xn is. j D2 4. A system x1;:::;xn is linearly dependent if and only if some of its members are a (result of a) linear combination of the others. In particular, any system containing o is linearly dependent. Similarly, if there exist j ¤ k such that xj D xk then x1;:::;xn is linearly dependent.
Proof. 1. is trivial. 2. A non-trivial linear combination demonstrating the dependence of the smaller system demonstrates the dependence of the bigger one if we put ˛j D 0 for the remaining summands. 3. It suffices to prove one implication, the other follows by symmetry since the first system can be obtained from the second by using the coefficients ˇj . Thus, Xn let ˛1.x1 C ˇj xj / C ˛2x2 C C˛nxn D o with an ˛k ¤ 0.Thenwehave j D2
˛1x1 C .˛2 C ˛1ˇ2/x2 C C.˛n C ˛1ˇn/xn D o
and it is a non-trivial linear combination of the x1;:::;xn: indeed either ˛1 ¤ 0 or .˛k C ˛1ˇk/ D ˛k ¤ 0. Xn 4. If ˛1x1 C C ˛nxn (briefly, ˛j xj D o) with ˛k ¤ 0 then xk D j D1 X X . ˛j / xj . On the other hand, if xk D ˛j xj we have the non-trivial ˛k j ¤k X j ¤k linear combination xk C . ˛j /xj D o. ut j ¤k 456 A Linear Algebra I: Vector Spaces
2.3 Conventions
We speak of a linearly independent finite set X V if X is independent when ordered as a sequence without repetition. A general subset X V is said to be independent if each of its finite subsets is independent.
2.4 Theorem. Let M be an arbitrary subset of a vector space V .ThenL.M / is the set of all the (results of) linear combinations of finite subsystems of M .
Proof. The set of all such results of linear combinations is obviously a subspace of V . On the other hand, a subspace W containing M has to contain all the (results of) linear combinations of elements of M . ut
2.5 Proposition. L.u1;:::;un/ L.v1;:::;vk / if and only if each of the uj ’s is a linear combination of v1;:::;vk.
Proof. If it is, the inclusion follows from 2.4 since L.u1;:::;un/ is the smallest subspace containing all the uj ; if we have the inclusion then the uj ’s are the desired linear combinations, again by 2.4. ut
2.6 Theorem. (Steinitz’ Theorem, or The Exchange Theorem) Let v1;:::;vk be a linearly independent system in a vector space V and let fu1;:::;ung be a generating set. Then (1) k n, and (2) There exists a bijection Wf1;:::;ng!f1;:::;ng (i.e. a permutation of the set f1;:::;ng) such that
fv1;:::;vk; u .kC1/;:::;u .n/g
is a generating set.
Proof. by induction. Xn If k D 1 we have v1 D ˛j uj and since v1 ¤ o by 2.2, there exists at least j D1
one uj0 with ˛j0 ¤ 0.Now X 1 ˛j uj0 D v1 C uj ˛j0 ˛j0 j ¤j0 and we have, by 2.5, L L .v1; u1;:::;uj0 1; uj0C1;:::;un/ D .u1;:::;un/ D V:
Rearange the uj by exchanging u1 with uj0 . 3 Basis and dimension 457
Now let the statement hold for k and let us have a linearly independent system v1;:::;vk;vkC1.Thenv1;:::;vk. is linearly independent and we have, after a rearrangement of the uj ,
L.v1;:::;vk; ukC1;:::;un/ D V:
Since vkC1 2 V we have
Xk Xn vkC1 D ˛j vj C ˛j uj : j D1 j DkC1
We cannot have all the ˛j with j>kequal to zero: since v1;:::;vk;vkC1 are
independent, this would contradict 2.2.1 4. Thus, ˛j0 ¤ 0 for some j0 >kand hence, first,
n k C 1;
and, second, after rearranging the uj ’s to exchange the uj0 with ukC1 we obtain
Xk Xn 1 ˛j ˛j vkC1 D vj C ukC1 C uj ; ˛kC1 ˛kC1 ˛kC1 j D1 j DkC2
and hence
Xk Xn ˛j 1 ˛j ukC1 D vj C vkC1 C uj ; ˛kC1 ˛kC1 ˛kC1 j D1 j DkC2
and L.v1;:::;vk;vkC1; ukC2;:::;un/ D L.u1;:::;un/ D V by 2.5 again. ut
3 Basis and dimension
3.1
We have observed a somewhat complementary behaviour of generating sets and independent systems: the former remain a generating set if more elements are added, the latter remain independent if some elements are deleted. This suggests the importance of minimal generating sets and maximal independent ones. We will see they are, basically, the same. The resulting concept is of fundamental importance in linear algebra. A basis of a vector space V is a subset that is both generating and linearly independent. 458 A Linear Algebra I: Vector Spaces
3.1.1 Observation. In a vector space V , Xn (1) if u1;:::;un is a generating set then each x can be written as x D ˛j uj , j D1 (2) if u1;:::;un is linearly independent then each x can be written at most one way Xn as x D ˛j uj , j D1 (3) if u1;:::;un is a basis then each x can be written precisely one way as x D Xn ˛j uj . j D1
Xn Xn Xn ((1) is in 2.4; as for (2), if ˛j uj D ˇj uj then .˛j ˇj /uj D o and j D1 j D1 j D1 ˛j ˇj D 0; (3) is a combination of (1) and (2).)
3.2 Theorem. 1. Every (finite) generating system u1;:::;un contains a basis. 2. Every linearly independent system v1;:::;vn of a finitely generated vector space can be extended to a basis. 3. All bases of a finitely generated vector space have the same number of elements.
Proof. 1. If u1;:::;un are linearly independent we already have a basis. Else there is, by 2.1 4, an element uj ,sayun (which we can achieve by rearange- ment), that is a linear combination of others. Then by 2.5, L.u1;:::;un 1/ D L.u1;:::;un/ D V and we can repeat the procedure with the generating system u1;:::;un 1. After repeating the procedure sufficiently many times we finish with a generating u1;:::;uk that is linearly independent. (Note that this last system can be empty if the preceding system u1 consisted of u1 D o only; the empty system is formally independent, and constitutes a basis of the trivial vector space fog.) 2. From 1 we already know that V has a basis u1;:::;un and from 2.6 we infer that after rearangement we have a generating system
v1;:::;vk; ukC1 :::;un (*)
and this, by 1 again, has to contain a basis. But this basis cannot be a proper subset of (*), by 2.6, since there exists an independent system u1;:::;un. 3. If u1;:::;un and v1;:::;vk are bases then by 2.6, k n and n k. ut
3.3
The common cardinality of all bases of a finitely generated vector space V is called the dimension of V and denoted by
dim V: 3 Basis and dimension 459
From 2.6 and 3.2 we immediately obtain
Corollary. Let dim V D n.Then 1. every generating system u1;:::;un is a basis, and 2. every linearly independent system u1;:::;un is a basis.
3.4 Theorem. A subspace W of a finitely generated vector space V is finitely generated and we have dim W dim V .Ifdim W D dim V then W D V .
Proof. We just have to show that W is finitely generated; the other statements are consequences of the already proved facts (since a basis of W is a linearly indepen- dent system in V ). Suppose W is not finitely generated. Then, first, it contains a non-zero element u1. Suppose we have already found a linearly independent system u1;:::;un.SinceV ¤ L.u1;:::;un/ there exists a unC1 2 V XL.u1;:::;un/. Then, by 2.2.1 4, u1;:::;un; unC1 is linearly independent, and we can construct inductively an arbitrarily large independent system, contradicting 2.6. ut
3.5 Remark
We have learned that every finitely generated vector space has a basis. In fact, one can easily prove, using Zorn’s lemma, that every vector space has one. Indeed, let
fIj j j 2 J g S be a chain of independent subsets of V .ThenI D fIj j j 2 J g is an independent
set again, since any finite subset M Dfx1;:::;xng I is independent: if xk 2 Ijk
then M Ir , the largest of the Ijk , k D 1;:::;n. Thus there exists a maximal independent set B and this B is a basis: if there were x … L.B/ we would have fxg[B independent, by 2.2.1 4, contradicting the maximality. Recall the sum of subspaces from 1.7.Wehave
3.6 Theorem. Let W1;W2 be finitely generated subspaces of a vector space V . Then
dim W1 C dim W2 D dim.W1 \ W2/ C dim.W1 C W2/:
Proof. Consider a basis u1;:::;uk of W1 \ W2.By3.2, there exist bases
u1;:::;uk;vkC1;:::;vr of W1; and
u1;:::;uk; wkC1;:::;ws of W2:
Then the system
u1;:::;uk;vkC1;:::;vr ; wkC1;:::;ws 460 A Linear Algebra I: Vector Spaces
obviously generates W1 CW2 and hence our statement will follow if we prove that it is linearly independent (and hence a basis) Ð since then dim.W1 C W2/ D r C s k. To this end, let
Xk Xr Xs ˛j uj C ˇj vj C j wj D o: j D1 j DkC1 j DkC1
Then we have Xr Xk Xs ˇj vj D ˛j uj j wj 2 W1 \ W2 j DkC1 j D1 j DkC1
Xk and since it also can be written as ıj uj , all the ˇj are zero, by 3.1.1. j D1 Consequently,
Xk Xs ˛j uj C j wj D o j D1 j DkC1
and since u1;:::;uk; wkC1;:::;ws is a basis, also all the ˛i and i are zero. ut
4 Inner products and orthogonality
4.1
In this section, it is important that we work with vector spaces over R or C.Since all the formulas in the real context will be special cases of the respective complex ones, the proofs will be done in C. 0 Recall the complex conjugate z D z1 iz2 ofpz D z1 C iz2, the formulas z C z D z C z0 and z z0 D z z0, the absolute value jzjD zz, and realize that for a real z this absolute value is the standard one.
4.2
An inner product in a vector space V over C resp. R is a mapping
..x; y/ 7! x y/ W V V ! C resp. R
such that (1) u u 0 (in particular always real), and u u D 0 only if u D o, (2) u v D v u (u v D v u in the real case), 4 Inner products and orthogonality 461
(3) .˛u/ v D ˛.u v/,and (4) u .v C w/ D u v C u w. We usually write simply uv for u v,andu2 for uu. Note that
u.˛v/ D .˛v/u D ˛.vu/ D ˛.vu/ D ˛.uv/
and using similarly twice the complex conjugate,
.v C w/u D vu C wu:
Remark: The notation for an inner product sometimes varies. The most common alternate notation to x y is hx;yi (although one must beware of possible confusion with our notation for closed intervals). The notation is particularly convenient when we want to express the dependence of the product on some other data, such as a matrix (see Section 7.7 below). Further, we introduce the norm p jj ujj D uu:
4.3 An important example
In the row vector space we will use without further mentioning the inner product the symbol
Xn Xn x y x y D xj yj (in the real case D xj yj / j D1 j D1
(see Exercise (2)). This specific example of an inner product is sometimes referred to as the dot product. p p 4.4 Theorem. (The Cauchy-Schwarz inequality) We have jxyj xx yy.
Proof. We have
0 .x C y/.x C y/ D xx C . y/x C x. y/ C . y/. y/ (*) D xx C .yx/ C .xy/ C .yy/:
If x D o then the inequality in the statement holds trivially. Else set xy D yy 462 A Linear Algebra I: Vector Spaces
to obtain from (*) xy yx .xy/.yx/ xy 0 xx .yx/ .xy/ C .yy/ D xx .yx/ yy yy .yy/.yy/ yy
and hence .xy/.xy/ D .xy/.yx/ .xx/.yy/. Take square roots. ut
4.5
Vectors u;vare said to be orthogonal if uv D 0. Note that the only vector orthogonal to itself is o.
Asystemu1;:::;un is said to be orthogonal if uj uk D 0 whenever j ¤ k.Itis orthonormal if, moreover, jj uj jj D 1 for all j .
4.5.1 Proposition. An orthogonal system consisting of non-zero elements (in par- ticular, an orthonormal system) is linearly independent. P P o Proof.P Multiply D ˛j uj by uk from the right. We obtain 0 D .˛j uj /uk D ˛j .uj uk/ D ˛k.ukuk/.Sinceukuk ¤ 0, ˛k D 0. ut
4.5.2 Theorem. (The Gram-Schmidt orthogonalization process) For every basis u1;:::;un of a vector space V with inner product there exists an orthonormal basis v1;:::;vn such that for each k D 1;2;:::;n,
L.v1;:::;vk/ D L.u1;:::;uk/:
If u1;:::;ur is orthonormal we can have vj D uj for j r.
1 Proof. Start with v1 D . If we already have an orthonormal system v1;:::;vk jj u1jj such that L.v1;:::;vr / D L.u1;:::;ur / for all r k set
Xk w D ukC1 .ukC1vj /vj : j D1
For all vr , r k,wehave
Xk wvr D ukC1vr .ukC1vj /.vj vr / D ukC1vr ukC1vr D 0: j D1
Xk We have w ¤ o since otherwise ukC1 D .ukC1vj /vj 2 L.v1;:::;vk/ D j D1 L.u1;:::;uk/ contradicting the linear independence of u1;:::;uk; ukC1. Thus we can set 4 Inner products and orthogonality 463
w v C D k 1 jj wjj
and obtain an orthonormal system v1;:::;vk ;vkC1 and
L.v1;:::;vk;vkC1/ D L.u1;:::;uk; ukC1/
by 2.5. Finally observe that if u1;:::;ur was already orthonormal, the procedure yields vj D uj until j D r. ut
4.6
The orthogonal complement of a subspace W of a vector space V with inner product is the set
W ? Dfu 2 V j uv D 0 for all v 2 W g:
From the properties in 4.1 we immediately obtain
4.6.1 Observations. 1. W ? is a subspace of V and we have W ? \ W Dfog and the implication
? ? W1 W2 ) W2 W1 : ? 2. L.v1;:::;vn/ Dfu j uvj D 0 for all j D 1;:::;ng:
4.6.2 Theorem. Let V be a finite-dimensional vector space with inner product. Then we have, for subspaces W;Wj V , (1) W ˚ W ? D V , (2) dim W ? D dim V dim W , (3) .W ?/? D W , and ? ? ? ? ? ? (4) .W1 \ W2/ D W1 C W2 and .W1 C W2/ D W1 \ W2 .
Proof. (1) and (2): Let u1;:::;uk be an orthonormal basis of W .By2.6 and 4.5.2 we can extend it to an orthonormal basis u1;:::;uk; ukC1;:::;un of V .If Xn Xn ? x D ˛j uj is in W we have 0 D xur D ˛j .uj ur / D ˛r for r k and j D1 j D1 x 2 L.ukC1;:::;un/. ? On the other hand, if x 2 L.ukC1;:::;un/ then x 2 W by 4.6.1 2. Thus, ? W D L.ukC1;:::;un/, and (1) and (2) follow. (3) Obviously W .W ?/?.By(2),dimW D dim.W ?/? and hence W D .W ?/? by 3.4. ? ? ? ? ? (4) Obviously Wi .W1 \ W2/ and hence W1 C W2 .W1 \ W2/ ,and ? ? ? similarly W 1 \ W 2 .W1 C W2/ . Now, using (3) and 4.6.1 1 we obtain 464 A Linear Algebra I: Vector Spaces
? ? ? ? ? ? ? ? ? ? ? ? .W1 \ W2/ D ..W1 / \ .W2 / / ..W1 C W2 / / D W1 C W2 ; and ? ? ? ? ? ? ? ? ? ? ? ? .W1 C W2/ D ..W1 / C .W2 / / ..W1 \ W2 / / D W1 \ W2 : ut
4.7 Hermitian and Symmetric Bilinear Forms
For a vector space V over C, a mapping V V ! C satisfying all the axioms of 4.2 except axiom (1) is called a Hermitian form. (Note that by axiom (2)of4.2, B.v; v/ is always a real number.) If we replace C by R in this definition, we speak of a symmetric bilinear form (over R). For Hermitian and symmetric bilinear forms, one usually does not use the notation , but a letter, for example B.u;v/, u;v 2 V . A Hermitian (resp. real symmetric bilinear) form B is then called positive definite (resp. negative definite)ifB is an inner product (resp. B is an inner product). B is called indefinite if it is neither positive nor negative definite. A Hermitian resp. real symmetric bilinear form B is called degenerate if there exists a non-zero vector v 2 V such that for every w 2 V , B.v; w/ D 0.Otherwise,B is called non- degenerate. Clearly, every degenerate Hermitian or real symmetric bilinear form is indefinite. Real symmetric bilinear forms, and whether they are non-degenerate and positive or negative-definite, is important in multivariable differential calculus (see Section 8 of Chapter 3). Hermitian forms behave analogously in many ways. It is therefore natural to ask: Given a Hermitian or real symmetric bilinear form, can we decide if it is positive or negative definite? Doing this algorithmically requires solving systems of linear equations, which we will review in Appendix B, so we will postpone the solution of this problem to Appendix B.2.6 below.
5 Linear mappings
5.1
Let V;W be vector spaces. A mapping f W V ! W is said to be linear if
for all x;y 2 V; f.x C y/ D f.x/C f.y/; and for all ˛ 2 F and x 2 V; f.˛x/ D ˛f.x/:
Note that the “multiplication by elements of F” really acts as individual unary operations (recall 1.1). In particular, a linear mapping f W F ! F with F viewed as F1 (recall 1.2 1 satisfies f.ax/ D af .x /, not f.ax/ D f.a/f.x/). A linear mapping f W V ! W is an isomorphism if there is a linear mapping g W W ! V such that fg D id and gf D id; V and W are then said to be isomorphic. 5 Linear mappings 465
We have an immediate
5.1.1 Observation. A composition of linear mappings is a linear mapping.
5.2 Examples
1. The projections pk D ..x1;:::;xn/ 7! xk/ W Fn ! F1 are linear mappings. 2. The mapping ..x1;x2;x3/ 7! .x2;x1 x3// W F3 ! F2 is linear. 3. Recall 1.2 2. The mapping . 7! .x// W F.X/ ! R1 is linear. 4. Let J be an open interval. Recall 1.2 2 again. Taking the derivative at a point 1 a 2 J is a linear mapping from C .J / to R1. See the Exercises for more examples.
5.3 Theorem. Let f W V ! W be a linear mapping such that fŒV D W ,let g W V ! Z be a linear mapping, and let h W W ! Z be a mapping such that hf D g.Thenh is linear.
Proof. For each w 2 W choose an element .w/ 2 V such that f. .w// D w. We have h.x C y/ D h.f . .x// C f . .y/// D hf. .x/ C .y// D g. .x/ C .y// D g .x/Cg .y/ D hf .x / Chf .y / D h.x/Ch.y/ and similarly h˛x D h.˛f .x// D hf .˛ .x // D g.˛ .x// D ˛g. .x// D ˛hf .x/ D ˛h.x/. ut
Note. This is a general fact about homomorphisms between algebraic structures.
5.3.1 Corollary. Every linear mapping f W V ! W that is one-one and onto is an isomorphism.
(Indeed, there is a g W W ! V such that gf D id and gf D id. Since f is onto and id is linear, g is linear.)
5.3.2 Corollary. If dim V D n then V is isomorphic to Fn.
F (Choose a basisPu1;:::;un and define a mapping f W n ! V by setting f ..x1;:::;xn// D xi ui .Thisf is obviously linear and by 3.1.1 1 it is one-one and onto.)
5.4 Proposition. Let f W V ! W be a linear mapping. If f is one-one then it sends every linearly independent system to a linearly independent one, if f is onto then it sends every generating set to a generating one. Consequently, isomorphisms preserve generating sets, linearly independent ones, and bases. P P o o Proof.P Let f be one-one and let ˛j f.xj / D .Thenf. ˛j xj / D f. / and ˛j xj D o so that if x1;:::;xn were linearly independent, all the ˛j are zero. 466 A Linear Algebra I: Vector Spaces
Let f be onto and let MPgenerate V .Foray 2 W choose an x 2 VPsuch that Pf.x/ D y and write x as ˛i ui with ui 2 M . Then, y D f.x/ D f. ˛i ui / D ˛i f.ui / with f.ui / 2 fŒM . ut
5.5 Theorem. Let u1;:::;un be a basis of a vector space V ,letW be a vector space and let Wfu1;:::;ung!W be an arbitrary mapping. Then there exists precisely one linear mapping f W V ! W such that f.ui / D .ui / for each i. P Proof. Since every element of V canP be written as x D ˛j uj there isP at most one such fP:wemusthavef.x/ DP ˛j .uj /. On the other hand, if x D ˛j uj and y D ˇj uj then x C y D P.˛j C ˇj /uj and it is, by 3.1.1, the only such representation. Similarly for ˛x D ˛˛j uj . Thus, setting X X f.x/ D ˛j .uj / where x D ˛j uj
yields a linear mapping f W V ! W such that f.ui / D .ui /. ut
5.6 The Free Vector Space on a Set S
In view of Theorem 5.5, it is an interesting question if for any set S, we can find Š a vector space with a basis B and a bijection W S ! B. This is called the free F-vector space on the set S, and denoted by FS (it is customary to treat as the identity, which is usually OK, since it is specified). Of course, for S finite, we may simply take Fn where n is the cardinality of S.However,forS infinite, the Cartesian product FS turns out not to be the right construction. Rather, we set there exists a finite subset F S such that FS D a W S ! F j : a.s/ D 0 for s 2 S X F
The operations of addition and multiplication by a scalar are done point-wise. In fact, this is a vector subspace of FS , which is the space of all maps S ! F.The basis B in question is the set of all maps as W S ! F where as .s/ D 1 and as .t/ D 0 for t ¤ s. It is easily verified that this is a basis. One usually treats the map S ! FS, s 7! as as an inclusion, so as becomes identified with s.
5.7 Affine subsets
Let W be a subspace of a vector space V and let x0 2 V . A subset of the form
x0 C W Dfx0 C w j w 2 W g
is called an affine subset of V (or affine set in V ). 5 Linear mappings 467
5.7.1 Proposition. Let L be an affine set in V . Then the subspace W in the representation
L D x0 C W
is uniquely determined, while for x0 one can take an arbitrary element of L.The space W is sometimes referred to as the associated vector subspace of V , and the dimension of V is referred to as the dimension of L.
Proof. We have
w 2 W if and only if w D x y with x;y 2 L
(x0 C u .x0 C v/ D u v 2 W and on the other hand, if w 2 W then w D .x0 C w/ x0). Now let x1 D x0 C w0 be arbitrary, w0 2 W . Then for any w 2 W we have x1 C w D x0 C .w0 C w/ 2 L,andx0 C w D x1 w0 C w. ut
5.8 Theorem. Let f W V ! Z be a linear mapping. Then (1) W D f 1Œfog is a subspace of V , and (2) the f 1Œfzg are precisely the affine sets in V of the form v CW with f.v/ D z.
Proof. (1): If f.x/ D f.y/ D o then f.˛xC ˇy/ D o. (2) Let f.v0/ D z. Then for each w 2 W we have f.v0 C w/ D f.v0/ C f.w/ D z C o D z and on the other hand, if f.v/ D z then f.v v0/ D z z D o, hence v v0 2 W ,andv D v0 C .v v0/. ut
5.9 Affine maps
By an affine map between affine subsets L V , M W of vector spaces V , W we shall mean simply a map
f W L ! M
which is of the form
f.x/ D y0 C g.x x0/
where x0 2 L, y0 2 M ,andg is a linear map between the associated vector subspaces. It is possible to say a lot more about affine subsets and affine maps. Alternately, many calculus texts do not mention them at all and refer to affine subsets as “linear subsets”, and affine maps imprecisely as “linear maps [in the broader sense]”. We decided to make the compromise of keeping the terminology precise without dwelling on details which would not be useful to us. 468 A Linear Algebra I: Vector Spaces
6 Congruences and quotients
6.1
A congruence on a vector space V is an equivalence relation E V V (we will write xEy for .x; y/ 2 E) such that
xEy ) .˛x/E.˛y/ for all ˛ 2 F; and
xi Eyi ;iD 1; 2 ) .x1 C x2/E.y1 C y2/:
For the equivalence (congruence) classes Œx ; Œy set
Œx C Œy D Œx C y and ˛Œx D Œ˛x
(this is correct: if x0 2 Œx and y0 2 Œy then x0Ex and y0Ey and hence .x0 C y0/E.x C y/ and x0 C y0 2 Œx C y ; similarly for Œ˛x ). It is easy to check that the set of equivalence classes with these operations constitutes a vector space, denoted by
V=E;
and that
pE D .x 7! Œx / W V ! V=E
is a linear mapping onto.
6.2 Theorem. The formulas
E 7! WE Dfx j xEog and W 7! EW Df.x; y/ j x m 2 W g
constitute a one-one corespondence between the congruences on V and subspaces of V . The congruence classes of E are precisely the affine sets
x C WE :
Proof. Obviously WE Dfx j xEog is a subspace. If W is a subspace then EW is a congruence: trivially xEW x,ifxEW y then x y 2 W , hence y x D .x y/ 2 W and yEW x,andifxEW y and yEW z then x z D .x y/C.y z/ 2 W and xEW z; if xi EW yi then .x1 y1/ C .x2 y2/ 2 W ,thatis,.x1 C x2/ .y1 C y2/ 2 W and finally if xEW y we have x y 2 W and hence ˛x ˛y 2 W ,thatis,˛xEW ˛y. o o Now x 2 WEW if and only if xEW if and only if x D x 2 W ,andxEWE y if and only if x y 2 WE if and only if .x y/Eo if and only if xEy. 7 Matrices and linear mappings 469
Finally, if y 2 Œx then yEx, hence .y x/Eo,thatis,y x 2 WE ,and y D x C .y x/ 2 x C WE .Ify 2 .x C WE / then y D x C w with w 2 W and y x D w 2 W . ut
6.2.1 If W is a subspace of V we will use, in view of 5.2, the symbol
V=W instead of V=EW :
We call the vector space V=W the quotient space (or factor)ofV by the subspace W .
6.3
Let f W V ! Z be a linear mapping. The subspace f 1Œfog of V is called the kernel of f and denoted by
Kerf:
Theorem. (The homomorphism theorem for vector spaces) For every linear map- ping f W V ! Z and every subspace W Kerf there is an homomorphism
h W V=W ! Z
defined by h.x C W/D f.x/.Iff is onto, so is h.IfW D Kerf , h is one-to-one.
Proof. Using the projection V=W ! V=Kerf , x 2 W 7! x C Kerf , it suffices to consider the case W D Kerf .Ifx C Kerf D y C Kerf then x y 2 Kerf , hence f.x/ f.y/ D o and f.x/ D f.y/. Thus, the mapping h is correctly defined. Since we have, for the linear mapping p D .x 7! Œx / W V ! V=Kerf with hp D f , h is a linear mapping, by 5.3.Nowh is obviously onto if f .Ifx C Kerf ¤ y C Kerf then x y … Kerf and f.x/ f.y/ D f.x y/ ¤ o so that h is one-one. ut
7 Matrices and linear mappings 7.1 Matrices
In this section we will deal with vector spaces over the field of complex or real numbers. A matrix of the type m n is an array 0 1 a11;:::;a1n A D @ ::: ::: ::: A am1 ;:::;amn 470 A Linear Algebra I: Vector Spaces
where the entries ajk are numbers, real or complex, according to the context. If m and n are obvious we often write simply
A D .ajk/j;k or .ajk/jk:
Sometimes the jk-th entry of a matrix A is denoted by Ajk. The row vectors
.aj1;:::;ajn/; j D 1;:::;m
are called the rows of the matrix A,andthe
.a1k;:::;amk/; k D 1;:::;n
are called the columns of A. Hence, a matrix of the type m n is sometimes referred to as a matrix with m rows and n columns. Matrices of the type m m are called square matrices.
7.2 Basic operations with matrices
Transposition. Let A D .ajk/jk be an m n matrix. The n m matrix
T 0 0 A D .ajk/jk where ajk D akj
is called the transposed matrix of A. There is a variant of this construction over the field C:IfA is a matrix over C, we denote by A the complex conjugate of AT , i.e. the matrix obtained from AT by replacing every entry by its complex conjugate. This is sometimes called the adjoint matrix of A. A (necessarily square) matrix A which satisfies AT D A (resp. A D A) is called symmetric (resp. Hermitian). Multiplication. Let A D .ajk/jk be an m n matrix and let B D .bjk/jk be an n p matrix. The product of A and B is the matrix
Xn AB D .cjk/jk where cjk D ajrbrk: rD1
The unit matrices are the matrices of type n n defined by (
k k 1 if j D k I D In D .ıj /jk where ıj D : 0 if j ¤ k
We obviously have
.AB/T D BT AT ;.AB/ D B A and AI D A and IA D A whenever defined. 7 Matrices and linear mappings 471
The motivation for the definition of the product will be apparent in 7.6 below, where we will also learn more about its properties.
7.3 Row and column vectors as matrices
A vector x D .x1;:::;xn/ 2 Fn will be viewed as a matrix of the type 1 n.Also, we will consider the column vectors, matrices of type n 1, 0 1 x1 xT D @:::A : xn
Clearly, all column vectors of a given dimension n also form a vector space over F, known as the n-dimensional column vector space and denoted as Fn. We will see that in spite of the fact that it is more convenient to write rows than columns, the space of columns is more convenient in the sense that for columns, composition of linear maps corresponds to multiplication of matrices without reversing orders (see Theorem 7.6 below). Because of this, nearly all courses in linear algebra now use the space of column vectors and not row vectors as the default model of an n-dimensional vector space. We will follow this convention in this text as well. In particular, we will extend the convention 1.3 to column vectors.
n 7.4 The standard bases of Fn, F
In the row vector space Fn, we will consider the basis ( 1 if j D k; e1;:::;en where .ej /k D 0 if j ¤ k
and in Fn, we will consider the basis
1 n i T e ;:::;e where e D .ei /
e k (this notation conforms with 1.3; of course . j /k D ıj from 7.2). j The ej ’s from Fm and Fn with m ¤ n differ (and similarly for e ), but this rarely causes confusion. In the rare cases where it can we will display the dimension n as j nej , ne . Obviously we have
Xn x D xj ej : (7.4.1) j D1 472 A Linear Algebra I: Vector Spaces
A 7.5 The linear maps fA,f
Let A be a matrix of type m n. Define a mapping
fA W Fm ! Fn by setting fA.x/ D xA;
and a mapping
f A W Fn ! Fm by setting f A.x/ D Ax:
A 7.5.1 Theorem. The mappings fA, f are linear and the formula
A 7! fA
resp.
A 7! f A
yields a bijective correspondence between matrices of type m n and the set of all n m linear mappings Fm ! Fn resp. F ! F .
Proof. We will prove the statement about row spaces. The statement for column spaces is analogous (see Exercise (10)). The linearity of the formula is an immediate consequence of the definition of a product of matrices. We have
Xn .ej A/1k D ejrark D ajk (*) rD1
and hence if A ¤ B,thereexistr; s such that ars ¤ brs. Thus, fA ¤ fB . Now let f W Fm ! Fn be an arbitrary linear mapping. Consider the ajk uniquely defined by the formula
Xn f.mej / D ajk.nek / kD1
and define A as the array .ajk/jk.Wehave,by(*), X X X X X X f.x/ D f. xj .mej // D xj f.mej / D xj ajk.nek/ D . xj ajk/.nek /; j j j k k j
and hence f.x/1k D .xA/1k and finally f.x/ D .xA/. ut 7 Matrices and linear mappings 473
7.6 Theorem. In the representation of linear mappings from 7.5 we have
fI D id;fAB D fB ı fA;
and
f I D id;fAB D f A ı f B :
Proof. We will only prove the statement for row vectors. The statement for column vectors is analogous (see Exercise (11)). The first formula is obvious. Now let A, B be matrices of types m n resp. n p. If two linear maps agree on a basis they obviously coincide. We have X X fB .fA.mej // D fB . ajk.mek // D ajkfB .mek / k k X X X X D ajk. bkr.per / D . ajkbkr/per D fAB .mej /: ut k r r k
7.6.1 From the associativity of composition of mappings and from the uniqueness of the matrix in the representation of linear mappings as fA we immediately obtain
Corollary. Multiplication of matrices is associative, that is, A.BC / D .AB/C whenever defined.
7.6.2 Different bases, base change At this point we must mention the fact that the association between matrices and linear maps works for arbitrary finite-dimensional vector spaces V;W .LetB D .v1;:::;vn/ resp. C D .w1;:::;wm/ be sequences of distinct vectors in V resp. W which, when considered as sets, form bases of V and W (we speak of ordered bases). Then for an m n matrix A over F, we have an associated linear map
A B;C f W V ! W
given by
Xm A B;C f .vj / D aijwi : iD1
Clearly, (for example, by considering the isomorphisms between V , Fn and W , Fm mapping B and C to the standard bases), this again defines a bijective correspondence between m n matrices over F and linear maps from V to W . A We will say that the linear map B;C f is associated to the matrix A with respect to the bases B and C , and, vice versa, that A is the matrix associated with the linear 474 A Linear Algebra I: Vector Spaces
A map (or simply matrix of the linear map) f D B;C f with respect to the bases B;C. An analogue of Theorem 7.6 of course holds, i.e.
A1A2 A1 A2 B;Df D C;Df ı B;C f (*)
for an m n matrix A1 and an n p matrix A2, and ordered bases B;C;D of m-resp.n-resp.p-dimensional spaces U , V , W . For two ordered bases B;B0 of the same finite-dimensional vector space V ,the matrix of Id W V ! V with respect to the basis B in the domain and B0 in the codomain is sometimes referred to as the base change matrix from the basis B to the basis B0.By(*), base change matrices can be used to relate matrices of linear maps with respect to different bases, both in the domain and codomain.
7.7 Hermitian matrices and Hermitian forms
Given a Hermitian (resp. symmetric) matrix A of type n n over C (resp. over R), we have a Hermitian (resp. symmetric bilinear) form B on Cn (resp. Rn)givenby
B.x; y/ D y Ax:
In case when B is positive-definite, this becomes an inner product, also denoted by
hx; yiB:
(In the real case, of course, y D yT .) Conversely, the axioms immediately imply that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn)arisesinthis way. We will say that the form B is associated with the matrix A and vice versa. Sometimes we simplify the terminology and call a Hermitian (resp. real symmetric) matrix positive definite resp. negative definite resp. indefinite if the corresponding property holds for its associated Hermitian (resp. symmetric bilinear) form.
8Exercises
(1) Prove the statement made in Example 1.2 3. (2) Prove that the dot-product from 4.2 satisfies the definition of an inner product, and more generally the B defined in Subsection 7.7 is a Hermitian (resp. symmetric bilinear) form. (3) Prove that every Hermitian (resp. symmetric bilinear) form on Cn (resp. Rn) is associated with a Hermitian (resp. symmetric) matrix. (4) Take the vector space V from 1.23. Prove that .x 7! ln x/ is an isomorphism V ! R1. 8Exercises 475
(5) Prove that if 1, 2 are inner products on a (real or complex) vector space V , and ; > 0,then . 1/ C . 2/ is an inner product. (6) Prove that linear maps F ! F are precisely the mappings .x 7! ax/ where a 2 F is fixed. Z b (7) Prove that if ha; bi is a closed interval then . 7! .x/dx/ is a linear a mapping C.ha; bi/ ! R1. (8) Prove that the set of all as, s 2 S in 5.6 forms a basis of the free vector space FS on a set S. (9) Prove that an affine map f W L ! M between affine subsets of vector spaces V , W can be made to satisfy the definition 5.9 with any choice of the element x0 2 L. Is an analogous statement true for y0 2 M ? (10) Prove the statement of Theorem 7.5.1 for column vectors. (11) Prove the statement of Theorem 7.6 for column vectors. (12) Prove that the set of all matrices of type m n with entries in F is a vector space over F where addition is addition of matrices, and multiplication by a scalar 2 F is the operation which multiplies each entry by . Is this vector space finite-dimensional? What is its dimension? Linear Algebra II: More about Matrices B
1 Transforming a matrix. Rank
1.1 Elementary row and column operations
Recall Section A.7.LetA be a matrix of type m n. The vector subspace Row.A/ of Fn generated by the rows of A is called the row space of A and the vector subspace Col.A/ of Fm generated by the columns is called the column space of A. An elementary row (resp. column) operation on A is any of the following three transformations of the matrix. (E1) A permutation of the rows (resp. columns). (E2) Multiplication of one of the rows (resp. columns) by a non-zero number. (E2) Adding to a row (resp. column) a linear combination of the other rows (resp. columns).
1.1.1 Observation. An elementary row (resp. column) operation does not change the row resp. column space.
1.2
The column space is, of course, changed by a row operation (and the row space is changed by a column operation). We have, however, the following
Proposition. An elementary row (resp. column) operation preserves the dimension of the column (resp. row) space.
Proof. Let p be a permutation of the set f1;2;:::;ng.Define p W Fn ! Fn by setting