Vector spaces

Brian Krummel January 25, 2020

In the previous lecture we introduced matrices and column vectors. Today we will discuss the basic algebraic properties of matrices and column vectors by introducing the general concept of a . Matrices and column vectors will be examples of vector spaces, as will many other thing in including real-valued functions. The main reason we care about linear is that the theory of vector spaces can be applied in a wide variety of situations, including solving systems of linear equations and differential equations. Finally, we will discuss linear combinations of vectors and introduce two important concepts, span and linear independence.

Definition 1. A vector space is a V together with operations

which assigns to each X,Y ∈ V a unique sum X + Y

which assigns to each X ∈ V and scalar (or real ) r a unique scalar rX

such that for each X,Y,Z ∈ V and scalars r, s

(i) (Closed under addition) X + Y is a well-defined element of V.

(ii) ( of addition) X + Y = Y + X

(iii) ( of addition) X + (Y + Z) = (X + Y ) + Z

(iv) (Additive ) There exists an element 0 in V such that X + 0 = X for every X ∈ V.

(v) () For each X ∈ V there exists an element −X ∈ V such that X +(−X) = 0.

(vi) (Closed under ) rX is a well-defined element of V.

(vii) (Associative property of scalar multiplication) r(sX) = (rs)X

(viii) () r(X + Y ) = (rX) + (rY )

(ix) (Distributive property) (r + s)X = (rX) + (sX)

(x) (Scalar multiplicative identity) 1 X = X

Remark 1. “X ∈ V” means that X is an element of the set V.

1 A good way to “memorize” all 10 properties is to think of them in groups. (i)–(v) are properties of vector addition, and are a lot like properties of addition for real . (xi), (xii), and (x) are properties of scalar multiplication. (viii) and (ix) are distributive properties for vector addition and scalar multiplication.

Example 1. The n × 1 column vectors Rn and the m × n matrices M(m, n) are vector spaces. In particular, the m × n matrices M(m, n) satisfy properties (i)–(x) by the definition of addition and scalar multiplication and the algebraic properties of the real numbers. For instance, if X = [xij], Y = [yij], and Z = [zij] are m × n matrices, then property (iii) holds true since

X + (Y + Z) = [xij] + ([yij] + [zij]) = [xij] + [yij + zij] = [xij + (yij + zij)]

= [(xij + yij) + zij] = [xij + yij] + [zij] = ([xij] + [yij]) + [zij] = (X + Y ) + Z.

0 as in property (iv) is the m × n zero matrix for which each entry is zero. For example, the 2 × 3 zero matrix is  0 0 0  0 0 0

Example 2. F(R) denotes the space of all real-valued functions f(x) which are defined for each x ∈ R. Given f, g ∈ F(R) and a scalar (or constant real number) c, we define addition and scalar multiplication by adding and multiplying the values of the functions at each point:

(f + g)(x) = f(x) + g(x) (cf)(x) = c f(x) for each x ∈ R. We define the zero 0 to be the function which takes value zero 0(x) = 0 at each x ∈ R. Using these definitions of addition and scalar multiplication and the algebraic properties of the real numbers, one can check that the vector space properties are satisfied by V = F(R).

Vector spaces are an abstract mathematical structure. The spaces of column vectors Rn, matrices M(m, n), and functions F(R) are all examples of vector spaces. Often when stating definitions and theorems, we will often say “let V be a vector space”. Then everything we say about vector spaces will apply to column vectors Rn, matrices M(m, n), and functions F(R). One reason we care about vector spaces is that when we prove algebraic facts using properties (i)–(x), then they will apply to not only column vectors and matrices but also to real-valued functions and many other mathematical objects. This allows us to build up mathematical theory that applies in a wide variety of contexts. To illustrate this, let’s prove the following theorem:

Theorem 1. For each m × n matrix A, 0A = 0.

Proof 1. Let A = [aij] be an m × n matrix. Then

0A = 0 [aij] = [0 · aij] = [0] = 0

(where [0] means the m × n matrix with all zero entries, i.e. the zero matrix 0).

2 Proof 2. We have

0A + 0A = (0 + 0) A (by vector space property (ix)) = 0A.

Let −0A denote the additive inverse of 0A as in vector space property (v). By adding −0A to both sides we can cancel 0A:

(−0A) + (0A + 0A) = (−0A) + 0A ((−0A) + 0A) + 0A = (−0A) + 0A (by vector space property (iii)) 0 + 0A = 0 (by vector space property (v)) 0A = 0 (by vector space property (iv)).

Proof 1 is very simple but is specific to matrices. Proof 2 uses the vector space properties (i)–(x) and in fact shows that 0A = 0 in any vector space, not just M(m, n). For instance, Proof 2 also shows that 0 f = 0 for each real-valued function f ∈ F(R). Since Proof 2 proves 0A = 0 in such generality, Proof 2 is preferable. Of course, 0 A = 0 is a very simple algebraic property. Nonetheless, this illustrates is a very important principle: Anything we prove about matrices using only the vector space properties (i)–(x) will be true in any context for which these properties hold true. This principle will allow use to apply in a variety of context, including solving systems of linear equations and solving differential equation. Naturally, when mathematician notice that certain properties hold true in a wide variety of context, they give it a name and develop a general theory of that object. In this case, we defined the concept of a vector space. One important thing we can do with vectors is take linear combinations of vectors.

Definition 2. Let V be a vector space. Let S = {X1,X2,...,Xk} be a finite set of vectors in V and let c1, c2, . . . , ck be scalars. The vector

Y = c1X1 + c2X2 + ··· + ckXk is the linear combination of X1,X2,...,Xk with weights c1, c2, . . . , ck. Example 3. We have

 6 3   1 0   1 1   0 1  = 2 + 4 − 0 1 0 2 0 0 0 3 where the matrix on the left-hand side is a linear combination of the three matrices on the right- hand side with weights 2, 4, and −1.

3 Example 4. Suppose that X,Y ∈ R2 are a pair of nonzero, non- vectors. We can visualize each linear combination of X,Y as lying on a grid:

3X+2Y

x2

X Y x1

We can visualize the set of all linear combinations of X,Y as a grid:

x2

x2 3X X Y

x1 X

x1 -X The lines of the grid represent linear combinations of X,Y where one of the weights is an . x3 We can visualize X and Y as directions to travel in the plane 2. We can travel to each point in -3X R the plane by traveling in the X direction, and then traveling in the Y direction, in exactly one way. If I regard the plane as the x x -plane in 2, then any point lying off the plane (i.e. lying off 1 2 Y R the page) would not be a linear combination of X,Y . I could also add a third vector Z = X + Y to the grid, in which case there is moreX than one way to travel in X,Y,Z directions to reach a point in the plane; for instance, the vector Z is both X + Y and Z xand1 the vector 2X + Y is both

2X + Y and X + Z. x2

Main questions about linear combinations. Let V be a vector space. Given X1,X2,...,Xk ∈ V and Y ∈ V, consider the equation

c1X1 + c2X2 + ··· + ckXk = Y. (?)

(i) Existence: Can we express Y as a linear combination as in (?) for some scalars c1, c2, . . . , ck?

(ii) Uniqueness: Is there a unique choice of scalars c1, c2, . . . , ck for which (?) holds true? The answer to the first question is called the span and the answer to the second question is called linear independence. First let’s look at the span.

4 Definition 3. Let V be a vector space and let S = {X1,X2,...,Xk} be a finite set of vectors in V. The span of S is the set of all linear combinations

Y = c1X1 + c2X2 + ··· + ckXk of X1,X2,...,Xk, where c1, c2, . . . , ck is any possible choice of scalars. We denote the span of S by Span S. Remark 2. Notice definitions often take the following format: A is a satisfying . In this case, the span of S is a set which consists of all linear combinations of X1,X2,...,Xk. Example 3 continued. We have that  6 3   1 0   1 1   0 1  ∈ Span , , 0 1 0 2 0 0 0 3 since we showed above that  6 3   1 0   1 1   0 1  = 2 + 4 − . 0 1 0 2 0 0 0 3 One might correctly infer that the overall span is the set of all upper triangular matrices with (2, 1)-entry zero:  1 0   1 1   0 1   a b   Span , , = : a, b, c ∈ . 0 2 0 0 0 3 0 c R How to carefully verify this is a good question for another time.

Remark 3. For any vector space V and any vectors X1,X2,...,Xk ∈ V, the zero vector 0 is always in Span{X1,X2,...,Xk} since

0 = 0 X1 + 0 X2 + ··· + 0 Xk.

Example 5. The span of a single nonzero vector X in R2 is the line passing through 0,X:

x2

X

x1

x2 Example 6. What is the span of two nonzero vectors X,Y in R2? 2X Example 7. What is the span of two vectors X,Y in R3 which are nonzero and are not parallel? X

5 x1