1.3.3 sets and Gram-Schmidt

Before we address the question of existence and uniqueness, we must establish one more tool for working with vectors – basis sets.

v1  v  Let v ∈RN, with v =  2  (1.3.3-1) :    v3 

We can obviously define the set of N unit vectors

1 0 0       0 1 0 e[1] = 0 e[2] = 0 … e[N] = 0 (1.3.3-2)       :  :  :  0 0 1

so that we can write v as [1] [2] [N] v = v1e + v2e + … + vNe (1.3.3-3)

As any v ∈RN can be written in this manner, the set of vectors {e[1], e[2], … e[N]} are said to form a basis for the RN.

The same function can be performed by any set of mutually orthogonal vectors, i.e. a set of vectors {U[1], U[2], …, U[N]} such that

U[j] • U[k] = 0 if j ≠ k (1.3.3-4)

This means that each U[j] is mutually orthogonal to all of the other vectors. We can then write any v∈RN as

' [1] ' [2] ' [N] [1] v = v1 e + v 2 e + ... + v N e (1.3.3-5) U

Where we use a prime to denote that 90 deg. ' [3] v j ≠ v j (1.3.3-6) U 90 deg. when comparing the expansions (1.3.3-3) and (1.3.3-5) U[2]

Orthogonal basis sets are very easy to use since the coefficients of a vector v∈RN in the expansion are easily determined.

We take the dot product of (1.3.3-5) with any basis vector U[k], k∈[1,N],

[k] ' [1] [k] ' [k] [k] ' [N] [k] v • U = v1 (U • U ) + ... + v k (U • U ) + ... + v N (U • U ) (1.3.3-6)

Because [j] [k] [k] [k] [k] 2 U • U = (U • U )δ jk = U δ jk (1.3.3-7) with

1, j = k δ jk =  (1.3.3-8) 0, j ≠ k

then (1.3.3-6) becomes [k] 2 [k] ' [k] ' v • U v • U = v k U ⇒ v k = 2 (1.3.3-9) U[k]

In the special case that all basis vectors are normalized, i.e. U[k] =1 for all k∈[1,N], we have an set, and the coefficients of v∈RN are simply the dot products with each basis set vector.

Exmaple 1.3.3-1

Consider the orthogonal basis for R3

1 1  0 [1]   [2]   [3]   U = 1 U = −1 U = 0 (1.3.3-10) 0 0  1

v1  for any v∈R3, v = v  what are the coefficients of the expansion  2    v3 

' [1] ' [2] ' [3] v = v1 U + v 2 U + v3 U (1.3.3-11)

First, we check the basis set for 1  [2] [2]   U • U = [ 1 1 0] −1 = (1)(1) + (1)(-1) + (0)(0) = 0 0 

0 [1] [3]   U • U = [ 1 1 0] 0 = (1)(0) + (1)(0) + (0)(1) = 0 (1.3.3-12) 1

0 [2] [3]   U • U = [1 -1 0] 0 = (1)(0) + (-1)(0) + (0)(1) = 0 1

We also have 1 [1] 2   U = [1 1 0] 1 = 2 0 1  [2] 2   U = [1 -1 0] −1 = 2 0  0 [3] 2   U = [0 0 1] 0 = 1 1 (1.3.3-13)

v1  So the coefficients of v = v  are  2    v3  1 v • U[1] 1 1 v' = = [v v v ] 1 = (v + v ) 1 2 1 2 3   1 2 U[1] 2 2 0 1  v • U[2] 1 1 v' = = [v v v ] −1 = (v - v ) 2 2 1 2 3   1 2 U[2] 2 2 0  0 v • U[3] 1 v' = = [v v v ] 0 = v 3 2 1 2 3   3 U[3] 1 1 Although orthogonal basis sets are very convenient to use, a set of N vectors B = {b[1], b[2], …, b[N]} need not be mutually orthogonal to be used as a basis – they need merely be linearly independent.

Let us consider a set of M≤ N vectors b[1], b[2], …, b[M] ∈RN. This set of M vectors is said to be linearly independent if

[1] [2] [M] c1b + c2b + … + cMb = 0 implies c1=c2=…=cM=0 (1.3.3-16)

This means that no b[j], j∈[1,M] can be written as a linear combination of the other M-1 basis vectors.

For example, the set of 3 vectors for R3

2 1 1  [1]   [2]   [3]   b = 0 b = 1 b = −1 (1.3.3-17) 0 0 0 

is not linearly independent because we can write b[3] as a linear combination of b[1] and b[2],

2 1 1  [1] - [2]       [3] b b = 0 - 1 = −1 = b (1.3.3-18) 0 0 0 

Here, a vector v ∈RN is said to be a linear combination of the vectors b[1], …, b[M] ∈RN if it can be written as

' [1] ' [2] ' [M] v = v1 b + v 2 b + ... + v M b (1.3.3-19)

We see that the 3 vectors of (1.3.3-17) do not form a basis for R3 since we cannot express 3 [1] [2] [3] any vector v ∈R with v3 ≠ 0 as a linear combination of {b , b , b } since

' ' ' 2 1 1  2v1 − v 2 + v3    v = v ' 0 + v ' 1 + v ' −1 = v' − v' (1.3.3-20) 1   2   3    2 3  0 0 0  0         

We see however that if we instead had the set of 3 linearly independent vectors

2 1 0 [1]   [2]   [3]   b = 0 b = 1 b = 0 (1.3.3-21) 0 0 2 then we could write any v∈R3 as

' ' v  2 1 0 2v1 + v 2  1   v = v  = v ' 0 + v ' 1 + v ' 0 = v' (1.3.3-22)  2  1   2   3    2  v  0 0 2  '   3        2v3 

(1.3.3-22) defines a set of 3 simultaneous linear equations

' ' 2v1 + v 2 = v1 ' v=2 v 2 ' 2v3 = v3 (1.3.3-23)

' 2 1 0 v1  v    1 0 1 0 v' = v     2   2  0 1 2  '  v    v3   3 

' ' ' that we must solve for v,1 v,2 v3 ,

v (v − v' ) v' = 3 , v' = v , v' = 1 2 (1.3.3-24) 1 2 2 2 1 2

We therefore make the following statement:

Any set B of N linearly independent vectors b[1], b[2], …, b[N] ∈RN can be used as a basis for RN.

We can pick any M subset of the linearly independent basis B, and define the span of this subset {b[1], b[2], …, b[M]} ⊂ B as the space of all possible vectors v ∈RN that can be written as

[1] [2] [M] v = c1b + c2b + … + cMb (1.3.3-25)

2 0 [1]   [3]   For the basis set (1.3.3-21), we choose b = 0 and b = 0 .(1.3.3-26) 0 2 Then, span { b[1] , b[3] } is the set of all vectors v ∈R3 that can be written as

v1  2 0 2c1  v = v  = c b[1] + c b[3] = c 0 + c 0 = 0  (1.3.3-27)  2  1 3 1   3         v3  0 2 2c3 

Therefore, for this case it is easy to see that v ∈ span { b[1] , b[3] }, if and only if (“iff”) v2 = 0.

Note that if v∈span{ b[1] , b[3] } and w∈ span{ b[1] , b[3] }, then automatically v + w ∈span { b[1] , b[3] }.

We see then that span{ b[1] , b[3] } itself satisfies all the properties of a vector space identified in section 1.3.1.

Since span{ b[1] , b[3] } ⊂ R3 (i.e. it is a subset of R3), we call span{ b[1] , b[3] } a subspace of R3.

This concept of basis sets also lets us formally identify the meaning of dimension – this will be useful in the establishment of criteria for existence/uniqueness of solutions.

Let us consider a vector space V that satisfies all the properties of a vector space identified in section 1.3.1. We say that the dimension of V is N if every set of N+1 vectors v[1], v[2], …, v[N+1] ∈V is linearly independent and if there exists some set of N linearly independent vectors b[1], …, b[N] ∈ V that forms a basis for V. We say then that dim(V) = N. (1.3.3-28)

While linearly independent basis sets are completely valid, they are more difficult to use than orthogonal basis sets because one must solve a set of N linear algebraic equations to find the coefficients of the expansion

' [1] ' [2] ' [N] v = v1 b + v 2 b + ... + v N b (1.3.3-29)

[1] [2] [N] ' b1 b1 ... b1 v1  v1   [1] [2] [N]  '    b 2 b 2 ... b 2 v 2  v 2 3 =   O(N ) effort to solve for all vj’s (1.3.3-30) : : : :  :       [1] [2] [N] ' v b N b N ... b N v N   N 

This requires more effort for an orthogonal basis {U[1], … , U[N]} as

[j] ' v • U 2 v j = O(N ) effort to find all vj’s U[j] • U[j] (1.3.3-9, repeated)

This provides an impetus to perform Gramm-Schmidt orthogonalization. We start with a linearly independent basis set {b[1], b[2], …, b[N]} for RN. From this set, we construct an orthogonal basis set {U[1], U[2], …, U[N]} through the following procedure:

1. First, set U[1] = b[1] (1.3.3-31)

2. Next, we construct U[2] such that U[2] • U[1] = 0 . Since U[1] = b[1], and b[2] and b[1] are linearly independent, we can form an orthogonal vector U[2] from b[2] by the following procedure:

U[2] cU[1] b[2]

b[1] = U[1] “subtract” this part from b[2]

We write U[2] = b[2] + cU[1] (1.3.3-32)

Then, taking the dot product with U[1],

U[2] • U[1] = 0 = b[2] • U[1] + cU[1] • U[1] (1.3.3-33)

Therefore − b[2] • U[1] c = (1.3.3-34) 2 U[1]

And our 2nd vector in the orthogonal basis is   b[2] • U[1] U[2] = b[2] -  U[1] (1.3.3-35)  [1] 2  U  

3. We now form U[3] in a similar manner. Since U[2] is a linear combination of b[1] and b[2], we can add a component from b[3] direction to form U[3],

[3] [3] [2] [1] U = b + c2U + c1U (1.3.3-36)

[3] [1] [3] [1] [2] [1] [1] [1] First, we want U • U = 0 = b • U + c2 U • U + c1 U • U (1.3.3-37)

= 0

so

− b[3] • U[1] c = (1.3.3-38) 1 2 U[1]

A similar condition that U[3] • U[2] = 0 yields − b[3] • U[2] c = (1.3.3-39) 2 2 U[2] so that the 3rd member of the orthogonal basis set is

    b[3] • U[2]  b[3] • U[1]  U[3] = b[3] -  U[2] −  U[1] (1.3.3-40)  [2] 2   [1] 2   U   U     

4. Continue for U[j], j = 4, 5, …, N where

  j-1 b[j] • U[k]  U[j] = b[j] -  U k (1.3.3-41) ∑ 2 k=1  [k]   U   

5. Normalize vectors if desired (we can do this also during construction of orthogonal basis set)

U[j] U[j] ← (1.3.3-42) U[j] As an example, let us use this method to generate an orthogonal basis for R3 such that the 1st member of the basis set is

1 [1]   U = 1 (1.3.3-43) 0

First, we write a linearly independent basis that is not, in general, orthogonal. For example, we could choose

1 1 0 [1]   [2]   [3]   b = 1 b = 0 b = 0 (1.3.3-44) 0 0 1

We now perform Gram-Schmidt orthogonalization,

1 [1] [1]   1. U = b = 1 (1.3.3-45) 0 2. We next set   b[2] • U[1]  U[2] = b[2] -  U[1] (1.3.3-35, repeated)  [1] 2   U   

1 2 [1]   U = [1 1 0] 1 = 2 (1.3.3-46) 0

1 [2] [1]   b • U = [1 0 0] 1 =1 (1.3.3-47) 0 so

1  2  1 1   1 1 U[2] = 0 - 1 = −  (1.3.3-48)   2    2 0 0   0    1 [2] [1]   Note U • U = [1/2 -1/2 0] 1 = ½ - ½ = 0 (1.3.3-49) 0

We now calculate     b[3] • U[2]  b[3] • U[1]  U[3] = b[3] -  U[2] −  U[1]  [2] 2   [1] 2   U   U      (1.3.3-41, repeated) 1  2    2 2 2 1  1   1  1 1 1 U[2] = [1/2 -1/2 0] −  =   + −  = + = (1.3.3-50)  2  2   2  4 4 2   0    1  2    1 b[3] • U[2] = [0 0 1] −  = 0 (1.3.3-51)  2   0    1 [3] [1]   b • U = [0 0 1] 1 = 0 (1.3.3-52) 0

0 [3] [3]   We therefore have merely U = b = 0 (1.3.3-53) 1

Our orthogonal basis set is therefore

1  2  1   0 1 U[1] = 1 U[2] = −  U[3] = 0 (1.3.3-54)    2   0   1 0   