<<

201-NYC-05-E (Enriched I) Lecture Notes

Tristan Martin

Fall 2017

Enriched 201-NYC-05 (Enriched Linear Algebra I), instructed by Matthew Egan, Yariv Barsheshat, Christopher Turner, and Dominic Lemelin. These notes probably contain many typos.

Contents

1 Introduction to Linear Systems and How to Solve Them (Egan)4 1.1 Linear equations and linear systems ...... 4 1.2 Introduction to augmented matrices...... 5

2 Applications of Manipulations that Leave the Solution Unchanged (Egan)6

3 More on Row Reduction (Egan)9 3.1 REF and RREF ...... 9 3.2 Row vectors, column vectors and linear dependance ...... 10 3.3 and linear systems...... 10

4 Applications of Row Reduction and Introduction to Homogenous Systems (Egan) 13 4.1 Some applications ...... 13 4.1.1 Network flow...... 13 4.1.2 Balancing chemical reactions...... 13 4.2 Homogenous systems and solutions to linear systems...... 14

5 Introduction to Geometric Vectors (Lemelin) 16 5.1 Vectors: the geometric case...... 16

6 Notation (Turner) 17

7 , Projections, Orthogonals and the (Barsheshat) 18 7.1 Dot product ...... 18 7.2 Projection of a vector onto another vector...... 20 7.3 Orthogonal projection...... 20 7.4 Cross product...... 21

8 Lines and Planes (Barsheshat) 23 2 8.1 Lines in R ...... 23 3 8.2 Lines in R ...... 23 8.3 Parallel lines ...... 24

1 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

3 8.4 Planes in R ...... 25 8.4.1 Parallel planes...... 27

9 Span and , Dependence and Independence (Barsheshat) 29

10 Span and Bases (Barsheshat) 31 10.1 More on span and linear combination, dependence and independence . . . 31 m 10.2 for R ...... 32

11 Subspaces and Span (Barsheshat) 34

12 Basis and Dimension for a Subspace (Barsheshat) 36 12.1 Basis for a subspace ...... 36 12.2 Dimension of a subspace...... 36

13 Exercises on Span (Barsheshat) 38

14 Subspaces of Matrices (Barsheshat) 40

15 Matrix Operations (Barsheshat) 42 15.1 Matrix addition and multiplication...... 42 15.2 ...... 42

16 Column and Row Representations of Matrix Multiplication and the Prop- erties of the Operation (Barsheshat) 44 16.1 Partitioning matrices...... 44 16.2 Column representation of matrix multiplication...... 44 16.3 Row representation of matrix multiplication...... 44 16.4 Column-row representation ...... 44 16.5 Algebraic proterties of matrix addition and . . . . . 45 16.6 Matrix exponentiation...... 46 16.7 of matrix ...... 46 16.8 Properties of matrix multiplication...... 46

17 Linear Systems With Matrices (Barsheshat) 47

18 Matrix Inverses (Barsheshat) 49 18.1 Using row-reduction to find inverses ...... 50

19 Gauss-Jordian Algorithm for Finding Inverses (Barsheshat) 51

20 More on Matrix Inverses 54

21 (Barsheshat) 55 21.1 Determinants for 3 × 3 matrices ...... 56

22 More on Determinants (Barsheshat) 57 22.1 Generalized cofactor expansions ...... 57

23 Even More On Determinants (Barsheshat) 59 23.1 Triangular Matrices ...... 59 23.2 Determinants of elements matrices...... 59 23.3 Cramer’s Rule ...... 60

2 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

24 Complex Numbers (Barsheshat) 62 24.1 Operations on complex numbers ...... 62 24.2 Polar form of copmlex numbers...... 63

25 More on Complex Numbers (Barsheshat) 65 25.1 De Moivre’s formula...... 65 25.2 Basic polynomial of complex numbers...... 66

26 Integrative Activity: Electrical Circuits (Barsheshat) 67

27 Enriched Material: Eigenvalues and Eigenvectors (Barsheshat) 68 27.1 Finding eigenvalues of A ...... 68

3 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §1 Introduction to Linear Systems and How to Solve Them (Egan)

§1.1 Linear equations and linear systems The main field of study for this course will be linear equation which have the form

a1x1 + a2x2 + a3x3 + ··· + anxn = b, (1) a1, a2, a3, ··· , an, b ∈ R and x1, , xn are unknowns. A solution of a linear sysrem is a set of values {s1, ··· , sn}. that make the equation true when we replace {x1, ··· , xn} with {s1, ··· , sn}. Definition 1.1. A linear system of linear equations is a collection of m linear equations in n unknowns.

It thereby has the form  a11x1 + a12x2 + a13x3 + ··· + a1nxn = b1   a21x2 + a22x2 + a23x3 + ··· + a2nxn = b2  a31x1 + a32x2 + a33x3 + ··· + a3nxn = b3 (2)  .  .  am1x1 + am2x2 + am3x3 + ··· + amnxn = bm

Definition 1.2. A solution to a linear system is a set of values {s1, ··· , sn} that is a solution for all equations in the system.

Example 1.3 Consider the 2 × 2 system x − 2x = −1 1 2 (3) −x1 + 3x2 = 3 It has unique solution (3, 2) and the system is therefore consistent.

Example 1.4 Consider the 2 × 2 system x − 2x = −1 1 2 (4) −x1 + 2x2 = 3 It has no solution (i.e., the system is inconsistent).

Example 1.5 Consider the 2 × 2 system x − 2x = −1 1 2 (5) −x1 + 2x2 = 1 It has infinitely many solutions and is hence a consistent system.

4 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Theorem 1.6 Every linear system of equations has either 0,1 or ∞ solutions.

Furthermore, to describe the complete solution set for example5, we use parameters to write a out general solution. If we define x2 = t ∈ R, we can write x1 as

x1 = 2x1 − 1 (6) = 2t − 1. (7)

The former is called the parametric eqautions of a line. The general solution is written as

x = 2t − 1 1 (8) x2 = t

Choose a value for t to obtain a particular solution. For instance t = 0 gives solution (−1, 0) and t = 1 gives (1, 1).

§1.2 Introduction to augmented matrices Now, consider larger linear systems like the following 3 × 3 system:   x1 − 2x2 + x3 = 0 2x2 − 8x3 = 8 (9) −4x1 + 5x2 + 9x3 = −9

To solve such a system we first need to strip away all but the coefficients and the constant to obtain the augmented matrix of the system:

 1 −2 1 0   0 2 −8 8  . (10) −4 5 9 −9

Remark 1.7. What manipulations of the system will leave the solution set unchanged? We can:

• interchange any two rows (equations)

• multiply a row (equation) by a non-zero constant

• replace a row (equation) by itself plus a multiple of another row (equation)

5 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §2 Applications of Manipulations that Leave the Solution Set Unchanged (Egan)

Continuing our example from lecture 1, we will use these properties as a means to solve the system. We will soon learn there is an algorithm (i.e., Gauss-Jordan elimination) that allows us to apply these transformations to any augmented matrix to obtain a solution set to the original system of equations. For now, we will gloss over the definition of row echelon form (REF) and reduced row echelon form (RREF), but from the upcoming examples, it will be pretty clear.

Now, back to our augmented matrix from lecture 1. For now, just focus on under- standing how the discussed manipulations are applied (noting that Ri indicates the ith row), the desired result will be later studied.

 1 −2 1 0  1 −2 1 0  R3+4R1  0 2 −8 8  −−−−−→ 0 2 −8 8  (11) −4 5 9 −9 0 −3 13 −9     1 −2 1 0 1 1 −2 1 0 2 R2 0 2 −8 8  −−−→ 0 1 −4 4  (12) 0 −3 13 −9 0 −3 13 −9 1 −2 1 0  1 −2 1 0 R3−3R2 0 1 −4 4  −−−−−→ 0 1 −4 4 (13) 0 −3 13 −9 0 0 1 3 1 −2 1 0 1 −2 0 −3 R2+4R3 0 1 −4 4 −−−−−→ 0 1 0 16  (14) 0 0 1 3 R1−R3 0 0 1 3 1 −2 0 −3 1 0 0 29 R1+2R2 0 1 0 16  −−−−−→ 0 1 0 16 (15) 0 0 1 3 0 0 1 3

This matrix is in RREF. Note that these manipulations cannot be done simultaneously, R2+4R3 but one after the other. Hence the notation −−−−−→ indicates that the operation R2 +4R3 R1−R3 was first conduction, then followed by R1 − R3. From this matrix, the systems solution is obvious, whereas with our augmented matrix, the solution was not. We therefore find:  1 · x1 + 0 · x2 + 0 · x3 = x1 = 29 0 · x1 + 1 · x2 + 0 · x3 = x2 = 16 (16)  0 · x1 + 0 · x2 + 1 · x3 = x3 = 3

Note that the 1 are called pivot entries or leading coefficients. Technically, a pivot entry does not necessarily need to be 1, but depending on the textbook you use, some authors are quite pernickety about solely using 1s as pivot entries. Hence, to accommodate the most textbooks possible, I will only be using 1s as pivot entries in these notes, but be aware that it is not required.

We can write our solution in another form using column vectors (objects we will study later on in the course):     x1 29 x2 = 16 . (17) x3 3

6 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Note that we can verify our answer by plugging in our solution into the original. This is quite trivial, however, so I will leave it for you to do as an exercise.

Exercise 2.1. Find the augmented matrix of the following system and try to row reducing it:   x2 − 4x3 = 8 2x1 − 3x2 + 2x3 = 1 (18) 5x1 − 8x2 + 7x3 = 1 We therefore have the following augmented matrix which can be reduced in the following way:

0 1 −4 8 2 −3 2 1 R1↔R2 2 −3 2 1 −−−−−→ 0 1 −4 8 (19) 5 −8 7 1 5 −8 7 1     2 −3 2 1 5 2 −3 2 1 R3− 2 R1 0 1 −4 8 −−−−−→ 0 1 −4 8  (20) 1 3 5 −8 7 1 0 − 2 2 − 2     2 −3 2 1 1 2 −3 2 1 R3+ 2 R2 0 1 −4 8  −−−−−→ 0 1 −4 8  (21) 1 3 5 0 − 2 2 − 2 0 0 0 − 2

We stop here. Notice the last row is equivalent to 0 · x1 + 0 · x2 + 0 · x3 = 5/2. No x1, x2, x3 satisfies this equations. The system is therefore inconsistent. Exercise 2.2. Find the augmented matrix of the following system and try to row reducing it:   3x2 − 6x3 + 6x4 + 4x5 = −5 3x1 − 7x2 + 8x3 − 5x4 + 8x5 = 9 (22) 3x1 − 9x2 + 12x3 − 9x4 + 6x5 = 15 We therefore have the following augmented matrix which can be reduced in the following way:

0 3 6 6 4 −5 3 −9 12 −9 6 15  R1↔R3 3 −7 8 −5 8 9  −−−−−→ 3 −7 8 −5 8 9  (23) 3 −9 12 −9 6 15 0 3 −6 6 4 −5 3 −9 12 −9 6 15  3 −9 12 −9 6 15  R2−R1 3 −7 8 −5 8 9  −−−−→ 0 2 −4 4 2 −6 (24) 0 3 −6 6 4 −5 0 3 −6 6 4 −5     3 −9 12 −9 6 15 1 1 −3 4 −3 2 5 3 R1 0 2 −4 4 2 −6 −−−→ 0 1 −2 2 1 −3 (25) 1 R 0 3 −6 6 4 −5 2 2 0 3 −6 6 4 −5 1 −3 4 −3 2 5  1 −3 4 −3 2 5  R3−2R2 0 1 −2 2 1 −3 −−−−−→ 0 1 −2 2 1 −3 (26) 0 3 −6 6 4 −5 0 0 0 0 1 4

Note that the last pivot entry is further away, that’s ok. This matrix is still REF. If we

7 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes keep going, we can row reduce it until it’s in RREF.

1 −3 4 −3 2 5  1 −3 4 −3 0 3  R1−2R3 0 1 −2 2 1 −3 −−−−−→ 0 1 −2 2 1 −7 (27) 0 0 0 0 1 4 R2−R3 0 0 0 0 1 4 1 −3 4 −3 0 3  1 0 −2 3 0 −24 R1+3R2 0 1 −2 2 1 −7 −−−−−→ 0 1 −2 2 0 −7  (28) 0 0 0 0 1 4 0 0 0 0 1 4

This RREF augmented matrix represents the linear system  x1 − 2x3 + 3x4 = −24 x2 − 2x3 + 2x4 = −7 (29)  x5 = 4

We now assign parameters to free variables: let x3 = s and x4 = t. We can thereby express the general solution of the system     x1 −24 + 2s − 3t x2  −7 + 2s − 2t      x3 =  s  (30)     x4  t  x5 4 −24 2 −3  −7  2 −2       =  0  + t 1 + s  0  (31)        0  0  1  4 0 0

8 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §3 More on Row Reduction (Egan)

§3.1 REF and RREF Our goal in applying row reduction is to obtain RREF of the augmented matrix Definition 3.1. A matrix A is in row echelon form if 1. zero rows are at the bottom 2. every non-zero row begins with a 1 3. leading 1s (echelon) from top left to bottom right 4. entries below leading 1s are 0. Add one more condition, and we get reduced row echelon form. Definition 3.2. A matrix A is in reduced row echelon form if 1. zero rows are at the bottom 2. every non-zero row begins with a 1 3. leading 1s (echelon) from top left to bottom right 4. entries below leading 1s are 0 5. entries above leading 1s are 0 Exercise 3.3. Which of the following matrices are in RREF and REF? 1 0 1 0 2 5 0 1 7 A = 0 0 B = 0 0 1 3 C = (32)     0 0 6 0 1 0 0 0 7 1 0 7 1 5 7 3 D = E = 0 1 5 (33) 0 1 6 4   0 0 0 D is in REF and E, in RREF.

Theorem 3.4 Every m × n matrix A has a unique RREF R that is row equivalent to A: A ∼ R.

Given a system with augmented matrix A ∼ R, how do we write a general solution?

Example 3.5

1 0 3 7 A ∼ R = (34) 0 1 2 1 We thereby have         x1 −3s + 7 −3 7 x2 = −2s + 1 = s −2 + 1 (35) x3 s 1 0

9 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

§3.2 Row vectors, column vectors and linear dependance Given the following augmented matrix A, what can we conclude about the relationship between the equations of the linear system?

1 2 −1 2  1 2 −1 2  1 2 −1 2  R2−2R1 R3−3R2 A = 2 5 2 −1 −−−−−→ 0 1 4 −5  −−−−−→ 0 1 4 −5 7 17 5 −1 R3−7R1 0 3 12 −15 0 0 0 0

We see that row 3 in REF is a row of 0s. It must therefore be a linear combination of the other row. Let’s recap what happened. Looking at the operations conducted on the second row, we have: R3 was replaced by R3 − 7R1, hence R3 − 7R1 = 3(R2 − 2R1). Hence, R3 = R1 + 3R2.

Now consider columns of RREF(A):

1 2 −1 2  1 0 −9 12  R1−2R2 0 1 4 −5 −−−−−→ 0 1 4 −5 (36) 0 0 0 0 0 0 0 0

In RREF, we have (noting that Ci indicates the ith column) C3 = −9C1 + 4C2 and C4 = 12C1 − 5C2; these as invariants. −→ −→ −→ Definition 3.6. A set of vectors ( u 1, u 2, ··· , u n) is linear dependant (LD) if one −→ of the vectors u k is a linear combination of the others. −→ −→ −→ −→ −→ −→ In the example above, the row vectors { r 1, r 2, r 3} are LD, since r 3 = r 1 + 3 r 2. −→ −→ −→ The column vectors are also LD since c 3 = −9 c 1 + 4 c 2. Note that when one vectors is a linear combination of the others, so are the others.

Example 3.7 −→ −→ −→ We can express the linear combination of r 1 in terms of r 2 and r 3, but we can −→ −→ −→ also r 2 in terms of r 1 and r 3: −→ −→ −→ r 1 = −3 r 2 + r 3 (37) 1 1 −→r = − −→r + −→r (38) 2 3 1 3 3

§3.3 Matrix rank and linear systems Definition 3.8. The rank of a matrix A is rank(A) and is defined as the number of leading ones in REF/RREF of A. Note that rank(A) ≤ m and rank(A) ≤ n

For instance, 1 2 −1 2  1 2 −1 2  A = 2 5 2 −4 ∼ 0 1 4 −5 (39) 7 17 5 −1 0 0 0 0 Hence, rank(A) = 2. Note that

1 2 −1 C = 2 5 2  (40) 7 17 5

10 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes is called the coefficient matrix. If rank(A) = rank(C), then the system is consistent since there is no leading 1s in the last column, whereas if rank(A) = rank(C) + 1, the system is inconsistent. In the above example,

1 2 −1 1 2 −1 C = 2 5 2  ∼ 0 1 2  (41) 7 17 5 0 0 0

Hence, for this system, rank(A) = rank(C) = 2. Hence, the system is constitent. However, how to we differentiate between consistent systems with 1 solution and ones with infinitely many? If consistent, then

• rank(A) = n of C1 indicates that the system has a unique solution

• rank(A) < n of C indicates that the system has infinitely many solutions In summary, consider three systems with following ranks, where b is a column vector wherein the entries are the constants of the linear equations of the systems, System rank(C) rank(C|b) n Number of solutions First 2 2 2 1 (consistent) Second 1 2 2 0 (inconsistent) Third 1 1 2 ∞ (consistent)

Furthemore, a system with more variables than equations (i.e., n > m) is called an underdetermined system.

Theorem 3.9 An underdetermined system has either no solutions or infinitely many solutions.

Proof. Since rank(C) ≤ m and n > m, then rank(C) < n, and thus, as showed in the table above, the system can only have 0 solutions or infinitely many solutions.

I’m also putting a more involved proof to this theorem here as it seems appropriate and will give students a sneak-peek at material which will be seen later in the course. Simply look over it, and come back once the rank-nullity theorem has been covered.

As seen above rank(A) ≤ m. Furthermore, n > m for an underdetermined system. By the rank-nullity theorem, Null(A) = n−rank(A) ≥ n−m > 0. Hence, dim Null(A) > 0. Thus, −→ −→ −→ −→ if the system has solution X µ, then any vector X µ + X 0 is a solution if X 0 ∈ Null(A), −→ where there are infinitely many vectors X 0. The system can therefore have no solutions or infinitely many.

1Note that n of C corresponds to the number of variables of the linear system.

11 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Example 3.10 For what values of k ∈ R does the augmented matrix A represent (1) a consistent a linear system and (2) an inconsistent system?

1 2 −2 4 1 2 −2 4  A = 3 −1 1 2 ∼ 0 −7 7 −10  (42) 2 −3 3 k 0 0 0 k + 2

1. For the system to be consistent rank(A) = rank(C). Hence, k = −2, yielding infinitely many solution. The general solution would require one free variable.

2. For the system to be inconsistent rank(A) = rank(C) + 1. Hence k 6= −2.

Exercise 3.11. For what values of k ∈ R does the augmented matrix A represent (1) a linear system with infinitely many solution, (2) a linear system with 1 solution and (3) a linear system with no solutions?

1 2 −3 4  1 2 −3 4  A = 3 −1 5 2  ∼ 0 −7 14 −10  (43) 4 1 k2 − 4 k + 2 0 0 k2 − 16 k − 4

1. The only way to let the system to have infinitely many solutions by setting k if by creating a row of zeros. Hence, k2 − 16 = 0 and k − 4 = 0. Hence k = 4 yields a system with infinitely many solutions.

2. For the system to have a single solution k2 − 16 6= 0. Hence, k 6= ±4

3. For the system to be inconsistent, k2 − 16 = 0 and k − 4 6= 0. Hence, k = −4

Notice that we could have done this another way: we could have easily determined that k = −4 yields no solutions and k = 4 yields infinitely many solutions. Hence, by theorem 1.6 we could have therefore said that k 6= ±4 (since a system can only have either 0,1, or 2 ∞ solutions and since k − 16 is defined for all R) gives a matrix representing a linear system with a unique.

12 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §4 Applications of Row Reduction and Introduction to Homogenous Systems (Egan)

§4.1 Some applications The following are some applications of what we have learned so far.

§4.1.1 Network flow Definition 4.1. A network is a collection of nodes withs inbetween them.

For instance, consider the following traffic flow diagram, where f1, ··· , f4 indicate flow:

n4

f f 1 4

n1 n3

f 2 f 3

n2

Exercise 4.2. If node n1 has an input of 100 cars per unit time and n4, an input of 200 cars per unit time, find fi (where i = 1, 2, 3) in terms of f4. This system gives the following augmented matrix

1 −1 0 0 −100 1 0 0 −1 200 0 1 −1 0 200  0 1 0 −1 300 A =   ∼   (44) 0 0 1 −1 100  0 0 1 −1 100 1 0 0 1 −200 0 0 0 0 0

We therefore have (for f4 ≥ 0)

f1 = f4 + 200 (45)

f2 = f4 + 300 (46)

f3 = f4 + 100 (47)

Note that highest flow is f2.

§4.1.2 Balancing chemical reactions We can use the method previously shown to balance chemical equations. For instance, what coefficients x, y, z, w will balance the following reaction?

xC8H8 + yO2 → zCO2 + wH2O (48)

Assuming the conservation of matter, we have   8x = z 8x = 2w (49) 2y = 2z + w

13 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

The augmented matrix is thus     8 0 −1 0 0 1 1 0 −1/8 0 0 8 R1 A = 8 0 0 −2 0 −−−→ 8 0 0 −2 0 (50) 0 2 −2 −1 0 0 2 −2 −1 0 1 0 −1/8 0 0 1 0 −1/8 0 0 R2−8R1 8 0 0 −2 0 −−−−−→ 0 0 1 −2 0 (51) 0 2 −2 −1 0 0 2 −2 −1 0 1 0 −1/8 0 0 1 0 −1/8 0 0 R2↔R3 0 0 1 −2 0 −−−−−→ 0 1 −1 −1/2 0 (52) 1 R 0 2 −2 −1 0 2 2 0 0 1 −2 0 1 0 −1/8 0 0 1 0 0 −1/4 0 R2+R3 0 1 −1 −1/2 0 −−−−−→ 0 1 0 −5/2 0 (53) R + 1 R 1 0 1 −2 0 1 8 3 0 0 1 −2 0 (54) Hence, x 1/4 y = w 5/2 (55) z 2 We need the smallest integer solution. Thus, using w = 4, we have x = 1, y = 10, z = 8:

C8H8 + 10O2 → 8CO2 + 4H2O (56)

§4.2 Homogenous systems and solutions to linear systems Definition 4.3. A linear system is homogenous if constants are 0, or in other words −→ −→ −→ b = 0 , where 0 is the 0 vector. Remark 4.4. Homogenous systems are always consistent; they always have the following trivial solution:     x1 0 −→ x2 0 −→ X =   =   = 0 (57)  .  .  .  . xn 0 −→ −→ Hence, homogenous systems either only have X = 0 as a solution or have infinitely many solutions. −→ Every linear system with augmented matrix (C| b ) has an associated homogenous −→ system with augmented matrix (C| 0 ). Problem 4.5. Consider the following system:  2x + 4x + x − 2x = 1  1 2 3 4 −2x − 4x + x − 5x = 3 1 2 3 4 (58) 4x1 + 8x2 + 4x3 − x4 = 6   2x1 + 4x2 + 3x3 + x4 = 5 At this point, you should be able to solve the system. You can try this yourself. The solution is         x1 −2 7/4 −1/2 x2  1   0   0    = s   + t   +   (59) x3  0  −3/2  2  x4 0 1 0

14 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes where x2 = s and x4 = t. What do you suppose the solution to the associated homogenous system would look like? Try it out, solve the following system:

 2x + 4x + x − 2x = 0  1 2 3 4 −2x − 4x + x − 5x = 0 1 2 3 4 (60) 4x1 + 8x2 + 4x3 − x4 = 0   2x1 + 4x2 + 3x3 + x4 = 0

Notice that RREF(A) is not the same, but RREF(C) is; this is trivial. The solution to the homogenous system is thus       x1 −2 7/4 x2  1   0    = s   + t   (61) x3  0  −3/2 x4 0 1 where x2 = s and x4 = t. Notice how the solution subspace passes through the origin; this is a consequence of remark 4.4. Let’s compare the solution sets. s = t = 0 gives −→ −→ X 6= 0 in the non-homogenous system, whereas it yields the trivial solution in the homogeneous system. However, the parameter portion of the solutions are equivalent.

Theorem 4.6 −→ −→ Suppose an augmented matrix A has a particular solution X, then any solution x µ to the system must look like −→ −→ −→ xµ = x 0 + x ν, (62) −→ −→ where x ν is a column vector with entries ∈ R and x 0 is a solution of the associated homogenous system.

15 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §5 Introduction to Geometric Vectors (Lemelin) Notice: At this point, linear progression of the course material kind of goes out the window. Professor Egan was injured and was subsequently replaced by mutliple professors. We now switch gears and begin material on vectors.

§5.1 Vectors: the geometric case Later in this course, and especially in Linear Algebra II and other courses, vectors will be seen to be many, many different things. For now, we content ourselfs by knowing vectors are an element of a , a concept which will be studied later on. For now, we 2 3 consider the special case where the space is R or R . We will look at the particular case where vectors are geometric objects.

Geometric vectors have magnitude and direction, whereas a scalar ∈ R (an later ∈ C), however, only has magnitude. All vectors have two types of operations

1. Scalar multiplication: stretches or shrinks the vectors by keeping the same direction or exactly the opposite direction. Two vectors are parallel when they are a scalar −→ −→ −→ −→ multiple of each other: u || v if u = k v , where k ∈ R. −→ Note that 0 has magnitude 0, but by convention its direction is perpendicular to any other vector, including being perpendicular to itself.

2. Vector addition (subtraction is only a particular case of addition)

Vector addition has the following properties: −→ −→ −→ −→ −→ • To add vectors w and v ( w + v ), we move v so that its starting point is at the endpoint of −→w . −→ −→ −→ −→ • Vector addition is commutative w + v = v + w . As an exercise, show that this gives a parallelogram.

2 3 −→ Given a point P in R or in R , the vector P of the point P is the vector starting at the origin and extending to P. This vector is in standard position. We want a useful −−→ −→ −→ formula to describe a vector between two points P and Q: PQ = Q − P .

16 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §6 Notation (Turner) This lecture was pretty short and focused on vecture nomenclature.

2 In R , a vector is a list of two compotent: −→ v = hv1, v2i = [v1, v2] = (v1, v2) (63)     v1 v1 T = = = [v1, v2] , (64) v2 v2 where T indicates the transpose of the row vector. These are all appropriate ways of 2 −→ * expressing a vector in R . Furthermore, a vector v may be also labeled v, v , v, v. Definition 6.1. For a matrix A, the transpose respectively maps the ith row and jth T T column element of A to the jth row and ith column element of A:[A ]ij = Aji.

−→ n −→ p 2 2 The magnitude of a vector ω ∈ R is || ω || = ω1 + ··· + ωn. Furthermore, the addition of two vectors −→v and −→w is expressed as −→ −→ v + w = hv1 + w1, ··· , vn + wni. (65)

Proposition 6.2 −→ n −→ −→ −→ If k ∈ R and v ∈ R , then the length of k v is |k| times the length of v , || v ||. This can be shown by assuming the statement is true:

||k−→v || = |k|||−→v || (66) q 2 2 2 2 = |k| v1 + v2 + v3 + ··· + vn (67) √ q 2 2 2 2 2 = k v1 + v2 + v3 + ··· + vn (68) q 2 2 2 2 2 2 2 2 = k v1 + k v2 + k v3 + ··· + k vn (69)

= hkv1, kv2, kv3, ··· , kvni (70) = ||k−→v || (71)

Finally, here’s a nifty that will come in handy when doing vector geometry:

Theorem 6.3 Given three points, they form a triangle if and only if the points are not colinear.

17 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §7 Dot Product, Projections, Orthogonals and the Cross Product (Barsheshat)

§7.1 Dot product −→ −→ Pn Definition 7.1. u · v = i=1 uivi. This operation is called the dot product (or scalar product) of vectors −→u and −→v . The dot product has the following properties: −→ −→ −→ −→ • commutativity: u · v = v · u −→ −→ −→ −→ −→ −→ −→ • distributivity over vector addition: u · (w + v ) = u · w + u · v −→ −→ −→ −→ −→ −→ −→ • Bilinearity: u · (k w + v ) = k( u · w ) + ( u · v ) −→ −→ −→ −→ • Scalar multiplication: (k1 u ) · (k2 v ) = k1k2( u · v ) −→ −→ −→ • Not associative because the operation u · ( v · w ) makes no sense: what is in the parentheses is a scalar and you cannot compute the dot product between a scalar and a vector. −→ −→ • : Two non-zero vectors u and v are orthogonal if and only if their scalar prouct is 0 (we will shortly see this concretely). −→ −→ −→ • u · u = || u ||2

Theorem 7.2 2 We can interpret the dot product geometrically (at least in a familiar way in R and 3 −→ −→ −→ −→ R ): u · v = || u |||| v || cos θ, where θ is the angle between the two vectors.

Proof. Given two vectors −→u and −→v separated by angle θ, they form a triangle with a third side −→u − −→v . Applying the law of cosines

||−→u − −→v ||2 = ||−→u ||2 + ||−→v ||2 − 2||−→u ||||−→v || cos θ (72) (−→u − −→v ) · (−→u − −→v ) = ||−→u ||2 + ||−→v ||2 − 2||−→u ||||−→v || cos θ (73) −→u · −→u − 2−→u · −→v + −→v · −→v = ||−→u ||2 + ||−→v ||2 − 2||−→u ||||−→v || cos θ (74) ||−→u ||2 − 2−→u · −→v + ||−→v ||2 = ||−→v ||2 + ||−→v ||2 − 2||−→u ||||−→v || cos θ (75) −→u · −→v = ||−→u ||||−→v || cos θ (76)

Now, we can calculate the angle θ between two vectors −→u and −→v by

 −→u · −→v  θ = arccos (77) ||−→u ||||−→v ||

Note that the arccos θ function maps [−1, 1] → [0, π/2].

Theorem 7.3 −→ −→ n −→ −→ 2 −→ 2 −→ 2 −→ −→ For two vectors u and v in R , we have || u ± v || = || u || + || v || ± 2( u · v )

18 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Proof. n X ||−→u ± −→v ||2 = (u ± v)2 (78) i=1 n n n X 2 X X 2 = ui ± 2 uivi + vi (79) i=0 i=0 i=0 = ||u||2 ± 2(−→u · −→v ) + ||−→v ||2 (80)

Theorem 7.4 −→ m −→ −→ If v ∈ R is orthogonal to every other vector, then v = 0

Proof. Suppose −→v and −→w are non-zero, orthogonal vectors, which are not linearly independent: −→v = k−→w for some scalar k 6= 0. Since −→v · −→w = 0, then −→v · −→w = k(−→w · −→w ) = k||−→w || = 0 (81) . Since k 6= 0, then ||−→w || = 0, which contradicts the fact that −→w is non-zero.

Theorem 7.5 −→ −→ 2 3 n For two vectors u and v in R and R (and technically in R ), we have the Cauchy-Schwarz inequality |−→u · −→v | ≤ ||−→u ||||−→v ||

2 3 Proof. For R and R , we use our geometric intuition and begin with | cos θ| ≤ 1 (82) Multiplying both sides by ||−→u ||||−→v ||, we have ||−→u ||||−→v ||| cos θ| ≤ ||−→u ||||−→v || (83) Hence, |−→u · −→v | ≤ ||−→u ||||−→v || (84)

n For R , the inequality may be thusly proved: if inequality is true, then |−→u · −→v | ≤ ||−→u ||||−→v || (85) q q 2 2 2 2 |u1v1 + ··· + unvn| ≤ u1 + ··· + un v1 + ··· + vn (86) v u n ! n ! u X 2 X 2 ≤ t ui vi (87) i=1 i=1 2 ! ! n n n X X 2 X 2 uivi ≤ ui vi (88) i=1 i=1 i=1 n !2 n ! n ! X X 2 X 2 uivi ≤ ui vi (89) i=1 i=1 i=1 (90)

19 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Now, we consider the function:

n X 2 f(x) = (uix + vi) . (91) i=1

We have (uix + vi) ≥ 0 ⇒ f(x) ≥ 0, ∀x ∈ R. We develop f(x),

n X 2 2 2 f(x) = (ui x + 2uivix + vi ) (92) i=1 n n n X 2 2 X X 2 = (ui )x + 2 (uivi)x + vi (93) i=1 i=1 i=1 Solving the quadratic equation, noting that ∆ ≤ 0

n !2 n ! n ! X X 2 X 2 2 (uivi) ≤ 4 ui vi (94) i=1 i=1 i=1 n !2 n ! n ! X X 2 X 2 uivi ≤ ui vi (95) i=1 i=1 i=1

§7.2 Projection of a vector onto another vector Imagine two vectors seperated by angle, identical to the situation seen above. If we were to shine a light straight above these vectors, perpendicular to −→v for instance, the shadown cast by −→u onto −→v is called the projection of −→u onto −→v , which is written as −→ proj−→v u and has magnitude −→ −→ ||proj−→v u || = || u ||| cos θ| (96) |−→u · −→v | = ||−→u || (97) ||−→u ||||−→v || |−→u · −→v | = (98) ||−→v || −→ −→ Furthermore, as proj−→v u is in the same direction (or exactly opposite to) v , we write: −→ −→  −→  −→ u · v v proj−→ u = (99) v ||−→v || ||−→v || (−→u · −→v ) = −→v (100) ||−→v ||2 −→ −→ If 0 < θ < π/2, then proj−→ u > 0 and if π/2 < θ < π, then proj−→ u < 0. However, if −→ v v θ = π/2, then proj−→v u = 0 because of the dot product.

§7.3 Orthogonal projection −→ −→ −→ −→ Definition 7.6. The orthogonal projection of u on v , written as orth−→ u , is orth−→ u = −→ −→ v v u − proj−→ u , or v −→ −→ −→ −→ ( u · v )−→ orth−→ u = u − v (101) v ||−→v ||2

20 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

−→ 2 −→ 2 −→ 2 Note that ||orth−→v u || + ||proj−→v u || = || u || . −→ −→ Exercise 7.7. Show that orth−→v u and proj−→v u are perpendicular. We calculate the dot product  −→ −→   −→ −→  −→ −→ ( u · v )−→ −→ ( u · v )−→ (proj−→ u ) · (orth−→ u ) = v · u − v (102) v v ||−→v ||2 ||−→v ||2 (−→u · −→v ) (−→u · −→v ) (−→u · −→v ) = (−→u · −→v ) − (−→v · −→v ) (103) ||−→v ||2 ||−→v ||2 ||−→v ||2 (−→u · −→v ) (−→u · −→v ) (−→u · −→v ) = (−→u · −→v ) − ||−→v ||2 (104) ||−→v ||2 ||−→v ||2 ||−→v ||2 (−→u · −→v ) (−→u · −→v ) = (−→u · −→v ) − (−→u · −→v ) (105) ||−→v ||2 ||−→v ||2 = 0 (106) −→ −→ Hence, orth−→v u and proj−→v u are perpendicular.

§7.4 Cross product −→ −→ Definition 7.8. u × v = hu2v3 − u3v2, u3v1 − u1v3, u1v2 − u2v1i, which corresponds to

ˆı ˆ kˆ −→ −→ u × v = u1 u2 u3 , (107)

v1 v2 v3 where the expression on the right is called the (an operation we will see later on in the course). Note that in higher mathematics classes the notation is −→u ∧ −→v .

The vector −→u × −→v is perpendicular to both −→u and −→v and satisfies

||−→u × −→v || = ||−→u ||||−→v || sin θ, (108) which corresponds to the area of the parallelogram with sides −→u and −→v . The cross product is anticommutative (i.e., −→u × −→v = −(−→v × −→u )), distributive over multiplication, and bilinear. Furthermore −→u × −→u = 0 and

ˆı × ˆı = 0 ˆ× ˆ = 0 kˆ × kˆ = 0 (109) ˆ× kˆ = ˆı kˆ × ˆı = ˆ ˆı × ˆ = kˆ (110)

Exercise 7.9. Show that if three vectors −→u , −→v , −→w are coplanar, then their triple scalar product is 0. We can show this easily: |−→u · (−→v × −→w )| = ||−→u ||||−→v ||||−→w ||| sin θ|| cos φ|, where φ = π/2, and thus the scalar is 0.

Exercise 7.10. Prove that −→u × −→v is perpendicular to −→u and −→v . We compute the dot product of these two vectors: −→ −→ −→ u · ( u × v ) = hu1, u2, u3i · hu2v3 − u3v2, u3v1 − u1v3, u1v2 − u2v1i (111)

= u1u2v3 − u1u3v2 + u2u3v1 − u2u1v3 + u3u1v2 − u3u2v1 (112) = 0 (113)

Similarly for −→v .

21 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

What we have just computed is scalar triple product which corresponds to

u1 u2 u3 w1 w2 w3 −→ −→ −→ −→ −→ −→ ( u × v ) · w = v1 v2 v3 = w · ( u × v ) = u1 u2 u3 (114)

w1 w2 w3 v1 v2 v3 and has the following property: (−→u × −→v ) · −→w = −(−→u × −→w ) · −→v = −(−→v × −→u ) · −→w = (−→v × −→w ) · −→u . Note that |(−→u × −→v ) · −→w | corresponds to the volume of the parallelepipeds formed by these three vectors. Note that any permutation of these vectors yields the same result.

Theorem 7.11 −→ −→ −→ −→a × ( b × −→c ) = b (−→a · −→c ) − −→c (−→a · b ). This is known as Lagrange’s formula.

Proof. −→ −→ −→ a × ( b × c ) = ha1, a2, a3i × hb2c3 − b3c2, b3c1 − b1c3, b1c2 − b2c1i (115)

= a2(b1c2 − b2c1)ˆı − a3(b3c1 − b1c3)ˆı + a3(b2c3 − b3c2)ˆ (116)

− a1(b1c2 − b2c1)ˆ + a1(b3c1 − b1c3)kˆ − a2(b2c3 − b3c2)kˆ (117)

= a2b1c2ˆı − a2b2c1ˆı − a3b3c1ˆı + a3b1c3ˆı + a3b2c3ˆ− a3b3c2ˆ (118)

− a1b1c2ˆ+ a1b2c1ˆ+ a1b3c1kˆ − a1b1c3kˆ − a2b2c3kˆ + a2b3c2kˆ (119) −→ −→ −→ = b ( a · c ) − ha1b1c1, a2b2c2, a3b3c3i (120) −→ −→ −→ − c ( a · b ) + ha1b1c1, a2b2c2, a3b3c3i (121) −→ −→ = b (−→a · −→c ) − −→c (−→a · b ) (122)

22 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §8 Lines and Planes (Barsheshat)

§8.1 Lines in R2 The standard equation of a line taught in high schools is y = ax + b, where a is the slope and b is the y-intercept. However, we could rewrite this equation in such that it generalizes nicely in higher dimensions. Consider a point P and Q on a line L. −−→ −−→ −→ −−→ −−→ We have OP and OQ. Let d = PQ and OP = −→p for notational simplicity. As a result, the equation of the line L can be written as −→ −−→ −→r (t) = −→p + t d = OX, (123) where point X is a point on the line. This can be rewritten in vector function form −→ r (t) = hx0 + td1, y0 + td2i, (124) −→ where p = hx0, y0i (which is a point on the line) and d = hd1, d2i (which the direction vector of the line).

Exercise 8.1. Say we have a line described by

r (t) x  d  1 = 0 + t 1 , (125) r2(t) y0 d2 convert from vector function form to the form y = ax + b, where a and b will be in terms of x0, y0 and d1, d2. We have ∆y d a = = 2 (126) ∆x d1 and

d2 y0 = x0 + b (127) d1 d2 ⇒ b = y0 − x0 (128) d1

§8.2 Lines in R3 This notation generalizes well in higher dimensions. In n dimensions, the equation is −→ r (t) = hx1, x2, ··· , xni + thd1, d2, ··· , dni. (129)

3 For now, however, we are generally concerned with lines in R , written in: −→ • Vector function form: r (t) = hx0 + td1, y0 + td2, z0 + td3i

• Parametric equation form  x = x0 + td1 y = y0 + td2 z = z0 + td3

• Symmertic equation form x − x y − y z − z t = 0 = 0 = 0 d1 d2 d3

23 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Exercise 8.2. Find another parametric equation describing the line −→r (t).       r1(t) 1 2 −→ r (t) = r2(t) =  0  + t −1 (130) r3(t) −1 0 Suppose we introduce a new parameter s such that t = 3 − 2s, thereby yielding

 1   2  −→ r (t) =  0  + (3 − 2s) −1 (131) −1 0  7  −4 = −3 + s  2  (132) −1 0

Theorem 8.3 The distance between two skew lines is −−−→ D = ||proj−→n P1P2||, (133) −→ −→ −→ where P1 is on the first line, P2 is on the second line, and n = d 1 × d 2.

Theorem 8.4

The distance between a line L and a point M1(x, y, z) is −−−−→ −→ ||M0M1 × d || D = −→ , (134) || d || −→ where M0 is on the line and d is the direction vector of the line.

−→ −−−−→ −→ Proof. Let s d (where s ∈ ) and M M have M as an origin. These vectors form the R 0 1 −→ 0 −−−−→ −→ sides of a parallelogram. The endpoints of s d and M M are joined by vector D which −→ −→ 0 1 is orthogonal to s d . Hence, ||D|| is the distance between the point and the line. Note that −−−−→ −→ A = ||M0M1 × s d ||, (135) −→ −→ is the area of the parallelogram. However, A is also equivalent to A = ||s d ||||D||. As a result, −−−−−→ −→ −−−−−→ −→ −→ |||M0M1 × s d || |||M0M1 × d || D = ||D|| = −→ = −→ (136) ||s d || || d ||

§8.3 Parallel lines

Theorem 8.5 −→ −→ Parallel lines L1//L2 have d 1 = k d 2, where k ∈ R.

24 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Moreover, 2 1. (in R ) L1//L2 ⇔ L1 and L2 never intersect (L1 is distinct from L2) 3 2. (in R ) L1//L2 (and L1 6= L2) ⇔ L1 and L2 do not intersect

Proof. We will prove that parallel lines do not intersect. Assume L1//L2, where −→ −→ −→ L1 : = r (t) = p 1 + t d 1 (137) −→ −→ −→ L2 : = r (s) = p 2 + s d 2. (138) Without loss of generality, we may assume the same direction vector. Assuming distict, −→ we may say that p 2 ∈/ L1: −→ −→ ⇒ p 2 6= r(t) = p 1 + td1 (139) −→ −→ ⇒ p 2 − p 1 6= td1, t ∈ R. (140) We want to show there are no values for s and t such that −→r (t) = −→r (s). Assume ∃s, t such that −→r (t) = −→r (s): −→ −→ −→ −→ p 1 + t d = p 2 + s d (141) −→ −→ −→ ( p 2 − p 1) = (t − s) d , (t − s) ∈ R. (142)

This contradicts the fact that L1 and L2 are distinct.

Theorem 8.6 The distance between two parallel lines is −−−→ −→ ||P1P2 × d || D = −→ , (143) || d || −→ where P1 is on the first line, P2 is on the second line, and d is the direction vector of one of the lines. Furthermore, D can be expressed as −−−→ D = ||orth−→P P || (144) d 1 2

§8.4 Planes in R3 −→ −−→ Given an initial point P (x0, y0, z0) on a plane (with r 0 = OP ) and a normal vector −→n = ha, b, ci, then any other vector −→r = hx, y, zi which starts at the origin and ends at −→ −→ −→ a point on the plane satisfies the equation n · ( r − r 0) = 0.

−→n

−→ −→ −→ r − r 0 r 0 −→r

O.

25 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

−→ 3 −→ n Note that n is not unique; we can describe a plane in R with ∞ many n ’s. In R , −→ −→ −→ n · ( r − r 0) = 0 defines an n − 1 dimensional hyperplane. −→ −→ −→ Subbing n = ha, b, ci 6= 0 , r = hx, y, zi and r 0 = hx0, y0, z0i into the normal vector equations gives

ha, b, ci · hx − x0, y − y0, z − z0i = 0 (145)

a(x − x0) + b(y − y0) + c(z − z0) = 0 (146)

ax + by + cz = ax0 + by0 + cz0 = −d, (147) which is in scalar form. On the other hand, a different formalism can be used to describe planes: given an initial −→ −→ −→ −→ −→ −→ −→ −→ vetcor r 0 and 2 vectors on the plane, say u , v (where u , v 6= 0 ; u =6 k v ), then any vector ending on the plane (say −→r from standard position) can be written as −→ −→ −→ −→ r = r 0 + s u + t v , s, t ∈ R, (148) where −→n = −→u × −→v . Note that the parametric form (with 2 parameters) always defines a n −→ −→ −→ −→ −→ 2-dimensional plane in R . We must ensure that u , v 6= 0 and u 6= k v , however.

Example 8.7 −→ −→ Conisder the initial vector r 0 = h2, 1, −1i and n = h1, 1, 2i. The equation of the plane in scalar form is −→ −→ −→ n · ( r − r 0) = 0 (149) h1, 1, 2i · hx − 2, y − 1, z + 1i = 0 (150) x + y + 2z = 1 (151)

Moreover, we can write out solutions to scalar form equation of the plane in parametric form: x 1 − s − 2t y =  s  (152) z t 1 −1 −2 = 0 + s  1  + t  0  , s, t ∈ R, (153) 0 0 1

where y = s, z = t.

Theorem 8.8 The shortest distance D between some point Q(x, y, z) from some plane described by ax + by + cz + d = 0 is

|ax + by + cz + d| D = √ , (154) a2 + b2 + c2 assuming P is not on the plane.

Proof. Consider the following diagram:

26 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Q(x, y, z)

−→−→ −→ proj n v v

−→n

P (x0, y0, z0)

We have −→ D = ||proj−→n v || (155) −→ −→ n · v −→ = n (156) ||−→n ||2 |−→n · −→v | = (157) ||−→n || |a(x − x ) + b(y − y ) + c(z − z )| = 0 √ 0 0 (158) a2 + b2 + c2 |ax + by + cz − ax − by − cz | = √ 0 0 0 (159) a2 + b2 + c2 |ax + by + cz + d| = √ (160) a2 + b2 + c2

§8.4.1 Parallel planes 3 In R , if π1 and π2 are parallel and distinct, then they never intersect: π1//π2 (π1 =6 π2) ⇔ π1, π2 never intersect.

Theorem 8.9 3 In R , if two planes π1, π2 are not parallel, then they intersect to form a line.

Exercise 8.10. If two planes are not parallel, they interset to form a line. Assuming

π1 : a1x + b1y + c1z + d1 = 0 (161)

π2 : a2x + b2y + c2z + d2 = 0 (162)

Find a formula for the line formed by the intersection of these planes. The line must −→ −→ −→ have a direction common to both planes: d = n 1 × n 2. The rest is trivial.

27 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Example 8.11

Find the intersection of π1 and π2

π1 : 2x − y + z = 3 (163)

π2 : x + 2y + 3z = 0 (164)

We have the augmented matrix A which, after row reduction, yields a matrix R in RREF: 1 2 3 0 1 0 1 6/5  A = ∼ = R (165) 2 −1 1 3 0 1 1 −3/5 Hence, x  6/5 − t   6/5   1  y = −3/5 − t = −3/5 + t  1  , (166) z t 0 −1 where z = t ∈ R. However, a faster solution would have utilized the cross product of the normal vectors: −→ −→ n 1 × n 1 = h2, −1, 1i × h1, 2, 3i (167) = −5ˆı − 5ˆ + 5kˆ (168) = −5h1, 1, −1i (169) −→ Hence, d = h1,, 1, −1i. We would simply need a point on the line to be able to describe it mathematically. Out of simplicity, we will look for a point with z = 0:

2x − y = 3 (170) x + 2y = 0 (171) ⇒ 2(−2y) − y = 3 (172) −4y − y = 3 (173) y = −3/5 (174) ⇒ x = 6/5 (175)

Hence, we find an equivalent equation. Note that the equations given by the two −→ methods do not always match, because we could use any point on the line as r 0 and any multiple of the direction vector. Both, however, will be correct and describe the same line (if your work is correct).

28 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §9 Span and Linear Combination, Dependence and Independence (Barsheshat)

−→ −→ −→ m Definition 9.1. A vector set is defined as V = { v 1, v 2, ··· , v n} ⊆ R . −→ −→ m Definition 9.2. A linear combination of vectors v 1, ··· , v n ∈ R is simply a vector sum of arbitrary scalar multiples of these vectors. Note that n is not the dimension of the vector space. −→ −→ −→ Vector v is a linear combination of v 1, ··· , v n if

n −→ −→ −→ X −→ v = a1 v 1 + ··· + an v n = ai v i (176) i=1

−→ −→ m Definition 9.3. The linear span of a set of vectors { v 1, ··· , v n} ⊆ R denoted by

( n ) −→ −→ X −→ span ({ v 1, ··· , v n}) = ai v i : ai ∈ R (177) i=1 The properties of the linear span of a set of vectors are: −→ −→ −→ −→ −→ n−→o • 0 ∈ span ({ v 1, ··· , v n}) or span ({ v 1, ··· , v n}) = 0 −→ −→ −→ −→ • v 1, ··· , v n ∈ span ({ v 1, ··· , v n}) −→ −→ −→ −→ • under scalar multiplication: if v ∈ span ({ v 1, ··· , v n}), then a v ∈ −→ −→ span ({ v 1, ··· , v n}) (for any scalar a) −→ −→ −→ −→ −→ −→ −→ −→ • closure under addition: if u , v ∈ span ({ v 1, ··· , v n}), then u + v ∈ span ({ v 1, ··· , v n})

Example 9.4 −→ −→ −→ −→ −→ −→ If e 1 = ˆı, e 2 = ,ˆ e 3 = kˆ, then span ( e 1, e 2, e 3) corresponds to what space? −→ −→ −→ 3 −→ 3 span ( e 1, e 2, e 3) = R because any vector v = ha, b, ci ∈ R can be written as −→ −→ −→ −→ v = a e 1 + b e 2 + c e 3

−→ −→ −→ Exercise 9.5. Let v 1 = h1, 2i, v 2 = h−1, 3i, v 3 = h1, 1i, find: −→ −→ 1. span ({ v 1, v 2}) −→ −→ −→ 2. span ({ v 1, v 2, v 3})

−→ 2 −→ −→ −→ For the first exercise, let v = ha, bi ∈ R . We want to show that v = x v 1 + y v 2, for some x, y ∈ R. Thus a 1 −1 1 −1 a = x = y ⇒ A = (178) b 2 3 2 5 b

−→ −→ 2 If this system has a solution, then span ({ v 1, v 2}) = R . We have

   3a+b  1 −1 a 1 0 5 A = ∼ −2a+b = R (179) 2 5 b 0 1 5

29 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

3a+b −2a+b Hence, by setting x = 5 and y = 5 , we obtain −→ −→ −→ v = ha, bi = x v 1 + y v 2 (180)

−→ −→ 2 As a result, span ({ v 1, v 2}) = R .

−→ −→ −→ 2 For the second exercise, we have span ({ v 1, v 2, v 3}) = R , because it has to include −→ −→ 2 −→ −→ −→ −→ −→ −→ span ({ v 1, v 2}) = R ): span ({ v 1, v 2}) ⊆ span ({ v 1, v 2 v 3}). In other words, v 3 did not add any extra information. −→ −→ Definition 9.6. A set of vectors { v 1, ··· , v n} is linearaly independent if the ho- Pn −→ −→ mogeneous system defined by i=1 ai v i = 0 (where the scalars are the variables of the system) has only the trivial solution (i.e., ai = 0). If there is any non-trivial solution, the set of vectors is linearly dependent. −→ −→ −→ Exercise 9.7. Show that v 1 = h1, 2i, v 2 = h−1, 3i, v 3 = h1, 1i are linearly dependent. We have −→ −→ −→ −→ a1 v 1 + a2 v 2 + a3 v 3 = 0 (181) −→ a1h1, 2i + a2h−1, 3i + a3h1, 1i = 0 . (182)

Hence,  a − a + a = 0 1 2 3 (183) 2a1 + 3a2 + a3 = 0 This system is an underdetermined homogeneous system. Hence, it must have infinite −→ −→ −→ solutions. As a result, v 1 = h1, 2i, v 2 = h−1, 3i, v 3 = h1, 1i are linearly dependent.

We could have also used the result from the previous example such that 3a + b −2a + b −→v = ha, bi = −→v + −→v . (184) 5 1 5 2 We can thereby have (setting a, b = 1)

4 1 4 1 −→ −→v = h1, 1i = −→v − −→v → −→v − −→v − −→v = 0 . (185) 3 5 1 5 2 5 1 5 2 3 This is a non-trivial solution.

30 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §10 Span and Bases (Barsheshat)

§10.1 More on span and linear combination, dependence and independence

Proposition 10.1 −→ −→ Let V = { v 1, ··· , v n}. The following two statements are then equivalent:

• V is linearly independent

• no vector in V can be expressed as a linear combination of the other vectors in the set.

Proof. We will prove ¬a ⇒ ¬b and ¬b ⇒ ¬a.

We begin by showing ¬a ⇒ ¬b. If V is linearly dependent, then there is a non-trivial solution to: −→ −→ −→ a1 v 1 + ··· an v n = 0 . (186)

Without loss of generality, assume a1 6= 0, then we have: −→ −→ −→ a1 v 1 = −a2 v 2 − · · · − an v n (187)       −→ −a2 −→ −a3 −→ −an −→ v 1 = v 2 + v 3 + ··· + v n. (188) a1 a1 a1 b is therefore false (i.e., ¬b is true).

−→ Now, we show ¬b ⇒ ¬a. If, say v 1, could be expressed as a linear combination of the other vectors, then −→ −→ −→ v = b2 v 2 + ··· + bn v n, (189) for some b2, ··· , bn ∈ R. We can rearrange the previous equation as follows −→ −→ −→ −→ v 1 − b2 v 2 − · · · − bn v n = 0 . (190)

Setting a1 = 1 and ai = −bi for i ≥ 2, we find a non-trivial solution to the independence equation. Thus, V is not linearly independent (i.e., ¬a is true).

Before continuing, here are some additional propertites of span: −→ • 0 ∈ span(V) −→ • v i ∈ span(V), for i = 1, ··· , n −→ −→ • closure under adition/scalar multiplication: if u , v ∈ span(V) and a, b ∈ R, then (a−→u + b−→v ) ∈ span(V).

• if V ⊆ U, then span(V) ⊆ span(U). The following are properties of , −→ • if 0 ∈ V, then V is not independent • any singleton set is independent if its constituent vector is non-zero • for any set of two vectors, saying it’s independent is equivalent to saying that the vectors are not scalar multiples of eachother

• if V ⊆ U, and U is linearly independent, then so is V

31 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Proposition 10.2 m In R , the maximum size of a set of independent vectors is m.

Pn −→ −→ Proof. Consider the system i=1 ai v i = 0 . This corresponds to a coefficient matrix −→ −→ C = [ v 1 ··· v n] (191)

C has m rows and n columns. Hence, from our knowledge of row reduction, if n > m, then the homogeneous system has infinitely many solutions which makes the vector set dependent. As a result, for a vector set to be independent, n ≤ m.

Example 10.3 If {−→u , −→v , −→w } is independent, show that {−→u + −→v , −→u + −→w , −→v + −→w } is also inde- pendent. We begin by setting up the system with the vectors in the second set. Assume −→ a(−→u + −→v ) + b(−→u + −→w ) + c(−→v + −→w ) = 0 (192) −→ (a + b)−→u + (a + c)−→v + (b + c)−→w = 0 (193)

Because of the independence of −→u , −→v , −→w , we can conclude that a + b = 0, a + c = 0, b + c = 0. We have a homogeneous system. You can check that the only solution to the system is the trivial solution. This thereby shows that {−→u + −→v , −→u + −→w , −→v + −→w } is independent.

§10.2 Basis for Rm −→ −→ m Definition 10.4. A set of vectors V = { v 1, ··· , v n} is called a basis for R if any m m vector in R can be expressed as a linear combinations of vectors in V: span(V) = R m (V is a spanning set for R ). This is equivalent to:

m span(V) = R and V is linearly independent (194) (n≥m) (n≤m)

m Notice the following fact: if V is a basis for R , then |V| = m (i.e., V has m elements).

Example 10.5 n ˆo 3 −→ m ˆı, ,ˆ k is called the for R . More generally, if we define e i ∈ R to −→ −→ be the vector with all zeros except for a 1 in the i’th coordinate, then { e 1, ··· , e m} m is called the stadard (orthonormal) basis for R .

−→ −→ −→ Exercise 10.6. Consider vectors v 1 = h2, 0, 1i, v 2 = h−1, −1, 0i, v 3 = h1, −1, 0i: −→ −→ −→ 1. express ˆı, ,ˆ kˆ in terms of v 1, v 2, v 3. −→ −→ −→ 2. use above and the work shown to argue that v 1, v 2, v 3 form a basis.

32 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

We can easily see that: 1 ˆı = (−→v − −→v ) (195) 2 3 2 1 ˆ = − (−→v + −→v ) (196) 2 2 3 −→ −→ −→ kˆ = v 1 + v 2 − v 3 (197)

However, if it were not as obvious, we could have used row reduction. Consider the system −→ −→ −→ x v 1 + y v 2 + z v 3 = ha, b, ci. (198) This system translates to the following augmented matrix

2 −1 1 a 1 0 0 c  −a−b+2c A = 0 −1 −1 b ∼ 0 1 0 2  = R (199) a−b−2c 1 0 0 c 0 0 1 2 The system thereby has a unique solution. To determine the coefficients we need for ˆı, we set a = 1 and b, c = 0. This yields:

−a − b + 2c a − b − 2c ha, b, ci = c−→v + −→v + −→v (200) 1 2 2 2 3 −1 − 0 + 0 1 − 0 − 0 ˆı = h1, 0, 0i = −→v + −→v (201) 2 2 2 3 1 ˆı = (−→v − −→v ) (202) 2 3 2

Similarly for ˆ and kˆ. Furthermore, we have shown that there is exactly one way of −→ 3 −→ −→ −→ expressing any vector v = ha, b, ci ∈ R as a linear combination of v 1, v 2, v 3. Thus −→ −→ −→ 3 { v 1, v 2, v 3} is a basis for R .

33 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §11 Subspaces and Span (Barsheshat)

m |V| = m if V is a basis of R is a consequence of the following two propositions:

Proposition 11.1 m If span(V) = R , then |V| ≥ m

Proof. INSERT PROOF

Proposition 11.2 m If V ⊆ R , then |V| ≤ m

Proof. INSERT PROOF

Based on the two previous propositions one can say that a basis is both a minimal spanning set m −→ −→ or a maximal independent set. The standard basis for R : { e i} where e i has 0’s as every component except for the i’th component, which is 1. −→ −→ Exercise 11.3. Describe the span of the following three vectors: v 1 = h1, 0, 2i, v 2 = −→ h2, 1, −1i, v 3 = h0, 1, −5i: 1. geometrically (what object is this?)

2. as a span of two vectors −→ −→ Geometrically, it’s a plane with normal v 1 × v 2 and goes through the origin. The −→ −→ −→ −→ span of three vectors is any vector v = hx, y, zi = a v 1 + b v 2 + c v 3. Treating a, b, c as the variables of the system, we obtain the following augmented matrix

1 2 0 x 1 2 0 x  A = 0 1 1 y ∼ 0 1 1 y  (203) 2 −1 −5 z 0 0 0 −2x + 5y + z

Hence, for this system to be consistent, we must have −2x + 5y + z = 0. This yields the equation of a plane (which goes through the origin) with normal −→n = h−2, 5, 1i −→ −→ −→ (this answers part 1). For part 2 of the exercise, we find v 2 − 2 v 1 = v 3. As a result, −→ −→ −→ −→ −→ −→ −→ −→ v 3 ∈ span ({ v 1, v 2}) and thus span ({ v 1, v 2}) = span ({ v 1, v 2, v 3}).

m Definition 11.4. A set V is called a subspace of R if V is a non-empty set closed under addition and scalar multiplication 2. Properties of subspaces: −→ −→ −→ −→ • 0 ∈ V for any subspace V since V is not empty ⇒ v ∈ V ⇒ 0 v = 0 ∈ V. m In other words, any subspace V ⊆ R can be written as a span of n linearly independent vectors with n ≤ m.

2Note the following: −→ −→ −→ −→ • closure under addition: if u , v ∈ V, then u + v ∈ V. −→ m −→ • closure under multiplication: if v ∈ V and t ∈ R , then t v ∈ V. • V is not empty

34 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

m • Any span of any collection of vectors is a subspace of R . m • Any subspace V ⊆ R can be described as a span of n vectors with n ≤ m (tricky to prove, yet important to understand). n−→o • 0 is called the trivial subspace.

m • R is a subspace of itself.

Theorem 11.5 n For any set of vectors V = {v1, ··· , vn} in a vector space V (which will be R for n now), span(V) is a subspace of R .

−→ −→ Proof. Let u , w ∈ span(V) and k ∈ R. Then, there exists c1, ··· , cn, a1, ··· , an ∈ R such that −→ −→ −→ u = c1 v 1 + ··· + cn v n (204) −→ −→ −→ w = a1 v 1 + ··· + an v n (205)

Notice that −→ −→ −→ −→ u + w = (c1 + a1) v 1 + ··· + (cn + an) v n. (206)

Hence, span is closed under addition. Moreover, −→ −→ −→ k u = (kc1) v 1 + ··· (kcn) v n. (207)

As a result, span is closed under scalar multiplication. Finally, span is not empty as it contains the 0 vector, which is obtained by setting the coefficients in the linear combination to 0. Since span is a non-empty set closed under addition and scalar multiplication, it is n a subspace of R . Exercise 11.6. Are the following subsets subspaces? Justify carefully.

−→ m −→ −→ −→ m 1. { v ∈ R : u · v = 0} for some vector u ∈ R  3 2. ha, b, ci ∈ R : b = a + c + 1 For part 1, we show that V is closed under addition and multiplication. If −→v ∈ V and −→w ∈ V, then −→v · −→u = 0 and −→w · −→u = 0. Hence, (−→v + −→w ) · u = 0. V is closed under addition. Now, for multiplication, this can be shown as follows: if −→v ∈ V ⇒ −→u · −→v = 0 ⇒ −→u · (t−→v ) ⇒ t(−→u · −→v ) = 0. Hence, t−→v ∈ V.

For part 2, we see that the zero vector is not in V, hence it is not a subspace. We could  3 have also shown that ha, b, ci ∈ R : b = a + c + 1 is not closed under addition. Choose −→ −→ −→ −→  3 v = h0, 1, 0i, u = h1, 2, 0i. Then, u + v = h1, 3, 0i ∈/ ha, b, ci ∈ R : b = a + c + 1 . We could have also shown that it’s also not closed under multiplication. You can try this on your own.

35 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §12 Basis and Dimension for a Subspace (Barsheshat)

§12.1 Basis for a subspace m Given a subspace V ⊆ R , we say that B ⊆ V is a basis for V if 1. span(B) = V

2. B is linearly independent The follwing are properties of bases

m • If B is a basis for V ⊆ R , then |B| ≤ n, where |B| is the cadinality of B. 0 m 0 • If B and B are both bases for V ⊆ R , then |B| = |B |. • Every subspace has a basis (very hard to prove)

n−→o m Exercise 12.1. What is the basis of V = 0 ⊆ R ? We have B = {}, which is the empty set.

§12.2 Dimension of a subspace m Definition 12.2. Given a vector subspace V ⊆ R , the dimension of V, dim(V), is defined to be the number of elements in any basis of V. n−→o • 0 has dimension 0 −→ −→ −→ • span { v } for any dimension 1 ( v 6= 0 ) −→ −→ • span { v 1, ··· , v n} has dimension ≤ n

m • R has dimension m Exercise 12.3. Determine (using definition) whether each subset is a subspace. If possible, state the dimension and give the basis.

 3 1. V = ha, 0, 0i ∈ R : a ∈ R  3 2. V = ha, 1, 1i ∈ R : a ∈ R  3 3. V = ha, b, ci ∈ R : b = a − c 1. Yes. V is closed under addition and scalar multiplication. The basis for V : B = {h1, 0, 0i} ⇒ dimension 1

2. No, since when adding −→u = h1, 1, 1i and −→v = h2, 1, 1i (both in V) we get −→u + −→v = h3, 2, 2i ∈/ V (V is not closed under addition) −→ −→ −→ −→ 3. Yes. Note the closure under addition: u = ha1, b1, c1i, v = ha2, b2, c2i ⇒ u + v = −→ ha1 +a2, b1 +b1, c2 +c2i ⇒ b1 +b2 = (a1 −c1)+(a2 −c2) = (a1 +a2)−(c1 −c2) ⇒ u + −→ −→ −→ v ∈ R. Also, note closure under scalar multiplication: if v ∈ V ⇒ v = ha, b, ci, where b = a − c. Then t−→v = hta, tb, tci ⇒ tb = t(a − c) = ta − tc. Thus, t−→v ∈ V. Furthermore, we have

B = {h1, 1, 0i, h0, −1, 1i} ⇒ dim(V) = |B| = 2. (208)

Let’s prove that B is a basis for V. We need to show that

36 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

a) B is independent b) span(B) = V −→ We want to show that ah1, 1, 0i + bh0, −1, 1i = 0 implies a, b = 0. The previous −→ −→ −→ −b  −→ equation is equivalent to a v 1 = −b v 1 ⇒ v 1 = v 2, a =6 0 which is an a −→ impossible situation. Clearly only possible if b = 0, since v 2 is not a scalar −→ multiple of v 1 ⇒ a, b = 0. Hence B is independent. Next, we show span(B) = V. −→ −→ We have span = {s v 1 + t v 2 : s, t ∈ R} and we need to show it is the same as B = {h1, 1, 0i, h0, −1, 1i}. −→ −→ s v 1 + t v 2 = hx, y, zi (209) hs, s, 0i + h0, −t, ti = hx, y, zi (210)

We thereby have the system   x = s y = s − t ⇒ y = x − z (211)  z = t

Hence, hx, y, zi = hx, x − z, zi.

Exercise 12.4. Express h5, 7, −2i as a linear combinations of the vectors in B. From our previous work −→ −→ −→ −→ h5, 7, −2i = s v 1 + t v 2 = 5 v 1 − 2 v 2 (212)

37 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §13 Exercises on Span (Barsheshat) n −→ o Exercise 13.1. Consider V = span −→a = h1, 0, 1, 0i, b = h2, −1, 1, 1i . V is a 2- dimensional space (i.e., a plane that passes through the origin). Find the orthonormal −→ −→ basis B for V, B = { v 1, v 2} such that −→ −→ • || v 1|| = || v 2|| = 1 (normal) −→ −→ • v 1 · v 2 = 0 (orthogonal) −→ • v 1 is in the direction of h1, 0, 1, 0i

−→ −→ 2 2 −→ We begin√ with v 1 = s a = hs, 0, s, 0i which gives s + s = 1 because || v 1|| = 1. Hence s = ± 2/2. Take √ √ ± 2 ± 2 −→v = h , 0, , 0i (213) 1 2 2 −→ −→ −→ −→ 2 −→ −→ 2 For v 2: v 1 · v 2 = 0, || v 2|| = 1 and v 2 ∈ V. Note that || v 2|| = 1 is a non-linear −→ equation. Let v 2 = hx, y, z, wi. We have: −→ −→ v 1 · v 2 = 0 (214) √ √ 2 2 x + z = 0 (215) 2 2 x + z = 0 (216)

Furthermore, we have: −→ v 2 ∈ V (217) −→ −→ −→ v 2 = s a + t b (218) hx, y, z, wi = hs, 0, s, 0i + h2t, −t, t, ti (219)

The rest of the solution is pretty trivial, you can try this yourself.

There’s a second technique for finding −→v called Gram-Schmidt which 2 −→ 4 −→ −→ works as follows for R . We have || v 1|| = 1. Consider orth v 1 b gives −→ −→ −→ −→ b · v 1 −→ −→ orth v 1 b = b − −→ −→ v 1 (220) v 1 · v 1 −→ −→ −→ orth v 1 b satisfies all conditions for v 2 except for the , so take −→ orth−→ b −→ v 1 v 2 = −→ (221) orth−→ b v 1 Now, we have −→ −→ B = { v 1, v 2} (222) −→ −→ −→ −→ with || v 1|| = || v 2|| = 1, v 1 · v 2 = 0 ⇒ B is independent, but does span(B) = V? We −→ −→ −→ −→ n−→ −→o know v 1, v 2 ∈ V ⇒ span({ v 1, v 2}) ⊆ V = span a , b . Finally, to show that −→ −→ −→ −→ −→ −→ span(B) = V, you need to show that a ∈ span ({ v 1, v 2}) and b ∈ span ({ v 1, v 2}) −→ −→ and V ⊆ span ({ v 1, v 2}). The rest you can do.

38 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Exercise 13.2. Consider V = {h1, 2, 0, 0i, h2, 0, 1, 1i, h−1, −6, 1, 1i, h6, 8, 1, 1i}. Let’s −→ −→ write v 1, ··· , v 4 as columns in a 4 × 4 matrix.

1 2 −1 6 1 0 −3 4 2 0 −6 8 0 1 −1 1 A =   ∼   = R (223) 0 1 1 1 0 0 0 0 0 1 1 1 0 0 0 0 −→ −→ −→ −→ −→ −→ We clearly see that c 3 = −3 c 1 + c 2 and c 4 = 4 c 1 + c 2. Hence, by a theorem which will be covered in the following classes, −→ −→ −→ −→ −→ −→ v 3 = −3 v 1 + v 2, v 4 = 4 v 1 + v 1 (224) −→ −→ With these relations, we see that relations v 3 and v 4 can be seen as redundant, this −→ −→ V = span ({ v 1, v 2}).

39 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §14 Subspaces of Matrices (Barsheshat)

n If A is an m × n matrix then A can be viewed as a collection of m row vectors ∈ R or n m column vectors ∈ R . n Definition 14.1. row(A) = linear span of the row vectors ⊆ R m Definition 14.2. col(A) = linear span of the column vectors ⊆ R Moreover, note that

rank(A) = dim(row(A)) = dim(col(A)) (225)

Theorem 14.3 If A and R are two matrices such that A can be transformed into R through elementary row operations, then row(A) = col(A)

Proof. INSERT PROOF

Theorem 14.4 If there exists a dependency relation between the column vectors of a certain matrix A, then carrying out row operation does not change this dependency relation:

−→ −→ h−→ −→ i A = [ c 1| · · · | c 2] ∼ R = d 1| · · · | d 2 (226)

and n n X −→ −→ X −→ −→ ai c i = 0 ⇔ ai d i = 0 (227) i=1 i=1

Proof. INSERT PROOF

Theorem 14.5 n Main theorem about bases (basis theorem): If S is a subspace of R , then all bases of S have the same size (i.e., the same number of vectors).

−→ −→ −→ −→ Proof. Suppose B = { v 1, ··· , v r} and C = { u 1, ··· , u s} are bases for S:

span(B) = S = span(C), (228)

B and C are independent. Without loss of generality, assume r < s. We will show that the assumption r < s forces the set C to be dependent. Consider

s X −→ −→ ci u i = 0 (229) i=1

40 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Since B and C span the same subspace, we can represent all vectors in C as the following linear combinations of vectors in B: −→ −→ −→ −→ u 1 = a11 v 1 + a12 v 1 + ··· + a1r v r (230) −→ −→ −→ −→ u 2 = a21 v 1 + a22 v 1 + ··· + a2r v r (231) −→ −→ −→ −→ u 3 = a31 v 1 + a32 v 1 + ··· + a3r v r (232) . . (233) −→ −→ −→ −→ u s = as1 v 1 + as2 v 1 + ··· + asr v r (234)

We can thereby plug these equations into the previous equation, giving: −→ −→ −→ −→ −→ −→ −→ c1(a11 v 1 + a12 v 1 + ··· + a1r v r) + ··· + cs(as1 v 1 + as2 v 1 + ··· + asr v r) = 0 (235) −→ −→ −→ (c1a11 + ··· + csas1) v 1 + ··· + (c1a1r + ··· + csasr) v r = 0 (236)

Since B is independent (it is a basis), this means the above coefficients are all 0:

c1a11 + ··· + csas1 = 0 (237) . . (238)

c1a1r + ··· + csasr = 0 (239)

The previous set of equations is a homogenous system of equations with s variables and r equations. Since s > r, there must be at least 1 non-trivial solution to the system for c1, ··· , cs. This means that C is dependent which is a contradiction.

41 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §15 Matrix Operations (Barsheshat)

§15.1 Matrix addition and scalar multiplication

For an m × n matrix A = [aij], an m × n B = [bij] and t ∈ R, then

A + B = [aij + bij] (240)

tA = [taij] (241)

Subtraction is a combination of addition and scalar multiplication by −1.

§15.2 Matrix multiplication

Suppose A is a row matrix (i.e., 1 × n), A = [a1, a2, ··· , an] and B is column matrix (i.e., n × 1)   b1 b2 B =   , (242)  .   .  bn then AB is a 1 × 1 matrix equal to the dot product of A and B.

Example 15.1 Consider

A = [1, 2, 3, 4] (243)  1  −1 B =   . (244)  1  −1

Then AB = [1(1) + 2(−1) + 3(1) + 4(−1)] = [−2]

More generally, if A is m × n and B is n × r, then AB is of size m × r and described by

AB = [cij], (245) where cij is the dot product of the i’th row of A with the j’th column of B. In fact, if A is an n × m matrix and B is an m × p matrix, then their product is

m X AB = [cij], where cij = aikbkj (246) k=1

For instance, consider

A = [1, 2, 3, 4] (247)  1  −1 B =   . (248)  1  −1

42 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Then  1   1(1) 1(2) 1(3) 1(4)  −1 −1(1) −1(2) −1(3) −1(4) BA =   [1, 2, 3, 4] =   (249)  1   1(1) 1(2) 1(3) 1(4)  −1 −1(1) −1(2) −1(3) −1(4) Not only is matrix multiplication not commutative, but often, flipping the order makes the operation undefined. Furthermore, if AB = AC, then B does not necessarily equal C. There is no “cancelation property ” in matrix multiplication. The identity is a matrix of size n × n. Simply put, it is a with 1’s along its main diagonal and 0’s elsewhere.

Proposition 15.2 For any n × n matrix A InA = AIn = A (250)

Proof. The rows (in order) of In are the same as the columns of In. When regarded n −→ −→ −→ as vector in R , they are simply { e 1, e 2, ··· , e n}. Furthermore, for any vectors −→ n −→ −→ v = hv1, v2, ··· , vni ∈ R , v · e i = vi. If A = [aij] and InA = [cij], then

n X −→ cij = ( e i)k · akj = aij (251) k=1

Similarily for AIn.

43 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §16 Column and Row Representations of Matrix Multiplication and the Properties of the Operation (Barsheshat)

§16.1 Partitioning matrices It’s often useful to regard matrices as being partitioned into sub-matrices. For instance,

1 0 2 1 1 ! 2 0 1 −1 2 A B   M =   = , (252) 3 2 1 0 0  C D 1 2 3 2 1 where 1 0 2 A = (253) 2 0 1  1 1 B = (254) −1 2 3 2 1 C = (255) 1 2 3 0 0 D = (256) 2 1

§16.2 Column representation of matrix multiplication Given A, an m × n matrix, and B, an n × r matrix, then consider partitioning B into its columns: h−→ −→ −→ i B = b 1, b 2, ··· , b r . (257) Then h−→ −→ −→ i h −→ −→ −→ i AB = A b 1, b 2, ··· , b r = A b 1,A b 2, ··· ,A b r (258)

§16.3 Row representation of matrix multiplication Suppose A is m × n, B is n × r, and we partition A into rows  −→  a 1 −→  a 2  A =   . (259)  .   .  −→ a m

As a result, we have  −→  a 1B −→  a 2B  AB =   (260)  .   .  −→ a mB

§16.4 Column-row representation A is m × n −→ −→ A = [ a 1, ··· , a n] (261)

44 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes and B is n × r −→  b 1 −→   b 2 B =  .  (262)  .  −→  b n We thereby have −→  b 1 −→  −→ −→  b 2 −→ −→ −→ −→ −→ −→ AB = [ a 1, ··· , a n]  .  = a 1 b 1 + a 2 b 2 + ··· + a n b n (263)  .  −→  b n

Exercise 16.1. If 3 2 1 2 2 A = ,B = 0 1 (264) 0 1 1   2 2 find AB using column-row representation. We have

1 −→a = (265) 1 0 2 −→a = = −→a (266) 2 1 3 −→ b 1 = [3, 2] (267) −→ b 2 = [0, 1] (268) −→ b 3 = [2, 2] (269) (270)

Then 7 6 AB = (271) 2 3

§16.5 Algebraic proterties of matrix addition and scalar multiplication Assume A, B, C are matrices of same size, assume α and β are scalars ∈ R, then 1. Commutativity: A + B = B + A

2. Associativity: A + (B + C) = (A + B) + C

3. The zero matrix is the “identity” of matrix addition: A + 0 = A

4. A + (−A) = 0

5. Distributivity: α(A + B) = αA + αB or A(α + β) = αA + βA.

6. Identity: 1A = A

7. Scalar multiplication associativity: α(βA) = (αβ)A

45 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

§16.6 Matrix exponentiation When A and B are both n×n (square matrices), then AB is also n×n. However, we have the special case A = B and we then define A2 = AA (or more generally An = AAA ··· A). | {zn } Furthermore,

ArAn = Ar+n (272) (Ar)s = Ars (273) 0 A = In (274)

§16.7 Transpose of matrix If A is m × n, then AT is defined as the n × m matrix whose rows are the columns of A (in order) in coordinate notation:

T (A )ij = (A)ji : elements along the main diagonal are left unchanged (275) The following are the properties of matrix transpose: 1.( AT )T = A 2.( A + B)T = AT + BT

T T 3.( kA) = kA , where k ∈ R 4.( AB)T = BT AT 5. For square matrices, (Ar)T = (AT )r where r > 0 Furthermore, note that a symmetric matrix has AT = A and a skew-symmetric (or anti-symmetric) matrix has AT = −A

§16.8 Properties of matrix multiplication Let k ∈ R and A, B, C be matrices whose sizes are such that the following operations are well defined. Then: 1. Associativity: A(BC) = (AB)C 2. Left-distributivity: A(B + C) = AB + AC 3. Right-distributivity: (A + B)C = AC + BC 4. k(AB) = A(kB)

5. ImA = A = AIn where A is of size m × n

Proof of 2. Let Ai be the i’th row of A and bj and cj are the j’th column of B and C, respectively. Then,

[A(B + C)]ij = Ai · (bj + cj) (276)

= Ai · bj + Ai · bj (277)

= [AB]ij + [AC]ij (278)

= [AB + AC]ij (279)

46 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §17 Linear Systems With Matrices (Barsheshat) Suppose A is an m × n matrix   a11 a12 ··· a1n  a21 a22 a2n  A =   = [a ]. (280)  . .. .  ij  . . .  am1 am2 ··· amn

Let −→x be an arbitrary vector   x1 x2 −→v =   . (281)  .   .  xn −→ Let b be   b1 −→  b2  b =   . (282)  .   .  bm . Then the system of equations  a11x1 + a12x1 + ··· + a1nxn = b1   a21x1 + a22x1 + ··· + a2nxn = b2 (283) .  .  am1x1 + am2x1 + ··· + amnxn = bm −→ can be represented by the equation A−→x = b , where A is m × n, −→x is n × 1 and therefore −→ −→ b is m × 1. For a homogenous system, we can clearly write A−→x = 0 (if A is m × n , −→ n −→ m then x ∈ R and 0 ∈ R ). Given an m × n matrix A, the nullspace of A, denoted null(A), is the set of all vectors −→ n −→ x ∈ R such that A x = 0 (the set of all solutions of the homogeneous system):

n−→ n −→ −→o null(A) = x ∈ R : A x = 0 . (284)

Note that if null(A) is a vector subspace, then it is

1. closed under addition

2. closed under scalar multiplication −→ −→ 3. not empty: A 0 = 0

Proof. We begin by showing that null(A) is closed under addition. Suppose −→u , −→v ∈ null(A). Thus −→ −→ −→ A(−→u + −→v ) = A−→u + A−→v = 0 + 0 = 0 . (285) Hence, −→u +−→v ∈ null(A). Now, we show that null(A) is closed under scalar mutliplication. −→ −→ −→ −→ −→ −→ If v ∈ null(A) and t ∈ R, then A(t v ) = t(A v ) = t 0 = 0 . Hence, t v ∈ null(A).

47 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Proposition 17.1 A basis for the column space of A is composed of the columns in A that are associated with pivot columns in RREF(A). On the other hand, a basis for the row space of A is comprised of the pivot rows of RREF(A).

Example 17.2 Let 1 2 0  1 0 2/5  A = 2 −1 1  ∼ 0 1 −1/5 (286) 3 11 −1 0 0 0 Hence, the basis for row(A) is B = {h1, 0, 2/5i, h0, 1, −1/5i}. Moreover, the basis for col(A) is C = {h1, 2, 3i, h2, 1, 11i}. Moreover the basis for the nullspace is    −2/5  B =  1/5  . (287)  1 

As a result, null(A) is    −2/5  null(A) = span   1/5   (288)  1 

Theorem 17.3 Rank-Nullity Theorem: The rank of a matrix A and it’s nullity, which is the dimension of the nullspace of A, are thusly related:

rank(A) + nullity(A) = n (289)

Proposition 17.4 n−→o −→ −→ −→ If null(A) = 0 , then the column vectors of A v 1, v 2, ··· , v n, which span the column space of A −→ −→ −→ col(A) = span ( v 1, v 2, ··· , v n) , (290) are linearly independent and are the basis for the span.

Proposition 17.5 The reduced row echelon form R has the same null space as the original matrix A: null(A) = null(R).

48 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §18 Matrix Inverses (Barsheshat) Definition 18.1. An n × n (square) matrix A is said to be invertible if there exists another matrix B such that AB = BA = I. B is usually labelled as B = A−1 and called the inverse of A.

If A = [a] is a 1 × 1 matrix, then A is invertible if a 6= 0. The inverse will be A−1 = [a−1].

Example 18.2 If  1 2 A = , (291) −1 1 find A−1. Let a b A−1 = . (292) c d Then,  1 2 a b 1 0 = (293) −1 1 c d 0 1 As a result, a + 2c b + 2d 1 0 = . (294) c − a d − b 0 1 The rest of the problem involves solving a system of equations, which you can do on your own. The answer, however, is

1/3 −2/3 A−1 = (295) 1/3 1/3

Theorem 18.3 If A is invertible and B is invertible, show that AB is invertible.

Proof. Multiply AB by B−1A−1:

(B−1A−1)(AB) = ((B−1A−1)A)B (296) = (B−1(A−1A))B (297) = (B−1I)B (298) = B−1B (299) = I (300)

Hence AB is invertible and (AB−1) = B−1A−1.

Theorem 18.4 If A is invertible, then A−1 is unique.

49 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Proof. Suppose B and B0 are inverses of A. Then

AB = BA = A = AB0 = B0A. (301)

We begin with B0A = I,

B0A = I (302) (B0A)B = IB (303) B0(AB) = IB (304) B0I = IB (305) B0 = B, (306) which thereby implies that A−1 is unique.

The following are properties of matrix inverses:

• (AB)−1 = B−1A−1

−1 1 −1 • (kA) = k A for k 6= 0 n −1 −1 n • (A ) = (A ) for n ∈ R • (AT )−1 = (A−1)T −→ −→ −→ −→ • The system A x = b has exactly one solution ( x = A−1 b ).

§18.1 Using row-reduction to find inverses Given a certain square matrix A, we can both check if A is invertible and find its inverse with the same procedure. We begin by writing out A and I side by side. For this example, let 3 1 2  A = 1 2 −1 . (307) 2 −1 1 We consequently have 3 1 0 1 0 0 [A|I] = 0 2 −1 0 1 0 (308) 2 −1 1 0 0 1 Now, we row reduce until A has been transformed into I. If a row of zeros is obtained, then A is not invertible. In our example,

3 1 0 1 0 0 1 0 0 1 −1 −1 [A|I] = 0 2 −1 0 1 0 ∼ 0 1 0 −2 3 3  = [I|B]. (309) 2 −1 1 0 0 1 0 0 1 −4 5 6

The matrix B we obtain is actually A−1. You can check that AB = BA = I.

50 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §19 Gauss-Jordian Algorithm for Finding Inverses (Barsheshat) We must begin by discussing elementary matrices.

Definition 19.1. An elementary matrix is a matrix which corresponds to a row operation:

1. Row swap matrix: Tij is by definition the matrix that swaps row i and row j.

1 ······ 0 . .. . . . 1 . Tij =   (310) . .. . . 1 . . 0 ······ 1

Consider the 3 × 3 example

0 1 0 T12 = 1 0 0 (311) 0 0 1

and the 5 × 5 example 1 0 0 0 0 0 0 0 1 0   T24 = 0 0 1 0 0 (312)   0 1 0 0 0 0 0 0 0 1

Note that if A has size n × n, then TijA is simply the matrix A with rows i and j swapped.

2. The second operation is multiplying a row by a scalar: Ri → mRi where m 6= 0. Si(m) is the diagonal matrix with 1’s along the diagonal expect for the i’th row which has an m. 1 0 ··· 0 0  ..  0 . 0   . . Si(m) . m . (313)    ..  0 . 0 0 0 ··· 0 1

3. The final row operation is adding a scalar multiple of one row to another: R1 → Ri + mRj, where m 6= 0.

1   1 ··· m     1  Eij(m) =   , (314)  ..   .  1

where m is in the j’th position in the i’th row.

The following are properties of elementary matrices:

• If A is an n × n matrix, and E is any elementary matrix with same dimension as A, then EA is simply the matrix A with the corresponding row operation carried out.

51 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

• All elementary matrix are invertable:

−1 Tij = Tij (315) −1 (Si(m)) = Si(1/m) (316) −1 (Eij(m)) = Eij(−m) (317)

Suppose I apply the row operations E1,E2, ··· Ek in that order to an n × n matrix A to obtain the identity matrix, then,

In = EkEk−1 ··· E1A. (318)

−1 Therefore, EkEk−1 ··· E1 = A . Also,

A = (A−1)−1 (319) −1 = (EkEk−1 ··· E1) (320) −1 −1 −1 −1 = E1 E2 ··· Ek−1Ek . (321)

Theorem 19.2 Invertability theorem: Let A be an n × n matrix. Then the following statements are all equivalent:

1. A is invertable (A−1 exists)

2. A is a product of elementary matrices

3. rank(A) = n (A has full rank)

4. AT is invertable: (AT )−1 = (A−1)T

n 5. Columns (or) rows of A form a basis of R . −→ −→ 6. A−→x = 0 has only the trivial solution (−→x = 0 ) −→ −→ 7. A−→x = b has exactly one solution for each b .

Proof. (1) ⇔ (2): from what we saw with Guass-Jordan algorithm. A is invertable iff A is reduced to identity with row operations, i.e., A = E1E2 ··· Ek (a product of elementary matrices). (2) ⇔ (3) if A is a product of elementary matrices, then RREF (A) = I ⇒ all rows were independent ⇒ rank(A) = n. (3) ⇔ (4) straightforward since rank(A) = rank(AT ). (5) is clearly equivalent to (3) −→ (1) ⇔ (6) if A−→x = 0 , and A is invertable, then −→ −→ A−1(A−→x ) = A−1 0 = 0 (322) −→ ⇒ (A−1A)−→x = −→x = 0 (323)

−→ −→ (5) ⇔ (7) First off, if A−→x = 0 only has trivial solution, but suppose A−→x = b (for some

52 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes −→ −→ −→ −→ −→ −→ b ) has more than one soultion, i.e., x 1 6= x 2 but A x 1 = A x 2 = b . −→ −→ −→ A x 1 − A x 2 = 0 (324) −→ −→ −→ A( x 1 − x 2) = 0 (325) −→ −→ x 1 − x 2 is a solution to the homogenous system, thereby condraticting our original statement.

53 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §20 More on Matrix Inverses

Theorem 20.1 If A is an m × n matrix then

1. rank(AT A) = rank(A)

2. AT A is invertible ⇔ rank(A) = n

Proof. By the rank-nullity theorem,

rank(A) + mull(A) = n = rank(AT A) + nullity(AT A). (326)

Therefore we must show that nullity(A) = nullity(AT A). In fact, we will show an even stranger statement: null(A) = null(AT A). The first step is proving that

nullity(A) ⊆ nullity(AT A) (327) −→ Suppose −→x ∈ null(A), then A−→x = 0 . As a result, −→ A−→x = 0 (328) −→ AT (A−→x ) = AT 0 (329) −→ (AT A)−→x = 0 (330)

Hence −→x ∈ null(AT A). The second step involves showing that null(A) ≤ null(AT A). Suppose −→x ∈ null(AT A). Then, −→ (AT A)−→x = 0 (331) −−−−→ −→x T (AT A−→x ) = 0 (332) (−→x T AT ) · (A−→x ) = 0 (333) (A−→x )T (A−→x ) = 0 (334) ⇒ ||A−→x ||2 = 0 (335) ||A−→x || = 0 (336) −→ A−→x = 0 (337)

Moreover, we now use the rank-nullity theorem in conjunction with the inveritbility theorem to show the second statement of the theorem: AT A is invertible ⇔ rank(AT ) = n ⇔ rank(A) = n. We thereby obtain:

1. rank(AT A) = rank(A)

2. AT A is invertible ⇔ rank(A) = n

.

54 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §21 Determinants (Barsheshat) The determinant can be though of as a function whose domain (set of input) is the set of all square matrices, and whose range (set of output) is simply R: f : Am×n → R. Definition 21.1. If A is 2 × 2,

a b A = = ad − bc (338) c d

Theorem 21.2 A (which is n × n) is invertable iff det(A) 6= 0

Proof. Let a b A = (339) c d and a b −→u = , −→v = (340) c d Then, −→ −→ −→ u · v −→ proj−→ v = u (341) u ||−→u ||2 ab + cd = ha, ci (342) a2 + c2

Using this, we calculate the orthogonal

ab + cd orth−→ = ha, ci − ha, ci (343) u a2 + c2

Now, you can easily verify that −→ −→ || u ||||ortho−→u v || = |ad − bc| (344)

We will derive a formula for the inverse of a square 2 × 2 matrix. Let

a b A = (345) c d

We apply the Gauss-Jordian algorithm (note that a or c 6= 0). Without loss of generality, assume for simplicity a 6= 0. Hence,

   d b  a b 1 0 1 0 ad−bc − ad−bc A = ∼ c a (346) c d 0 1 0 1 − ad−bc ad−bc As a result, 1  d −b A−1 = (347) det(A) −c a

55 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

§21.1 Determinants for 3 × 3 matrices Suppose a b c A = d e f (348) g h i We give a recursive formula:

e f d f d e det(A) = a − b + c (349) h i g i g h = a(ei − fh) − b(di − fg) + c(db − eg) (350) = aei − afh − bdi + bfg + cbd − ceg (351) = aei + bfg + cdh − afh − bdi − ceg (352)

Notice the pattern is the same as that of the cross product:

abcabc defdef (353) g highi and a bcabc defdef (354) ghighi As a result, det(A) = aei + bfg + cdh − afh − bdi − ceg −→ −→ −→ Exercise 21.3. Let u = hu1, u2, u3i, v = hv1, v2, v3i, w = hw1, w2, w3i. Now that the scalar tripple product −→u · (−→v × −→w ) is the determinant

u1 u2 u3

v1 v2 v3 (355)

w1 w2 w3 and is also equivalent to the volume of a parallelpiped. Show that this proves the equivalence between invertability and non-zero determinants for 3 × 3. This exercise is left to the reader.

Before continuing, here are some of the properties of determinants. Assuming A, B are n × n matrices, we have

1. det(A) = det(AT )

−1 1 2. det(A ) = det(A) 3. If A is equivalent to B through adding multiples of rows to other rows, then det(A) = det(B)

4. det(AB) = det(A)det(B)

56 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §22 More on Determinants (Barsheshat)

Definition 22.1. Minors: Given a matrix A = [aij], Aij is calld the (i, j)- of A, and it is simply the matrix obtained by removing the i-th row and the j-th column of A. Back to 3 × 3 determinants:

det(A) = a11det(A11) − a12det(A12) + a13det(A13) 3 X j+1 = (−1) a1jdet(A1j)(∗) j=1 Definition 22.2. Cofactor: Given a square (n × n) matrix A, we can define the (i, j)- cofcctor of A, denoted Cij as follows: i+j Cij = (−1) det(Aij) (356) And (∗) becomes 3 X det(A) = a1jC1j (357) j=1

Now on to n × n matrices: If A is n × n, Cij is the (i, j) cofactor of A, then n X det(A) = a1j · C1j (recursive definition of n × n determinant). j=1

Example 22.3

A = (358) Find det(A) using above formula. Solution. Since a12 = a14 = 0, we only need to find C11 and C13:

1+1 C11 = (−1) det(A11) = 1· (359)

→ det(A) = a11 · C11 + a12 · C12 + a13 · C13 + a14 · C14 = 1 · 1 = 1

§22.1 Generalized cofactor expansions It turns out we can “expand” out determinant calcalation along any row or any column we want: • If A is n × n: n X det(A) = a1jC1j (cofactor expansion along first row) j=1 n X = aijCij (cofactor expansion along i-th row) j=1 n X = aijCij (cofactor expansion along j-th column) i=1

where Cij is (i, j)-cofactor and aij is element in i-th row and j-th column.

57 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Example 22.4

A = (360) Find determinant using any ow or column: det(A) = a14 · C14 + a24 · C24 + a34 · C34 + a44 · C44

4-th column expansion = 1· = (−1) = (−1)

The following are propertities of determinants. Assuming all matrices are square, and all oeprations are well-defined, we have:

1. det(A) 6= 0 iff A is invertible

2. det(A) = det(AT )

3. det(In) = 1

n 4. If A is n × n, and k ∈ R, det(kA) = k det(A) 5. det(AB) = det(A) · det(B)

−1 1 6. det(A ) = det(A)

58 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §23 Even More On Determinants (Barsheshat)

§23.1 Triangular Matrices Definition 23.1. An upper triangular (square) matrix is a matrix which has only 0’s below the main diagonal, e.g., 1 2 3 A = 0 2 7 (361) 0 0 1 Definition 23.2. A lower triangular (square) matrix is a matrix which has only 0’s above the main diagonal, e.g., 1 0 0 B = 0 1 0 (362) 1 2 3 Note that any diagonal matrix and the n × n 0 matrix are upper and lower triangluar.

Example 23.3 Compute determinants of the triangler matrices A and B defined above. We have:

1 2 3 2 7 det(A) = 0 2 7 = 1 · = 2 (363) 0 1 0 0 1

Moreover,

1 0 0 1 0 det(B) = 0 1 0 = 1 · = 3 (364) 2 3 1 2 3

Theorem 23.4 If A is upper or lower triangluar, then det(A) is simply the product of the elements along a main diagonal.

§23.2 Determinants of elements matrices There are three types of elementary matrices:

1. (Row Wwapping) If E is an elementary matrix obtained by swapping two rows of I, then det(E) = −1. For instance,

0 1 0

1 0 0 = −1 (365)

0 0 1

2. (Multiplication of a Row by k) If E is obtained by multiplying a row of I by a scalar k, then det(E) = k. For instance,

1 0 0

0 1 0 = 1 · 1 · k = k (366)

0 0 k

59 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

3. (Adding a Scalar Multiple of One Row to Another) If E is obtained by adding a scalar multiple of one row to another, then det(E) = det(I) = 1. For instance,

1 0 0

0 1 m = 1 · 1 · 1 = 1 (367)

0 0 1

Exercise 23.5. A quicker way to fin determinants:

 1 2 3 A =  2 1 1 (368) −1 2 1

We have,

1 2 3

det(A) = 2 1 1 (369)

−1 2 1

1 2 3

= 0 −3 −2 (370)

0 4 4

1 2 3

= − 0 4 4 (371)

0 −3 −2

1 2 3

= −4 0 1 1 (372)

0 −3 −2

1 2 3

= −4 0 1 1 (373)

0 0 1 = (−4) · 1 · 1 · 1 (374) = −4 (375)

Recap: To compute determinants, on ecan apply row operations, making sure to keep track of how determinants changes, until we obtain an upper or lower triangular matrix, then simply multiply along the diagonal (as well as any changes to the determinant).

§23.3 Cramer’s Rule −→ Suppose A is n × n and invertible. Consider the system A−→x = b , which has the unique −→ solution −→x = A−1 b . −→ −→ Definition 23.6. Given A (n × n) and a column vector b (n × 1), then A ( b ) is the −→ i n × n matrix where column i of A is replaced by b .

60 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Example 23.7 If 1 2 1 A = 0 1 0 (376) 0 2 1 and 1 −→ b = 2 , (377) 3 then 1 1 1 −→ A2( b ) = 0 2 0 (378) 0 3 1

−→ Cramer’s Rule: If A is an invertible n × n matrix, then the solution to A−→x = b is given by −→ det(A ( b )) x = i (379) i det(A) for i = 1, ··· , n. −→ Exercise 23.8. Solve A−→x = b , using Cramer’s rule, given

1 2 1 A = 0 1 0 (380) 0 2 1 and 1 −→ b = 2 (381) 3 −→ −→ −→ We have det(A) = 1, det(A1( b )) = −2, det(A2( b )) = 2, det(A3( b )) = −1. Hence,

−2 −→ x =  2  (382) −1

61 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §24 Complex Numbers (Barsheshat)

Definition 24.1. A complex number, z, is an number of the form z = a + bi (rectan- 2 gular form) where a, b ∈ R, and i is called the imaginary unit satisfying i = −1.

The set of complex numbers is denoted C:

 2 C = a + bi : a, b ∈ R, i = −1 . (383)

Given z = a + bi, a is called the real part and b is called the imaginary part:

a = Re(z) (384) b = Im(z) (385)

Given any , say x ∈ R we conventionally vew x as complex number as wll by associating: x = x + 0i. In this sense, we can view R ⊆ C. Hence,

R = {z ∈ C : Im(z) = 0} (386)

§24.1 Operations on complex numbers

Addition/Subtraction: Given z1 = a1 + b1i and z2 = a2 + b2i, then we have:

z1 + z2 = (a1 + a2) + (b1 + b2)i (387)

z1 − z2 = (a1 − a2) + (b1 − b2)i (388)

With these definitions, the properties of addition and subtraction of real numbers (e.g., commutativity, associativity) carry over to complex numbers. Moreover, viewing z1 and z2 as vectors in the complex plane, then z1 + z2 corresponds to vector addition. Multiplication: To define multiplication, we will assume that the “natural” properties (i.e., 2 commutativity, associativity, distributivity) along with i = −1. Consider z1 = a1 + b1i and z2 = a2 + b2i once more. Then

z1 · z2 = (a1 + b1i) · (a2 + b2i) (389)

= a1 · a2 + a1(b2i) + (b1i)a2 + (b1i)(b2)i (390) 2 = a1 · a2 + b1 · b2(i ) + (a1b2 + a2b1)i (391)

= (a1 · a2 − b1 · b2) + (a1b2 + a2b1)i (392)

Exercise 24.2. Find the unique complex number such that

z · (3 + 2i) = 13 = 13 + 0i. (393)

Let z = a + bi. Then,

(a + bi) · (3 + 2i) = 13 + 0i (394) (3a − 2b) + (2a + 3b) = 13 + 0i. (395)

Thus, we have the system 3a − 2b = 13 (396) 2a + 3b = 0 Solving such a system is trivial at this point in the course. The answer is a = 3, b = −2: z = 3a − 2b.

62 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

Definition 24.3. Given a complex number z = a + bi, we define the conjugate of z, denoted z∗ or z, is defined as z = a − bi, Re(z) = Re(z), Im(z) = −Im(z). The geometric interpretation of conjugation corresponds to a reflection with respect to the real axis.

Definition 24.4. The modulus, or magnitude, of a complez number z = a + bi, is denoted |z| and defined as p |z| = a2 + b2. (397)

The following are properties of conjugation, given complex numbers z and w:

1. (z + w) = z + w

2. (z · w) = z · w

3. (z) = z

4. z · z = |z|2.

5. |z| = |z|

Exercise 24.5. Given a complex number z = a1 + b1i 6= 0 + 0i, find the unique complex number w such that z · w = w · z = 1 + 0i. Recall z · z = |z|2 > 0. z · z z = 1, w = z−1 = . (398) |z|2 |z|2

Division: Given w ∈ C, z ∈ C, z 6= 0 + 0i, we define

w w · z = w · z−1 = (399) z |z|2

Exercise 24.6. Given z = 1 + i,

1. Find z−1

w 2. Find z , where w = 3 + 2i We have, z 1 − i 1 1 z−1 = = = − i. (400) |z|2 12 + 12 2 2 Moreover, we have

w 1 1  = w · z−1 = (3 + 2i) − i = (1/2)(5 − i) (401) z 2 2

§24.2 Polar form of copmlex numbers Given z = a + bi, we can represent this as an arrow on the complex plane. We can take the angle θ (known as the argument of z: θ = arg(z)) in standard position. Hence,

a = |z| cos θ (402) b = |z| sin θ (403)

63 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

We can consequently express z in terms of θ:

z = a + bi (404) = |z| cos θ + |z|i sin θ (405) = |z|(cos θ + i sin θ) (406) = |z|eiθ = |z| cis θ (407)

Converting from polar to rectangular form is straightforward given the relationships expressed about.√ To convert from rectangular to polar is trickier. Given z = a + bi, we have |z| = a2 + b2. Moreover, b tan θ = , (408) a assuming a, r 6= 0. Then, use quadrants/signs of a and b to determine θ properly. If a = 0, then θ = π/2 (if b > 0) or θ = 3π/2 (if b < 0). If r = 0, then θ can be anything, but for simplicity we take θ = 0 by convention.

64 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §25 More on Complex Numbers (Barsheshat)

The product of two complex numbers, z1 = r1(cos θ1+i sin θ1) and z2 = r1(cos θ2+i sin θ2), expressed in polar form is

z1 · z2 = r1r2(cos θ1 + i sin θ1)(cos θ2 + i sin θ2) (409)

= r1r2(cos θ1 cos θ2 + i cos1 sin θ2 + i sin θ1 cos θ2 − sin θ1 sin θ2) (410)

= r1r2(cos(θ1 + θ2) + i sin(θ1 + θ2)) (411)

−1 Moreover, the product z1 · z2 is z z · z1 = z · 2 (412) 1 2 1 |z|2 (cos θ − i sin θ ) = r r (cos θ + i sin θ ) 2 2 (413) 1 2 1 2 2 2 2 r2(cos θ2 + sin θ2) r1 = (cos θ1 cos θ2 − i cos θ1 sin θ2 + i sin θ1 cos θ2 + sin θ1 sin θ2) (414) r2 r1 = (cos(θ1 − θ2) + i sin(θ1 − θ2)) (415) r2

Exercise 25.1. Consider f : C → C where f(z) = iz. Describe the action of f. We know i = cos(π/2) + i sin(π/2) and z = cos θ + i sin θ. Hence,

f(z) = (cos(π/2) + i sin(π/2))(cos θ + i sin θ) = r1(cos(θ + π/2) + i sin(θ + π/2)) (416) f rotates z by π/2 counter-clockwise.

§25.1 De Moivre’s formula

Consider z1 and z2. By setting z1 = z2, we find the powers of a single complex number:

z1 · z2 = r1r2 cis(θ1 + θ2) (417) z2 = r2 cis(2θ) (418) z3 = r3 cis(3θ) (419) z4 = r4 cis(4θ) (420) . . (421) zn = rn cis(nθ) (422)

We can notably use De Moivres formula to find the roots of complex numbers. In other words, if zn is known, we could use this formula when to find the possible values of z. Suppose zn = rn cis θ = rn(cis(θ + k2n)), k ∈ NN. Recall that zn = rn cis(nθ), where θ ≡ argument. Going backwards, we take the nth root of r (positive) and the angle divided by n, then

θ + 2kπ  z1/n = r1/n cis , k = 0, 1, 2, ··· , n − 1 (423) n and gives exactly n distinct roots.

65 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes

§25.2 Basic polynomial of complex numbers 2 Recall that if ax + bx + c = 0 for a, b, c ∈ R, then √ −b ± b2 − 4ac x = . (424) 2a If we allow x to be complex, then we can adjust the formula: √ −b ± ∆ x = , ∆ ≥ 0 (425) 2a −b ± p|∆|i x = , ∆ < 0 (426) 2a

66 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §26 Integrative Activity: Electrical Circuits (Barsheshat) The following are the laws that govern electrical circuits. ∆V is the potential difference, I is the current and R is the resistance.

1. Ohm’s law: ∆V = IR

2. Kirchhoff’s laws • Kirchhoff’s current law (junction or node rule): The total current flowing into a node is euqal to the current flowing out. Assuming a + sign for current flowing in, - sign for current flowing out: P I = 0. • Kirchhoff’s voltage law (loop rule): The sum of all voltage differences around a loop is 0, or P ∆V = 0

∆V2

∆V1 ∆V3

∆V5 ∆V4

Example 26.1 Solve for the currents in the following circuit.

10 V 5 Ω 10 Ω

Two loops constitue the circuit:

1. Starting at negative terminal of the battery: ∆V1 = 10V, ∆V2 = −I2R1 = −(5 Ω)I2 ⇒= 10V − (5Ω)I2 = 0 ⇒ I2 = 2 A

2. We have ∆V1 = −I3R2 = −(10 Ω)I3 ⇒ ∆V2 = (5 Ω)I2. As a result, (−10 Ω)I3+ (5Ω)I2 = 0 ⇒ (−10 Ω)I3 + 10 V = 0 ⇒ I3 = 1 A

67 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes §27 Enriched Material: Eigenvalues and Eigenvectors (Barsheshat)

Definition 27.1. Given a square matrix n × n matrix A, an eigenvector/eigenvalue −→ −→ −→ n −→ equation is: A v = λ v , v ∈ R , λ ∈ R. In this case, v is an eigenvector of A associated with eigenvalue λ. −→ Remark 27.2. 0 is trivially an eigenvector of any square matrix A, associated with −→ −→ any eigenvalue λ (A 0 = λ 0 is always true). Generally, when looking at eigenvectors −→ and eigenvalues, we ignore 0 but not necessarily the 0 eigenvalue.

−→ n −→ −→ −→ Remark 27.3. Any v ∈ R is an eigenvector of I with eigenvalue 1 (I v = 1 v = v ).

Definition 27.4. Given a square matrix A and λ ∈ R, then Sλ(A) is the λ-eigenspace of A is the set of all vectors which are eigenvectors of A associated with eigenvalues λ.

−→ n −→ −→ Sλ(A) = { v ∈ R : A v = λ v } . (427)

Theorem 27.5 n Sλ(A) is a subspace of R

−→ Proof. If v ∈ Sλ(A):

⇔ A−→v = λ−→v (428) ⇔ A−→v = (λI)−→v (429) −→ ⇔ A−→v − λI−→v = 0 (430) −→ ⇔ (A − λI)−→v = 0 (431) ⇔ −→v ∈ null(A − λI) (432)

Thus, Sλ(A) = null(A − λI) and is therefore a subspace. −→ Terminology: We say that λ is an eigenvalue of A is ∃−→v 6= 0 , such that A−→v = λ−→v .

§27.1 Finding eigenvalues of A −→ −→ If −→v 6= 0 and A−→v = λ−→v ⇔ (A − λI)−→v = 0 ⇔ det(A − λI) = 0. The last expression is the characteristic polynomial in λ.

Exercise 27.6. Consider  2 −1 1 1 P = −1 2 1 (433) 3   1 1 2 We have P T = P (P is symmetric), P 2 = P (P is a projection matrix). Verify that for any   x1 −→ 3 x = x2 ∈ R (434) x3

68 Tristan Martin (Fall 2017) 201-NYC-05-E (Enriched Linear Algebra I) Lecture Notes that (P −→x ) = −→y is contained in the plane x + y − z = 0. By calculating det(P − λI) for arbitrary λ ∈ R, solve det(P − λI) = 0 to determine the eigenvalues of P . We have  2/3 −1/3 1/3 λ 0 0 P − λI = −1/3 2/3 1/3 − 0 λ 0 1/3 1/3 2/3 0 0 λ (2/3 − λ) −1/3 1/3  =  −1/3 (2/3 − λ) 1/3  1/3 1/3 (2/3 − λ)

(2/3 − λ) 1/3 1 −1/3 1/3 1 −1/3 (2/3 − λ) det(P − λI) = (2/3 − λ) + + 1/3 (2/3 − λ) 3 1/3 (2/3 − λ) 3 1/3 1/3 0 = (2/3 − λ)[(2/3 − λ)2 − 9] + 1/3[−1/3(2/3 − λ) − 1/9] + 1/3[−1/9 − 1/3(2/3 − λ)] = (2/3 − λ)(1/3 − λ)(1 − λ) + 2/3[−1/3(2/3 − λ) − 1/9] = (2/3 − λ)(1/3 − λ)(1 − λ) − 2/9(1 − λ) = (1 − λ) [(2/3 − λ)(1/3 − λ) − 2/9] = (1 − λ)(1 − λ)λ

Hence, λ = {0, 1}: projection matrices always have 0 and 1 as eigenvalues. Hence, −→ Sλ (P ) = S1(P ) = {hx, y, zi : x + y − z = 0}. In other words, if v is on the plane, then −→1 −→ P v = v . Moreover, we have S0(P ) = span {h1, 1, −1i}, where h1, 1, −1i is the normal vector of the plane.

69