Lecture 5: July 3, 2013

Properties of Operations The first three theorems of Section 1.6 give properties of matrix , , and . I did not present these theorems in class, but asked you to read them on your own. I list them here, grouping the properties in a slightly different manner.

Properties of Matrix Addition: Let A, B, and C be (m × n) matrices. (a) Addition is associative: (A + B) + C = A + (B + C).

(b) Addition is commutative: A + B = B + A. (c) Additive Identity: there exists a unique (m × n) matrix O called the zero matrix such that A + O = O + A = A. (d) Additive inverses exist: there exists an (m × n) matrix (−A) such that A + (−A) = O. (This is easily seen to be the matrix −1 · A.) Properties of Matrix Multiplication: Let A be an (m × n) matrix, let B be an (n × p) matrix, and let C be a (p × q) matrix. (a) Multiplication is associative: (AB)C = A(BC).

(b) Multiplicative Identity: There exist a unique (m × m) matrix Im and a unique (n × n) matrix In such that ImA = AIn = A Properties of Scalar Multiplication: Let A be an (m × n) matrix, let B be an (n × p) matrix and let r and s be scalars.

(a) Scalar multiplication is associative: (rs)A = r(sA) and r(AB) = (rA)B. (b) Scalar multiplication commutes with matrix multiplication: (rA)B = A(rB). Distributive Properties: Let A and B be (m × n) matrices, let C and D be (n × p) matrices, and let r and s be scalars.

(a) Matrix multiplication distributes over matrix addition : (A + B)C = AB + BC and A(C + D) = AC + AD. (b) Matrix multiplication distributes over scalar addition: (r + s)A = rA + sA. (c) Scalar multiplication distributes over matrix addition: r(A + B) = rA + rB.

Special Matrices Let us introduce some important matrices, the (m × n) zero matrix and the (n × n) identity matrix (these are the same matrices as in the theorems above). The (m × n) zero matrix, denoted by O in our textbook, is the (m × n) matrix consisting entirely of zeros:

 0 0 ··· 0   0 0 ··· 0  O =    . . .   . . .  0 0 ··· 0

1 It is easy to see that this indeed behaves as the additive identity for (m × n) matrices. The (n × n) identity matrix, denoted In, is the (n × n) matrix with ones along the main diagonal and zeros everywhere else:

 1 0 0 ··· 0 0   0 1 0 ··· 0 0     0 0 1 ··· 0 0  I =   n  . . . . .   . . . . .     0 0 0 ··· 1 0  0 0 0 ··· 0 1

So, for example,  1 0 0   1 0  I = and I = 0 1 0 2 0 1 3   0 0 1

When n is clear from the context, we will omit the subscript and write I instead of In. The identity matrix behaves as the multiplicative identity for (n × n) matrices and, more generally, if A is an (m × n) matrix, we have ImA = AIn = A

What’s Missing? If you look carefully at Theorems 1.7, 1.8, and 1.9, you will notice that certain nice properties of multiplication do not seem to carry over. In particular, • Matrix multiplication is NOT commutative: in general, AB 6= BA, even if both products are defined and have the same dimensions. For example, in the previous lecture we saw that

 5 2   0 −1   6 −3   0 −1   5 2   −1 −1  = but = 1 1 3 1 3 0 3 1 1 1 16 7

• There was no mention of matrix division. We will discuss matrix inverses in Section 1.9. However, it is worth making a few comments here. We know that every nonzero real number k has a multiplicative inverse, namely 1/k and this inverse has the property that k · 1/k = 1. The situation with matrices is much more complicated: First, we restrict ourselves to square matrices. In this case, we can define inverses analagously to the real number case, namely if A and B are two (n × n) matrices then A and B are inverses if AB = In. However, it is not true that every nonzero has an inverse. To see what might go wrong, consider the following multiplication:

 0 1   0 1   0 0  = 0 0 0 0 0 0

Thus, it is possible to find a nonzero matrix whose square is the zero matrix. This phenomenon has no analogue over the real numbers. We’ll discuss matrix inverses in much more detail in a few days, but the lesson from these two examples is that matrix multiplication can be somewhat surprising and should be careful not to assume matrix multiplication behaves just like real number multiplication.

The of a Matrix T Definition: Let A = (aij) be an (m × n) matrix. The transpose of A is the (n × m) matrix A obtained by interchanging the rows and columns of A: T (A )ij = aji

2 Examples: Let A and B be the matrices

 1 2 3   1 2  A = and B = 4 5 6 0 3

Then  1 4   1 0  AT = 2 5 and BT =   2 3 3 6

Theorem 1.10: Let A and B be two (m × n) matrices and let C be an (n × p) matrix. Then 1.( A + B)T = AT + BT 2.( AC)T = CT AT (order is reversed) 3.( AT )T = A Note: Although the reversed order in 2. may seem counter-intuitive at first, it becomes more plausible if we consider the dimensions of these matrices. A is an (m × n) matrix so AT is an (n × m) matrix; C is an (n × p) matrix so CT is a (p × n) matrix. The AC is an (m × p) matrix so (AC)T will be a (p × m) matrix. The product CT AT is also a (p × m) matrix, while the product AT CT is not even defined if m 6= p. Now that the result at least seems plausible, let’s go ahead and prove it.

Proof of 2: By the remark above, we see that (AC)T and CT AT have the same dimensions. Recall the formula for matrix multiplication:

n X (AC)ij = (A)ik(C)kj k=1 Thus, using the definition of transpose, we have

n T X ((AC) )ij = (AC)ji = (A)jk(C)ki k=1 Also, using the definitions of matrix multiplication and transpose,

n n T T X T T X (C A )ij = (C )ik(A )kj = (C)ki(A)jk k=1 k=1

But (C)ki and (A)jk are scalars and scalar multiplication is commutative so we see that

n n T X X T T ((AC) )ij = (A)jk(C)ki = (C)ki(A)jk = (C A )ij k=1 k=1

It follows that (AC)T = CT AT .

Definition: An (m × n) matrix is symmetric if AT = A.

Note that A is an (m × n) matrix and AT is an (n × m) matrix. Thus, if AT = A, it must be the case that m = n so A is a so-called square matrix. Since only square matrices can be symmetric, we may as well refine our definition:

Refined Definition: A symmetric matrix is an (n × n) matrix satisfying AT = A.

3 Examples: The matrices  1 2 3   1 2  and 2 4 5 2 3   3 5 6 are symmetric. However, the matrices  1 2 3   2 3 4   1 2 3  and  1 2 3  1 2 3 0 1 2 are not symmetric. In the definition of symmetric matrix, the only symmetry we are interested in is symmetry about the main diagaonal. The matrix may exhibit other forms of symmetry, but these do not count towards the definition of symmetric matrix. The matrices from above are displayed again below with the main diagonal in red in each case:  123   123   234   12  , 245 , 123 , 123 23       3 56 1 23 0 12 Exercise: For what avalues of a, b, and c is the following matrix symmetric?  1 2 5 4   a 7 1 b     5 1 c 3  4 0 3 8

(As usual, the answers can be found at the end of these lecture notes.)

Scalar Product and Vector Norms In Math 126, you defined the of two vectors in 2- and 3-dimensions as well as the magnitude of 2- and 3-dimensional vectors. Here, we will generalize these ideas to n-dimensional vectors and see how to nicely express them using matrix notation. Recall that if x = hx1, x2, x3i and y = hy1, y2, y3i, the dot product of x and y was given by x • y = x1y1 + x2y2 + x3y3 and the magnitude of x was given by q 2 2 2 kxk = x1 + x2 + x3 (where this formula was derived from the Pythagorean Theorem). Let us generalize these concepts to Rn, using the notation of this class. Let x and y be the n-dimensional vectors     x1 y1  x2   y2  x =   and y =    .   .   .   .  xn yn Definition: The scalar product (or dot product) of x and y is the scalar

T x y = x1y1 + x2y2 + ··· + xnyn The Euclidean norm (or magnitude) of x is the scalar √ q T 2 2 2 kxk = x x = x1 + x2 + ··· + xn

4 (Note: technically, xT y and xT x are (1 × 1) matrices. By standard abuse of notation, we treat (1 × 1) matrices as scalars.)

Recall from Math 126 that the dot product is symmetric: x • y = y • x. This symmetry is apparent in the formula on the right. However, we can also prove the symmetry using Theorem 1.10: Note that

(xT y)T = yT (xT )T = yT x

But xT y is a (1 × 1) matrix so (xT y) = xT y. Thus,

xT y = yT x as desired.

Linear Combinations m Definition: Let v1, v2,..., vn be vectors in R . A of these vectors is any sum of the form n X akvk = a1v1 + a2v2 + ··· + anvn k=1 where a1, a2, . . . , an are scalars.

T Example: Recall Theorem 1.5: If A = [A1 A2 ··· An] is an (m × n) matrix and x = [x1 x2 ··· xn] is a vector in Rn, then the product Ax is given by

Ax = x1A1 + x2A2 + ··· + xnAn

The xi are scalars (the components of x) while the Ai are vectors (the columns of A). Thus, Theorem 1.5 tells us that the vector Ax is a linear combination of the columns of A.

Answers to Exercise: a = 2; b = 0; c can be any scalar.

5