College of Engineering and Computer Science Mechanical Engineering Department Engineering Analysis Notes

Last updated: March 24, 2014 Larry Caretto Introduction to Analysis

Introduction

These notes provide an introduction to the use of matrices in engineering analysis. Matrix notation is used to simplify the representation of systems of linear algebraic equations. In addition, the matrix representation of systems of equations provides important properties regarding the system of equations. The discussion here presents many results without proof. You can refer to a general advanced engineering math text1 or a text on for such proofs.

Parts of these notes have been prepared for use in a variety of courses to provide background information on the use of matrices in engineering problems. Consequently, some of the material may not be used in this course and different sections from these notes may be assigned at different times in the course.

Basic matrix definitions

A matrix is represented as a two-dimensional array of elements, aij, where i is the row index and j is the column index. The entire matrix is represented by the single boldface symbol A. In general we speak of a matrix as having n rows and m columns. Such a matrix is called an (n by m) or (n x m) matrix. Equation [1] shows the representation of a typical (n x m) matrix.

a11 a12 a13   a1m  a a a   a   21 22 23 2m  a31 a32 a33   a3m  A    [1]                 an1 an2 an3   anm 

In general the number of rows may be different from the number of columns. Sometimes the matrix is written as A(n x m) to show its size. (Size is defined as the number of rows and the number of columns.) A matrix that has the number of rows equal to the number of columns is called a square matrix.

Matrices are used to represent physical quantities that have more than one number. These are usually used for engineering systems such as structures or networks in which we represent a collection of numbers, such as the individual stiffness of the members of a structure, as a single symbol known as a stiffness matrix. Networks of pipes, circuits, traffic streets, and the like may be represented by a connectivity matrix which indicates which pair of nodes in the matrix are directly joined to each other. The use of matrix notation and formulae for matrices leads to important analytical results. Students taking a vibrations course learn that a matrix property

1 Kreyszig, Advanced Engineering (9th edition), Wiley, 2006, Chapter 7.

Jacaranda Hall Room 3314 Mail Code Phone: 818.677.6448 Email: [email protected] 8348 Fax: 818.677.7062 Matrix Introduction L. S. Caretto, March 24, 2014 Page 2 knows as its eigenvalues represents the fundamental vibration frequencies in a mechanical system.

Two matrices can be added or subtracted if both matrices have the same size. If we define a matrix, C, as the sum (or difference) of two matrices, A and B, we can write this sum (or difference) in terms of the matrices as follows.

C  A  B (possible only if A and B havethe same size) [2]

The components of the C matrix are simply the sum (or difference) of the components of the two matrices being added (or subtracted). Thus for the matrix sum (or difference) shown in equation [2], the components of C are give by the following equation.

C  A  B  cij  aij bij (i 1,n; j 1,m) [3]

The product of a matrix, A, with a single number, x, yields a second matrix whose size is the same as that of matrix A. Each component of the new matrix is the component of the original matrix, aij, multiplied by the number x. The number x in this case is usually called a scalar to distinguish it from a matrix or a matrix component.

B  xA if bij  xaij (i 1,n; j 1,m) [4]

We define two special matrices, the null matrix, 0, and the , I. The null matrix is an arbitrary size matrix in which all the elements are zero. The identity matrix is a square matrix in which all the diagonal terms are 1 and the off-diagonal terms are zero. These matrices are sometimes written as 0(m x n) or In to specify a particular size for the null or identity matrix. The null matrix and the identity matrix are shown below.

0 0 0   0 1 0 0   0 0 0 0   0 0 1 0   0     0 0 0 0 0 0 1   0 0    I    [5]                         0 0 0   0 0 0 0   1

A matrix that has the same pattern as the identity matrix, but has terms other than ones on its principal diagonal is called a diagonal matrix. The general term for such a matrix is diδij, where di is the diagonal term for row i and δij is the Kronecker delta; the latter is defined such that δij = 0 unless i = j, in which case δij = 1. A diagonal matrix is sometimes represented in the following form: D = diag(d1, d2, d3,…,dn); this says that D is a diagonal matrix whose diagonal components are given by di

We call the diagonal for which the row index is the same as the column index, the main or principal diagonal. Algorithms in the of differential equations lead to matrices whose nonzero terms lie along diagonals. For such a matrix, all the nonzero terms may be represented by symbols like ai,i-k or ai,i+k. Diagonals with subscripts ai,i-k or ai,i+k are said to lie, respectively, below or above the main diagonal. Matrix Introduction L. S. Caretto, March 24, 2014 Page 3

If the n rows and m columns in a matrix, A, are interchanged, we will have a new matrix, B, with m rows and n columns. The matrix B is said to be the transpose of A, written as AT.

T B  A if bij  a ji [i 1,n; j 1,m;A is (n x m);B is (m x n).] [6]

An example of an original A matrix and its transpose is shown below.

 3 14   3 12  6 T   A    A  12  2 [7] 14  2 0     6 0 

The transpose of a product of matrices equals the product of the transposes of individual matrices, with the order reversed. That is,

T T T T T T T T (AB)  B A (ABC)  C B A (ABCD)  [8]

Matrices with only one row are called row matrices; matrices with only one column are called column matrices.2 Although we can write the elements of such matrices with two subscripts, the subscript of one for the single row or the single column is usually not included. The examples below for the row matrix, r, and the column matrix, c, show two possible forms for the subscripts. In each case, the second matrix has the commonly used notation. When row and column matrices are used in formulas that have two matrix subscripts, the first form of the matrices shown below are implicitly used to give the second subscript for the equation.

c11 c1  c  c   21  2  r  r11 r12 r13   r1m  c  c  c  31  3 [9]  r r r   r        1 2 3 m           cn1  cn 

The transpose of a column matrix is a row matrix; the transpose of a row matrix is a column matrix. This is sometimes used to write a column matrix in the middle of text by saying, for example, that c = [1 3 -4 5]T.

Matrix Multiplication

The definition of seems unusual when encountered for the first time. However, it has its origins in the treatment of linear equations. For a simple example, we consider three two-dimensional coordinate systems. The coordinates in the first system are x1

2 Row and column matrices are called row vectors or column vectors when they are used to represent the components of a vector. In these notes we will use upper case boldface letters such as A and B to represent matrices with more than one row or more than one column; we will use lower case boldface letters such as a or b to represent matrices with only one row or only one column. We will generally refer to these matrices as vectors. Matrix Introduction L. S. Caretto, March 24, 2014 Page 4

and x2. The coordinates for the second system are y1 and y2. The third system has coordinates z1 and z2. Each coordinate system is related by a coordinate transformation given by the following relations.

y1  a11x1  a12x2 z1  b11y1  b12y2 [10] y2  a21x1  a22x2 z2  b21y1  b22y2

We can obtain a relationship between the z coordinate system and the x coordinate system by combining the various components of equation [10] to eliminate the yi coordinates as follows.

z1  b11[a11x1  a12x2 ] b12[a21x1  a22x2 ] [11] z2  b21[a11x1  a12x2 ] b22[a21x1  a22x2 ]

We can rearrange these terms to obtain a set of equations similar to those in equation [10] that relates the z coordinate system to the x coordinate system.

z1  [b11a11  b12a21]x1 [b11a12  b12a22]x2  c11x1  c12x2 [12] z2  [b21a11  b22a21]x1 [b21a12  b22a22]x2  c21x1  c22x2

We see that the coefficients cij, for the new transformation are related to the coefficients for the previous transformations as follows.

c11  [b11a11  b12a21] c12  [b11a12  b12a22] [13] c21  [b21a11  b22a21] c22  [b21a12  b22a22]

There is a general form for each cij coefficient in equation [13]. Each is a sum of products of two terms. The first term from each product is a bik value whose first subscript (i) is the same as the first subscript of the cij coefficient being computed. The second term in each product is an akj value whose second subscript (j) is the same as the second subscript of the c term being computed. In each bikakj product, the second b subscript (k) is the same as the first a subscript. From these observations we can write a general equation for each of the four coefficients in equation [13] as follows.

2 cij  bikakj (i 1,2; j 1,2) [14] k1

The definition of matrix multiplication is a generalization of the simple example in equation [14] to any general sizes of matrices. In this general case, we define the product, C = AB, of two matrices, A with n rows and p columns, and B with p rows and m columns by the following equation.

p C(n x m)  A(n x p)B( p x m)  cij  aikbkj (i 1,,n; j 1,,m) [15] k1

There are two important items to consider in the formula for matrix multiplication. The first is that order is important. The product AB is different from the product BA. In fact, one of the products Matrix Introduction L. S. Caretto, March 24, 2014 Page 5 may not be possible. The second item is the need for compatibility between the first and second matrix in the AB product.3 In order to obtain the product AB the number of columns in A must equal the number of rows in B. A simple example of matrix multiplication is shown below.

3 4 3 0  6   A    B  1 2 4  2 0    6 1 [16] 3(3)  0(1)  6(6) 3(4)  0(2)  6(1)  27 6  AB       4(3)  2(1)  0(6) 4(4)  2(2)  0(1)  10 12

Matrix multiplication is simple to program. The VBA code for multiplying two matrices is shown below.4 This code assumes that all variables have been properly declared and initialized. The code uses the obvious notation to implement equation [15]. The array components are denoted as a[i][k]. b[k][j] and c[i][j]. The product matrix, C, has the same number of rows, n, as in matrix A and the same number of columns, m, as in matrix B. The number of columns in A is equal to p, which must also equal the number of rows in B.

For i = 1 To n For j = 1 To m c(i,j) = 0.0 For k = 1 To p c(i,j) = c(i,j) + a(i,k) * b(k,j) Next k Next j Next i

We now examine how the coordinate transformations that we used above to introduce matrix multiplication can be represented as matrix equations. We can define matrices, A, B, and C to represent the coefficients that we used in our coordinate transformation equations.

a11 a12  b11 b12  c11 c12  A    B    C    [17] a21 a22 b21 b22 c21 c22

The various coordinate pairs can be represented as column matrices as shown below.

x1  x11  y1   y11 z1  z11 x       y       z       [18] x2  x21 y2  y21 z2  z21

With these matrix definitions, the two sets of simultaneous linear equations shown in equation [10] can be represented by the following pair of matrix equations:

3 The terms premultiply and postmultiply are commonly used to indicate the order of the matrices involved in matrix multiplication. In the matrix product AB, we say that B is premultiplied by A or that A is postmultiplied by B. Alternatively, the terms left multiplied and right multiplied are used. In the AB product, A is right multiplied by B and B is left multiplied by A. 4 The basic code structure is the same in any language. There are three nested loops. The two outer loops cover all possible combinations of i and j to ensure that all the cij components are computed. The inner loop code is the typical code for summing a number of items. Matrix Introduction L. S. Caretto, March 24, 2014 Page 6

y  Ax and z  By [19]

You can verify that the equations above are correct by applying the general formula for matrix multiplication in equation [15] to the matrix equations in [19]. To do this, you should use the definitions of A, B, x, y, and z, provided in equations [17] and [18]. If we combine the matrix equations in [19] to eliminate the y matrix, we get the following result.

z  By  BAx or z  Cx with C  BA [20]

Note the importance of the order of multiplication. In general, BA; it is not equal to AB.

There are two cases where the order is not important. These are multiplication by a null matrix, which produces a null matrix, and multiplication by an identity matrix, which produces the original matrix.

0A  A0  0 and AI  IA  A [21]

Although the order is not important here, the actual identity and null matrices used may be different. We can rewrite equations [21] to explicitly show the rows and columns in each matrix.

0( p x n) A(n x m)  0( p x m) A(n x m) 0(m x q)  0(n x q) . [22] A(n x m)I(m x m)  I(n x n)A(n x m)  A(n x m)

By definition the identity matrix is a square matrix. One size specification for the identity matrix, the number of rows or the number of columns, is set by the compatibility condition for matrix multiplication. Once this is done, the other size is set by the requirement that I is square. For the null matrices in equation [22], the size specifications, n or m, must match the sizes for the A matrix. Although the size specifications p and q, for the null matrices in equation [22] are arbitrary, they are usually taken as p = m and q = n to give a square null matrix as the 0A product.

Simultaneous Linear Algebraic Equations

The coordinate transformation equations are simple examples of a more general case for simultaneous linear algebraic equations. In the general case we can have a set of simultaneous equations that is written as follows

m aij x j  bi i 1,,n [23] j 1

We expect that a well-determined problem will have the number of equations, n, equal to the number of unknowns, m, but in the general case, shown above, these n may be different from m. We see that this system of equations in equation [23] can be represented by matrices A(n x m), x(m x 1), and b(n x 1), where A has the form shown in equation [1], x is the column matrix [x1, x2, x3, T T …xm] and b is the column matrix [b1, b2, b3, …bn] . With these definitions, equation [23] is the same as the general equation for matrix multiplication shown in equation [15]. (Recall that we have omitted the second subscript, which is one, on the components of x and b.) The system of equations shown in equation [23] is written, in matrix form, in equation [24], below.

Ax = b [24] Matrix Introduction L. S. Caretto, March 24, 2014 Page 7

These matrices are written out in detail below. Here the column matrix, x, appears to have more rows than the coefficient matrix, A. This is done to emphasize the notion that m may be different from n in general. Of course, m may be equal to or less than n rather than greater than n as implied in the matrices below.

 x1  a a a   a  b  11 12 13 1m  x  1    2    a21 a22 a23   a2m b2    x3    a31 a32 a33   a3m    b3          [25]                                  an1 an2 an3   anm    bn  xm 

In order to solve a set of simultaneous linear equations we use a process that replaces equations in the set by equivalent equations. We can replace any equation in the set by a of other equations without changing the solution of the system of equations. For example, consider the simple set of two equations with two unknowns

3x1  5x2 13 [26] 7x1  4x2  1

You can confirm that x1 = 1 and x2 = 2 is a solution to this set of equations. To solve this set of equations we can replace the second equation by a new equation, which is a linear combination of the two equations without changing the solution. The particular combination we seek is one that will eliminate x1. We can do this by subtracting the first equation, multiplied by 7/3, from the second equation to obtain the following pair of equations, which is equivalent to the original set in equation [26].

3x1  5x2 13 47 94 [27]  x   3 2 3

We can readily solve the second equation to find x2 = 2, and substitute this value of x2 into the first equation to find that x1 = [13 – 5(2)]/3 = 1. The general process for solving the system of equations represented by equations [23], [24], or [25], known as Gauss elimination, is similar to the one just shown. It requires a series of operations on the coefficients aij and bi to produce a set of equations with the form shown in equation [28], below, without changing the solution of the initial problem.

The basic rule in the Gauss elimination process is that we can use a linear combination of two equations to replace one of those equations, without changing the solution to the problem. This is the process that we used above in going from the set of equations in [26] to the set of equations in [27]. Both sets of equations are equivalent in the sense that both sets of equations give the same answers for x1 and x2. However, the second set of equations can be directly solved for all the unknowns. Matrix Introduction L. S. Caretto, March 24, 2014 Page 8

11 12 13   1n1 1n   x1   1   0         x      22 23 2n1 2n   2   2   0 0 33   3n1 3n   x3   3                     [28]                    0 0 0   n1n1 n1n  xn1  n1         0 0 0   0 nn  xn  n 

The revised coefficient matrix in equation [28] is called an upper . The only nonzero terms are on or above the principal diagonal. The same operations that are used to obtain the revised coefficient matrix are used to obtain the revised right-hand-side matrix.

The revised A and b matrices are obtained in a series of steps. In the first step, the x1 coefficients are eliminated from all equations except the first one. This is done by the following replacement operations on the coefficients in equations 2 to n. The replacement notation () from computer programming is used here to indicate that an old value of aij is being replaced by the results of a calculation. This avoids the need to use mathematical notation that would require separate symbols for the two values.

 ai1 ai1  aij  aij  a1 j j 1,n and bi  bi  b1  i  2,n [29]  a11 a11 

After equation [29] is applied to all rows below the first row, the only nonzero x1 coefficient is in the first equation (represented by the first row of the matrix.) You can confirm that this will set ai1 = 0 for i > 1. You can also apply the formulae in [29] to equation [26] to see that the result is equation [27]. The elimination process is next applied to make the x2 coefficients on all equations below the second equation zero.

 ai2 ai2  aij  aij  a2 j j  2,n and bi  bi  b2  i  3,n [30]  a22 a22 

Equation [30] has the same form as equation [29]; only the starting points for the row and column operations are different. The process described by equations [29] and [30] continues until the form shown in equation [28] is obtained. From equation [28], the various values of x can be found by back substitution. We can simply find xn as βn/αnn. The remaining values of x are found in reverse order by the following equation.

n i  ij x j j i1 xi  i  n 1,n  2,,1 [31] ii

When we are solving for xi, all previous values of xj required in the summation are known. Matrix Introduction L. S. Caretto, March 24, 2014 Page 9

The VBA code below shows a simplified version5 of how the Gauss elimination method is applied to the solution of equations. As in previous code examples, all data values are assumed to be properly declared and initialized. The number of equations is equal to the number of unknowns, n. The row that is subtracted from all rows below it is called the pivot row. The main outer loop in the first part of the code uses the , pivot, to represent this row. The code execution is simplified by augmenting the a matrix so that ai,n+1 = bi. This allows the code to proceed without separate consideration of similar operations on the A and b matrix components. All operations are performed on the original A matrix so that the original data are lost in this routine.

‘augment a matrix with b values For row = 1 to n a(row,n+1) = b(row) Next row ‘get upper triangular array For pivot = 1 to n For row = pivot+1 to n; row++ ) ‘Code here to find row with larger pivot element For ( column = row+1 to n+1 a(row,column) = a(row,column) - a(row,pivot) * a(pivot,column) / a(pivot,pivot) Next column Next row Next pivot ‘Upper triangular matrix complete; get x values For row = n To 1 Step -1 x(row) = a(row,n+1); For column = n To row-1 x(row) = x(row) - a(row,column) * x(column) Next column x(row) /= x(row) / a(row,row) Next row

The process outlined above for the solution of a set of simultaneous equations is known as the Gaussian elimination procedure. Alternative procedures such as the Gauss-Jordan method and LU decomposition, work in a similar manner. They produce an upper triangular matrix or diagonal matrix that is then used to solve for the values of xi in reverse order.

Matrix Rank Determines Existence and Uniqueness of Solutions

If the solution process outlined above is used on certain matrices, it may not be possible to obtain a solution. Consider the two sets of equations shown below.

3x1  5x2 13 3x1  5x2 13 and [32] 6x1 10x2  26 6x1 10x2  27

5 Actual code would have to account for the possibility that the system of equations might not have a solution. Typically the code switches rows of the augmented matrix (equivalent to exchanging the order of the equations so that no change is made in the solution). Before a row is considered as a pivot row, the value in the pivot column in each equation below the pivot row is examined. The row with the largest value in the pivot column (in absolute value) is exchanged with the original pivot row before subtracting the pivot row from other rows. If the pivot column does not have a nonzero element (nonzero to within numerical accuracy), the problem has a singular matrix. Matrix Introduction L. S. Caretto, March 24, 2014 Page 10

In the set of equations on the left, the second equation is simply twice the first equation. If we multiply the first equation by two and subtract it from the second equation, we get the result that 0 = 0. Thus, the second equation gives us no new information on the relationship between x1 and x2. We say that this system of equations has an infinite number of solutions. Any value of x2 = 2.6 – 0.6x1 will satisfy both equations. The second set of equations has no solutions. If we multiply the first equation by two and subtract it from the second equation, we have the result that 0 = 1! Thus, this second set of equations is incompatible and does not have a solution.6

This simple example can be generalized to discuss the existence and uniqueness of solutions for the general set of equations. If we carry out the solution process outlined above to form an upper triangular matrix, we may have the result that one (or more) of the final rows in the coefficient matrix is all zero. This means that we cannot obtain a unique solution. Such a case is called a singular matrix. The rank of a matrix is formally defined as the number of linearly independent rows in a matrix. (This can be shown to be equal to the number of linearly independent columns.) The practical determination of rank is based on the Gauss elimination process outlined above. If in the final matrix in the elimination process is a matrix with n rows of which nzero rows contain all zeros, the rank of the matrix is n – nzero. (This rank is the same for both the original matrix and the upper-triangular matrix because the Gauss elimination operations do not change the matrix rank.) The A matrix for both sets of equations in equation [32] has only one linearly independent row, thus its rank is one. The upper triangular form that results when a matrix is tested for rank is sometimes called the row-echelon form. Sometimes in this form each row is divided by the diagonal element on that row so that all the diagonal elements are one.

The two matrices in equation [33] below have been placed in row-echelon form by using Gauss elimination on the original matrices. Can you determine the rank of the original matrices before looking at the answers below?

The matrix on the left of equation [33] has four rows that are not all zero; thus its rank is four. The one on the right has six rows that are not all zero, thus its rank is six. This rank-six matrix has eight columns. Because the number of linearly independent columns and the number of linearly independent rows is the same as the rank of six, we know that these eight columns will be related by two different linearly equations.

6 0 2 0 0 0 6 0 2 0 0 0 1 6 0 1 7 8 6 2     0 1 7 8 6 2 8 4 0 0 3 4 0 0 0 0 2 0 3 5 8 0     [33] 0 0 0 9 1 2 0 0 0 4 1 0 7 3   0 0 0 0 6 0 3 5 0 0 0 0 0 0     0 0 0 0 0 0 2 1 0 0 0 0 0 0

The existence and uniqueness of solutions are defined in terms of the rank of the augmented matrix, [A,b]. This is the matrix in which the right hand side column matrix, b, is added as the final column in the A matrix. This augmented matrix is shown below for the general case of n equations and m unknowns. The n equations mean that there are n rows in the matrix. The m unknowns give m + 1 columns to the augmented matrix.

6 This result has a geometric interpretation. When we have two simultaneous linear algebraic equations, we can plot each equation in x1–x2 space. The solution to the pair of equations is located at the point where both equations intersect. If we did this for the left set of equations in [32], we would only have a single line. Matrix Introduction L. S. Caretto, March 24, 2014 Page 11

a11 a12 a13   a1m b1  a a a   a b   21 22 23 2m 2  a31 a32 a33   a3m b3  [A,b]    [34]                   an1 an2 an3   anm bn 

The existence and uniqueness of solutions to Ax = b is stated below without proof. If the rank of the original matrix, A, equals the rank of the augmented matrix, [A,b], and both equal the number of unknowns, m, there is a unique solution to the matrix equation, Ax = b. If the rank of the original matrix, A, equals the rank of the augmented matrix, [A,b], but the common rank is less than the number of unknowns, m, there are an infinite number of solutions to the matrix equation, Ax = b. If the rank of the original matrix, A, is not equal the rank of the augmented matrix, [A,b], there is no solution to the matrix equation, Ax = b.

We can see that these statements are consistent with the examples in equation [32]. A formal proof of these statements is given in linear algebra texts.

These guidelines for the existence and uniqueness of solutions to simultaneous linear equations are illustrated in the three sets of equations shown below. Each equation set has three equations in three unknowns. The original equation set, shown in the first column, is converted to an upper triangular form in the second column. We see that the first set has a unique solution. The second and third sets do not have a unique solution; however, there is a difference between these two. The second set has an infinite number of solutions. For any value,  that we pick for x3 we can determine a value of x1 and x2 that is consistent with the original set of equations. However, for the third set of equations, the upper triangular form gives an inconsistent third equation. Thus this set of equations has no solution.

Original Equation Set Upper Triangular Form Solutions

x1  4x2  26x3  2 x1  4x2  26x3  2 x1  0

Set I 2x2  9x3  5 2x2  9x3  5 x2  7

7x1  3x2  8x3  13 50.5x3  50.5 x3  1

x1  4x2  26x3  2 x1  4x2  26x3  2 x1  12  8

Set II 2x2  9x3  5 2x2  9x3  5 x2  2.5  4.5

 2x1 10x2  61x3  9 0  0 x3  

x1  4x2  26x3  2 x1  4x2  26x3  2 No Set III 2x  9x  5 2x  9x  5 2 3 2 3 Solution  2x1 10x2  61x3  8 0  1

These three sets of equations are shown in terms of their A and augmented [A b] matrices in the table below. We see that the set of equations in the table above corresponds to the data in the augmented matrix. The first set of equations has rank A = rank [A b] = 3, the number of Matrix Introduction L. S. Caretto, March 24, 2014 Page 12 unknowns. We have already seen that this provides the unique solution above. The second set of equations has rank A = rank [A b] = 2, less than the number of unknowns. This means that we have an infinite number of solutions. Again, this corresponds to the result above. Finally, the third case below has rank A = 2, but rank [A b] = 3. This difference in rank shows that there are no solutions.

Original Matrices Row-Echelon Form Rank A [A b] A [A b] A [A

b] 1  4 2 1  4 2 2  1  4 2  1  4 2 2  Set         0 2 9 0 2 9  5 0 2 9 0 2 9  5 3 3 I         7 3 8 7 3 8 13 0 0 50.5 0 0 50.5 50.5  1  4 2  1  4 2 2 1  4 2 1  4 2 2 Set        0 2 9 0 2 9  5 0 2 9 0 2 9  5 2 2 II         2 10 61 2 10 61  9 0 0 0 0 0 0 0  1  4 2 2 1  4 2 2 Set     0 2 9  5 0 2 9  5 2 3 III      2 10 61  8 0 0 0 1

There is one final case to consider; that is the case of homogenous equations, where the b matrix is all zeros. If there are n equations and the rank of the coefficient matrix is n then the only solution to the set of equations is that all xi = 0. (This is called the trivial solution.) However, if the rank is less than n, it is possible to have a solution in which all the xi are not zero. However, such a solution is not unique.

Consider the two sets of homogenous equations shown below. Each set of equations has a right- hand side that is all zeros. (The two equation sets are identical except for the coefficient of the x1 term in the first equation.)

 x1  4x2  3x3  0 x1  4x2  3x3  0

 4x1 11x2  6x3  0 and  4x1 11x2  6x3  0 [35]

x1 8x2  5x3  0 x1 8x2  5x3  0

If we carry out the usual solution process to create an upper triangular matrix for these two sets of equations, we obtain the following results.

 x1  4x2  3x3  0 x1  4x2  3x3  0

27x2 18x3  0 and  6x2  6x3  0 [36]

0  0  2.8x3  0

For the set of equations on the right, the rank of the coefficient matrix is the same as the number of equations. Here we have a unique solution in which all of the xi = 0. The rank of the coefficient matrix for the equations on the left is less than the number of equations. In this case, we have an infinite number of solutions. If we pick x3 = α, an arbitrary constant, we can satisfy all three Matrix Introduction L. S. Caretto, March 24, 2014 Page 13

equations if x2 = 2α/3 and x1 = 3α – 4(2α/3) = α/3. One of the infinite solutions, with  = 0, is the 7 trivial solution where all xi = 0.

Determinants

A is a single numerical value that can be computed for a square array. The values of play a theoretical role in matrix analysis and can be used for calculations on small matrices. For matrices greater than 3x3 or 4x4, alternative calculation methods are used in place of determinants.

Various notations are available for a determinant. If A is a matrix, then Det A is the determinant for the coefficients in a matrix. The determinant for an array of numbers that looks like a matrix can be written using the absolute value , | |, instead of the brackets, [], that we have been using for matrix coefficients. The various notations are shown below for a 2x2 array. For this array, the formula for the determinant is particularly simple.

a11 a12  a11 a12  a11 a12 A    Det A  Det    a11a22  a21a12 [37] a21 a22 a21 a22 a21 a22

For a 3x3 array, the determinant is a bit more complex.

a11 a12 a13  a11 a12 a13   a11a22a33  a21a32a13  a31a12a23 Det a a a  a a a  [38]  21 22 23 21 22 23  a11a32a23  a21a12a33  a31a22a13 a31 a32 a33 a31 a32 a33

The general equation for computing a determinant is given in terms of minors (or cofactors) of a determinant. The minor, Mij of a determinant is the smaller determinant that results if row i and i+j column j are eliminated from the original determinant. The cofactor, Aij, equals (-1) Mij. For example, if we start with a 3x3 determinant, such as the one shown in equation [38] we can define nine possible minors (and cofactors). Four of these are shown below:

a a a a A  (1)33 M  11 12 A  (1)22 M  11 13 33 33 a a 22 22 a a 21 22 31 33 [39] 32 a11 a13 31 a12 a13 A32  (1) M 32   A31  (1) M 31  a21 a23 a22 a23

The determinant of a matrix can be written in terms of its minors or cofactors as follows.

7 You should make sure that you can place the original sets of equations in [35] and the upper triangular forms in [36] into an A and an augmented [A b] matrix and show that both sets of equations have rank A = rank [A b]. Do both sets of equations produce A matrices with the same rank? What are the ranks of the A and [A b] matrices for the two sets of equations? Matrix Introduction L. S. Caretto, March 24, 2014 Page 14

n n i j Det A(n x n)  (1) aijM ij   aij Aij i1 i1 n n [40] i j  (1) aijM ij   aij Aij j 1 j 1

Note that the sum is taken over any one row or over any one column. In applying this formula, one seeks rows or columns with a large number of zeros to simplify the calculation of the determinant. We can show that this equation is consistent with the results given previously for the determinants of 2x2 and 3x3 arrays. Applying equation [40] to the third row of a 3x3 array gives the following result.

3 Det A(3 x 3)  a3 j A3 j  a31A31  a32A32  a33A33 [41] j1

We could have applied equation [40] to any of the three rows or any of the three columns to compute the determinant. I chose to use the third row since the necessary cofactors can be found in equation [39]. If we use equation [37] to expand the (2 x 2) cofactors in [39] and apply those results to equation [41], we obtain the following result.

Det A(3 x 3)  a31M 31  a32M 32  a33M 33

a12 a13 a11 a13 a11 a12  a31  a32  a33 a22 a23 a21 a23 a21 a22  a31(a12a23  a22a13)  a32(a11a23  a21a13)  a33(a11a22  a12a21) [42]

 a31a12a23  a31a22a13  a32a11a23  a32a21a13  a33a11a22  a33a12a21

The final result, after some rearrangement, is the same as the one in equation [38].

Two rules about determinants are apparent from equation [41]:  A determinant is zero if any row or any column contains all zeros.  If one row or one column of a determinant is multiplied by a constant, k, the value of the determinant is multiplied by the same constant. Note the implication for matrices: if a matrix is multiplied by a constant, k, then each matrix element is multiplied by k. If A is an n x n matrix, Det(kA) = knDet(A). Additional rules for and properties of determinants are stated below without proof.  If one row (or one column) of a determinant is replaced by a linear combination of that row (or column) with another row (or column), the value of the determinant is not changed. This means that the operations of the Gauss elimination process do not change the determinant of a matrix.  If two rows (or two columns) of a determinant are linearly dependent, the value of the determinant is zero.  The determinant of the product of two matrices, A and B is the product of the determinants of the individual matrices: Det(AB) = Det(A) Det(B).  The determinant of transposed matrix is the same as the determinant of the original matrix: Det(AT) = Det(A). Matrix Introduction L. S. Caretto, March 24, 2014 Page 15

If we apply the column expansion of equation [41] to an upper triangular matrix, A, we find that Det A = a11A11, since the a11 term is the only term in the first column. We can apply equation [41] repeatedly to the cofactors. Each application shows that the determinant is simply the new term in the upper left of the array times its cofactor. Continuing in this fashion we see that the determinant of an upper triangular matrix is simply the product of the diagonal terms. We can combine this result with the fact noted above that the operations of the Gauss elimination process do not change the determinant of a matrix to develop a practical for computing determinants of any matrix. Apply Gauss elimination to get the matrix in upper triangular form then the determinant (of both the original matrix and the one in upper triangular form) is simply the product of the diagonal elements.

As an example consider the matrices from “Set 1” in the table on page 12. The original matrix 1  4 2 1  4 2      was 0 2 9 ; its upper triangular form was 0 2 9 . We can readily compute the     7 3 8 0 0 50.5 determinant as the product (1)(2)(50.5) = 101. You can show that the same value is obtained by the conventional formula for the evaluation of the original 3 x 3 determinant.

Determinants are not used in normal numerical calculations. However if you need to find the numerical value for a large determinant, the process outlined above is the most direct numerical approach.

Cramer’s rule gives the solution to a system of linear equations in terms of determinants. This approach is never used except in some theoretical applications. According to Cramer’s rule the solution for a particular unknown xi is the ratio of two determinants. The determinant in the denominator uses the usual matrix coefficients, aij. The determinant in the numerator consists of the aij coefficients except in one column. When we are solving for xj we replace column i in the aij coefficients by the right-hand-side matrix coefficients, bi. For a set of three equations in three unknowns, Cramer’s rule would give the solutions shown in equation [43].

Cramer’s rule allows us to find an analytical expression for the solution of a set of equations, and it is sometimes used to solve small sets of equations (2 x 2 or 3 x 3). However, it is never used for numerical calculations of larger systems because it becomes extremely time consuming the number of equations increases.

b1 a12 a13 a11 b1 a13 a11 a12 b1

b2 a22 a23 a21 b2 a23 a21 a22 b2

b3 a32 a33 a31 b3 a33 a31 a32 b3 x1  x2  x3  [43] a11 a12 a13 a11 a12 a13 a11 a12 a13

a21 a22 a23 a21 a22 a23 a21 a22 a23

a31 a32 a33 a31 a32 a33 a31 a32 a33

Determinants are also related to rank. An array in which the rows are not linearly independent will have a zero determinant. As an example of this consider the left-hand set of equations from equation [35]. Recall that the coefficients for that set of three equations had a rank of two because the equations were not linearly independent. When we evaluate the determinant for this array below, using equation [38] for the determinant of a (3x3) array, we find that the determinant is zero. Matrix Introduction L. S. Caretto, March 24, 2014 Page 16

1  4 3  (1)(11)(5)  (4)(8)(3)  (1)(4)(6)  4 11  6  (1)(8)(6)  (4)(4)(5)  (1)(11)(3) [44] 1 8 5  55  96  24  (48) 80  33  0

This gives us another approach to determining when a set of equations with all zeros on the right- hand side has a solution other than the simple one that all xi are zero. This condition is that the determinant of the coefficient matrix is zero. If Det(A) = 0 then a solution to Ax = b, where b contains all zeros, that does not have all xi = 0 is possible. There are actually an infinite number of such solutions. These solutions differ by an arbitrary multiplier. We will use this idea below when considering matrix eigenvalues.

Inverse of a Matrix

We have defined operations for adding, subtracting and multiplying matrices. Matrix inversion is the matrix analog of division. For a square matrix, A, we define a matrix inverse, A-1, by the following equation.

1 1 AA  A A  I [45]

If we have a matrix equation, Ax = b, we can, in principle, solve this equation by premultiplying both sides of the equation by A-1. This gives the following result.

1 1 1 1 If Ax  b, A Ax  A b  Ix  A b  x  A b [46]

The various steps in equation [46] use the definition, in equation [45], that the product of a matrix and its inverse is the identity matrix and the definition, in equation [21] that the product of any matrix with the identity matrix is the original matrix. Although the result that x = A-1b may be written as the solution to the original equation, the actual solution of matrix equations like Ax = b is done methods other than the direct calculation of the inverse. It is not always possible to find the inverse. A square matrix that has no inverse is called a singular matrix.

It is usually not necessary to find the inverse of a matrix. If necessary, you can find a numerical value of the inverse by the same process used to solve simultaneous linear algebraic equations. To understand how this is done, we define a second matrix, B, as A-1. Then, by the definition of inverse we have the following equation.

If B  A1, AB  I [47]

Equation [48] shows the matrices involved in this equation.

a11 a12 a13   a1n  b11 b12 b13   b1n  1 0 0   0 a a a   a  b b b   b  0 1 0   0  21 22 23 2n   21 22 23 2n    a31 a32 a33   a3n  b31 b32 b33   b3n  0 0 1   0        [48]                                             an1 an2 an3   ann  bn1 bn2 bn3   bnn  0 0 0   1 Matrix Introduction L. S. Caretto, March 24, 2014 Page 17

We have a form similar to the usual problem of solving a set of equations. The coefficient matrix, A, is the same, but we have n right-hand side columns of known values. Each of these columns of known values corresponds to one column of unknowns in the B matrix that is A-1. If we use our usual process for solving Ax = b, with, for example, b = [1 0 0 0 …0]T, we will obtain the first column of B = A-1. Repeating the process for similar b columns, which are all zeros except for a 1 in row k gives us column k of the inverse. For example, equation [49] shows the solution for the second column of B = A-1.

a11 a12 a13   a1n  b12  0 a a a   a  b  1  21 22 23 2n   22    a31 a32 a33   a3n  b32  0        [49]                             an1 an2 an3   ann  bn2  0

Because the operations for solving a set of simultaneous linear equations are based on the A matrix only, the solution for the inverse is actually done simultaneously for all columns.

An analytical expression for the inverse can be obtained in terms of the cofactors discussed in the -1 section on determinants. We continue to define B = A ; the components of the inverse, bij, are then given in terms of the minors or cofactors of the original A matrix and its determinant.

1 Aji i j M ji If B  A , b   (1) [50] ij Det(A) Det(A)

The simplest application of this equation is to a 2x2 matrix. For such a matrix,

M a M a b  (1)11 11  22 b  (1)12 21   12 11 Det(A) Det(A) 12 Det(A) Det(A) [51] M a M a b  (1)21 12   21 b  (1)22 22  11 21 Det(A) Det(A) 22 Det(A) Det(A)

Combining the results of equation [51] with equation [37] for a 2x2 determinant, gives the following result for the inverse of a 2x2 matrix.

1 a11 a12  1  a22  a12      [52] a21 a22 a11a22  a21a12  a21 a11 

You can easily show that this is correct by multiplying the original matrix by its inverse. You will obtain a unit matrix by either multiplication: AA-1 or A-1A. The same process can be used to find the inverse of a (3x3) matrix; the result is shown below: Matrix Introduction L. S. Caretto, March 24, 2014 Page 18

(a22a33  a32a23) (a32a13  a33a12) (a12a23  a22a13) (a a  a a ) (a a  a a ) (a a  a a ) 1  31 23 33 21 11 33 31 13 21 13 11 23  a11 a12 a13    (a21a32  a31a22) (a31a12  a11a32) (a11a22  a21a12) a a a  [53]  21 22 23  a11a22a33  a21a32a13  a31a12a23  a31 a32 a33    a11a32a23  a21a12a33  a31a22a13 

Equations [52] and [53] show the value of determinants in providing analytical solutions to inverses. Although determinants are valuable in such cases any use of determinants should be avoided in numerical work.

The general rule for the inverse of a matrix product and the inverses of the individual matrices is similar to the same equation for the transpose of a matrix product and the product of the transposes of the individual matrices. This relation is shown below.

1 1 1 1 1 1 1 1 (AB)  B A (ABC)  C B A (ABCD)  [54]

Matrix Eigenvalues and Eigenvectors

If a square matrix can premultiply a column vector and return the original column vector multiplied by a scalar, the scalar is said to be an eigenvalue of the matrix and the column vector is called an eigenvector. In the following equation, the scalar, λ, is an eigenvalue of the matrix A, and x is an eigenvector.

A(n x n)x(n x1)  x(n x1) [55]

We can use the identity matrix to rewrite this equation as follows.

[ A(n x n)  I(n x n)]x(n x1)  0(n x1) [56]

a11   a12 a13   a1n   x1  0  a a   a   a   x  0  21 22 23 2n   2     a31 a32 a33     a3n   x3  0        [57]                              an1 an2 an3   ann   xm  0

As discussed above in the sections on simultaneous linear equations and determinants, equation [57] has the solution that all values of xi are zero. It may have a nonzero solution if the determinant of the coefficient matrix is zero. That is, Matrix Introduction L. S. Caretto, March 24, 2014 Page 19

a11   a12 a13   a1n   a a   a   a   21 22 23 2n   a31 a32 a33     a3n  Det[A  I]  Det    0 [58]                  an1 an2 an3   ann  

From the general expression for a determinant, we see that one component of the final expression for a determinant of any size is the product of all elements on the principal diagonal. In equation [58] this term will give an nth order polynomial in  (for our n x n matrix). This nth order polynomial is known as the characteristic equation of the matrix. This characteristic equation can be solved for n values of , not all of which may be distinct. For a two-by-two matrix, setting Det[A – Iλ]=0 gives the following quadratic equation.

a11   a12  (a11  )(a22  )  a21a12  2 [59] a21 a22     (a11  a22)  a11a22  a21a12  0

We can solve the quadratic equation in [59] to get two roots that give us the two possible eigenvalues:

2 (a11  a22)  (a11  a22)  4a21a12  a11a22    [60] 2

Each eigenvalue will have its own eigenvector. Each eigenvector is found by the solution of equation [53]. If we denote the eigenvectors as x(1) and x(2), the components of eigenvector j may be written as x(j)1 and x(j)2. Accordingly, we have to solve the set of equations shown below two times once for 1 and once for 2.

(a11   j )x( j)1  a12x( j)2  0 [61] a21x( j)1  (a22   j )x( j)2  0

Again, the solution is not unique. Any set of x values, multiplied by an arbitrary constant, will satisfy this set of equations. For simplicity we pick x(j)1 = α. We have two possible results for the eigenvector component, x(j)2 depending on which equation we use.

( j  a11) a21 x( j)1   x( j)2     [62] a12 ( j  a22)

There appear to be two different solutions for x(j)2, depending on the use of the first or second equation to get this eigenvector component. However, equating these two values for x(j)2, will eliminate the arbitrary constant, α, and obtain equation [59] that we solved for λ. Thus the two possible expressions for x(j)2 in equation [62] will result in the same value. Matrix Introduction L. S. Caretto, March 24, 2014 Page 20

The approach outlined above for finding eigenvalues and eigenvectors can, in principle, be applied to any size of matrix. However, numerical methods are used to find the eigenvalues and eigenvectors of large matrices. The eig of MATLAB can be used to find both eigenvalues and eigenvectors.

As a numerical example, consider the determination of the eigenvalues and eigenvectors for the 1 5 matrix, A    . You can find the answer using equations [61] and [62]. However, we will 0 2 outline the entire solution process as an example of finding eigenvalues and eigenvectors for larger systems. Solving the equation Det[A – I] for this matrix gives the following result. 1  5 Det[A  I]   (1 )(2  )  (0)(5)  0 . The roots to this equation are 2 = 0 2  

2 and 1 = 1. We now substitute each eigenvalues into the equation (A – Ik)x(k) = 0, and solve for the components of each eigenvector. For the first eigenvector we obtain.

(1 2)x(1)1  5x(1)2  0 [63] 0x(1)1  (2  2)x( j)2  0

We see that the last equation results in 0 = 0, which gives us no useful information. Since we know that the homogenous equation set has an infinite number of solutions, we pick an arbitrary value, , for x(1)2. With this value, the first equation gives us the result that x(1)1 = 5. Thus our T first eigenvector, x(1) = [5 ] . We can apply the same procedure to find the second eigenvector.

(11)x(2)1  5x(2)2  0 [64] 0x(2)1  (2 1)x(2)2  0

Here both equations tell us that x(2)2 must be zero. However, there is no information about x(2)1. We conclude that this must be an arbitrary quantity that we will call . This gives our second T eigenvector, x(2) = [ 0] . We can verify our solution for eigenvalues and eigenvectors by showing that they satisfy the defining equation, Ax = x.

1 55 (1)(5)  (5)() 10 5 Ax(1)            2   1x(1) [65] 0 2   (0)(5)  (2)()  2    

1 5 (1)()  (5)(0)   Ax(2)           1   2x(2) [66] 0 20 (0)()  (2)(0) 0 0

The calculations above show that the definition of eigenvalues and eigenvectors is satisfied regardless of our choices for  and . This is a general result. We are always free to choose one component of the eigenvector. However, the remaining components will be set. Typically the Matrix Introduction L. S. Caretto, March 24, 2014 Page 21 eigenvector components are chosen to give a simple expression for the eigenvector (i.e, one in which all the components are integers or simple fractions) or a unit vector.8

In the example of the two-by-two matrix used above, we could express the eigenvectors we found in any of the ways shown immediately below. The last expression shown for each eigenvector is a unit vector. Note that the two eigenvectors are not orthogonal in this example.

 5  5 5  26  1 x(1)    x(1)    x(1)  x(2)    x(2)    [67]    1  1  0 0  26

8 2 A unit vector, u, is a row or column vector for which the two norm, ||u||2 = ui = 1. If we have a vector, v, that is not a unit vector we can convert it into a unit vector by dividing each component by ||v||2. v This would give the components of the new unit vector by the following equation: u  i i 2 vk