Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Solution Methods – System of Linear Equations

Earlier we saw that a generic system of n equations in n unknowns

b1  a11 x1  a12 x2    a1n xn

b2  a21 x1  a22 x2    a2n xn     

bn  an1 x1  an2 x2    ann xn could be represented in the following format The elements of the b1  a11 a12  a1n x1       square [A] matrix and b2  a21 a22  a2n x2        the {b} vector will be            known and our goal is      finding the elements of bn  an1 an2  annxn  the vector{x}.

B  AX 1 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Finding the elements of the {x} vector can be accomplished using approaches from an extensive library of methods that are quite diverse. All methods seek to solve a linear system of equations that can be expressed in a matrix format as

Ax  b for the vector {x}. If we could simply “divide” this expression by the matrix [A], i.e.,

x  A1 b then we could easily formulate the vector {x}. As we will see this task is labor intensive. The methods used to accomplish this can be broadly grouped into the following two categories:

1. direct methods and 2. iterative methods

Each group contains a number of methods and we will look at several in each category. Keep in mind that there are hybrid methods exist that are combinations of the two methods in the categories. 2 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Basic Definitions

In algebra we easily make use of the concept of zero and one as follows:   0  0     1  1   where  is a scalar quantity. A scalar certainly possesses a reciprocal, or multiplicative inverse. that when applied to the scalar quantity produces one:

 1        1   1   The above can be extended to n x n matrices. Here scalar one (1) becomes the [I], and zero is the null matrix [0], i.e.,

A  0  0  A  A A I   IA  A

3 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

At this point we note that if there is an n x n matrix [A]-1 that pre- and post-multiplies the matrix [A] such that

A1 A  A A1   I then the matrix [A]-1 is termed the inverse of the matrix [A] with respect to . The matrix [A] is said to be invertible, or non-singular if [A]-1 exists, and non-invertible or singular if [A]-1 does not exist.

The concept of matrix inversion is important in the study of structural analysis with matrix methods. We will study this topic in detail several times, and refer to it often throughout the course.

We will formally define the inverse of a matrix though the use of the of the matrix and its self –adjoint matrix. We will do that in a formal manner after revisiting properties of the and co-factors of a matrix.

However, there are a number of methods that enable one to find the solution without finding the inverse of the matrix. Probably the best known of these is Cramer's Rule followed by and the Gauss-Jordan method. 4 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Cramer’s Rule – Three Equations and Three Unknowns It is unfortunate that usually the method for the solution of linear equations that students remember from secondary education is Cramer's rule which is really an expansion by minors (topic discussed subsequently). This method is rather inefficient and relatively difficult to program. However, as it forms sort of a standard by which other methods can by judged, we will review it here for a system of three equations and three unknowns. The more general formulation is inductive.

Consider the following system of three equations in terms of three unknowns {x1, x2, x3}

b1  a11 a12 a13 x1      b  a a a  x  2   21 22 23 2       bn  a31 a32 a33x3  Where we identify a11 a12 a13  A  a a a     21 22 23 a a a   31 32 33 5 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

and

b1 a12 a13  a11 b1 a13  a11 a12 b1  A  b a a  A  a b a  A  a a b  1   2 22 23 2   21 2 23 2   21 22 2        b3 a32 a33  a31 b3 a33  a31 a32 b3 

The solution is formulated as follows

b1 a12 a13 a11 b1 a13

b2 a22 a23 a21 b2 a23 A A 1 b3 a32 a33 2 a31 b3 a33 x1   x2   A a11 a12 a13 A a11 a12 a13

a21 a22 a23 a21 a22 a23

a32 a32 a33 a32 a32 a33

6 Section 5: Linear Systems and Matrices Washkewicz College of Engineering and a11 a12 b1

a21 a22 b2 A 3 a31 a32 b3 x3   A a11 a12 a13

a21 a22 a23

a32 a32 a33

Proof follows from the solution of a system of two equations and two unknowns.

For a system of n equations with n unknowns this solution method requires evaluating the determinant of the matrix [A] as well as augmented matrices (see above and previous page) where the jth column has been replaced by the elements of the vector {B}. Evaluation of the determinant of an n × n matrix requires about 3n2 operations and this must be repeated for each unknown. Thus solution by Cramer's rule will require at least 3n3 operations.

7 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Gaussian Elimination Let us consider a simpler algorithm, which forms the for one of the most reliable and stable direct methods for the solution of linear equations. It also provides a method for the inversion of matrices. Let begin by describing the method and then trying to understand why it works.

Consider representing the set of linear equations

a a  a b   11 12 1n  1  a21 a22  a2n b2                an1 an2  ann bn 

Here we have suppressed the presence of the elements of the solution vector {x} and parentheses are used in lieu of brackets and braces so as not to infer matrix multiplication in this expression. We will refer to the above as an “.”

8 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Now we perform a series of operations on the rows and columns of the coefficient matrix [A] and we shall carry through the row operations to include the elements of the constant vector {B}. The rows are treated as if they were the equations so that anything done to one element is done to all. Start by dividing each row including the vector {B} by the lead element in the row – initially a11. The first row is then multiplied by an appropriate constant and subtracted from all the lower rows. Thus all rows but the first will have zero in the first column. That row should have a one (1) in the first column. This is repeated for each succeeding row. The second row is divided by the second element producing a one in the second column. This row is multiplied by appropriate constants and subtracted from the lower rows producing zeroes in the second column. This process is repeated until the following matrix is obtained

1        12 1n  1  0 1  23   2n 2  0 0              1 an1, n n1     0 0 0  1 n  9 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

When the diagonal coefficients are all unity, the last term of the vector {} contains the value of xn, i.e.,

xn  n

This can be used in the (n -1)th equation represented by the second to the last line to obtain xn-1 and so on right up to the first line which will yield the value of x1.

n xi  n   ij x j ji1

10 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Gaussian-Jordan Elimination A simple modification to Gauss elimination method allows us to obtain the inverse to the matrix [A] as well as the solution vector {x}. Consider representing the set of linear equations as

 a a  a  b  1 0  0  11 12 1n   1    a21 a22  a2n  b2  0 1                 0             an1 an2  ann  bn  0 0  1 Now the unit matrix [I] is included in the augmented matrix. The procedure is carried out as before, the Gauss elimination method producing zeros in the columns below and to the left of the diagonal element. However the same row operations are conducted on the unit matrix as well. At the end of the procedure we have both solved the system of equations and found the inverse of the original matrix.

11 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.1

12 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.2

13 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

The Determinant of a

A square matrix of order n (an n x n matrix), i.e.,

 a11 a12  a1n  a a  a  A   21 22 2n           an1 an2  ann possesses a uniquely defined scalar that is designated as the determinant of the matrix, or merely the determinant

det A  A

Observe that only square matrices possess determinants.

14 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Vertical lines and not brackets designate a determinant, and while det[A] is a number and has no elements, it is customary to represent it as an array of elements of the matrix

a11 a12  a1n a a  a detA  21 22 2n    

an1 an2  ann

A general procedure for finding the value of a determinant sometimes is called “expansion by minors.” We will discuss this method after going over some ground rules for operating with determinants.

15 Section 5: Linear Systems and Matrices Washkewicz College of Engineering Rules for Operating with Determinants Rules pertaining to the manipulation of determinants are presented in this section without formal proof. Their validity is demonstrated through examples presented at the end of the section. Rule #1: Interchanging any row (or column) of a determinant with its immediate adjacent row (or column) flips the sign of the determinant. Rule #2: The multiplication of any single row (column) of determinant by a scalar constant is equivalent to the multiplication of the determinant by the scalar. Rule #3: If any two rows (columns) of a determinant are identical, the value of the determinant is zero and the matrix from which the determinant is derived is said to be singular. Rule #4: If any row (column) of a determinant contains nothing but zeroes then the matrix from which the determinant is derived is singular. Rule #5: If any two rows (two columns) of a determinant are proportional, i.e., the two rows (two columns) are linearly dependent, then the determinant is zero and the matrix from which the determinant is derived is singular. 16 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Rule #6: If the elements of any row (column) of a determinant are added to or subtracted from the corresponding elements of another row (column) the value of the determinant is unchanged. Rule #6a: If the elements of any row (column) of a determinant are multiplied by a constant and then added or subtracted from the corresponding elements of another row (column), the value of the determinant is unchanged. Rule #7: The value of the determinant of a is equal to the product of the terms on the diagonal. Rule #8: The value for the determinant of a matrix is equal to the value of the determinant of the of the matrix. Rule #9: The determinant of the product of two matrices is equal to the product of the determinants of the two matrices. Rule #10: If the determinant of the product of two square matrices is zero, then at least one of the two matrices is singular. Rule #11: If an m x n rectangular matrix A is post-multiplied by an n x m rectangular matrix B, the resulting square matrix [C] = [A][B] of order m will, in general, be singular if m > n. 17 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Rule #12: A determinant may be evaluated by summing the products of every element in any row or column by the respective cofactor. This is known as Laplace’s expansion. Rule #13: If all cofactors in a row or a column are zero, the determinant is zero and matrix from which they are derived is singular. Rule #14: If the elements in a row or a column of a determinant are multiplied by cofactors of the corresponding elements of a different row or column, the resulting sum of these products are zero.

18 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.3

19 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.4

20 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.5

21 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Minors and Cofactors Consider the nth order determinant:

a11 a12  a1n a a  a detA  21 22 2n    

an1 an2  ann The mth order minor of the nth order matrix is the determinant formed by deleting ( n – m ) th rows and ( n – m ) columns in the n order determinant. For example the minor |M|ir of the determinant |A| is formed by deleting the ith row and the rth column. Because |A| is an nth 2 order determinant, the minor |M|ir is of order m = n – 1 and contains m elements. In general, a minor formed by deleting p rows and p columns in the nth ordered determinant |A| is an (n – p)th order minor. If p = n – 1, the minor is of first order and contains only a single element from |A|. From this it is easy to see that the determinant |A| contains n2 elements of first order minors, each containing a single element. 22 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

When dealing with minors other than the (n – 1)th order, the designation of the eliminated rows and columns of the determinant |A| must be considered carefully. It is best to consider consecutive rows j, k, l, m … and consecutive columns r, s, t, u … so that the (n – 1)th, th th (n – 2) , and (n – 3) order minors would be designated, respectively, as |M|j,r, |M|jk,rs and |M|jkl,rst. The complementary minor, or the complement of the minor, is designated as |N| (with subscripts). This minor is the determinant formed by placing the elements that lie at the intersections of the deleted rows and columns of the original determinant into a square array in the same order that they appear in the original determinant. For example, given the determinant from the previous page, then

N  a 23 23

a a N  21 23 23,31 a31 a33

23 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

The algebraic complement of the minor |M| is the “signed” complementary minor. If a minor is obtained by deleting rows i, k, l and columns r, s, t from the determinant |A| the minor is designated M ikl,rst the complementary minor is designated

N ikl,rst and the algebraic complement is designated

1 iklrst N   ikl,rst The cofactor, designated with capital letters and subscripts, is the signed (n – 1)th minor formed from the nth order determinant. Suppose the that the (n – 1)th order minor is formed by deleting the ith row and jth column from the determinant |A|. Then corresponding cofactor is A  1 i j M ij   ij 24 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Observe the cofactor has no meaning for minors with orders smaller than (n – 1) unless the minor itself is being treated as a determinant of order one less than the determinant |A| from which it was derived. Also observe that when the minor is order (n – 1), the product of the cofactor and the complement is equal to the product of the minor and the algebraic complement. We can assemble the cofactors of a square matrix of order n (an n x n matrix) into a square cofactor matrix, i.e.,

 A11 A12  A1n   A A  A  AC   21 22 2n           An1 An2  Ann

So when the elements of a matrix are denoted with capital letters the matrix represents a matrix of cofactors for another matrix.

25 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.6

26 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.7

27 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Determinants through Expansion by Minors Using Rule #12 the determinant for a three by three matrix can be computed via the expansion of the matrix by minors as follows:

a11 a12 a13 a22 a23 a12 a13 a12 a13 detA  a21 a22 a23  a11  a21  a31 a32 a33 a32 a33 a22 a23 a31 a32 a33

This can be confirmed using the classic expansion technique for 3 x 3 determinants. This expression can be rewritten as:

a11 a12 a13 det A  a a a  a M  a M  a M   21 22 23 11 11 21 21 31 31

a31 a32 a33 or using cofactor notation:

detA  A  a11 A11  a21 A21  a31 A31 28 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Using the Adjoint Matrix to Formulate the Inverse The adjoint the matrix [A] is the matrix of transposed cofactors. If we have an nth order matrix [A] this matrix possess the following matrix of cofactors

A11 A12  A1n  A A  A  AC   21 22 2n          An1 An2  Ann and the adjoint of the matrix is defined as the transpose of the cofactor matrix

T adjA  AC 

A11 A21  An1  A A  A    12 22 n2          A A  A  1n 2n nn 29 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Suppose this n x n matrix is post multiplied by its adjoint and the resulting n x n matrix is identified as [P]

 P  AadjA

a11 a12  a1n  A11 A21  An1  a a  a  A A  A    21 22 2n   12 22 n2                  an1 an2  ann A1n A2n  Ann The elements of matrix [P] are divided into two categories, i.e., elements that lie along the diagonal

p11  a11 A11  a12 A12    a1n A1n

p22  a21 A21  a22 A22    a2n A2n     

pnn  an1 An1  an2 An2    ann Ann

30 Section 5: Linear Systems and Matrices Washkewicz College of Engineering and those that do not

p12  a11 A21  a12 A22    a1n A2n

p13  a11 A31  a12 A32    a1n A3n     

p21  a21 A11  a22 A12    a2n A1n     

p32  a31 A21  a32 A22    a3n A2n     

pn3  an1 A31  an2 A32    ann A3n     

The elements of [P] that lie on the diagonal are all equal to the determinant of [A] (see Rule #12 and recognize the Laplace expansion for each diagonal value). Note that the non- diagonal elements will be equal to zero since they involve the expansion of one row of matrix A with the cofactors of an entirely different row (see Rule #14).

31 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Thus

p11  a11 A11  a12 A12    a1n A1n  A

p22  a21 A21  a22 A22    a2n A2n  A     

pnn  an1 An1  an2 An2    ann Ann  A and

p12  a11 A21  a12 A22    a1n A2n  0

p13  a11 A31  a12 A32    a1n A3n  0     

p21  a21 A11  a22 A12    a2n A1n  0     

p32  a31 A21  a32 A22    a3n A2n  0     

pn3  an1 A31  an2 A32    ann A3n  0      32 Section 5: Linear Systems and Matrices Washkewicz College of Engineering which leads to  A 0  0     0 A  0  P  A adj A   A I         0 0  A  or adj A I  A A When this expression is compared to

I  A A 1 then it is evident that 1 adj A A  A The inverse exists only when the determinant of A is not zero, i.e., when A is not singular. 33 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

If we count the computations required in finding an inverse using adjoints and determinants then this approach is as much of a “brute force” approach as finding the solution of a system of linear equation by Cramer’s rule. From a computational standpoint the method is inefficient (but doable) when the matrix is quite large. There are more efficient methods for solving large systems of linear equations that do not involve finding the inverse. Generally these approaches are divided into the following two categories: • Direct Elimination (not inversion) Methods (LDU decomposition, Gauss elimination, Cholesky) • Iterative Methods (Gauss-Seidel, Jacobi) We will look at methods from both categories.

34 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.8

35 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Direct Elimination Methods

Elimination methods factor the matrix [A] into products of triangular and diagonal matrices, i.e., the matrix can be expressed as

A  L D U

Where [L] and [U] are lower and upper triangular matrices with all diagonal entries equal to “1”. The matrix [D] is a diagonal matrix.

Variations of this decomposition are obtained if the matric [D] is associated with either the matrix [L] or the matrix [U], i.e.,

A  L U where [L] and [U] in this last expression are not necessarily the same as the matrices identified in the previous expression.

36 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

In an expanded format

a11 a12  a1n  l11 0  0  u11 u12  u1n  a a  a  l l  0   u  u   A   21 22 2n    21 22   22 2n                          an1 an2  ann ln1 ln2  lnn   unn

and using a generalized index notation n aij   lik ukj k 1 The matrices [L] and [U] in this decomposition are not unique. Differences in the many variations of elimination methods are simply differences in how these two matrices are constructed. Consider, for example, i=4 and j=3 then for any n x n matrix

a43  l41 u13  l42 u23  l43 u33  1  l44 u43  0    l4n  0 un3  0 37 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

If by definition we stipulate that the diagonal entries of the upper are all equal to “1” , i.e.,

u jj  1 Returning to the previous expression

a43  l41 u13  l42 u23  l43 u33  1  0    0

 l41 u13  l42 u23  l43 u33  1 and in general we can write for i > j

j1 aij  lij u jj 1   lik ukj k 1 Solving for lij

j1 lij  aij   lik ukj i  j,,n k 1

38 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

In solving a system of linear equations we can now write in matrix notation

A x  LU x  b

If we let U x  y then LU x  L y  b which is an easier computation. Using generalized index notation

i  lij y j  bi j1 For example for i=3

l31y1  l32y2  l33y3  b3

39 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

From this we can rearrange the generalized index formulation as

i1 liiyi   lij y j  bi j1

Solving this expression for yi yields

i1 bi  lij y j j1 yi  i  1, 2,, n lii Similarly

n yi  uij x j ji1 xi  i  n, ,1 lii

The process for solving for the unknown vector quantities {x} can be completed without computing the inverse of [A]. 40 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

In general we can write for i < j

i1 aij  lii uij   lik ukj k 1

Solving for uij

i1 aij   lik ukj k 1 uij  j  i,,n lii

If at any stage in this algorithm the coefficient of the first equation, i.e., ajj (often referred to as the pivot point) or ljj becomes zero the method fails.

41 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.9

42 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Cholesky’s Decomposition – A Direct Elimination Method In , the Cholesky algorithm is a decomposition of a Hermitian, positive- (square ) into the product of a lower triangular matrix multiplied by an upper triangular matrix that is the conjugate transpose of the lower triangular matrix, or

A  L LT

The approach was derived by André-Louis Cholesky. When applicable, the is roughly twice as efficient as the LU decomposition for solving systems of linear equations.

Finding [L] can be loosely thought of as the matrix equivalent of taking the square root of [A]. Note that [A] is a positive definite matrix if for all non-zero vectors {z} the inner product

zT Az  0 is always greater than zero. This is guaranteed if all the eigenvalues of the matrix are positive.

43 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

With A  L LT

a11 a12  a1n  l11 0  0  l11 l21  ln1  a a  a  l l  0   l  l   21 22 2n    21 22   22 n2                          an1 an2  ann ln1 ln2  lnn   lnn then by columns (the second number/letter of the subscripts) 1 1 2 2 2 2 l  a 2 l33  a33  l13  l23  11 11 l22  a22  l21   a21 a32  l31l21 l21  l32  l11 l22 a  l  31 31 l 11 an2  ln1 l21 ln2   l22

an1 ln1  44 l11 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

The decomposition of [A] proceeds by forward substitution. As the decomposition is performed, the following recurrence relationships for each successive column (ith index) value in the lower triangular matrix can be extracted from the previous results

i1 2 lii  aii   lik  k1 i1 a ji   l jk lik k1 l ji  j  i 1, , n lii

These expressions can be modified to where there is no need to take a square root (an additional computation) in the first expression. To accomplish this recast the previous matrix expression such that

A  L D LT where again [D] is a diagonal matrix (not necessarily the identity matrix). 45 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

The recurrence relationships for this form of the Cholesky decomposition (LDL) can be expressed as follows for each successive column (ith index) entry

i1 2 dii  aii   dkk lik  k1

lii  1 i1 a ji   dkk l jk lik k1 l ji  j  i 1, , n d ii

With [A] decomposed into a triple matrix product the solution to the system of equations proceeds with

b  A x  LDLT x  L y

46 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Again i1 bi  lij y j j1 yi  i  1, 2,, n lii but now

n yi  diklkj x j ji1 xi  i  n, ,1 lii

47 Section 5: Linear Systems and Matrices Washkewicz College of Engineering

Example 5.10

48