Chapter 2

Solving Linear Equations

Po-Ning Chen, Professor

Department of Computer and Electrical Engineering

National Chiao Tung University

Hsin Chu, Taiwan 30010, R.O.C. 2.1 Vectors and linear equations 2-1

• What is this course about?

Algebra (代數) The part of mathematics in which letters and other general sym- bols are used to represent numbers and quantities in formulas and equations. Linear Algebra (線性代數) To combine these algebraic symbols (e.g., vectors) in a linear fashion. So, we will not combine these algebraic symbols in a nonlinear fashion in this course! x x1x2 =1 x 1 Example of nonlinear equations for = x : 2 x1/x2 =1 x x x x 1 1 + 2 =4 Example of linear equations for = x : 2 x1 − x2 =1 2.1 Vectors and linear equations 2-2

The linear equations can always be represented as operation, i.e., Ax = b. Hence, the central problem of linear algorithm is to solve a system of linear equations. x − 2y =1 1 −2 x 1 Example of linear equations. ⇔ = 2 3x +2y =11 32 y 11 • A linear equation problem can also be viewed as a linear combination prob- lem for column vectors (as referred by the textbook as the column picture view). In contrast, the original linear equations (the red-color one above) is referred by the textbook as the row picture view. 1 −2 1 Column picture view: x + y = 3 2 11 1 −2 We want to find the scalar coefficients of column vectors and to form 3 2 1 another vector . 11 2.1 Matrix equation 2-3

From now on, we will focus on the matrix equation because it can be easily extended to higher dimension (≥ 2). Ax = b. In terminologies, • Ax = b will be referred to as matrix equation. • A will be referred to as coefficient matrix. • x will be referred to as unknown vector.

More terminologies: The elements of a matrix A as well as its size are indexed first by row,and then by column. Example. a1,1 a1,2 a1,3 A2×3 = a2,1 a2,2 a2,3 2.1 Other representations of matrix 2-4

In MATLAB: MATLAB (“Matrix Laboratory”) is a numerical computing environment, innovated by Cleve Moler, and developed by MathWorks. It allows matrix manipulation, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs in other languages. • A matrix a1,1 a1,2 a1,3 12 3 A2,3 = = a2,1 a2,2 a2,3 25−2 is represented by A=[1 2 3 ; 2 5 -2] % Again, row first, and then column. % Rows are separated by semicolon “;”  • A vector x = 1 −11 is inputted as x=[1 ; -1 ; 1] 2.1 Other representations of matrix 2-5

• The b = Ax is then given by b=A∗x

Note: The textbook refers to this “∗”asdot product (equivalently, inner product) in MATLAB, which is not entirely correct. MATLAB treated this “∗” as the usual matrix multiplication. • An element of a matrix A can be extracted by row and column indexes. a1,1 a1,2 a1,3 A(1,2) % This is exactly a1,2 in A2,3 = . a2,1 a2,2 a2,3 • The row vector of a matrix A can be extracted by row index. A(1,:) % This is exactly a1,1 a1,2 a1,3 in A2,3. • The submatrix of a matrix A can be extracted by row and column index ranges. a1,1 a1,2 A(1:2,1:2) % This is exactly in A2,3. a2,1 a2,2 2.1 Other representations of matrix 2-6

• The matrix multiplication b = Ax can then be rewritten by two forms: % Inner product with row vectors b=[A(1,:)*x ; A(2,:)*x] % Linear combination of column vectors b=A(:,1)*x(1)+A(:,2)*x(2)+A(:,3)*x(3) • The above MATLAB rules for matrix manipulation can be easily extended: A(:,2)=[] % Eliminate the second column. A(1:2,:)=[] % Eliminate the first two rows. A=[A; a a a ]% A 3,1 3,2 3,3 Add a third row to matrix . 3 v=[3:5] % v = 4 5 2.1 Other representations of matrix 2-7

• Some useful MATLAB functions appear in textbook Problems.   100 A=eye(3) % Generate a 3-by-3 A = 010  001 111 A=ones(3) % Generate a 3-by-3 all-one matrix A = 111 111 000 A=zeros(3) % Generate a 3-by-3 all- A = 000  000 1 v=ones(3,1) % Generate a 3-by-1 all-one vector v = 1 1  0 A=zeros(3,1) % Generate a 3-by-1 all-zero vector A = 0 0 2.2 The idea of elimination 2-8

• Is there a systematic method to solve the linear equations?

Answer: Forward elimination and back(ward) substitution. x − 2y =1 Example. 3x +2y =11 Step 1: Forward elimination. Eliminate in a forward sequential order, i.e., x, y, .... x − 2y =1 ...(1) 3x +2y =11 ...(2) Multiplier x − 2y =1 ...(1) ⇒ (3x − 3x)+(2y − (−6y)) = (11 − 3) ...(2) − 3 × (1)

x − y ... ⇒ 2 =1 (1) 8y =8...(2) − 3 × (1) 2.2 The idea of elimination 2-9

Step 2: Back(ward) substitution. Substitute in a backward sequential order, i.e., ..., y, x.

x − 2y =1...(i) 8y =8...(ii)

x − y ... i ⇒ 2 =1 ( ) y =1 x − ... y substitute−→ i ⇒ 2 =1 =1 ( ) y =1

x ⇒ =3 y =1 • After forward elimination, the equations form an upper triangle.

x − 2y =1 8y =8

• In terminology, the element (e.g., x in the first equation) that is used to elimi- nate the same unknown in the remaining equations is called pivot.Thepivots are on the diagonal of the upper triangle after elimination. 2.2 Failure of forward elimination 2-10

• Can the systematic method fail?

Answer: If Ax = b has no solution or has infinitely many solutions, then some pivots will disappear during the process of forward elimination.

Example. x − 2y =1 ...(1) 2x − 4y =11 ...(2) Multiplier x − 2y =1 ...(1) ⇒ (2x − 2x)+(−4y − (−4y)) = (11 − 2) ...(2) − 2 × (1)

x − y ... ⇒ 2 =1 (1) 0=9...(2) − 2 × (1) ⇒ No solution since 0 =9! Hence, the next pivot disappears (or was canceled). 2

If obtaining 0 = 0, then there are infinitely many solutions for the linear equations. 2.2 Failure of forward elimination 2-11

• Can the systematic method fail if the solution exists and is unique?

Answer: Yes, if the sequence for forward elimination is not properly or- dered. Example. Sequence of choosing pivots in forward elimination {x from (1), y from (2), z from (3)}.   x − 2y + z =1...(1) This has a unique solution 2x − 4y + z =11...(2)  (x, y, z)=(12, 1, −9). x + y − z =22...(3)   x − 2y + z =1...(1) ⇒ − z ... − ×  =9 (2) 2 (1) 3y − 2z =21...(3) − 1 × (1)

⇒ No y from (2) can be found! 2 In such case, one can simply switch (2) with any of the remaining equations (e.g., (3)) of which its y still exists (i.e., has a coefficient other than zero). 2.3 Forward elimination/back substitution in matrix form2-12

• The forward elimination/back substitution can be easily done in matrix form.

Example.Thepivots are the diagonal elements.       2 4 −2 x 2 ...(1) with 2 the first pivot  49−3 y =  8  ...(2) −2 −37 z 10 ...(3)       2 4 −2 x 2 ...(1) with 2 the first pivot ⇔ 01 1 y =  4  ...(2) − 2 × (1) 01 5 z 12 ...(3) − (−1) × (1)       2 4 −2 x 2 ...(1) ⇔ 0 1 1  y =  4  ...(2)with1 the second pivot 01 5 z 12 ...(3)       2 4 −2 x 2 ...(1) ⇔ 0 1 1  y = 4 ...(2)with1 the second pivot 00 4 z 8 ...(3) − 1 × (2) Based on the above, back substitution can be straightforwardly followed. 2

The above procedure transforms Ax = b into Ux = c,whereU is an upper . 2.3 Forward elimination/back substitution in matrix form2-13

We summarize the algorithmic procedure of forward elimination for n × n matrix A and n unknowns as follows. • Use the first diagonal element as the first pivot to create zeros below it. – If the diagonal element is zero, switch the row with any row below it, for which its respective element at the same column is non-zero. • Use the new second diagonal element as the second pivot to create zeros below it. – If the diagonal element is zero, switch the row with any row below it, for which its respective element at the same column is non-zero. • Use the new third diagonal element as the third pivot to create zeros below it. – If the diagonal element is zero, switch the row with any row below it, for which its respective element at the same column is non-zero. • ...

• Repeat the above procedure until either an upper-triangular U is resulted or the next pivot does not exist even with row switching. 2.3 Forward elimination/back substitution in matrix form2-14

This algorithmic procedure can be graphically illustrated as:     ···  ···      ···   0 ···      ···  ⇒  ···     0   ......   ......  ···  0 ···      ···  ···       0 ···   0 ···      ⇒   ···  ⇒   ···   00   00   ......   ......  00 ···  000··· 

Terminology: When a set of linear equations has either no solution or infinite many solutions, it is often referred to as a singular system. 2.3 Elimination using matrices 2-15

Reminder of terminologies (in English) • Sigma notation:Thedot or inner product can be expressed using the so-called “sigma notation”. n v · w = vjwj. j=1

• m by n matrix: The terminology to describe the size of a matrix with m rows and n columns.

• Entry of a matrix: ai,j in the ith row and jth column of matrix A is referred to as an entry of A.  • Component of a vector: vj in a vector v = v1 v2 ··· vn is referred to as a component of v. 2.3 The matrix form of forward elimination 2-16

• The forward elimination process can be implemented by multiplying the elimination matrix. Example.Thepivots are the diagonal elements.       2 4 −2 x 2 ...(1) with 2 the first pivot  49−3 y =  8  ...(2)     −2 −37 z 10 ...(3) 100 2 4 −2       −2 10  49−3 − x ... 001 −2 −37 2 4 2 2 (1) with 2 the first pivot     ⇔ 01 1 y =  4  ...(2) − 2 × (1) 1002 4 −2  010  011 01 5 z 12 ...(3) − (−1) × (1) −(−1) 01 −2 −37       2 4 −2 x 2 ...(1) ⇔ 0 1 1  y =  4  ...(2)with1 the second pivot 01 5 z 12 ...(3)           − x ... 100 24−2 2 4 2 2 (1)        010 0 1 1  ⇔ 0 1 1 y = 4 ...(2 )with1 the second pivot 0 −1 1 01 5 00 4 z 8 ...(3) − 1 × (2) Based on the above, back substitution can be straightforwardly followed. 2 2.3 The matrix form of forward elimination 2-17

The forward elimination process can then be implemented by multiplying the elim- ination matrices E2,1, E3,1 and E3,2 in sequence.         100 100 100 24−2 010  010 −2 10  49−3 x − − − − −  0 1 1   ( 1) 01  001  2 37 E E E 3,2 3,1 2,1        100 100 100 2 = 010  010 −2 10  8  − − −  0 1 1   ( 1) 01  001  10 E3,2 E3,1 E2,1 which immediately gives     2 4 −2 2 0 1 1  x = 4 . 00 4 8 2

For matrix multiplications, associative law is true (e.g., (E3,1E2,1)A = E3,1(E2,1A)) but commutative law is false (e.g., EA = AE). 2.3 The matrix form of forward elimination 2-18

• We now understand why the forward elimination process can be implemented by multiplying the elimination matrix.

– What if a row exchange is needed (because the next diagonal element is zero) during the process?

Answer: Use the row exchange matrix. Specifically, insert the row ex- change matrix in-between the elimination matrices. ∗ Denote by Pi,j the row exchange matrix that exchanges row i and row j by left-multiplying it. ∗ Pi,j is an identity matrix with row i and j reversed. For example,   100   P2,3 = 001 ←− The original row 3. 010 ←− The original row 2. 2.3 The matrix form of forward elimination 2-19

Example.Thepivots are the diagonal elements.       2 4 −2 x 2 ...(1) with 2 the first pivot     4 8 −3 y =  8  ...(2) (Replace the middle 9 by 8) 100 2 4 −2 −2 10  48−3 −2 −37 z 10 ...(3) 001 −2 −37           − x ... 1002 4 −2 2 4 2 2 (1) with 2 the first pivot  010  001 ⇔ 00 1 y =  4  ...(2) − 2 × (1) −(−1) 01 −2 −37     01 5 z 12 ...(3) − (−1) × (1) 100 2 4 −2       001 00 1 2 4 −2 x 2 ...(1) 010 01 5 ⇔ 0 1 5  y = 12 ...(2)with1 the second pivot 00 1 z 4 ...(3)           100 24−2 2 4 −2 x 2 ...(1) 010 0 1 5  ⇔ 0 1 5  y = 12 ...(2)with1 the second pivot 0 −0 1 00 1 00 1 z 4 ...(3) − 0 × (2)

Based on the above, back substitution can be straightforwardly followed. 2 2.3 The matrix form of forward elimination 2-20

The forward elimination process can then be implemented by multiplying E2,1, E3,1, P2,3 and E3,2 in sequence.           100 100 100 100 24−2 010 001  010 −2 10  48−3 x − − − − −  0 0 1   010   ( 1) 01  001  2 37 E P E E 3,2 2,3  3,1  2,1      100 100 100 100 2 = 010 001  010 −2 10  8  − − −  0 0 1   010   ( 1) 01  001  10 E3,2 P2,3 E3,1 E2,1 which immediately gives     2 4 −2 2 0 1 5  x = 12 . 00 1 4 2 2.3 The for forward elimination 2-21

• Since we need to multiply both A and b by E = E3,2P2,3E3,1E2,1 in order to solve Ax = b, e.g.,           100 100 100 100 24−2 010 001  010 −2 10  48−3 x − − − − −  0 0 1   010   ( 1) 01  001  2 37 E P E E 3,2 2,3  3,1   2,1      100 100 100 100 2 = 010 001  010 −2 10  8  − − −  0 0 1   010   ( 1) 01  001  10 E3,2 P2,3 E3,1 E2,1 we can introduce a so-called augmented matrix A b so that x can be solved by directly computing     24−22 2 4 −22 E A b = E  49−38 = 0 1 512 . −2 −37 10 00 1 4 2.3 Appendix 2-22

• If we place E2,1 on the right, what would happen then? Answer: Row operations will become column operations. For example,       100 24−2 24−2 −2 10  48−3 =  001 001 −2 −37 −2 −37   (row 1) = (row 1) − × where (row 2) = (row 2) 2 (row 1)  (row 3) = (row 3) becomes       24−2 100 −6 4 −2  48−3 −2 10 = −12 8 −3 Exchange 1 ↔ 2 −2 −37 001 4 −37 ↔  row column  (column 2) = (column 2) − ×  where (column 1) = (column 1) 2 (column 2)  (column 3) = (column 3) 2.3 Appendix 2-23

• If we place P2,3 on the right, what would happen then? Answer: Row operations will become column operations. For example,       100 24−2 24−2 001  48−3 = −2 −37 010 −2 −37 48−3   (row 1) = (row 1) where (row 2) = (row 3)  (row 3) = (row 2) becomes       24−2 100 2 −24  48−3 001 =  4 −38 Exchange 2 ↔ 3 −2 −37 010 −2 7 −3 ↔  row column  (column 1) = (column 1)  where (column 3) = (column 2)  (column 2) = (column 3) 2.3 Appendix 2-24

• The below definition gives what is required by Problem 8. ab M det M (Problem 8, Section 2.3) The of = cd is ( )= ad − bc. Subtract times row 1 from row 2 (i.e., row 2 - × row 1) to produce a new M ∗. Show that det(M ∗)=det(M) for every .When = c/a, the product of pivots equals the determinant:(a)(d − b)equals ad − bc. 2.3 Appendix 2-25

Definition (Determinant 決定因子):Adeterminant, often denoted as det(A)orsimplydetA, is a determinant factor associated with a square v v w matrix. For example, for a matrix formed by w ,where and are 2-dimensional (column) vectors, the determinant is the (signed) area of the parallelogram formed by v and w,andisgivenby v1 v2 v1 v2 detA = det = = v1w2 − v2w1 w1 w2 w1 w2

Area of parallelogram = v w sin(θ)(θ :fromv to wcounterclockwisely) w 2  v · w v w − 2 θ v w − = 1 cos ( )= 1 v w = v 2 w 2 − (v · w)2 k θ w1 w2 v Area of parallelogram = v w sin(2π − θ)ifA = R v1 v2 2.3 Appendix 2-26

• We can say more about determinant.   v – The determinant of a matrix formed by w,wherev, w and u are u 3-dimensional (column) vectors, the determinant is the (signed) volume of the parallelepiped formed by v, w and u. – ...

(We will introduce the determinant in great detail in Chapter 5.) 2.3 Appendix 2-27

• The below definition gives what is required by Problem 30. 34 (Problem 30, Section 2.3) Write M = as a product of many factors 57 10 11 A = and B = . 11 01 (a) What matrix E subtractsrow1fromrow2tomakerow2ofEM smaller? (b)WhatmatrixF subtracts row 2 of EM from row 1 to reduce row 1 of FEM? (c)ContinueE’s and F ’s until (many E’s and F ’s) times (M)is(A or B). (d) E and F are the inverses of A and B! Moving all E’s and F ’s to the M A B right side will give you the desired result = product of ’s and ’s. ab M > ad − bc This is possible for all matrices = cd 0thathave =1. 2.3 Appendix 2-28

Definition (Positivity of a matrix): A positive matrix is a matrix in which all the elements are greater than zero. It is often denoted by A>0. Particularly note that a positive matrix is different from a positive definite matrix, which we will introduce in Chapter 6. 2.4 Rules for matrix operations 2-29

• Let us revisit matrix operations and summarize the rules that they obey.

– Matrix addition is simply a term-by-term addition. – The definition of matrix multiplication is repeated below.

Definition (Product of matrices): The product of two matrices is defined as the inner products of row vectors and column vectors respec- tively from the first matrix and the second matrix, counted from the left. Specifically as an example, a a b a b a · b a · b A B 1 b b  1 1 1 2 1 1 1 2 , 2×3 3×2 = a 1 2 a b a b = a · b a · b 2 2 1 2 2 2×2 2 1 2 2 2×2 where for the two matrices on the extreme left of the above equation, “2” is the number of vectors, and “3” is the dimension of the vectors. Note that the dimension of the vectors shall be the same, otherwise the inner product operation cannot be performed. 2.4 Rules for matrix operations 2-30

• Rules or laws that matrix operations obey.

– Commutative law for addition: A + B = B + A. – Commutative law for multiplication: AB = BA. – Distributive law from the left: A(B + C)=AB + AC. – Distributive law from the right: (A + B)C = AC + BC. – Distributive law with respect to scalar: α(A + B)=αA + αB. – Associative law for addition: (A + B)+C = A +(B + C). AB C A BC – Associative law for multiplication: ( ) = ( ). ApAq = Ap+q, if p and q are positive integers   p q p+q −1 A A = A , if p or q is a negative integer, and A exists  A0 = I, if A is a square matrix, – Power law:  where I is an identity matrix    Note: We can talk about the product AA  only when A is a square matrix. 2.4 Rules for matrix operations 2-31

• The way to prove these rules may require the below two pictures. – Column picture of matrix product: From the definition of matrix product, we can see that a a a a b a b A B 1 b b 1 b 1 b  1 1 1 2 2×3 3×2 = a 1 2 = a 1 a 2 a b a b 2 2 2 2 1 2 2 2×2 or we can simply write A2×3B3×2 = A b1 b2 = Ab1 Ab2 .

In summary, the jth column of product AB is given by Abj,wherebj is the jth column of B. 2.4 Rules for matrix operations 2-32

– Row picture of matrix product: From the definition of matrix product, we can see that  a b b a b a b a1 1 1 2 1 1 1 2 A2×3B3×2 b1 b2  =  =    a2 a b b a b a b 2 1 2 2 1 2 2 2×2 or we can simply write   a1 a1B A2×3B3×2 =  B =  . a2 a2B

  In summary, the ith row of product AB is given by aiB,whereai is the ith row of A. 2.4 Block manipulation of matrices 2-33

• The above column or row picture indicates that we can manipulate the ma- trix in blocks. For example, matrix B can be divided into two blocks when calculating the matrix product. AB = A b1 b2 = Ab1 Ab2 .

• Recall that we can solve the linear equations using the augmented matrix A b , which consists of two blocks of different sizes. • To generalize the idea, we can represent matrix A in the form   101010    010101 III A =   =  101010 III 010101 2.4 Block manipulation of matrices 2-34

• We can treat blocks as number entries in matrix operation, which is usually simpler.

Example. A1,1 A1,2 B1,1 B1,2 A1,1 + B1,1 A1,2 + B1,2 + = A2,1 A2,2 B2,1 B2,2 A2,1 + B2,1 A2,2 + B2,2 and A1,1 A1,2 B1,1 B1,2 A1,1B1,1 + A1,2B2,1 A1,1B1,2 + A1,2B2,2 = A2,1 A2,2 B2,1 B2,2 A2,1B1,1 + A2,2B2,1 A2,1B1,2 + A2,2B2,2 2.4 Block manipulation of matrices 2-35

• By block manipulation, the product AB does not have to be calculated/represented in terms of rows of A and columns of B,butcanbedoneinaviceversa way. a a b a b AB 1 b b 1 1 1 2 = a 1 2 = a b a b Representation based on Inner product 2 2 1 2 1  d1   = c1 c2  = c1d1 + c2d2 Representation based on Outer product d2

The representation using inner product is more convenient in the sense that we can have an entry-wise identification of elements in Am×nBn×p. Hence, the (i, j)th entry of Am×nBn×p is the inner product of row i of A and column j of B, i.e.,  aibj = bj · ai.

Hence, we need to perform m × p inner products to obtain Am×nBn×p. 2.4 Block manipulation of matrices 2-36

The representation using outer product can represent AB as sum of all outer products of column i of A and row j of B, i.e.,    d1    n d2  Am×nBn×p = c1 c2 ··· cn   = cid .  .  i  i=1 dn

Hence, we need to perform n outer products to obtain Am×nBn×p. 2.4 Block manipulation of matrices 2-37

• By block manipulation,theforward elimination canbedoneina block-by-block fashion.

Example. 10ab ab E2,1F = = −ca−1 1 cd 0 d − ca−1b and I A B E F 0 2,1 = −CA−1 I C D A B = C − CA−1A D − CA−1B A B = 0 D − CA−1B

The matrix D − CA−1B is important in linear algebra; hence, it gets itself a terminology Scher complement, named after Issai Scher. 2.4 Appendix 2-38

What is the (i, j)th entry of ABC? Solution: • The (i, j)th entry of AD is the inner product of row i of A and column j  of D, i.e. aidj. • The column j of D = BC is the matrix multiplication of matrix B and column j of C, i.e., dj = Bcj.   c1,j    • Therefore, the answer is aiBcj = ai,1 ai,2 ··· B c2,j . .

Exercise: What is the (i, j)th entry of ABCD? 2.4 Appendix 2-39

Exercise: Problems 27, 28 and 37 in textbook. (Problem 27, Section 2.4) Show that the product of upper triangular matrices is always upper triangular:       xxx x x x AB = 0 xx  0 x x = 0  . 00x 00x 00 Proof using dot products (Row times column)(Row2ofA)·(column 1 of B)=0. Which other dot products gives zeros? Proof using full matrices (Column times row)Drawx’s and 0’s in (column 2 of A) times (row 2 of B). Also show (column 3 of A) times (row 3 of B).

(Problem 28, Section 2.4) Draw the cuts in A (2 by 3) and B (3 by 4) and AB to show how each of the four multiplications rules is really a block multiplication: (1) Matrix A times columns of B. Columns of AB (2) Rows of A times the matrix B. Rows of AB (3) Rows of A times columns of B. Inner products (numbers in AB) (3) Columns of A times rows of B. Outer products (matrices add to AB) 2.4 Appendix 2-40

(Problem 37, Section 2.4) To prove (AB)C = A(BC), use the column vec- tors b1,...,bn of B. First suppose that C has only one column c with entries c1,...,cn: AB has columns Ab1,...,Abn and then (AB)c = c1Ab1 + ···+ cnAbn. Bc has one column c1b1 +···+cnbn and then A(Bc)equalsA(c1b1 +···+cnbn). Linearity gives equality of those two sums. This proves (AB)c = A(Bc). The same is true for all other of C. Therefore (AB)C = A(BC). Apply to inverses: If BA = I and AC = I, prove that the left-inverse B equals the right-inverse C. 2.5 Inverse matrices 2-41

Definition (Inverse matrix): The matrix A is invertible if there exists a matrix A−1 such that A−1A = I and AA−1 = I.

• A is invertible only when A is a square matrix. • A is invertible if, and only if, Ax = b has a unique solution. – This unique solution must be A−1b. – If A is invertible,thenAx = 0 cannot have a non-zero solution.

• An×n is invertible if, and only if, forward elimination (with possibly row ex- change) should give n non-zero pivots. (This test is what you can use now!) • If A is invertible if, and only if, the determinant of A (i.e., det(A)) is non-zero. (We will come back to this in Chapter 5.) – Hence, for a square matrix, either both left and right inverses do not exist (det(A) = 0) or both left and right inverses exist (det(A) =0). 2.5 Inverse matrices 2-42

• The inverse matrix of A is unique.

Proof: If L and R are both the inverse of A,then L = LI = L(AR)=(LA)R = IR = R. 2 – In fact, the above proof also verifies that for a square matrix A,itsleft inverse L always equals its right inverse R. – Again, it is not possible for a square matrix without a left (respectively, right) inverse to have a right (respectively, left) inverse. – Again, it is not possible for a square matrix to have distinct left and right inverses. 2.5 Inverse matrices 2-43

Lemma. Suppose A and B are both square matrices. Then, AB has an inverse if, and only if, both A and B have inverses. Proof: 1. if-part:IfbothA and B have inverses, then (AB)(B−1A−1)=A(BB−1)A−1 = AIA−1 = AA−1 = I. Hence, AB has an inverse B−1A−1. 2. only-if-part: Suppose AB has an inverse C. Then, it is not possible that A or B has no inverse because if A had no inverse (meaning that both left and right inverses of A do not exist), then ABC = I implies that the square matrix A has a right inverse BC, a contradiction. On the other hand, if B has no inverse (meaning that both left and right inverses of B do not exist), then CAB = I implies that the square matrix B has a left inverse CA, a contradiction. 2 2.5 How to find the inverse matrix of A 2-44

• How to find A−1 if it exists?

Answer: Gauss-Jordan method. • What is Gauss-Jordan method? Answer: To repeat forward elimination and back substitution n times for an n-by-n matrix A.

Example. Suppose A is a 3-by-3 matrix.

– Find the solutions x1, x2 and x3 for       1 0 0       Ax1 = e1  0 ,Ax2 = e2  1 and Ax3 = e3  0 0 0 1 – Then, A x1 x2 x3 = Ax1 Ax2 Ax3 = e1 e2 e3 = I, −1 which means A = x1 x2 x3 . 2.5 How to find the inverse matrix of A 2-45

• Gauss said “We can solve the three linear equations in one time by using the augmented matrix.” Ax e  1 = 1 Ax e ⇒ A e e e AI.  2 = 2 = 1 2 3 =  Ax3 = e3   2 −10 Example. Suppose A = −12−1. 0 −12   2 −10 100 ...(1) AI = −12−1010 ...(2) 0 −12 001 ...(3)    2 −10 100 ...(1) ...(1 ) ⇒  3 − 1  ... − −1 × ...  0 2 1 2 10 (2) 2 (1) (2 ) 0 −12 001 ...(3) − 0 × (1) ...(3)   2 −1 0 100 ...(1) ⇒  3 − 1  ...  0 2 1 2 10 (2 ) 4 1 2 ...  − −2 × 00 3 3 3 1 (3 ) 3 (2) 2 2.5 How to find the inverse matrix of A 2-46

• What is the next step after obtaining   2 −1 0 100  3 1  0 2 −1 2 10 4 1 2 00 3 3 3 1

Answer: Back substitution to solve       x1,1 x1,2 x1,3       x1 = x2,1 , x2 = x2,2 and x3 = x2,3 . x3,1 x3,2 x3,3

Example (Continue).       1−(−1)x2,1−(0)x3,1 3 x  2  4 1,1     x x   1/2−(−1)x3,1  1 1 = 2,1 =  3/2  = 2 x3,1 1/3 1 4/3 4 2.5 How to find the inverse matrix of A 2-47

and       0−(−1)x2,2−(0)x3,2 1 x  2  2 1,2     x x   1−(−1)x3,2    2 = 2,2 =  3/2  = 1 x3,2 2/3 1 4/3 2 and       0−(−1)x2,3−(0)x3,3 1 x  2  4 1,3       0−(−1)x3,3 1 x3 = x2,3 =   =  3/2  2 x3,3 1 3 4/3 4 2 2.5 How to find the inverse matrix of A 2-48 • Jordan said “We can also solve x1 x2 x3 by backward elimination.”

Example. (Continue)   2 −1 0 100 ...(1)  3 − 1  ... 0 2 1 2 10 (2) 00 4 1 2 1 ...(3)  3 3 3  2 −10 100 ...(1) ⇒  3 3 3 3 ... − − / ×  0 2 0 4 2 4 (2) ( 3 4) (3) = (2 ) 004 1 2 1 ...(3)  3 3 3  2 −1 0 100 ...(1) − (0) × (3) = (1) ⇒  3 3 3 3 ...  0 2 0 4 2 4 (2 ) 004 1 2 1 ...(3)  3 3 3  3 1    2 0 0 2 1 2 ...(1 ) − (−2/3) × (2 )=(1) ⇒  3 3 3 3  0 2 0 4 2 4 ...(2 ) 4 1 2 ... 0 0 3 3 3 1 (3)

Based on this new augmented matrix, we can solve x1, x2 and x3 simply by dividing the pivot in each row (i.e., to make the pivot equal to 1). The resultant matrix is said to be in its reduced echelon form. 2.5 How to find the inverse matrix of A 2-49

  3 1 1 ...  / 1 00 4 2 4 (1 ) 2 ⇒  1 1  0 1 0 2 1 2 ...(2 )/(3/2) 1 1 3 001 4 2 4 ...(3)/(4/3) Hence,   3 1 1 4 2 4 A−1 1 1 . = 2 1 2 1 1 3 4 2 4 2

Gauss-Jordan method: Multiply AI by A−1 to get IA−1 using forward elimination and backward elimination. 2.5 Terminologies 2-50

: A form of matrices satisfying – Every all-zero row is below non-all-zero rows. – The leading coefficient (the first nonzero number from the left, also called the pivot) of a non-all-zero row is always strictly to the right of the leading coefficient of the row above it.

Example. Upper triangular matrix obtained from forward elimination. • (row) Reduced echelon form: A form of matrices in row echelon form with the leading coefficient being one and also being the only non-zero entry in the column.   1101 Example. 0011 0000 2.5 MATLAB revisited 2-51

In MATLAB:

• The Gauss-Jordan method can be performed as: I = eye(n); % Define the n-by-n identity matrix. R = rref([A I]); % Obtain the row reduced echelon form. X = R(:, n+1:n+n); % Pick up the last n columns to form A−1. • You can also obtain the inverse matrix directly: X=inv(A) % A has to be a square matrix.

• Question: If inv() has already fulfilled the purpose of finding inverse, why we need rref()? 2.5 Gauss-Jordan method revisited 2-52 Example. Find the right inverse R3×2 of a non-square matrix A2×3 = a1 a2 a3 . (Note that this cannot be done by inv().)     r   1,1    1 A r2,1 =  0  r 3,1 1 0   =⇒ Form the augmented matrix A  0 1  r1,2    0 A r    2,2 =  1 r3,2 1 0 1 0 =⇒ Find E2×2 such that EA E E = I2×2 Ea3 E E 0 1 0 1    r1,1 1  = E − Ea3r3,1 r2,1 0 ⇒ • R =  The right inverse is not unique!  r1,2 0  = E − Ea3r3,2 r2,2 1 2.5 Gauss-Jordan method revisited 2-53 2 −10 Further suppose A = . Then, −12−1 10 2 −10 10 2 −10 10 1 = 3 1 ...Forward elimination 2 1 −12−101 0 2 −1 2 1 2 − −2 4 2 1 3 2 10 10 20 3 3 3 ... 3 − 1 = 3 − 1 Backward elimination 01 0 2 1 2 1 0 2 1 2 1 1 −2 4 2 −1 2 1 2 0 20 3 3 3 10 3 3 3 2 3 1 = 2 1 2 ...Pivot normalization 0 3 0 2 −1 2 1 01−3 3 3 Hence, 1 2 2 1 2 0 1 3 10 3 3 E = 2 1 = 1 2 . 0 3 01 2 1 3 3 2.5 Gauss-Jordan method revisited 2-54

The problem is reduced to        r  r   1,1  −1  1,1 2    1  10 3   3 EA r2,1 = E  r2,1 =     −2   1  0  01 3 3  r3,1  r3,1   =⇒      r1,2  r1,2     −1   1    0  10 3   3 EA r2,2 = E  r2,2 =     −2   2  1  01 3 3 r3,2 r3,2

We can let r3,1 = r3,2 = 0 and obtain  2  r1,1 1  E 3  = = 1 r2,1 0 E 3 R . 1 So, the right inverse is =  r1,2 0 00  E 3  = = 2 r2,2 1 3 2 E − Ea3 r3,1 r3,2 Question: Can we choose r3,1 = r3,2 =1?Hint: R = . r3,1 r3,2 2.5 Gauss-Jordan method revisited 2-55

In MATLAB: The Gauss-Jordan method can be performed as: I = eye(2); % Define the 2-by-2 identity matrix. A = [2 -1 0; -1 2 -1]; % Define A. S = rref([A I]); % Obtain the row reduced echelon form. −1 2 1 % S 10 3 3 3 = −2 1 2 01 3 3 3 R = [S(:, 4:5); 0 0 ]; % Pick up the last 2 columns to form R.

Important summary: • In the process of Gauss-Jordan method, we solve and obtain directly the right inverse in the above example. • Using the same technique, we may obtain the left inverse,whichisthe product of the forward elimination matrices,thebackward elimination matrices and the pivot normalization matrix. In the above example, A2×3 does not have the left inverse because no matrix L3×2 can make L3×2A2×3 = I3×3. 2.5 Diagonal and tridiagonal matrices 2-56

• (1) The inverse matrix of a diagonal square matrix (if the inverse exists) is a diagonal square matrix. (2) The inverse matrix of a tridiagonal square matrix may however be a dense square matrix (since a is also a ).

Definition (Diagonal and tridiagonal matrices): A square matrix is a diagonal matrix if all the off-diagonal entries are zero. Likewise, a tridiagonal matrix is one that all the entries are zero except those in the main diagonal and two adjacent diagonals (which can be zeros). – Some may define the tridiagonal matrix as one that all entries in the main diagonal and two adjacent diagonals are non-zeros. In such case, its inverse is always dense. – It is also worth knowing that “antidiagonal” entries of a matrix A are a1,n,a2,n−1,a3,n−2,...,an,1.

Definition (Dense matrix): A matrix is dense if all the entries are non- zeros. 2.5 Triangular matrix and singular matrix 2-57

• If L is lower triangular with 1’s on the diagonal, so is its inverse.

A square triangular matrix is invertible if, and only if, all its diagonals are nonzeros.

Definition (Singular matrix): A square matrix is called singular if it has no inverse. 2.5 Symmetric and determinant 2-58

• The inverse matrix of a (if the inverse exists) is symmetric. Definition (Symmetric matrix): A square matrix is symmetric if its (i, j)th entry is equal to its (j, i)th entry for every i and j. • Recall that the determinant is the signed area (volume, etc) of the parallelo- gram (parallelepiped, etc) formed by the column vectors of a matrix A.

– The Gauss-Jordan method can also be used to determine the determi- nant, which is equal to the product of all the pivots before nor- malization.   2 −10 Example. Suppose A = −12−1. 0 −12   3 1 2 0 0 2 1 2 G.J.Method⇒  3 3 3 3 ⇒ A × 3 × 4 . AI = 0 2 0 4 2 4 det( )=2 =4 4 1 2 2 3 0 0 3 3 3 1 2.6 Elimination = Factorization : A = LU 2-59

Definition (Factorization): Factorization is the decomposition of an object (e.g., a number or a polynomial or a matrix) into a product of other objects, named factors, such that when being multiplied together, one gets the original. Example. 15 = 3 × 5 x2 − 3x +2=(x − 1)(x − 2) A = LU 2

• Triangular factorization: How to find lower triangular L and upper triangular U such that A = LU?

Answer: Gauss-Jordan method (Actually, just forward elimination. Why? See the below example).

– Forward elimination produces a matrix E such that EA = U. – The inverse of lower triangular E is also lower triangular L = E−1. – This immediately gives A = E−1U = LU . 2.6 Elimination = Factorization : A = LU 2-60

Example.   − ...     2 4 2 100 (1) with 2 the first pivot 100 2 4 −2  −  ... −2 10  49−3 49 3 010 (2) 001 −2 −37 −2 −3 7 001 ...(3)       1002 4 −2 − ...  010  011 2 4 2100 (1) with 2 the first pivot −(−1) 01 −2 −37 ⇔ 01 1 −210 ...(2) − 2 × (1) 01 5 1 01 ...(3) − (−1) × (1)   2 4 −2100...(1) ⇔ 0 1 1 −210 ...(2)with1 the second pivot      100 24−2 01 5 1 01 ...(3 ) 010 0 1 1    0 −1 1 01 5 2 4 −2100...(1) ⇔ 0 1 1 −210 ...(2)with1 the second pivot 00 4 3 −11 ...(3) − 1 × (2) 2.6 Elimination = Factorization : A = LU 2-61

The forward elimination process can then be implemented by multiplying the elim- ination matrices E2,1, E3,1 and E3,2 in sequence.         100 100 100 24−2 100 010  010 −2 10  49−3 010 − − − − −  0 1 1   ( 1) 01  001  2 3 7 001 E E E 3,2 3,1 2,1   2 4 −2100 = 0 1 1 −210 00 4 3 −11 Then we have that after forward elimination E3,2E3,1E2,1 AI = EA E = UE where E = E3,2E3,1E2,1.

−1 −1 −1 −1 Next, we need to determine L = E = E2,1 E3,1 E3,2 .Thisiseasybecauseevery Ei,j is formed by “changing” just one off-diagonal entry from an identity matrix. 2.6 Elimination = Factorization : A = LU 2-62

What we need to do is to “negate” the only non-zero off-diagonal entry!  −1    −1   100 100 100 100 010 = 010 ,  010 =  010 − − − −  0 1 1  0 1 1  ( 1)01  ( 1) 01 −1 −1 E3,2 E3,1  −1   100 100 and −2 10 = 2 10  001  001 −1 E2,1 More surprisingly, when they are multiplied in reverse order, the product is simply to place all these non-zero entries at their respective position!   100 −1 −1 −1 −1   L = E = E2,1 E3,1 E3,2 = 2 10 −111 2 2.6 Elimination = Factorization : A = LU 2-63

Intuitions behind the above example

• Why the inverse of Ei,j is simply to negate the multiplier of Ei,j? Answer: The inverse action of (i) − i,j × (j) is simply (i)+i,j(j) . −1 −1 −1 • Why the product of E2,1 E3,1 E3,2 is to place all the non-zero entries in their   respective position? 100  Answer: 0  © −1 −1  – E3,1 = only updates row 3 of E3,2 by row 1, i.e., (3 )=(3)+3,1 × (1) . Hence, (3), if (1) = 0 (such as the (1, 2)th and (1, 3)th entries of E−1) (3)= 3,2 3,1, if (3) = 0 and (1) = 1

Hint of a formal proof:

−1 −1 −1 −1 −1 −1 −1 −1 [I + (E3,1 − I)][I + (E3,2 − I)]=I + (E3,1 − I) + (E3,2 − I) + (E3,1 − I)(E3,2 − I) = I + (E3,1 − I) + (E3,2 − I).

−1 −1 −1 – Similar interpretation can be done for E2,1 onto E3,1 E3,2 .

We summarize the observation in the below lemma. 2.6 Elimination = Factorization : A = LU 2-64

Lemma: For an A, if no row exchange is necessary during forward elimination,thenitslower triangular factor L and upper triangular factor U satisfy • A = LU; • L has 1 on all main diagonal entries, and multipliers below the diagonal entries. – When a row of A starts with r zeros, so does the rows of L (because the multipliers are zeros). • U has (non-normalized) pivots on its main diagonal. – When a column of A starts with c zeros, so does the columns of U (because forward eliminations will not change these zeros).   a1,1 a1,2 a1,3   Example. Suppose A = a2,1 a2,2 a2,3 . a3,1 a3,2 a3,3 2.6 Elimination = Factorization : A = LU 2-65

Then   a1,1 a1,2 a1,3   a2,1 a2,2 a2,3 a a a  3,1 3,2 3,3  a1,1 a1,2 a1,3 ...(1)   ⇔ 0 a2,2 − (a2,1/a1,1)a1,2 a2,3 − (a2,1/a1,1)a1,3 ...(2) − (a2,1/a1,1) × (1) 0 a3,2 − (a3,1/a1,1)a1,2 a3,3 − (a3,1/a1,1)a1,3 ...(3) − (a3,1/a1,1) × (1) Then, we know     100 a1,1 a1,2 a1,3     L = a2,1/a1,1 10 and U = 0 a2,2 − (a2,1/a1,1)a1,2 a2,3 − (a2,1/a1,1)a1,3 a3,1/a1,1 ?1 00 ? 2

Remember the formula of multiplier: 2,1 = a2,1/a1,1 and 3,1 = a3,1/a1,1.

By LU decomposition, Ax = b can be solved by Lc = b (forward substitution) Ux = c (back substitution) 2.6 Elimination = Factorization : A = LU 2-66

• What if row exchange is needed during forward elimination?

Example.       100 2 4 −2 2 4 −2 100 ...(1) with 2 the first pivot −2 10  48−3   001 −2 −37 4 8 −3 010 ...(2) (Replace the middle 9 by 8)     1002 4 −2 −2 −3 7 001 ...(3)  010  001   −(−1) 01 −2 −37 2 4 −2100...(1) with 2 the first pivot       100 2 4 −2 ⇔ 00 1 −210 ...(2) − 2 × (1)     001 00 1 01 5 1 01 ...(3) − (−1) × (1) 010 01 5   2 4 −2100...(1) ⇔ 0 1 5101 ...(2)with1 the second pivot      100 24−2 00 1 −210 ...(3 ) 010 0 1 5    0 −0 1 00 1 2 4 −2100...(1) ⇔ 0 1 5101 ...(2)with1 the second pivot 00 1 −210 ...(3) − 0 × (2) 2.6 Elimination = Factorization : A = LU 2-67

The forward elimination process can then be implemented by multiplying E2,1, E3,1, P2,3 and E3,2 in sequence.           100 100 100 100 24−2 100 010 001  010 −2 10  49−3 010 − − − − −  0 0 1   010   ( 1) 01  001  2 3 7 001 E P E E 3,2 2,3 3,1 2,1   2 4 −2100 = 0 1 5101 00 1 −210 Then we have after forward elimination E3,2P2,3E3,1E2,1 AI = EA E = UE where E = E3,2P2,3E3,1E2,1.

−1 −1 −1 −1 −1 Next, we need to determine L = E = E2,1 E3,1 P2,3 E3,2 . This is easy because every Eij is formed by “changing” just one off-diagonal entry from an identity matrix, and the inverse of P2,3 is itself! 2.6 Elimination = Factorization : A = LU 2-68

What we need to do is to “negate” the only non-zero off-diagonal entry!  −1    −1   100 100 100 100 010 = 010 ,  010 =  010 − − − −  0 1 1  0 1 1  ( 1)01  ( 1) 01 −1 −1 E3,2 E3,1  −1   100 100 and −2 10 = 2 10  001  001 −1 E2,1   100 −1 −1 −1 −1 −1   L = E = E2,1 E3,1 P2,3 E3,2 = 201 Not a lower triangular matrix! −110 2 2.6 Elimination = Factorization : A = LU 2-69

How can we guarantee there is no row exchanges necessary for finding A = LU? • No row exchanges during forward elimination if, and only if, all the upper-left square sub-matrices of A are invertible. (See Problem 24 on p. 106.) (Problem 24, Section 2.6) Which invertible matrices allow A = LU (elimination without row exchanges)? Good question! Look at each of the square upper left submatrices of A. All upper left k by k submatrices Ak must be invertible(sizesk =1,...,n ). Lk 0 Uk ∗ Explain that answer: Ak factors into because LU = . ∗∗ 0 ∗

• In such case Ak = LkUk,whereAk (similarly, Lk and Uk) consists of the first k rows and the first k columns of A. a1,1 a1,2 a1,3 1 0 0 u1,1 u1,2 u1,3 a2,1 a2,2 a2,3 = 2,1 1 0 0 u2,2 u2,3 a3,1 a3,2 a3,3 3,1 3,2 1 00u3,3 2.6 Elimination = Triple Factorization : A = LDU 2-70

• Now assume again no row exchange is needed during forward elimination. • Sometimes, we wish to make the diagonal entries of U being all ones. Hence, a diagonal matrix D is introduced.

Example.       2 4 −2 2 00 1 2 −1 0 1 1  ⇒ 0 1 0 0 1 1  00 4  00 4   00 1  D U 2

The advantage of triple factorization is that for symmetric matrix A = LDU, we have L = U , where U  is the “transpose” of matrix U (i.e., the (i, j)th entry of U  is the (j, i)th entry of U). Hence, A = LDL. 2.6 The cost of elimination 2-71

Question: Again, assume with no row exchanges during elimination. What will be the complexity of using forward elimination and back substitution to solve Ax = b? Answer: i) Forward elimination for matrix A, × operations exact approximation 2 ai,1 → 0fori =2,...,n n(n − 1) n 2 ai,2 → 0fori =3,...,n (n − 1)(n − 2) (n − 1) . . . 2 ai,n−1 → 0fori = n 2 · 1 2 n  1 1 total i(i − 1) = n(n2 − 1) n3 i=2 3 3 Note: For example, n(n − 1) = n operations/row × (n − 1) rows. Note: The blue-colored numbers are used for step ii) in the next page, which requires for example, 1 operation/row × (n − 1) rows. 2.6 The cost of elimination 2-72 n 1 ii) Forward elimination for vector b, we need i=2 (i − 1) = 2n(n−1) computations. iii) Back substitution Ux = c requires × or / operations exact xn = cn/un,n 1 xn−1 =(cn−1 − un−1,nxn)/un−1,n−1 2 . . x1 = ··· n n  1 total i = n(n +1) i=1 2

Hence, ii)andiii)requiren2 multiplications.

• Using MATLAB, the above algorithm requires about 1 second to solve Ax = b with n = 1000. • Doubling n makes the algorithm 23 = 8 times slower. Example. n = 100 × 1000 → 1003 seconds = 11.574 days. 2 2.6 The cost of elimination 2-73

−1 Question: How about the complexity for calculating A (without row exchange) 1 2 2 using Gauss-Jordan? Does it require n × 3n(n − 1) + n multiplications as we need to solve Axi = ei for i =1,...,n?

Answer: i) Forward elimination for augmented matrix AI, × operations exact ai,1 → 0fori =2,...,n 2n(n − 1) ai,2 → 0fori =3,...,n (2n − 1)(n − 2) . . ai,n−1 → 0fori = n (n +2)· 1 n  1 total (n + i)(i − 1) = n(n − 1)(5n +2) i=2 6 2.6 The cost of elimination 2-74 ii) Backward elimination for augmented matrix AI, × operations exact ai,n → 0fori = n − 1,...,1 (n +1)(n − 1) ai,n−1 → 0fori = n − 2,...,1 (n +2)(n − 2) . . ai,2 → 0fori =1 (2n − 1) · 1 n−1  1 total (2n − i)i = n(n − 1)(4n +1) i=1 6 iii) Pivot normalization: n2 divisions.

1 2 1 2 3 3 Three steps in total, 2n(n − 1)(3n +1)+n = 2n(3n − 1) ≈ 2n .

• We can also find the inverse of A by solving Axi = ei for i =1,...,n sep- arately. In such case, the forward elimination part in slide 2-71 (step i)) can be shared among all n equations, while steps ii)andiii) in slide 2-72 (steps ii iii n 1n n2 − n · n2 ≈ 4n3 )and )) must be repeated times. This yields 3 ( 1) + 3 multiplications. • Backward substitution is more efficient than backward elimination. 2.6 The cost of elimination 2-75

Again, assume with no row exchanges during elimination. What will be the com- plexity of using forward elimination and back substitution to solve Ax = b, if A is a with w non-zero diagonals both below and above the main diagonal? Answer: i) Forward elimination for matrix A,

× operations exact approximation 2 ai,1 → 0fori =2,...,w+1 w · w w 2 ai,2 → 0fori =3,...,w+2 w · w w . . . 2 ai,n−w → 0fori = n − w +1,...,n w · w w . . . 2 ai,n−1 → 0fori = n 1 · 1 w w−1 i2 + w2(n − w) total nw2 i=1 1 2 = 6w(6nw − 4w − 3w +1)

Note: The blue-colored numbers are used for step ii). 2.6 The cost of elimination 2-76 ii b w−1 i n − w w 1w n − w − ) Forward elimination for vector , we need i=1 +( ) = 2 (2 1) computations. iii) Back substitution Ux = c requires × operations exact xn = cn/un,n 1 xn−1 =(cn−1 − un−1,nxn)/un−1,n−1 2 . . xn−w =(cn−w − un−w,nxn −···)/un−w,n−w w +1 . . x1 = ··· w +1 1 w n − w total 2( + 1)(2 )

Hence, ii)andiii)requiren(1 + 2w) − w(1 + w) computations. A band-(2w +1) matrix A (if no row exchange is needed during forward elimination) satisfies that its factors L and U are both band matrices with bandwidth of w +1. 2.7 Transpose and permutation 2-77

Definition (Matrix transpose): The transpose A of a matrix A is simply to exchange the rows and columns. Hence, the (i, j)th entry of A is the (j, i)th entry of A. Note: A matrix transpose is sometimes denoted by A or AT. For example,

• A is used by MATLAB. • AT is adopted by the textbook. (We will use this notation in the sequel!)

Properties of transpose • Sum Property:(A + B)T = AT + BT Proof: This can be proved by listing all the matrix entries of both sides. 2 2.7 Transpose and permutation 2-78

• Product Property:(AB)T = BTAT Proof: First, prove using Sum Property of Transpose. For a matrix A and vector b, T T (Ab) = a1 a2 ··· b   T a1 T T T T  T T T =(b1a1 + b2a2 + ···) = b1a1 + b2a2 + ···= b a2 = b A . Then, T T (AB) =(A b1 b2 ··· )     bTAT bT 1 1 T  T T  T T T T = Ab1 Ab2 ··· = b2A = b2 A = B A . . . 2 2.7 Transpose and permutation 2-79

• Inverse Property:(AT)−1 =(A−1)T Proof: A−1A = I ⇒ (A−1A)T = IT = I; hence, the Product Property of Transpose gives AT(A−1)T = I,which implies (A−1)T is the inverse of AT. 2 2.7 Inner product revisited 2-80

Recall the difference between inner product and outer product (for real-valued vectors) T v1 Inner product v · w = w v = w1 w2 = w1v1 + w2v2 v2 and T v1 v1w1 v1w2 Outer product v ⊗ w = vw = w1 w2 = v2 v2w1 v2w2

To memorize it, • for inner product, T is placed “inner” (in-between v and w) • for outer product, T is placed “outer” (outside v and w). 2.7 Inner product revisited 2-81

• The “dot product” we have introduced in Chapter 1 should be only termed as a“product” between two vectors, which is denoted by a “dot”. • It happens to satisfy the three axioms of an inner product. Definition: A mapping from V×Vto F, denoted by ·, ·, is an inner product if for every x, y, z ∈Vand a ∈ F, 1. Positive-definiteness: x, x≥0 with equality only when x =0 x, y y, x 2. Symmetry: = ax, y = a x, y 3. Linearity: x + y, z = x, z + y, z

– Vector dot product is an inner product with V = n and F =  (set of all real numbers).

• It is then possible to parallel linear algebra with calculus along this conception. 2.7 Inner product revisited 2-82

Example. What is the “calculus” for “function transpose operation”, such as in (Ax)Ty = xT(ATy), where A is the difference matrix?

• Inner product of two vectors can be expressed in terms of “transpose” as x, y = x · y = yTx.

• Hence, (Ax)Ty = xT(ATy) can be expressed as y,Ax = ATy, x.

• “Inner product” of two real functions can be defined as ∞ x(t),y(t)  x(t)y(t)dt. −∞ • We then reasonably guess the calculus counterpart of difference matrix A is derivative. I.e., we guess ∂ ∂ T y t , x t y t ,x t . ( ) ∂t ( ) = ∂t ( ) ( ) ∂ T The big question is what ∂t is? 2.7 Inner product revisited 2-83

∞ ∞ • Given that −∞ |x(t)|dt < ∞ and −∞ |y(t)|dt < ∞, integration by parts gives us ∞ ∂x t y t ,x t  y t ( )dt ( ) ( ) = ( ) ∂t −∞ ∞ ∂y t y t x t |∞ − ( )x t dt = ( ) ( ) −∞ ∂t ( ) (Integration by parts) −∞ ∞ ∂y(t) = − x(t)dt = −y(t),x(t) −∞ ∂t Hence, ∂ T ∂ − . Transpose of differentiation is ∂t = ∂t

• Verification:         100 x1 1 −10 −(y2 − y1)     T     Ax = −110x = x2 − x1 and A y = 01−1 y = −(y3 − y2) 0 −11 x3 − x2 00 1 y3 2 2.7 Inner product revisited 2-84

Summary • Inner product is a more general concept, and has much more extensional use than what we have introduced here. 2.7 Symmetric matrix 2-85

We can re-define symmetric matrix (different from that given in slide 2-58) using transpose. Definition (Symmetric matrix): A matrix A is said to be symmetric if AT = A.

Example. Prove that ATA is a symmetric matrix. Proof: (ATA)T = ATA; hence, it is symmetric by definition. 2 Remarks. • Based on the above definition, we can even introduce skew-symmetric matrix defined as AT = −A . • One advantage of symmetric matrices is that its triple factorization has a “symmetric” form. In other words, A = LDLT if A is symmetric (and no row exchange during forward elimination). • This can save half efforts in, e.g., forward elimination (from n3/3downto n3/6). (See Slide 2-71.) 2.7 Symmetric matrix 2-86

Question: Again, assume without row exchanges during elimination. What will be the complexity of forward elimination for a symmetric A? Answer: i) Forward elimination for matrix A, × operations exact 1 ai,1 → 0fori =2,...,n 2+3+···+(n − 1) + n = 2(n − 1)(n +2) a → i ,...,n ··· n − 1 n − n i,2 0for =3 2+3+ +( 1) = 2( 2)( +1) . . 1 ai,n−1 → 0fori = n 2=21 · 4 n  1 1 total (i − 1)(i +2)= n(n − 1)(n +4) i=2 2 6 ii) During the above eliminations, we only need to retain each multiplier (i.e., i,j) and pivot (i.e., ui,i)toformL and D. (Note that in the above process, we know that A will be reduced to LDLT,whichis of course known after knowing L and D.) a1,1 a1,2 a1,3 1 0 0 u1,1 — — T a2,1 a2,2 a2,3 = 2,1 1 0 0 u2,2 — L a3,1 a3,2 a3,3 3,1 3,2 1 00u3,3 2.7 Factorization with row exchange: PA = LU 2-87

• What if row exchange is needed during forward elimination?

Example. (This is exactly the same example that appears on slide 2-66!)       100 2 4 −2 2 4 −2 100 ...(1) with 2 the first pivot −2 10  48−3   001 −2 −37 4 8 −3 010 ...(2) (Replace the middle 9 by 8)     1002 4 −2 −2 −3 7 001 ...(3)  010  001   −(−1) 01 −2 −37 2 4 −2100...(1) with 2 the first pivot       100 2 4 −2 ⇔ 00 1 −210 ...(2) − 2 × (1)     001 00 1 01 5 1 01 ...(3) − (−1) × (1) 010 01 5   2 4 −2100...(1) ⇔ 0 1 5101 ...(2)with1 the second pivot      100 24−2 00 1 −210 ...(3 ) 010 0 1 5    0 −0 1 00 1 2 4 −2100...(1) ⇔ 0 1 5101 ...(2)with1 the second pivot 00 1 −210 ...(3) − 0 × (2) 2.7 Factorization with row exchange: PA = LU 2-88

The forward elimination process can then be implemented by multiplying E2,1, E3,1, P2,3 and E3,2 in sequence.           100 100 100 100 24−2 100 010 001  010 −2 10  49−3 010 − − − − −  0 0 1   010   ( 1) 01  001  2 3 7 001 E P E E 3,2 2,3 3,1 2,1   2 4 −2100 = 0 1 5101 00 1 −210 Then we have that after forward elimination E3,2P2,3E3,1E2,1 AI = EA E = UE where E = E3,2P2,3E3,1E2,1.

−1 −1 −1 −1 −1 Next, we need to determine L = E = E2,1 E3,1 P2,3 E3,2 . This is easy because every Eij is formed by “changing” just one off-diagonal entry from an identity matrix, and the inverse of P2,3 is itself! 2.7 Factorization with row exchange: PA = LU 2-89

What we need to do is to “negate” the only non-zero off-diagonal entry!  −1    −1   100 100 100 100 010 = 010 ,  010 =  010 − − − −  0 1 1  0 1 1  ( 1)01  ( 1) 01 −1 −1 E3,2 E3,1  −1   100 100 and −2 10 = 2 10  001  001 −1 E2,1   100 −1 −1 −1 −1 −1   L˜ = E = E2,1 E3,1 P2,3 E3,2 = 201 Not a lower triangular matrix! −110 Then, we obtain     100 2 4 −2 A = L˜U =  201 0 1 5  −110 00 1 2.7 Factorization with row exchange: PA = LU 2-90

Hence,       100 1 00 2 4 −2 P A = 001A = P L˜U = LU = −110 0 1 5  010 201 00 1 2 Factorization with row exchange: Find P such that PA= LU.

• The pre-permutation matrix that can force zero row exchange is not unique! For (another) example, check the below for the above example.       010 1 00 48−3  A −1   11  001 = 2 1 0 0 1 2 1 1 100 2 01 00−2 2.7 Factorization with row exchange: PA = LU 2-91

• The MATLAB command for LU-factorization is [L,U,P]=lu(A)

Note: The MATLAB will make row exchanges such that the next pivot is the largest! This is the reason why for   24−2 A =  4 8 −3 , −2 −37 the first pivot that MATLAB chooses is 4 ! 2.7 Factorization with row exchange: PA = LU 2-92

Some useful notes • A permutation matrix P (which has the rows of identity matrix I in any order) satisfies P T = P −1. Proof: Permutation does not change the inner product; hence (P x)T(P y)= xTy.Sincex and y are arbitrary, P TP = I. 2 • It is not necessary true that P −1 = P as P may not be symmetric. • There are n! permutation matrices of size n × n. • The determinant of P is either 1 (when the number of row changes is even) or −1 (when the number of row changes is odd). • We can also use forward elimination to make symmetric of a matrix A, i.e., A = ES where E is an elementary row operation matrix, and S is symmetric.

– The idea is simply to make the lower triangle entry ai,j the same as aj,i (instead of making ai,j =0). 12 10 12 Example (Problem 34 on page 119). A = = = ES 4 9 21 2 5 2 2.7 Factorization with row exchange: PA = LU 2-93 12 (Problem 34, Section 2.7) Write A = as the product EH of an 34 elementary row operation matrix E and a symmetric matrix H.

Definition (): An elementary (row operation) matrix is a matrix that differs from the identity matrix by one single elementary row operation, which includes 1. row exchanges; 2. row multiplications; and 3. row additions.

– In fact, we can use triple factorization to “symmetricize” a matrix. A LDU L U T −1 U TDU ES. = =  ( ) (  ) = E S