<<

MATH3602 Chapter 2 Advanced Direct Methods 1

2 Advanced Direct Methods

2.1 Use of (Similarity) in Solving Block- System

• Suppose we are to[ solve the following] system of : Ax = b where D11 D12 A = and Dij = Diag(Dij, ··· ,Dij) D21 D22 1 n are n × n diagonal matrices for i, j = 1, 2. Here A is a of 2 × 2 block (same size n × n of ). •       D11 D12 x b  1 1   1   1   ......   .   .         D11 D12   x   b   n n   n  =  n   21 22       D1 D1   xn+1   bn+1   ......   .   .  21 22 Dn Dn x2n b2n • LU factorization may need O(n3) operation? MATH3602 Chapter 2 Advanced Direct Methods 2 • Any strategy you can propose apart from LU factorization?

We consider the following example:   5 0 0 2 0 0     0 4 0 0 1 0     0 0 5 0 0 2    · A =   = L U  1 0 0 4 0 0     0 3 0 0 3 0  0 0 1 0 0 5      1.00 0 0 0 0 0   5.00 0 0 2 0 0       0 1.00 0 0 0 0   0 4.00 0 0 1 0       0 0 1.00 0 0 0   0 0 5.00 0 0 2  · ≡   ·   L U      0.20 0 0 1.00 0 0   0 0 0 3.60 0 0       0 0.75 0 0 1.00 0   0 0 0 0 2.25 0  0 0 0.20 0 0 1.00 0 0 0 0 0 4.60 MATH3602 Chapter 2 Advanced Direct Methods 3 • One can consider a re-arrangement of both the equations and variables and get the following equivalent system of equations A˜x˜ = b˜:       D11 D12 x b  1 1   1   1   21 22       D1 D1   xn+1   bn+1   11 12       D2 D2   x2   b2         D21 D22   x   b   2 2   n+2   n+2   .   .  =  .   ..   .   .         ...   .   .         11 12      Dn Dn xn bn 21 22 Dn Dn x2n b2n MATH3602 Chapter 2 Advanced Direct Methods 4 • It is clear to see that the linear system has a unique solution iff ([ ]) 11 12 Di Di ̸ det 21 22 = 0 for i = 1, 2, . . . , n. Di Di • The solution is given by (i = 1, 2, . . . , n) [ ] [ ] x 1 D22b − D12b i = i i i n+i 11 22 − 21 12 11 − 21 xn+i Di Di Di Di Di bn+i Di bi • The computational cost is O(n).

• The memory cost is also O(n).

• There exists a P such that A˜ = P · A · P T where P T = P −1. MATH3602 Chapter 2 Advanced Direct Methods 5 • The following is an example when n = 2:         11 12 11 12  D1 D1   1 0 0 0   D1 D1   1 0 0 0   21 22     11 12     D1 D1   0 0 1 0   D2 D2   0 0 1 0   11 12  =    21 22     D2 D2   0 1 0 0   D1 D1   0 1 0 0  21 22 21 22 D2 D2 0 0 0 1 D2 D2 0 0 0 1       11 12 11 12  D1 D1   1 0 0 0   D1 D1   21 22     11 12  •  D1 D1   0 0 1 0   D2 D2   11 12  =    21 22   D2 D2   0 1 0 0   D1 D1  D21 D22 0 0 0 1 D21 D22  2 2   2  2  11 12 11 12  D1 D1   D1 D1   1 0 0 0   21 22   21 22    •  D1 D1   D1 D1   0 0 1 0   11 12  =  11 12     D2 D2   D2 D2   0 1 0 0  21 22 21 22 D2 D2 D2 D2 0 0 0 1 MATH3602 Chapter 2 Advanced Direct Methods 6

• To solve Ax = b, i.e., to solve AP˜ x = P b: Stage 1 : A˜y = P b (solve for y) Stage 2 : P x = y (solve for x)

• The computational cost of P x can be ignored or is negligible.

• The idea can be extended to a matrix of m × m block (same size of diagonal matrix).

• If m << n, then the computational cost will be O(m3n). Why?

• The method can be implemented in a parallel computer easily. If there are O(n) processors, then the computational cost can be reduced to O(m3). MATH3602 Chapter 2 Advanced Direct Methods 7 • LU factorization can be applied wisely in diagonal block form:   11 12  D1 D1   . .   .. ..     D11 D12   n n   21 22   D1 D1     ......  21 22 Dn Dn     1 2  1   U1 U1   .   . .   ..   .. ..       1   U 1 U 2  =    n n     1   L1 1   Un+1       ......   ...  1 Ln 1 U2n • The idea can be extended to a matrix of m × m block (same size of diagonal matrix). If m << n, the computational cost will be O(m3n). MATH3602 Chapter 2 Advanced Direct Methods 8

2.2 Use of Similarity in Solving System

• A matrix is called a circulant matrix if it takes the following form:   ······  a1 a2 an−1 an     an a1 a2 ······ an−1   .   a − a a ······ .   n 1 n 1  An =  ......  .  ......   .   a3 ······ .. a1 a2 

a2 a3 ··· an−1 an a1

• We note that each row is just the previous row cycled forward for one step. • Thus an n × n circulant matrix is characterized by n coefficients only. ′ ′ • In an n × n circulant matrix, we have Cij = Ci′j′ if i − j ≡ i − j (mod n). MATH3602 Chapter 2 Advanced Direct Methods 9 • In fact, we can write ∑n−1 i An = ai+1Cn (2.1) i=0 where C is a permutation matrix n   0 1 0 ··· 0      0 0 1 0   . . . . .  Cn =  ......  .    ...... 1  1 0 ······ 0 • i 0 n We note that Cn are all permutation matrices and Cn = Cn = In. • Moreover, An is diagonalized if Cn is diagonalizable. MATH3602 Chapter 2 Advanced Direct Methods 10

• i It can be shown that all Cn can be diagonalized by using the discrete (FFT) matrix Fn. • The discrete FFT matrix is defined as follows:  0·0 0·1 ··· 0·(n−1)  wn wn wn   1·0 1·1 ··· 1·(n−1)   wn wn wn  Fn = Fn(wn) =    . . . .  (n−1)·0 (n−1)·1 (n−1)·(n−1) wn wn ··· wn where ( ) ( ) −2πi 2π 2π w = e n = cos − i sin . n n n • It can be shown that (Exercise)   − · − w−0·0 w−0·1 ··· w 0 (n 1) ( )  n n n   −1·0 −1·1 ··· −1·(n−1)  −1 1 1 1  wn wn wn  Fn = Fn =  . . . .  . n wn n  . . . .  −(n−1)·0 −(n−1)·1 −(n−1)·(n−1) wn wn ··· wn MATH3602 Chapter 2 Advanced Direct Methods 11 • A demonstration         0 1 0 0   0 0 1 0   0 0 0 1   1 0 0 0           0 0 1 0  2  0 0 0 1  3  1 0 0 0  4  0 1 0 0  C4 =   ,C =   ,C =   ,C =   ,  0 0 0 1  4  1 0 0 0  4  0 1 0 0  4  0 0 1 0  1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 •      1 1 1 1   0.25 0.25 0.25 0.25   − −   − −   1 i 1 i  −1  0.25 0.25i 0.25 0.25i  F4 =   F =   .  1 −1 1 −1  4  0.25 −0.25 0.25 −0.25  1 i −1 −i 0.25 −0.25i −0.25 0.25i • · −1 · · −1 − − Here F4 F4 = I4 and F4 C4 F4 = Diag(1, i, 1, i).

• Product of two circulant matrices is still a circulant matrix (Exercise). • The inverse of a circulant matrix is still a circulant matrix (Exercise). MATH3602 Chapter 2 Advanced Direct Methods 12 • Since n det(λI − Cn) = λ − 1, the eigenvalues of Cn are distinct (therefore Cn is diagonalizable) and are given by 2πij − n j − e = wn for j = 0, 1, . . . , n 1 or store in the following diagonal matrix −1 −2 ··· −(n−1) Dn = Diag(1 wn wn wn ). • · We have Fn Cn =    0 1 0 ··· 0 · − · −   w0 (n 1) w0·0 ··· w0 (n 2)  .   n n n   . 0 1 0   1·(n−1) 1·0 1·(n−2)  wn w ··· wn F ·  ......  =  n  n    . . . .   . . . .   . . . .   ...... 1  − · − − · − · − w(n 1) (n 1) w(n 1) 0 ··· w(n 1) (n 2) 1 0 ······ 0 n n n and −1 1 2 ··· (n−1) Dn = Diag(1 wn wn wn ). MATH3602 Chapter 2 Advanced Direct Methods 13 •   0·(n−1) 0·0 ··· 0·(n−2)  wn wn wn   1·(n−1)+1 1·0+1 ··· 1·(n−2)+1  −1  wn wn wn  D · Fn · Cn =   . n  . . . .  (n−1)·(n−1)+(n−1) (n−1)·0+(n−1) (n−1)·(n−2)+(n−1) wn wn ··· wn • Then by comparing the columns, we have  0·0 0·1 ··· 0·(n−1)  wn wn wn   1·0 1·1 ··· 1·(n−1)  −1  wn wn wn  D · Fn · Cn =   = Fn. n  . . . .  (n−1)·0 (n−1)·1 (n−1)·(n−1) wn wn ··· wn • − · k · −1 k Thus for k = 0, 1, . . . , n 1, we have Fn Cn Fn = Dn. From Eq. (2.1), we have ∑n−1 · · −1 k Fn An Fn = ak+1Dn = En k=0 where the eigenvalues of An are stored in the entries of the diagonal matrix En. MATH3602 Chapter 2 Advanced Direct Methods 14 • Now we have ∑n−1 · · −1 k Fn An Fn = ak+1Dn = En k=0 • The right-hand-side is a diagonal matrix and therefore it is an easy-to-solve system. But we have to solve two problems below:

−1 (i) The matrix-vector multiplications of Fnx and Fn x should be efficient. (ii) Get the eigenvalues of An, i.e., to get En.

• −1 For (i), it is known that both Fnx and Fn x can be done in O(n log n). We will briefly discuss this fact in the next subsection. In MATLAB you may use fft(x) and ifft(x). MATH3602 Chapter 2 Advanced Direct Methods 15

• For (ii), it is straightforward to show that −1 · T T · T T Fn [1 1 ... 1] = [1 0 ... 0] as Fn [1 0 ... 0] = [1 1 ... 1] . Then one can easily see that · · −1 · T · T Fn An Fn [1 1 ... 1] = En [1 1 ... 1] and this yields T T Fn · An · [1 0 ··· 0] = [λ1 λ2 . . . λn] . Then we have

T T Fn · [a1 an an−1 . . . a2] = [λ1 λ2 . . . λn] . • { }n This means the eigenvalues λi i=1 of An can be obtained in O(n log n) operations. • Finally, we conclude that a circulant matrix system can be solved in O(n log n) operations using the discrete Fast Fourier Transform (FFT). MATH3602 Chapter 2 Advanced Direct Methods 16 • Given a circulant matrix system Ax = b. { }n Step 1: Let A1 be the first column of A, then we get all the eigenvalues λi i=1 of A by applying the FFT : fft(A1). This requires O(n log n) operations. −1 · · Step 2: Then we have A = Fn Diag(λ1, . . . , λn) Fn. Therefore

Diag(λ1, . . . , λn) · Fnx = Fn · b.

Compute r = Fn·b by using the discrete Fast Fourier Transform (FFT). This requires O(n log n) operations.

Step 3: Let y = Fnx. Then we solve the linear equations

Diag(λ1, . . . , λn) · y = r and it requires O(n) operations. −1 Step 4: x = Fn y and it can be done by ifft(y) which requires O(n log n) opera- tions. • Finally we conclude that a circulant system can be solved in O(n log(n)) operations. MATH3602 Chapter 2 Advanced Direct Methods 17

2.2.1 Evaluating a and Fast Fourier Transform

• Given a polynomial of degree n − 1, assuming n = 2k for some k ∈ N ∑n−1 i P (x) = aix . i=0 • We are asked to evaluate at n distinct points x0, x1, . . . , xn−1. In matrix notation, we write       2 ··· n−1  1 x0 (x0) (x0)   a0   P (x0)   2 ··· n−1       1 x1 (x1) (x1)   a1   P (x1)      =    . . . . .   .   .  2 n−1 1 xn−1 (xn−1) ··· (xn−1) an−1 P (xn−1) • Suppose we are allowed to choose those n distinct points freely. We observe that − 2 2 4 4 ··· if we assign xi = xj then xi = xj, xi = xj, and we can save half of the computational cost. We hope that we can do the same for all the other pairs of rows. • We expect to have n special rows such that the computation requires n/2 vector products. MATH3602 Chapter 2 Advanced Direct Methods 18

• We let xi = −xn/2+i (i = 0, 1, . . . , n/2 − 1). We then divide the problem into two subproblems of size n/2.     2 n−1 1 x0 (x0) ··· (x0)   P (x0)      2 ··· n−1     1 x1 (x1) (x1)   a0   P (x1)   . . . . .     .   . . . . .   a1   .   2 n−1       1 x − (x − ) ··· (x − )   a   P (x − )   n/2 1 n/2 1 n/2 1   2  =  n/2 1 (2.2).  − − 2 ··· − n−1   .   −   1 x0 ( x0) ( x0)   .   P ( x0)   2 n−1   .     1 −x1 (−x1) ··· (−x1)   .   P (−x1)   . . . . .   .   . . . . .  an−1  .  2 n−1 1 −xn/2−1 (−xn/2−1) ··· (−xn/2−1) P (−xn/2−1)

• Comparing the two subproblems, those even powers in xi need to compute once

(save half of the cost) and those odd powers in xi are differed by a negative sign.

• We note that both Pe(·) and Po(·) below are degree n/2 − 1 : n/∑2−1 n/∑2−1 2i 2i 2 2 P (x) = a2ix + x a2i+1x ≡ Pe(x ) + xPo(x ). i=0 i=0 MATH3602 Chapter 2 Advanced Direct Methods 19

• To evaluate Eq. (2.2), we need to compute both P (xi) and P (−xi) for i = 0, 1, . . . , n/2 − 1. Here 2 2 2 2 P (x) = Pe(x ) + xPo(x ) and P (−x) = Pe(x ) − xPo(x ) 2 • Thus we need to compute n/2 values of the form Pe(x ), n/2 values of the form 2 Po(x ), n multiplications, n/2 additions and n/2 subtractions. • Assuming that the process can be further sub-divided into n/4, n/8,..., 4, 2 etc. then we would like to know the total computational cost.

• Let T (n) be the computational cost of evaluating a polynomial of degree n at n distinct points then we have ( ) n T (n) = 2 · T + 2n. 2 Or by letting n = 2k we have T (2k) = 2T (2k−1) + 2 · 2k = 22T (2k−2) + 2(1 + 1)2k = 2kT (1) + 2k2k. This means T (n) = O(n log n). MATH3602 Chapter 2 Advanced Direct Methods 20

• Can we further sub-divide the process in computing P (x2)? Let us take a look of    e    2 4 ··· n−2  1 (x0) (x0) (x0)   a0   Pe(x0)   . . . . .     .   . . . . .   a2   .   2 4 n−2   .     1 (x − ) (x − ) ··· (x − )   .   P (x − )   n/4 1 n/4 1 n/4 1    =  e n/4 1   2 4 ··· n−2   .     1 (xn/4) (xn/4) (xn/4)   .   Pe(xn/4)   . . . . .     .   . . . . .   an−4   .  2 4 n−2 1 (xn/2−1) (xn/2−1) ··· (xn/2−1) an−2 Pe(xn/2−1)

• 2 − 2 In order to apply the trick, we must have xn/4 = (x0) . Therefore the sub-dividing process cannot be further if xi are real. • However,√ this is possible if xi are carefully chosen complex numbers such as xj+n/4 = −1xj for j = 0, 1, . . . , n/4 − 1.

• In particular, if we wish to carry out the sub-division k − 1 times then xi can be chosen as the entries of the Fast Fourier Transform matrix F2k(w2k). MATH3602 Chapter 2 Advanced Direct Methods 21

2.3 Systems

• A matrix is called a Toeplitz matrix if it takes the following form:   ······  a0 a1 an−1 an     a−1 a0 a1 ······ an−1   . .   . a− a ······ .   1 0  An =  ......  .  ......   .   a−n+1 ······ .. a0 a1 

a−n a−n+1 ······ a−1 a0

• A n × n Toeplitz matrix 1 has constant and is characterized by 2n − 1 coefficients and it appears in many applications such as numerical P.D.E. • We note a circulant matrix is a particular member of the class of Toeplitz matrices.

1Usually there is a continuous generating function f(x) such that the entries of the Toeplitz matrix is given by ∫ 1 π −ikx ak = 2π −π f(x)e dx MATH3602 Chapter 2 Advanced Direct Methods 22 • An example in numerical O.D.E. is the finite difference method. We consider the following second order ordinary differential equation f ′′(x) = sin(x2), f(0) = f(1) = 0, x ∈ [0, 1]. (2.3)

• One may first divide the interval [0, 1] into n equal parts. We then let i x = , for i = 0, 1, 2, . . . , n. i n

From (2.3) we have f(x0) = f(xn) = 0.

′′ • We then adopt the following approximation of f (x) at the points xi for i = 1, 2, . . . , n − 1, then we have − − f(xi−1) f(xi) − f(xi) f(xi+1) f(x − ) − 2f(x ) + f(x ) f ′′(x ) = 1/n 1/n = i 1 i i+1 . (2.4) i 1/n 1/n2 MATH3602 Chapter 2 Advanced Direct Methods 23

• Therefore from Equation (2.4) we get       2 −1 0 0 0 f(x ) f ′′(x )    1   1   − − .. .     ′′   1 2 1 . .   f(x2)   f (x2)     .  1  .   0 −1 ...... 0   .  = −  .  (2.5)     n2    ...... 2 −1   .   .  ′′ 0 ··· 0 −1 2 f(xn−1) f (xn−1) or       2 −1 0 0 0 f(x ) sin(x2)    1   1   − − .. .     2   1 2 1 . .   f(x2)   sin(x2)   . .   .  1  .  An−1x =  0 −1 .. .. 0   .  = −  .  . (2.6)     n2    ...... 2 −1   .   .  ··· − 2 0 0 1 2 f(xn−1) sin(xn−1) • By solving the above Toeplitz system, we may get an approximation of the so- lution of (2.3) at the points xi ∈ [0, 1]. MATH3602 Chapter 2 Advanced Direct Methods 24 • It can be show that the approximation error goes to zero as n go to infinity. Thus to get an accurate solution, we need to solve for a large matrix system Anx = b.

Theorem 2.1. We have det(An) = n + 1 ≠ 0. Therefore all the principal minors of

An are invertible and hence the Doolittle LU factorization of An exists. [This is an exercise in Assignment 1]

Theorem 2.2. We have ∑n ∑n−1 ∑n−1 T 2 − 2 2 − 2 x Anx = 2 xi 2 xi+1xi = x1 + xn + (xi+1 xi) i=1 i=1 i=1 and therefore An is symmetric positive definite. Hence An has a unique Cholesky factorization. MATH3602 Chapter 2 Advanced Direct Methods 25

Theorem 2.3.     1 0 ······ 0 a u 0 ··· 0    1 1   .. .. .   .. .   l1 1 . . .   0 a2 u2 . .   . . .   . . .  An ≡ LnUn =  0 ...... 0   0 ...... 0  .  . .   . .   . .. ln−1 1 0   . .. 0 an−1 un−1 

0 ··· 0 ln−1 1 0 ··· 0 0 an Then we have (i)   − −1  a1 = 2, ai = 2 ai−1, i = 2, . . . , n −1 l = −1/2, l = −(2 + l − ) , i = 2, . . . , n  1 i i 1 ui = −1, i = 1, . . . , n. and (ii) i + 1 −i a = and l = i = 1, . . . , n. i i i i + 1 MATH3602 Chapter 2 Advanced Direct Methods 26

Proof We have    1 0 ······ 0 a u 0 ··· 0    1 1   .. .. .   .. .   l1 1 . . .   0 a2 u2 . .       0 ...... 0   0 ...... 0   . .   . .   . .. ln−1 1 0   . .. 0 an−1 un−1 

0 ··· 0 ln−1 1 0 ··· 0 0 an   a u 0 ··· 0  1 1   .. .   l1a1 l1u1 + a2 u2 . .    =  0 ...... 0  .  . .   . .. ln−2an−2 ln−2un−2 + an−1 un−1 

0 ··· 0 ln−1an−1 ln−1un−1 + an • Hence we have

u1 = u2 = ... = un−1 = −1 and

a1 = 2, ai = 2 + li−1, l1 = −1/2, li = −1/ai. MATH3602 Chapter 2 Advanced Direct Methods 27

Thus { − −1 a1 = 2, ai = 2 ai−1, i = 2, . . . , n −1 l1 = −1/2, li = −(2 + li−1) , i = 2, . . . , n • For (ii), we apply M.I. The result is true when i = 1. We assume that i + 1 −i a = and l = i i i i + 1 then by induction assumption 1 i i + 2 ai+1 = 2 − = 2 − = ai i + 1 i + 1 and 1 i −1 i + 1 li+1 = − = −(2 − ) = − . 2 + li i + 1 i + 2 We proved the result. • One can then obtain the Cholesky factorization (Exercise). MATH3602 Chapter 2 Advanced Direct Methods 28

2.4 Sparse Toeplitz System

• In this part, we consider the numerical solution (finite difference method) U(x, y) of the following partial differential equation in the domain Ω = [0, 1] × [0, 1] :

Uxx(x, y) + Uyy(x, y) = f(x, y) in Ω (2.7) where U(x, y) is zero on the boundary of Ω, i.e., U(x, 0) = U(0, y) = U(1, y) = U(x, 1) = 0 ∀x, y ∈ [0, 1].

• This is the Poisson’s equation with Dirichlet boundary condition. MATH3602 Chapter 2 Advanced Direct Methods 29

• One may first divide the square domain Ω into n2 equal squares with the grid points (see the figure below for the case of n = 4) i j (x , y ) = ( , ). i j n n y (0,1) •6 • • • •(1,1) • • • • • • • • • • • • • • • (0,0) (1,0) • • • • • - x MATH3602 Chapter 2 Advanced Direct Methods 30

• We then adopt the following approximation of Uxx(x, y) and Uyy(x, y) at the interior grid points (xi, yj)

Ui+1,j − 2Ui,j + Ui−1,j 1/n2 and U − 2U + U − i,j+1 i,j i,j 1. 1/n2 • Hence the partial equation can be approximated as follows: U − 2U + U − U − 2U + U − i+1,j i,j i 1,j + i,j+1 i,j i,j 1 = f . (2.8) 1/n2 1/n2 i,j Here

Ui,j = U(xi, yj) and fi,j = f(xi, yj). MATH3602 Chapter 2 Advanced Direct Methods 31

Therefore from Equation (2.8) we get       4 −1 0 −1 0 0 0 0 0 U f    1,1   1,1   − − −       1 4 1 0 1 0 0 0 0   U1,2   f1,2         0 −1 4 0 0 −1 0 0 0   U   f     1,3   1,3   − − −       1 0 0 4 1 0 1 0 0   U2,1   f2,1      1    0 −1 0 −1 4 −1 0 −1 0   U2,2  = −  f2,2  . (2.9)     n2    − − −       0 0 1 0 1 4 0 0 1   U2,3   f2,3         0 0 0 −1 0 0 4 −1 0   U3,1   f3,1         0 0 0 0 −1 0 −1 4 −1   U3,2   f3,2 

0 0 0 0 0 −1 0 −1 4 U3,3 f3,3 • By solving the above linear system, one may get an approximation of the solution of (2.7) at the points (xi, yj) ∈ [0, 1] × [0, 1]. • In general to get a better accuracy of the solution, we have to increase n, i.e., the size of the matrix system. MATH3602 Chapter 2 Advanced Direct Methods 32

• In general the problem is to solve a linear system Bx = b of size n2 × n2 where   A −I 0  n   − −   IAn I    B =  ......  . (2.10)    −IAn −I 

0 −IAn Here I is the n × n and   2 −1 0    − −   1 2 1   . . .  An = 2I +  ......  . (2.11)    −1 2 −1  0 −1 2 MATH3602 Chapter 2 Advanced Direct Methods 33

• We observe that    2I −I 0   A − 2I 0  −I 2I −I   n     −     An 2I  B =  ......  +   .    ...   −I 2I −I  0 A − 2I 0 −I 2I n • −1 Is there an efficient way to diagonalize An (SnAnSn is a diagonal matrix)? • It can be shown that the eigenvalues( of) An are sπ 4 sin2 , s = 1, 2, . . . , n 2(n + 1) and the corresponding eigenvectors are respectively given by [ ( ) ( ) ( )] sπ 2sπ nsπ T sin sin ... sin . (n + 1) (n + 1) (n + 1) We will show it later. • It can also be shown that the eigenvalues can be obtained in O(n log(n)) operations −1 and both Snv and Sn v can be done in O(n log(n)) operations. MATH3602 Chapter 2 Advanced Direct Methods 34 • The strategy here is to consider   −  Dn + 2I I 0   − −  −1 −1  IDn + 2I I  Diag(Sn, ··· ,Sn)·B·Diag(S , ··· ,S ) =   . n n  ...... 

0 −IDn + 2I • There exists a permutation matrix P such that · ··· · · −1 ··· −1 · T ··· P Diag(Sn, ,Sn) B Diag(Sn , ,Sn ) P = Diag(T1,T2, ,Tn) where i = 1, 2, ··· , n   −  [Dn]ii + 2 1 0   − −   1 [Dn]ii + 2 1  Ti =   .  ...... 

0 −1 [Dn]ii + 2 MATH3602 Chapter 2 Advanced Direct Methods 35

• Solving n tri-diagonal system requires O(n2) operations.

• Applying the permutation matrix P requires minimal operations.

• Applying the fast Sine transform n times requires O(n2 log(n)) operations.

• Applying the inverse fast Sine transform n times requires O(n2 log(n)) operations.

• The overall cost will be O(n2 log(n)) operations and recall that there are n2 un- knowns to be solved. MATH3602 Chapter 2 Advanced Direct Methods 36

2.4.1 The Eigenvalues and Eigenvectors of An

• How to get the eigenvalues and vectors in closed form?

• We consider the relation A v = λv, then we have  n  − −  (2 λ)v1 v2 = 0   −v1 + (2 − λ)v2 − v3 = 0  . .  −v − + (2 − λ)v − v = 0  j 1 j j+1  . .  . .  −vn−1 + (2 − λ)vn = 0

We define v0 = vn+1 = 0 then we have the difference equations:

−vj−1 + (2 − λ)vj − vj+1 = 0 for j = 1, 2, . . . , n. MATH3602 Chapter 2 Advanced Direct Methods 37 • Using standard result in difference equation, the solution is of the form:

j j vj = Bm1 + Cm2 where m1 and m2 are roots of the quadratic equation −1 + (2 − λ)m − m2 = 0 (2.12) and C and B are two constants.

• We remark that m1 ≠ m2 as if this is the case, we have the solution form: j vj = (B + Cj)m1 but v0 = vn+1 = 0 implies that

n+1 v0 = 0 = B and vn+1 = 0 = C(n + 1)m1 and therefore B = C = 0. MATH3602 Chapter 2 Advanced Direct Methods 38 • Now we have n+1 n+1 v0 = 0 = B + C and vn+1 = 0 = Bm1 + Cm2 . Thus we have m B = −C and ( 1)n+1 = 1 = e2πi √ m2 where i = −1. By Equation (2.12), we have

m1m2 = 1. Here we have sπi − sπi m1 = en+1 and m2 = e n+1 • By Equation (2.12) again, we have

m1 + m2 = 2 − λ and thus we have for s = 1, 2, . . . , n, sπi −sπi λs = 2 − (en+1 + e n+1 ) − sπ = 2 2 cos(n+1) − − 2 sπ 2 sπ = 2 2(1 sin (2(n+1))) = 4 sin (2(n+1)). MATH3602 Chapter 2 Advanced Direct Methods 39

• Finally we note that for λs, we have j = 1, 2, . . . , n

j j sπi −sπi jsπ v = Bm + Cm = B(en+1 − e n+1 ) = 2iB sin( ). j 1 2 n + 1 • As 2iB is just a constant, we have ( ) sπ 2sπ snπ T v = sin( ), sin( ), ··· , sin( ) . s n + 1 n + 1 n + 1

• Now we see that there is an Sn such that −1 · · Sn An Sn = Dn where Dn is a diagonal matrix.

• −1 Similar to the case of the FFT, it can be shown that both Snx and Sn x can be done in O(n log n) and this is called the discrete Fast Sine Transform. MATH3602 Chapter 2 Advanced Direct Methods 40

2.4.2 A Short Note on Second-order Difference Equation

• Suppose you are given a difference equation:

xn+1 + axn + bxn−1 = 0 n = 1, 2,.... (2.13) This is called a linear second-order difference equation. We would like to find the solution of the form: n xn = A · α . • Now we have A · (αn+1 + aαn + bαn−1) = 0. • Assuming A and α are not zero. This means if α satisfies α2 + aα + b = 0 (this is called the auxiliary equation) then

n xn = A · α can be a non-zero solution. MATH3602 Chapter 2 Advanced Direct Methods 41

• Let m1 and m2 be two distinct roots then one can see that · n · n xn = A m1 + B m2 is a solution for any A and B. This is called a general solution of (2.13).

• To fix A and B, we need to know further information such as the values of x0 and x1, i.e., the exact values of two terms.

• If m1 = m2 then it can be shown that the general solution is · n · · n xn = A m1 + B n m1 . MATH3602 Chapter 2 Advanced Direct Methods 42

2.5 Tri-diagonal Linear Systems

• In application, systems of equations often arise in which the coefficient matrix has a special structure.

• It is usually better to solve these system using tailor-made algorithms that exploit the special structure.

• We consider one example of this, the tridiagonal system. Tridiagonal matrices arise in many practical problems, including

(i) Numerical approximation to one-dimensional partial differential equations. (ii) Markov chain problems. MATH3602 Chapter 2 Advanced Direct Methods 43 • Here is an example of a tri-diagonal matrix: Superdiagonal ↓   −  2 1 0 0    −1 2 −1 0  Subdiagonal −→   0 −1 2 −1 0 0 −1 2

Of course, in general, the nonzero entries need not be equal.

Definition 2.1. A A = [aij] is said to be tridiagonal if

aij = 0 for all |i − j| > 1. MATH3602 Chapter 2 Advanced Direct Methods 44

• Thus, in the ith row, only ai,i−1, ai,i and ai,i+1 can be different from 0. In practice, three vectors can be used to store the nonzero elements which are arranged like this:   a a  11 12     a21 a22 a23   .   a32 a33 ..   . .   .. .. an−1,n 

an,n−1 an,n while zero entries are not stored.

• Suppose A is factorized into the form     1 U U    11 12       L21 1   U22 U23      LU =  ......   ......   . .   .   .. ..   .. Un−1,n 

Ln,n−1 1 Un,n MATH3602 Chapter 2 Advanced Direct Methods 45 • We note that the following U entries require no computation cost

U11 = a11 and

Ui−1,i = ai−1,i (i = 2, 3, ··· , n). • In general, we may compute as follows:

For k = 2, . . . , n,

Lk,k−1 = ak,k−1/Uk−1,k−1;

Ukk = akk − Lk,k−1; end. MATH3602 Chapter 2 Advanced Direct Methods 46 • The following table summarizes the operational cost.

÷ × +/−

L21,U22 1 1 1

L32,U33 1 1 1 . . . .

Ln,n−1,Un,n 1 1 1 Total n − 1 n − 1 n − 1

• The overall cost will be reduced to O(n) in stead of O(n3) and hence the cost for solving a tridiagonal linear system is also O(n). (Why?) MATH3602 Chapter 2 Advanced Direct Methods 47

2.6 Tri-diagonal Systems: Markov Chains

• Consider a random walker performing a random walk on a straight line with grid points 1, 2, . . . , n.

• At each time, the random walker has a probability of p to move one-step forward and a probability of 1 − p to move one-step backward. Here 0 < p < 1.

• At grid point 1 with probability of 1, the walker will move to grid point 2. While at grid point n with probability of 1, the walker will move to grid point n − 1. MATH3602 Chapter 2 Advanced Direct Methods 48 • The process can be described by a n × n one-step column- A as follows:   0 1 − p    −   1 0 1 p     p 0 1 − p     .. ..  A =  p . .   . .   .. .. 1 − p     p 0 1  p 0

• Let x be the stationary probability distribution of the process. Here xi is the long-run probability that random walker is found in grid point i. It can be shown that ∑n Ax = x and xi = 1. i=1 MATH3602 Chapter 2 Advanced Direct Methods 49 • The solution strategy for solving the following linear system:       −1 1 − p x 0    1     − −       1 1 1 p   x2   0         p −1 1 − p   .   .         .. ..   .   .   p . .   .  =  .   . .   .   .   .. .. 1 − p   .   .         p −1 1   xn−1   0 

p −1 xn 0 • The above matrix is singular and it can be shown that its is n−1 (Exercise).

• To write xi in terms of x1 for i = 2, 3, . . . , n.

∑n • Then by using the fact that xi = 1, x1 can be solved and hence x. i=1

• This is called the matrix geometric method. It can be extended to case of block matrices. MATH3602 Chapter 2 Advanced Direct Methods 50

−1 • Now from the first equation, x2 = (1 − p) x1. −2 From Equation 2, x1 − x2 + (1 − p)x3 = 0, so x3 = p(1 − p) x1 2 −3 From Equation 3, px2 − x3 + (1 − p)x4 = 0, so x4 = p (1 − p) x1

...... n−3 −n+2 From Equation n−2, pxn−3 −xn−2 +(1−p)xn−1 = 0, so xn−1 = p (1−p) x1

From Equation n − 1, pxn−2 − xn−1 + xn = 0 so n−3 −n+2 n−3 −n+3 n−2 −n+2 xn = p (1 − p) x1 − p (1 − p) x1 = p (1 − p) x1. • For p ≠ 0.5 (p = 0.5 will be left as an exercise), we have ∑n −1 −1 n−3 −n+3 n−2 −n+2 xi = x1+(1−p) (1+p(1−p) +···+p (1−p) )+p (1−p) x1 = 1. i=1 ( )− − p n−2 1 p 1 ( − ) x = 1 + ( )n−2 + 1 p 1 1 − p 1 − 2p • Its computational cost is O(n). MATH3602 Chapter 2 Advanced Direct Methods 51 • Consider solving the following tri-diagonal block transition probability matrix system of infinite size:       x0   B0 B1   x0         x1   CAD   x1         x   CAD   x   2     2  P   =     = 0  x3   CAD   x3         .   ......   .  ...... where each block is of size n × n and xi is an n × 1 vector.

• We assume that there exists an n × n matrix R such that

xi+1 = Rxi, i = 0, 1, 2,....

i Therefore we have xi = R x0, i = 1, 2,...,. MATH3602 Chapter 2 Advanced Direct Methods 52 • Now we have to solve

B0x0 + B1x1 = (B0 + B1R)x0 = 0 and

2 i Cxi + Axi+1 + Dxi+2 = (C + AR + DR )R x0 = 0 i = 0, 1, 2,.... • This is equivalent to find an n × n matrix R such that C + AR + DR2 = 0 then we solve

(B0 + B1R)x0 = 0 i for x0 and hence xi = R x0. Finally normalize the whole vector to get the distribu- tion. MATH3602 Chapter 2 Advanced Direct Methods 53 Example: Consider

B0 = −0.2I2; B1 = 0.3I2; C = 0.2I2; D = 0.3I2 and [ ] −0.6 0.2 A = 0.1 −0.7   −0.2 0.0 0.3 0.0    −   0 0.2 0.0 0.3     0.2 0.0 −0.6 0.2 0.3 0.0     0.0 0.2 0.1 −0.7 0.0 0.3    P =    0.2 0.0 −0.6 0.2 0.3 0.0     0.0 0.2 0.1 −0.7 0.0 0.3     ......  ...... MATH3602 Chapter 2 Advanced Direct Methods 54

2 • Now the key step is to solve R such that 0.2I2 + AR + 0.3R = 0 and we rewrite R = −A−1(0.2I + 0.3R2) where the inverse of −A is 2 [ ] 1.75 0.50 −A−1 = . 0.25 1.50 • We develop an iterative scheme:[ ] [ ]

−1 2 0.35 0.10 0.525 0.150 2 Rn+1 = −A (0.2I2+0.3R ) = + R n = 0, 1,.... n 0.05 0.30 0.075 0.450 n T 1 1 and we may use the matrix R0 = [1 1] [3 3] as an initial guess. It eventually converges to [ ] 0.5375 0.2583 R = . 0.1291 0.4084 Then we solve [ ] −0.0387 0.0775 (B0 + B1R)x0 = x0 = 0 0.0387 −0.0775 T and get x0 = [1 0.5] . MATH3602 Chapter 2 Advanced Direct Methods 55

2.7

• The LU factorization of a matrix is indeed an abstract version of Gaussian elimi- nation. The traditional from of Gaussian elimination will be described and related to the abstract form here.

Consider       −  6 2 2 4   x1   12   −      (1)  12 8 6 10   x2   34  A x =     =   —— (1)  3 −13 9 3   x3   27 

−6 4 1 −18 x4 −38 • First step of the Gaussian algorithm:

◦ Subtract 2× 1st equation from the second. ◦ 1× Subtract 2 1st equation from the third. ◦ Subtract −1× 1st equation from the fourth. MATH3602 Chapter 2 Advanced Direct Methods 56 • Then we get       −  6 2 2 4   x1   12   −      (2)  0 4 2 2   x2   10  A x =     =   —— (2)  0 −12 8 1   x3   21 

0 2 3 −14 x4 −26

• 1 − The numbers 2, 2, 1 are called multipliers. • The number 6 used as the divisor in forming each of these multipliers is called the pivot element. • The first row is used in the process but it is not changed. We refer to it as the pivot row in the first step. • We repeat similar process with Row 2 and then Row 3 .... MATH3602 Chapter 2 Advanced Direct Methods 57 • Apply the process to the second row (as  pivot row), we get −  6 2 2 4   x1   12   −      (3)  0 4 2 2   x2   10  A x =     =   —— (3)  0 0 2 −5   x3   −9 

0 0 4 −13 x4 −21 − −1 where 4 is the pivot element and the multipliers are 3 and 2. • Finally we apply the process to the third  row:   −  6 2 2 4   x1   12   −      (4)  0 4 2 2   x2   10  A x =     =   —— (4)  0 0 2 −5   x3   −9 

0 0 0 −3 x4 −3 where 2 is the pivot element and the multiplier is also 2. • Solving this upper triangular linear system, we get x = [1 − 3 − 2 1]T . Linear systems (1), (2), (3) and (4) are equivalent in the sense that they have the same solution x. MATH3602 Chapter 2 Advanced Direct Methods 58 • Note that the Gaussian elimination process is equivalent to a LU factorization:  A   L   U  − −  6 2 2 4   1 0 0 0   6 2 2 4   12 −8 6 10   2 1 0 0   0 −4 2 2    =      −   1   −   3 13 9 3   2 3 1 0   0 0 2 5  − − − −1 − 6 4 1 18 1 2 2 1 0 0 0 3 | ↑ ↑ ↑ | (1) Multipliers (4) (1) (2) (3) ⇓ ⇓ ⇓ (2) (3) (4) where each multiplier in L is written in the location corresponding to the 0 entry in the matrix it was responsible for creating. And this result can be generalized to any n × n matrix. • The equivalence is clear if we consider the reverse process to obtain A from U and regard L as the matrix corresponding to the row operations. MATH3602 Chapter 2 Advanced Direct Methods 59

• Interpretation I: Let us denote the rows of A by A1, A2, A3 and A4 and similarly for the rows of L and U. Then         6 −2 2 4 1 0 0 0 6 −2 2 4  A1         L1   U1   A   12 −8 6 10   2 1 0 0   0 −4 2 2   L   U   2  ≡   =     ≡  2   2     −   1   −       A3   3 13 9 3   2 3 1 0   0 0 2 5   L3   U3  − − − −1 − A4 6 4 1 18 1 2 2 1 0 0 0 3 L4 U4

• Since A1 is not changed at all, then A1 = U1. So we must have

L1 = [1 0 0 0]

• The elimination process gives U2 = A2 − 2A1 and so

A2 = 2A1 + U2 = 2U1 + U2

Hence coefficients 2 and 1 occupy L2, i.e.,

L2 = [2 1 0 0] MATH3602 Chapter 2 Advanced Direct Methods 60 • Similarly, the row operations leading to Row 3 are 1 U = (A − A ) − 3U 3 3 2 1 2 and we have 1 1 A = A + 3U + U = U + 3U + U 3 2 1 2 3 2 1 2 3 1 The coefficients 2, 3 and 1 must therefore occupy L3 as follows: 1 L = [ 3 1 0] 3 2 • Finally, we have 1 U = (A + A ) + U − 2U 4 4 1 2 2 3 and 1 A = −U − U + 2U + U 4 1 2 2 3 4 and 1 L = [−1 − 2 1]. 4 2 MATH3602 Chapter 2 Advanced Direct Methods 61 • Interpretation II. The connection between the multipliers in the matrix L and Gaussian elimination can also be revealed directly by defining appropriate elimina- tion matrices M(i) in each elimination step.

• Recall that   −  6 2 2 4   −  (1)  12 8 6 10  A = A =    3 −13 9 3  −6 4 1 −18 MATH3602 Chapter 2 Advanced Direct Methods 62 • The row operations in the first step,

◦ Subtract 2×1st equation from the second. ◦ 1× Subtract 2 1st equation from the third. ◦ Subtract −1×1st equation from the fourth. can be represented by the elimination matrix M(1) such that

 M(1)    −  1 0 0 0   6 2 2 4   −2 1 0 0   0 −4 2 2   A(1) =   = A(2)  −1   −   2 0 1 0   0 12 8 1  −(−1) 0 0 1 0 2 3 −14

• We note that M(1) contains negative multipliers in the location corresponding to the 0 entry in the matrix A(2) it was responsible for creating. MATH3602 Chapter 2 Advanced Direct Methods 63

• Applying the process to A(2), we get

 M(2)    −  1 0 0 0   6 2 2 4     −   0 1 0 0  (2)  0 4 2 2  (3)  A =   = A  0 −3 1 0   0 0 2 −5  1 − 0 2 0 1 0 0 4 13 • Finally apply the process to A(3) and get

 M(3)    −  1 0 0 0   6 2 2 4     −   0 1 0 0  (3)  0 4 2 2  (4)  A =   = A = U  0 0 1 0   0 0 2 −5  0 0 −2 1 0 0 0 −3 MATH3602 Chapter 2 Advanced Direct Methods 64 • Thus the matrix A is transformed to the upper U by a sequence of multiplications with elimination matrices:

M(3)M(2)M(1)A = U.

• Since M(i) are nonsingular, we have −1 −1 −1 A = M(1) M(2) M(3) U. • −1 −1 −1 The matrix M(1) M(2) M(3) must be lower triangular and    1 0 0 0    −1 −1 −1  2 1 0 0  M(1) M(2) M(3) =  1  = L.  2 3 1 0  − −1 1 2 2 1 MATH3602 Chapter 2 Advanced Direct Methods 65

• −1 −1 −1 We note that the evaluation of M(1) M(2) M(3) requires no computational cost and it is true for general elimination matrices (elementary row operations).

• −1 We note that M(i) may be obtained from M(i) by flipping the sign of the multipliers, e.g. (verify it),     −1  1 0 0 0   1 0 0 0      −1  0 1 0 0   0 1 0 0  M(2) =   =    0 −m32 1 0   0 m32 1 0 

0 −m42 0 1 0 m42 0 1 • And it is easy to verify that M −1 M −1  (2)  (3)     1 0 0 0  1 0 0 0   1 0 0 0        0 1 0 0  0 1 0 0   0 1 0 0     =    0 m32 1 0  0 0 1 0   0 m32 1 0 

0 m42 0 1 0 0 m43 1 0 m42 m43 1 MATH3602 Chapter 2 Advanced Direct Methods 66

2.8 Gaussian Elimination and Factorization To describe formally the process of the Gaussian algorithm, we interpret it as a sequence of (n − 1) major steps: A = A(1) → A(2) → · · · → A(n)

• Clearly we have   a(1) a(1) ··· a(1)  11 12 1n   (1) (1) ··· (1)  (1)  a21 a22 a2n  A = A =    . .  (1) (1) ··· (1) an1 an2 ann and   a(2) a(2) ··· a(2)  11 12 1n   (2) ··· (2)   0 a22 a2n  (2)   A =  0 . .     . . .  (2) ··· (2) 0 an2 ann MATH3602 Chapter 2 Advanced Direct Methods 67

• The matrix A(2) is obtained from A(1) by ◦ (2) (1) ··· a1j = a1j , j = 1, 2, , n.

◦ (2) ··· ai1 = 0, i = 2, 3, , n. (1) i = 2, ··· , n ◦ (2) (1) − ai1 (1) aij = aij (1) a1j , a11 j = 2, ··· , n.

• In general, the matrix A(k+1) is obtained from A(k) by producing 0’s in the column (k) k below the pivot element akk .

• We need to subtract multipliers of row k from the rows beneath it. Rows 1, 2, ··· , k are not altered. MATH3602 Chapter 2 Advanced Direct Methods 68   (k) ··· (k) (k) ··· (k)  a11 a1k−1 a1k a1n   . . . .   .. . . .     a(k) a(k) ··· a(k)   k−1,k−1 k−1,k k−1,n   (k) (k)   0 ··· 0 a ··· a   k,k kn   0 ··· 0 a(k) ··· a(k)   k+1,k k+1,n   . . . .   . . . .     . . . .  (k) ··· (k) 0 0 an,k an,n • The formula is therefore  (k) ≤  aij if i k (k+1) ≥ ≤ a = 0 if i k + 1 and j k ij  (k)  a  (k) − ik × (k) ≥ ≥ aij (k) akj if i k + 1 and j k + 1 akk MATH3602 Chapter 2 Advanced Direct Methods 69

• Finally we set U = A(n) and define L by   (k) (k) ≥  aik /akk if i k + 1; Lik = 1 if i = k;  0 if i ≤ k − 1. • We want to obtain A = LU, the LU factorization by Gaussian elimination.

• We note that the Gaussian elimination process will break down if any of the pivot elements are 0. Now we can prove the following theorem.

(k) Theorem 2.4. If all the pivot elements akk are non-zero in the Gaussian elimination process then we have A = LU. MATH3602 Chapter 2 Advanced Direct Methods 70 Proof We first note that (n) (k) (i) Ukj = 0 if k > j; (iii) Ukj = akj = akj ; (k+1) (k) ≤ ≤ − (ii) Lik = 0 if k > i; (iv) aij = aij if i k or j k 1. • When i ≤ j, we have ∑n (LU)ij = Lik Ukj Lik = 0 , k = i + 1, . . . , n k=1 ∑i ∑i−1 (k) (k) (i) = Lik akj = Lik akj + Liiaij k=1 k=1 − ∑i 1 a(k) = ik a(k) + a(i) L = 1 and L = a(k)/a(k) (k) kj ij ii ik ik kk k=1 akk i−1 { } ∑ a(k) = a(k) − a(k+1) + a(i) a(k+1) = a(k) − ik × a(k) ij ij ij ij ij (k) kj k=1 akk (1) = aij = aij. MATH3602 Chapter 2 Advanced Direct Methods 71 • When i > j, we have ∑n ∑j (LU)ij = LikUkj = LikUkj Ukj = 0 if k > j k=1 k=1 ∑j (k) = Likakj k=1 j ∑ a(k) a(k) = ik a(k) L = ik (k) kj ik (k) k=1 akk akk j { } ∑ a(k) = a(k) − a(k+1) a(k+1) = a(k) − ik × a(k) ij ij ij ij (k) kj k=1 akk (1) − (j+1) = aij aij (1) = aij = aij. This is because (k) aij = 0 if i ≥ j + 1 and k ≥ j + 1. MATH3602 Chapter 2 Advanced Direct Methods 72

2.8.1 Operational Cost of the Gaussian Elimination • Recall the sequence of the n × n matrices: A = A(1) → A(2) → · · · → A(n) = U.

• The number of arithmetic operations in each major step is summarized below:

÷ × +/− Total A(1) −→ A(2) n − 1 n(n − 1) n(n − 1) 2n2 − (n + 1) A(2) −→ A(3) n − 2 (n − 1)(n − 2) (n − 1)(n − 2) 2(n − 1)2 − n ...... A(n−1) −→ A(n) 1 2 × 1 2 × 1 2 × 22 − 3 n(n−1) n(n2−1) n(n2−1) ≈ 2 3 Total 2 3 3 3n • Knowing the operation counts of the first major step is sufficient since the remaining steps are just a repetition of step 1 on smaller and smaller matrices. MATH3602 Chapter 2 Advanced Direct Methods 73

2.9 Hessenberg Linear Systems

• A is one that is “almost” triangular. To be exact,

◦ an upper Hessenberg matrix has zero entries below the first subdiagonal, and ◦ a lower Hessenberg matrix has zero entries above the first superdiagonal.

• For example: Upper Hessenberg Lower Hessenberg 1 4 2 3 1 2 0 0     3 4 1 7 5 2 3 0     0 2 3 4 3 4 3 7 0 0 1 3 5 6 1 1 MATH3602 Chapter 2 Advanced Direct Methods 74 • If the constraints of a problem do not allow a general matrix to be conveniently reduced to a triangular one, a reduction to Hessenberg form is often the next best thing. • In particular, many eigenvalue algorithms reduce their input matrix to Hessenberg form as a first step.

Definition 2.2. A matrix H is said to be an lower Hessenberg triangular matrix if it takes the following form:   h h 0 ... 0  11 12   .. .   h21 h22 h23 . .   .  H =  ...... 0   .   hn−1,1 ··· .. hn−1,n−1 hn−1,n 

hn1 hn2 ··· hn,n−1 hnn and HT is called an upper Hessenberg triangular matrix. MATH3602 Chapter 2 Advanced Direct Methods 75 • Gaussian elimination of an upper Hessenberg matrix can also be simplified as for a tridiagonal matrices. • Here we shall show that the lower Hessenberg matrix H can be turned in to a lower triangular matrix with O(n2) operations by “reversed” Gaussian algorithm.

• Multiplying hn−1,n/hnn to the nth row and subtract the resulting row from the (n − 1)th row, we get   (1) (1) ···  h11 h12 0 ... 0   (1) (1) (1) . . .   h h h .. .. .   21 22 23   ......  H(1) =    (1) ...... (1)   hn−2,1 . . . hn−2,n−1 0   (1) ...... (1)   hn−1,1 . . . hn−1,n−1 0  (1) (1) ······ (1) (1) hn1 hn2 hn,n−1 hnn where (1) − hn−1,n ··· − hn−1,j = hn−1,j hnj (j = 1, 2, , n 1). hnn MATH3602 Chapter 2 Advanced Direct Methods 76 While and the other nonzero entries are not altered: (1) ̸ − hij = hij (i = n 1) and the cost is 2(n − 1) operations.

• The remaining steps to obtain H(2) until H(n−1), which must be an lower triangular matrix, are just a repetition of the first step on smaller and smaller matrices. • Hence the overall cost is 2[(n − 1) + (n − 2) + ··· + 1] = n(n − 1) = O(n2).

• Since H(n−1) is a lower triangular matrix, by using forward substitution, the linear system H(n−1)x = b can be solved in O(n2) operations. MATH3602 Chapter 2 Advanced Direct Methods 77

2.10 Problems in Pivoting

The Gaussian algorithm, in the simple form just described, is not satisfactory since it fails on systems that are in fact easy to solve.

To illustrate, consider the following examples.

(1) [ ][ ] [ ] 0 1 x 1 1 = 1 1 x2 2

The pivot element is 0. The Gaussian algorithm fails because there is no way to

eliminate the 1-coefficient of x1 in the second equation. MATH3602 Chapter 2 Advanced Direct Methods 78

(2) [ ][ ] [ ] 10−100 1 x 2 1 = 1 1 x2 3 • Suppose in a computing machine, for x ≥ 10100, x will be recognized as 10100 and for x ≤ −10100, x will be recognized as −10100. • The problem still persists when the pivot element is nonzero but small. After Gaussian elimination, we get [ ][ ] [ ] [ ] −100 10 1 x1 2 ≈ 2 100 = 100 100 0 1 − 10 x2 3 − 2 × 10 −10 • The solution is  100 100  x2 = (−10 )/(1 − 10 ) ≈ 1 − × 100 ≈ 100  x1 = (2| {zx2}) 10 10 ≈1

• The correct solution is x1 ≈ 1 and x2 ≈ 2 the computed solution is inaccurate.

Question: How to solve these problems? MATH3602 Chapter 2 Advanced Direct Methods 79 Solution:

(1) Interchange the two rows. [ ][ ] [ ] 1 1 x 2 1 = 0 1 x2 1

=⇒ x2 = 1 and x1 = 1, the correct answer.

100 (2) Can we solve the problem[ by scaling][ up] the[ entry (Multiply] [10 to] the first row)? 1 10100 x 2 × 10100 10100 1 = ≈ 1 1 x 3 3 [ ][2 ] [ ] 100 100 ⇒ 1 10 x1 10 = 100 = 100 0 1 − 10 x2 3 − 10 • The solution is { 100 100 x2 = (3 − 10 )/(1 − 10 ) ≈ 1 100 100 x1 = 10 − x2 × 10 ≈ 0. Again the computed solution is wrong. MATH3602 Chapter 2 Advanced Direct Methods 80 • How about interchanging the two rows? [ ][ ] [ ] 1 1 x 3 1 = 10−100 1 x 2 [ ][2 ] [ ] ⇒ 1 1 x1 3 = −100 = −100 0 1 − 10 x2 2 − 3 × 10 • The solution is then{ correct −100 −100 x2 = (2 − 3 × 10 )/(1 − 10 ) ≈ 2

x1 = 3 − 1 × x2 ≈ 1. MATH3602 Chapter 2 Advanced Direct Methods 81

2.11 Gaussian Elimination with Scaled Row Pivoting The conclusion to be drawn from previous examples is that choosing a suitable pivot row to avoid a very small pivot element by interchanging two rows is necessary (a wise step).

• We introduce a modified Gaussian algorithm with scaled row pivoting. Consider   10−9 −1 2 ← r = max{10−9, 1, 2} = 2   1 A =  1 2 −4  ← r2 = max{1, 2, 4} = 4

10 0 −1 ← r3 = max{10, 0, 1} = 10 (1) To choose the first pivot row, we begin by computing the scale of each row

ri = max{|ai1|, ··· , |ain|} for i = 1, 2, ··· , n.

(2) We then set { } |a | |a | |a | s = max 11 , 21 , ··· , n1 . r1 r2 rn MATH3602 Chapter 2 Advanced Direct Methods 82

• Choose (first) row i to be the pivot row if |a | i1 = s. ri In our case, { } 10−9 1 10 s = max , , = 1. 2 4 10 Therefore row 3 should be chosen.

• We then interchange row 1 and row 3:     10 0 −1 Gaussian 10 0 −1      −  −→  −39  1 2 4 0 2 10 10−9 −1 2 Elimination 0 −1 2 + 10−10 • In the next step, the selection of a pivot row is made on the of the updated matrix A. MATH3602 Chapter 2 Advanced Direct Methods 83

(1’) We need to apply Gaussian elimination to [ ] { } 2 −39 ← r = max 2, 39 = 39 10 1 { 10 10 } −10 −10 −10 −1 2 + 10 ← r2 = max 1, 2 + 10 = 2 + 10 (2’) We then compute { } 2 1 20 s = max , = . 39/10 2 + 10−10 39 • Hence we choose row 1 as the pivot row and so no row interchange is required. [ ] Gaussian [ ] 2 −39 2 −39 10 −→ 10 −1 2 + 10−10 0 10−10 + 1 Elimination 20 MATH3602 Chapter 2 Advanced Direct Methods 84 • Finally we have     10−9 −1 2 10 0 −1      −  −→  −39  A = 1 2 4 0 2 10 = U − −10 1 10 0 1 0 0 10 + 20 × ×     0 0 1 1 0 0     P =  0 1 0   10−1 1 0  = L − 1 0 0 |10{z10} |−{z0.5} 1 Interchange of rows Multipliers • Therefore, at the end of the process, we have an LU factorization of PA (not A!): PA = LU where the permutation matrix P is formed by permuting the rows of the identity matrix I according to the row interchanges applied. MATH3602 Chapter 2 Advanced Direct Methods 85 • The solution to the original linear system Ax = b is obtained from the permuted linear system PAx = P b which is again solved in two stages: Ly = P b and then Ux = y. MATH3602 Chapter 2 Advanced Direct Methods 86

2.12 Diagonally Dominant Matrices

Some system of equations has the property that Gaussian elimination without scaled pivoting can be safely used. One example is diagonally dominant matrices. Definition 2.3. An n × n matrix is said to be “diagonally dominant” if ∑n |aii| > |aij| (1 ≤ i ≤ n) j=1,j≠ i   2 −1 0 2 > 1   A1 =  −1 −3 −1  3 > 1 + 1 0 −1 2 2 > 1   2 −1 0 2 > 1   A2 =  −1 2 −1  2 ≯ 1 + 1 0 −2 2 2 ≯ 2

Conclusion: A1 is diagonally dominant but A2 is not. MATH3602 Chapter 2 Advanced Direct Methods 87 • For a diagonally dominant matrix, in the first step of Gaussian elimination, we can use row 1 as the pivot row since the pivot element a11 is not 0 (Why?).

• After completing Step 1, we would like to know that Row 2 can be used as the next pivot row. Theorem 2.5. Gaussian elimination without pivoting preserves the diagonal dom- inance of a matrix.

Proof Recall A = A(1) → A(2) → A(3) → · · · → A(n).

• It suffices to consider the first step in Gaussian elimination because the subse- quent steps are similar to the first one but of smaller size.

• Let A be an n × n matrix that is diagonally dominant. Taking account of the 0’s created in Column 1, as well as the fact that Row 1 is unchanged, we are going to prove MATH3602 Chapter 2 Advanced Direct Methods 88

∑n | (2)| | (2)| (1) aii > aij , i = 2, . . . , n. j=2,j≠ i • In terms of A, this means ∑n ai1 ai1 (2) aii − × a1i > aij − × a1j (1) ⇐⇒ (2) a11 a11 j=2,j≠ i • Using the triangle inequality |x| − |y| ≤ |x − y| ≤ |x| + |y| it suffices to prove the stronger inequality than (2): { } ∑n ai1 ai1 (3) |aii| − × a1i > |aij| + × a1j a11 a11 j=2,j≠ i which is equivalent to

∑n ∑n ai1 (4) |aii| − |aij| > × a1j (3) ⇐⇒ (4) a11 j=2,j≠ i j=2 MATH3602 Chapter 2 Advanced Direct Methods 89 • From the diagonal dominance in the ith row, we have ∑n |aii| − |aij| > |ai1|. (2.14) j=2,j≠ i From the diagonal dominance of row 1, we get ∑n ∑n |a | |a | > |a | =⇒ 1 > 1j . 11 1j |a | j=2 j=2 11 Hence ∑n |a a | |a | > 1j i1 (2.15) i1 |a | j=2 11 • Combine the results in Eq. (2.14) and Eq. (2.15), we obtain

∑n ∑n ai1a1j |aii| − |aij| > a11 j=2,j≠ i j=2 This completes the proof. MATH3602 Chapter 2 Advanced Direct Methods 90

Example 2.1   3 1 0 3 > 1    −1 3 1  3 > 1 + 1 −  0 1 3 3 > 1 3 1 0 3 > 1   =⇒  0 10/3 1  10/3 > 1 −  0 1 3  3 > 1 3 1 0 3 > 1   =⇒  0 10/3 1  10/3 > 1 0 0 33/10 33/10 > 0 MATH3602 Chapter 2 Advanced Direct Methods 91 Theorem 2.6. Every diagonally dominant matrix is nonsingular and has an LU factorization.

Proof We recall the sequence in the Gaussian elimination process A = A(1) → A(2) → · · · → A(n) = U.

• Theorem 2.5 together with Theorem 2.4 implies that a diagonally dominant matrix A has an LU decomposition in which L is unit lower triangular.

• The matrix U, by Theorem 2.5, is diagonally dominant. Hence, its diagonal ele- ments are nonzero.

Thus, L and U are nonsingular. MATH3602 Chapter 2 Advanced Direct Methods 92 • We give another proof for “a diagonally dominant matrix A is nonsingular”. Let λ be an eigenvalue of A with an eigenvector x. Since x ≠ 0 we have for some r (1 ≤ r ≤ n)

|xr| = max{|xi|} > 0. i We then define 1 v = x xr which is an eigenvector of A such that Av = λv and |vi| ≤ 1 and vr = 1.

• Now we consider the rth row of the matrix equation and recall that vr = 1, ∑n ∑n arjvj = arr + arjvj = λvr = λ. j=1 j=1,j≠ r Thus we have

∑n ∑n ∑n

|arr − λ| = arjvj ≤ |arjvj| ≤ |arj| < |arr|

j=1,j≠ r j=1,j≠ r j=1,j≠ r and λ cannot be zero. MATH3602 Chapter 2 Advanced Direct Methods 93

2.13 A Brief Summary

• You should know the definition of a block-diagonal matrix and the method of permutation to solve the linear system.

• You should know the definitions and properties of a circulant matrix and a Toeplitz matrix.

• You should know the definitions and properties of a tri-diagonal matrix.

• You should be able to apply the Gaussian elimination (with or without scaled row pivoting) to a linear system.

• You should know the equivalence between the LU factorization and Gaussian elimination. MATH3602 Chapter 2 Advanced Direct Methods 94 • You should be able to prove the following: If all the pivot elements are nonzero in the Gaussian elimination process of a given matrix A then we can get the Doolittle factorization A = LU by using the upper triangular matrix in the final step and also the multipliers obtained in each step.

• You should know the diagonal dominance of a matrix.

• You should be able to prove the following: (i) The Gaussian elimination without pivoting preserves the diagonal dominance of a matrix. (ii) Hence every diagonally dominant matrix has an LU factorization.

• You should know the computational costs of solving various linear systems, trian- gular, tridiagonal, Hessenburg and general systems, by direct methods.