Sparse : LU Kristin Davies Peter He Feng Xie Hamid Ghaffari

April 4, 2007 Outline

 Introduction to LU Factorization (Kristin)

 LU Transformation Algorithms (Kristin)

 LU and Sparsity (Peter)

 Simplex Method (Feng)

 LU Update (Hamid) Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – What is LU Factorization?

 decomposition into the product of a lower and upper : A = LU

 Example for a 3x3 matrix:

a11 a12 a13  l11 0 0 u11 u12 u13       a21 a22 a23  = l21 l22 0  0 u22 u23       a31 a32 a33  l31 l32 l33  0 0 u33  Introduction – Transformations – Sparsity – Simplex – Implementation

a11 a12 a13    A = a 21 a 22 a 23  a a a  Introduction – LU Existence  31 32 33 

 LU factorization can be completed on an if and only if all its principle minors are non-zero  Recall: invertible Recall: principle minors

Ais invertible if Bexists s.t. det (a11 ) det (a22 ) det (a33 ) AB = BA = I  a11 a12  a22 a23   a11 a13  det   det   det   a21 a22  a32 a33  a31 a33   a a a   11 12 13  det a21 a22 a23    a31 a32 a33  Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – LU Unique Existence

 Imposing the requirement that the diagonal of either L or U must consist of ones results in unique LU factorization   1 0 0u11 u12 u13     l21 1 0 0 u22 u23    a11 a12 a13 l l 1 0 0 u     31 32 33 a21 a22 a23  =     a a a  l11 0 0 1 u12 u13 31 32 33     l21 l22 0 0 1 u23     l31 l32 l33 0 0 1  Introduction – Transformations – Sparsity – Simplex – Implementation

Introduction – Why LU Factorization?

 LU factorization is useful in for: – Solving systems of linear equations (AX= B) – Computing the inverse of a matrix

 LU factorization is advantageous when there is a need to solve a set of equations for many different values of B Introduction – Transformations – Sparsity – Simplex – Implementation

Transformation Algorithms

 Modified form of

 Doolittle factorization – L has 1’s on its diagonal  Crout factorization – U has 1’s on its diagonal  Cholesky factorization – U=L T or L=U T

 Solution to AX=B is found as follows: – Construct the matrices L and U (if possible) – Solve LY=B for Y using forward substitution – Solve UX=Y for X using back substitution Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle

 Doolittle factorization – L has 1’s on its diagonal  General algorithm – determine rows of U from top to bottom; determine columns of L from left to right

for i=1:n for j=i:n

Lik Ukj =Aij gives row i of U end for j=i+1:n

Ljk Uki =Aji gives column i of L end end Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example

 1 0 0u11 u12 u13   2 −1 − 2 for j=i:n      l u u L U =A gives row i of U  21 1 0 0 22 23  = − 4 6 3  ik kj ij      end l31 l32 1 0 0 u33  − 4 − 2 8  u   11  ()1 0 0  0  = 2 ⇒ u11 = 2    0  Similarly  1 0 0u11 u12 u13   2 −1 − 2      l u u  21 1 0 0 22 23  = − 4 6 3       l31 l32 1 0 0 u33  − 4 − 2 8 

⇒ u12 = − ,1 u13 = −2 Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example

 1 0 02 −1 − 2  2 −1 − 2 for j=i+1:n      l u u L U =A gives column i of L  21 1 00 22 23  = − 4 6 3  jk ki ji      end l31 l32 10 0 u33  − 4 − 2 8  2   ()l21 1 0 0 = −4 ⇒ 2l21 = −4 ⇒ l21 = −2   0 Similarly  1 0 02 −1 − 2  2 −1 − 2      l u u  21 1 00 22 23  = − 4 6 3       l31 l32 10 0 u33  − 4 − 2 8 

⇒ l31 = −2 Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example

 1 0 0u11 u12 u13   2 −1 − 2  1 0 02 −1 − 2  2 −1 − 2           l21 1 0 0 u22 u23  = − 4 6 3  − 2 1 00 4 −1 = − 4 6 3            l31 l32 1 0 0 u33  − 4 − 2 8  − 2 −1 10 0 3  − 4 − 2 8  3rd row of U 1st row of U  1 0 02 −1 − 2  2 −1 − 2  1 0 02 −1 − 2  2 −1 − 2           l21 1 00 u22 u23  = − 4 6 3  − 2 1 00 4 −1 = − 4 6 3            l31 l32 10 0 u33  − 4 − 2 8  − 2 −1 10 0 u33  − 4 − 2 8  2nd coln of L 1st coln of L  1 0 02 −1 − 2  2 −1 − 2  1 0 02 −1 − 2  2 −1 − 2           2 1 0 0 4 1 4 6 3 − 2 1 00 u22 u23  = − 4 6 3  −  −  = −            − 2 l32 10 0 u33  − 4 − 2 8  − 2 l32 10 0 u33  − 4 − 2 8  2nd row of U Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Doolittle example

 Execute algorithm for our example (n=3) i=1

j=1  L1k Uk1 =A 11 gives row 1 of U for i=1:n j=2  L1k Uk2 =A 12 gives row 1 of U for j=i:n j=3  L1k Uk3 =A 13 gives row 1 of U Lik Ukj =Aij gives row i of U j=1+1=2  L2k Uk1 =A 21 gives column 1 of L end j=3  L3k Uk1 =A 31 gives column 1 of L for j=i+1:n i=2

Ljk Uki =Aji gives column i of L j=2  L2k Uk2 =A 22 gives row 2 of U end j=3  L2k Uk3 =A 23 gives row 2 of U end j=2+1=3  L3k Uk2 =A 32 gives column 2 of L i=3

j=3  L3k Uk3 =A 33 gives row 3 of U Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Crout

 Crout factorization – U has 1’s on its diagonal  General algorithm – determine columns of L from left to right; determine rows of U from top to bottom (same!?)

for i=1:n for j=i:n

Ljk Uki =Aji gives column i of L end for j=i+1:n

Lik Ukj =Aij gives row i of U end end Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Crout example

l11 0 0 1 u12 u13   2 −1 − 2 for j=i:n      l l u L U =A gives column i of L  21 22 0 0 1 23  = − 4 6 3  jk ki ji      end l31 l32 l33 0 0 1  − 4 − 2 8  1   ()l11 0 0 0 = 2 ⇒ l11 = 2   0 Similarly l11 0 0 1 u12 u13   2 −1 − 2      l l u  21 22 0 0 1 23  = − 4 6 3       l31 l32 l33 0 0 1  − 4 − 2 8 

⇒ l21 = −4 , l31 = −4 Introduction – Transformations – Sparsity – Simplex – Implementation

Transformations – Solution

 Once the L and U matrices have been found, we can easily solve our system AX=B  AUX=B LY=B UX=Y

l11 0 0  y1  b1  u11 u12 u13 x1   y1            l21 l22 0 y2  = b2   0 u22 u23 x2  = y2            l31 l32 l33  y3  b3   0 0 u33 x3   y3  Forward Substitution Backward Substitution b1 yn y1 = xn = l11 unn i−1 n

bi − ∑lij x j yi − ∑uij y j j=1 j=i+1 yi = for i = 2,... n xi = for i = (n −1),... 1 aii uii Introduction – Transformations – Sparsity – Simplex – Implementation

References - Intro & Transformations

 Module for Forward and Backward Substitution. http://math.fullerton.edu/mathews/n2003/BackSubstitutionMod.hml  Forward and Backward Substitution. www.ac3.edu/papers/Khoury-thesis/node13.html  LU Decomposition. http://en.wikipedia.org/wiki/LU_decomposition  Crout . http://en.wikipedia.org/wiki/Crout_matrix_decomposition  Doolittle Decomposition of a Matrix. www.engr.colostate.edu/~thompson/hPage/CourseMat/Tutorials/C ompMethods/doolittle.pdf Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of

 sparse … many elements are zero for example: diag(d1,…,dn), as n>>0.

 dense … few elements are zero

 In order to reduce memory burden, introduce some kinds of matrix storage format Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of Sparse Matrix

 Regular storage structure.

row 1 1 2 2 4 4 list = column 3 5 3 4 2 3 value 3 4 5 7 2 6 Introduction – Transformations – Sparsity – Simplex – Implementation

Definition and Storage of Sparse Matrix

– A standard sparse matrix format Subscripts:1 2 3 4 5 6 7 8 9 10 11 Colptr: 1 4 6 8 10 12 1 -3 0 -1 0 Rowind: 1 3 5 1 4 2 5 1 4 2 5 0 0 -2 0 3 Value: 1 2 5 -3 4 -2 –5-1 –4 3 6 2 0 0 0 0 0 4 0 –4 0 5 0 –5 0 6 Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

 A↔ G(A) ↔ Adj(G) i  Adj(G)  ∈ {0,1} j   Matrix P is of permutation and G(A)= G(PT AP)

opt{PT APP:G(P T AP)= G(A)} ⇔

opt{PT adj(G)PP : G(P T adj(G)P)= G(A)} Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

 1. There is no loop in G(A) Step 1. No loop and t = 1 Step 2. ∃ output vertices S= it i t ...i t ⊂ V(G) t {1 2 nt } Step 3. t t t  i1 i 2 ...i n  adj(G)= adj(G) \ adj(G) t  t t t  i i ...i n  1 2 t  Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

t=t+1, returning to step 2.  After p times, {1,2,...,n}= S1 ∪ S 2 ∪∪ ... S p T T T T  (separation of set). ∃P =  Ps ,P s ,...,P s  1 2 p 

such that P T adj(G)P is a lower triangular

. Therefore P T AP is a lower Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

triangular block matrix .  2.There is a loop in G(A). If the graph is not strongly connected, in the reachable matrix of adj(A), there are naught entries. Step 1. Choose the j-th column, t = 1,and     i  (R∩ RT )    St (j)= i  =≤≤= 1,1 i n j     Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

t t t  = i i ...i n  1 2 t 

(j∈ St ⊂ {} 12...n, ) St is closed on strong connection. Step 2: Choose the j -th column ( ), 1 j1 ≠ j j= j 1, t=t+1, returning step 1.  After p times,

{1,2,...,n}= S1 ∪ S 2 ∪∪ ... S p Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

∃P = (PTT P . . . P TTT ) , ∋ P adj(G)P s1 s 2 s p is a lower triangular block matrix. Therefore PT AP is a lower triangular block matrix.  Note of case 2 mentioned above.

ˆ ˆ ˆ A11 A 12 ...A 1p  Aˆ A ˆ ...A ˆ  T 21 22 2p Aˆ = P adj(G)P =   . ......  ˆ ˆ  Ap1 . ...A pp  Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

p× p B= (bij ) ∈ R ,  =0 i = j  ˆ bij= = 1A ij ≠ 0. ⇒  =0 Aˆ = 0  ij Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

There is no loop of G(B). According to similar analysis to case 1, ×    . × P(j j ...j)BP(jT j ...j)=  ∈ R p× p 12p 12 p ×. ×    × ×. ×  . Therefore, P T AP is a lower triangular block matrix. Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (square)

 Reducing computation burden on solution to large scale linear system  Generally speaking, for large scale system, system stability can easily be judged.  The method is used in hierarchical optimization of Large Scale Dynamic System Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (General)

Dulmage-Mendelsohn Decomposition  ∃P,Q such that

  Ah x x  T   PAQ=  As X    A  v  Introduction – Transformations – Sparsity – Simplex – Implementation

Structure Decomposition of Sparse Matrix (General)

 Further, fine decomposition is needed. Ah → the block diagonal form Av → the block diagonal form As → the block upper triangular form.  D-M decomposition can be seen in reference [3].  Computation and storage: Minimum of filling in The detail can be see in reference [2]. Introduction – Transformations – Sparsity – Simplex – Implementation

LU Factorization Method : Gilbert/Peierls

 left-looking. kth stage computes kth column of L and U 1. L = I 2. U = I 3. for k = 1:n 4. s = L \ A(:,k) 5. (partial pivoting on s) 6. U(1:k,k) = s(1:k) 7. L(k:n,k) = s(k:n) / U(k,k) 8. end Introduction – Transformations – Sparsity – Simplex – Implementation Introduction – Transformations – Sparsity – Simplex – Implementation

LU Factorization Method : Gilbert/Peierls

 THEOREM( Gilbert/Peierls). The entire algorithm for LU factorization of A with partial pivoting can be implemented to run in O(flops (LU)+ m) time on a RAM where m is number of the nonzero entries of A.  Note: The theorem expresses that the LU factorization will run in the time within a constant factor of the best possible, but it does not say what the Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b

x = b; for j = 1:n if ≠(x(j) 0) , x(j+1:n) = x(j+1:n) - L(j+1:n,j) * x(j); end; --- Total time: O(n+flops) Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b

⇒  xj≠ 0l ∧ ij ≠ 0 x i ≠ 0 ⇒  bi≠ 0 x i ≠ 0

 LetG(L) havej→ an i edgelij ≠ 0 if β ={i bi ≠ 0 } χ ={i xi ≠ 0 }  Let and χ =ReachG(L) ( β )  Then --- Total time: O(flops) Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation

Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation

References

 J. R. Gilbert and T. Peierls, Sparse partial pivoting in time proportional to arithmetic operations, SIAM J. Sci. Statist. Comput., 9 (1988), pp. 862-874  Mihalis Yannakakis’ website: http://www1.cs.columbia.edu/~mihalis  A. Pothen and C. Fan, Computing the block triangular form of a sparse matrix, ACM Trans. On Math. Soft. Vol. 18, No.4, Dec. 1990, pp. 303-324 Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Problem size

min{cxAxT= bx , ≥ 0, A ∈ R m× n }  Problem size determined by A  On the average, 5~10 nonzeros per column

greenbea.mps of NETLIB Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Computational Form, Basis

min{cxAxT= bx , ≥ 0, A ∈ R m× n } Note: A is of full row and m

Basis (of Rm) : m linearly independent columns of A

Basic variables

x1 x2 x3 x4 10 1 2    01 2 1  Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Notations

β index set of basic variables γ index set of non-basic variables

B:=A β basis

R:=A γ non-basis (columns)

 βxβ  c ABRx=[ | ], = , c =  γxγ  c Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Basic Feasible Solution

Ax= b Bx γ β + Rx = b

−1 γ xβ = B( b − Rx )

Basic feasible solution

xγ = 0, x ≥ 0, Ax = b

If a problem has an optimal solution, then there is a basic solution which is also optimal. Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Checking Optimality

T T T Objective value : cx= cx γβ β + cx =cTBb−1+ () c T − cBT − 1 Rx γ γ β β constant Reduced Costs T =c0 + d x γ

Optimality condition : dj ≥0 for all j ∈ γ Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Improving Basic Feasible Solution

 Choose xq (incoming variable) s.t. dq < 0 Objective value T  increase xq as much as possible c0 + d x γ

non-basic variables basic variables basic variables remain feasible xm+1 = 0 x1 ↑ (≥ 0) even if xq ∞ ⋮ ⋮

xq ↑ xp → objective value is unbounded ⋮ ⋮

xn = 0 xn ↑ Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Improving Basic Feasible Solution

 Choose x (incoming variable) s.t. d < 0 Objective value q q T c0 + d x γ  increase xq as much as possible

non-basic variables basic variables basic variables

xm+1 = 0 x1 ↑ x1 ↑ ⋮ ⋮ ⋮

xq ↑ x → goes to 0 first p xp ↓ (outgoing variable) ⋮ ⋮ ⋮

xn = 0 x ↑ n xn ↑ Unbounded solution neighboring improving basis Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Basis Updating

B b,...,b ,..., b  Neighboring bases : = 1 p m 

B= [ b1,...,a ,..., b m ] Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method   – Basis Updating B=  b1,...,bp ,..., b m 

m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 (as the linear combination of the bases) Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method   – Basis Updating B=  b1,...,bp ,..., b m 

m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1

i 1 v p bp =p a −∑ p bi ( v ≠ 0 ) vi≠ p v T vv1p− 11 v p+ 1 v m  bp = ηB η , where = − ,⋯ ,− , , − , ⋯, −  vp vv p p v pv p  Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method   – Basis Updating B=  b1,...,bp ,..., b m 

m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1

1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m  bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ −  vp vvv ppp v p 

vp : Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method   – Basis Updating B=  b1,...,bp ,..., b m 

m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1

1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m  bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ −  vp vvv ppp v p 

  BBE=, where Eee = 1,...,p− 1 ,η , e p + 1 ,. .., e m  (elementary transformation matrix) Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method   – Basis Updating B=  b1,...,bp ,..., b m 

m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1

1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m  bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ −  vp vvv ppp v p 

  BBE=, where Eee = 1,...,p− 1 ,η , e p + 1 ,. .., e m 

B−1= EB − 1 Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Basis Updating

  Eee= 1,...,p− 1 ,η , e p + 1 ,..., e m  1 η 1    ETM ⋱   (Elementary Transformation Matrix) = η p    ⋱  m  η 1  10 1 η 1   10       ⋱ ⋮ ⋱ ⋮  ⋱ ⋮  = η p 1  ⋯ 1       ⋮ ⋱ ⋮ ⋱  ⋮ ⋱    m 0 1 0 1  η 1  Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Basis Updating

Basis tends to get denser after each update (B−1= EB − 1 ) 1 η η 1 w 111 w+ w p     ⋱  ⋮ ⋮  Ew = η η p w p = w pp     ⋱  ⋮ ⋮  m  m mpm   η η 1 w w+ w 

Ew = w if wp = 0 Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Algorithm

Steps Major ops 1. Find an initial feasible basis B 2. Initialization B−1 b

T −1 3. Check optimality cβ B −1 4. Choose incoming variable xq B a q

5. Choose outgoing variable xp 6. Update basis EB −1 Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Algorithm

Steps Major ops 1. Find an initial feasible basis B 2. Initialization B−1 b

T −1 3. Check optimality cβ B −1 4. Choose incoming variable xq B a q

5. Choose outgoing variable xp (pivot step) 6. Update basis EB −1 Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Algorithm

Choice of pivot (numerical considerations)

 resulting less fill-ins  large pivot element

Conflicting goals sometimes

In practice, compromise . Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – Typical Operations in Simplex Method

Typical operations : B−1 w, wT B − 1

Challenge: sparsity of B-1 could be destroyed by basis update

Need a proper way to represent B-1

Two ways: −1  Product form of the inverse ( B = EE k k − 1 ⋯ E 1 ) (obsolete )  LU factorization Introduction – Transformations – Sparsity – Simplex – Implementation

Simplex Method – LU Factorization

 Reduce complexity using LU update (B= BEB ,−1 = EB − 1 )

Side effect : more LU factors

 Refactorization (reinstate efficiency and numerical accuracy) Sparse LU Updates in Simplex Method

Hamid R. Ghaffari

April 10, 2007 Outline

LU Update Methods Preliminaries Bartels-Golub LU UPDATE Sparse Bartels-Golub Method Reid’s Method The Forrest-Tomlin Method Suhl-Suhl Method More Details on the Topic Revised Simplex Algorithm

Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n   o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Revised Simplex Algorithm

Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n   o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Revised Simplex Algorithm

Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n   o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Problems with Revised Simplex Algorithm

I The physical limitations of a computer can become a factor.

I Round-off error and significant digit loss are common problems in matrix manipulations (ill-conditioned matrices).

I It also becomes a task in .

2 I It take m (m − 1) multiplications and m(m − 1) additions, a total of m3 − m floating-point (real number) calculations.

Many variants of the Revised Simplex Method have been designed to reduce this O(m3)-time algorithm as well as improve its accuracy. I Having LU decomposition B = LU we have

−1 ¯ −l T L B = U + (L Aq − Uep)eq ,

I How to deal with this?

The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150)

Introducing Spike

I If Aq is the entering column, B the original basis and B¯ the new basis, then we have

¯ T B = B + (Aq − Bep)eq , I How to deal with this?

The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150)

Introducing Spike

I If Aq is the entering column, B the original basis and B¯ the new basis, then we have

¯ T B = B + (Aq − Bep)eq ,

I Having LU decomposition B = LU we have

−1 ¯ −l T L B = U + (L Aq − Uep)eq , I How to deal with this?

The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150)

Introducing Spike

I If Aq is the entering column, B the original basis and B¯ the new basis, then we have

¯ T B = B + (Aq − Bep)eq ,

I Having LU decomposition B = LU we have

−1 ¯ −l T L B = U + (L Aq − Uep)eq , The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150)

Introducing Spike

I If Aq is the entering column, B the original basis and B¯ the new basis, then we have

¯ T B = B + (Aq − Bep)eq ,

I Having LU decomposition B = LU we have

−1 ¯ −l T L B = U + (L Aq − Uep)eq ,

I How to deal with this? Introducing Spike

I If Aq is the entering column, B the original basis and B¯ the new basis, then we have

¯ T B = B + (Aq − Bep)eq ,

I Having LU decomposition B = LU we have

−1 ¯ −l T L B = U + (L Aq − Uep)eq ,

I How to deal with this?

The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Bartels-Golub Method Illustration

The first variant of the Revised Simplex Method was the Bartels-Golub Method. Bartels-Golub Method Algorithm

Revised Simplex Method Bartels-Golub d = B−1b d = U−1L−1b 0 0 −1 0 0 −1 −1 c¯ = cN − cB B N, c¯ = cN − cB U L N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 w = B Aq, w = U L Aq, n   o n   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update B−1 Update U−1 and L−1 In sparse case, yes.

Bartels-Golub Method Characteristics

I It significantly improved numerical accuracy.

I Can we do better? Bartels-Golub Method Characteristics

I It significantly improved numerical accuracy.

I Can we do better? In sparse case, yes. Single-Entry-Eta Decomposition:

 1   1   1   1  l 1 l 1 1 1  21  =  21  ·   ·   l31 1   1  l31 1   1  l41 1 1 1 l41 1

So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated.

Sparse Bartels-Golub Method eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:

 1  1  1   1  l 1 1 1 l 1  21  =   ·   ·  21  l31 l32 1   1   l32 1  l31 1  l41 l42 l43 1 l43 1 l42 1 l41 1 So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated.

Sparse Bartels-Golub Method eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:

 1  1  1   1  l 1 1 1 l 1  21  =   ·   ·  21  l31 l32 1   1   l32 1  l31 1  l41 l42 l43 1 l43 1 l42 1 l41 1

Single-Entry-Eta Decomposition:

 1   1   1   1  l 1 l 1 1 1  21  =  21  ·   ·   l31 1   1  l31 1   1  l41 1 1 1 l41 1 Sparse Bartels-Golub Method eta matrices

First take a look at the following facts:

Column-Eta factorization of triangular matrices:

 1  1  1   1  l 1 1 1 l 1  21  =   ·   ·  21  l31 l32 1   1   l32 1  l31 1  l41 l42 l43 1 l43 1 l42 1 l41 1

Single-Entry-Eta Decomposition:

 1   1   1   1  l 1 l 1 1 1  21  =  21  ·   ·   l31 1   1  l31 1   1  l41 1 1 1 l41 1

So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated. Sparse Bartels-Golub Method Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n   o n t   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n   o n t   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n   o n t   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm

Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n   o n t   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. I Instead of just L and U, the factors become the lower-triangular eta matrices and U.

I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix.

I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2).

Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method. I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix.

I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2).

Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix.

I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2).

Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta matrices and U.

I The eta matrices were reduced to single-entry eta matrices. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2).

Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta matrices and U.

I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. Sparse Bartels-Golub Method Advantages

I It is no more complex than the Bartels-Golub Method.

I Instead of just L and U, the factors become the lower-triangular eta matrices and U.

I The eta matrices were reduced to single-entry eta matrices.

I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix.

I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; 3 I O(n )-cost decomposition;

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; 3 I O(n )-cost decomposition;

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; 3 I O(n )-cost decomposition;

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; 3 I O(n )-cost decomposition;

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. I Large numbers of eta matrices; 3 I O(n )-cost decomposition;

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; 3 I O(n )-cost decomposition;

Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; Sparse Bartels-Golub Method Disadvantages

I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis.

I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur.

I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111)

I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method.

I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in;

I Large numbers of eta matrices; 3 I O(n )-cost decomposition; Ried Suggestion on Sparse Bartels-Golub Method

Rather than completely refactoring the basis, applying LU-decomposition only to the part of that remained upper-Hessenberg. Reid’s Method

Task: The task is to find a way to reduce that bump before attempting to decompose it;

Row singleton: any row of the bump that only has one non-zero entry.

Column singleton: any column of the bump that only has one non-zero entry.

Method:

I When a column singleton is found, in a bump, it is moved to the top left corner of the bump.

I When a row singleton is found, in a bump, it is moved to the bottom right corner of the bump. Reid’s Method Column Rotation Reid’s Method Row Rotation Reid’s Method Characteristics

Advantages:

I It significantly reduces the growth of the number of eta matrices in the Sparse Bartels-Golub Method

I So, the basis should not need to be decomposed nearly as often.

I The use of LU-decomposition on any remaining bump still allows some attempt to maintain stability. Disadvantages:

I The rotations make absolutely no allowance for stability whatsoever,

I So, Reid’s Method remains numerically less stable than the Sparse Bartels-Golub Method. The Forrest-Tomlin Method The Forrest-Tomlin Method

Bartels-Golub Method Forrest-Tomlin Method −1 −1 −1 Q −1 d = U L b d = U t Rt L b 0 0 −1 −1 0 0 −1 Q −1 c¯ = cN − cB U L N, c¯ = cN − cB U t Rt L N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q −1 w = U L Aq, w = U Rt L Aq, n   o n t   o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 creating a row factor as necessary. If there are too many factors, completely refactor the basis. The Forrest-Tomlin Method Characteristics Advantages:

I At most one row-eta matrix factor will occur for each iteration where an unpredictable number occurred before.

I The code can take advantage of such knowledge for predicting necessary storage space and calculations.

I Fill-in should also be relatively slow, since fill-in can only occur within the spiked column.

Disadvantages:

I Sparse Bartels-Golub Method allowed LU-decomposition to pivot for numerical stability, but Forrest-Tomlin Method makes no such allowances.

I Therefore, severe calculation errors due to near-singular matrices are more likely to occur. Suhl-Suhl Method

This method is a modification of Forrest-Tomlin Method. For More Detail

Leena M. Suhl, Uwe H. Suhl A fast LU update for linear programming. Annals of Operations Research 43(1993)33-47 Stiven S. Morgan A Comparison of Simplex Method Algorithms University of Florida, 1997 Vasek Chv´atal Linear Programming W.H. Freeman & Company (September 1983) Thanks