Sparse Linear Algebra: LU Factorization Kristin Davies Peter He Feng Xie Hamid Ghaffari
April 4, 2007 Outline
Introduction to LU Factorization (Kristin)
LU Transformation Algorithms (Kristin)
LU and Sparsity (Peter)
Simplex Method (Feng)
LU Update (Hamid) Introduction – Transformations – Sparsity – Simplex – Implementation
Introduction – What is LU Factorization?
Matrix decomposition into the product of a lower and upper triangular matrix: A = LU
Example for a 3x3 matrix:
a11 a12 a13 l11 0 0 u11 u12 u13 a21 a22 a23 = l21 l22 0 0 u22 u23 a31 a32 a33 l31 l32 l33 0 0 u33 Introduction – Transformations – Sparsity – Simplex – Implementation
a11 a12 a13 A = a 21 a 22 a 23 a a a Introduction – LU Existence 31 32 33
LU factorization can be completed on an invertible matrix if and only if all its principle minors are non-zero Recall: invertible Recall: principle minors
Ais invertible if Bexists s.t. det (a11 ) det (a22 ) det (a33 ) AB = BA = I a11 a12 a22 a23 a11 a13 det det det a21 a22 a32 a33 a31 a33 a a a 11 12 13 det a21 a22 a23 a31 a32 a33 Introduction – Transformations – Sparsity – Simplex – Implementation
Introduction – LU Unique Existence
Imposing the requirement that the diagonal of either L or U must consist of ones results in unique LU factorization 1 0 0u11 u12 u13 l21 1 0 0 u22 u23 a11 a12 a13 l l 1 0 0 u 31 32 33 a21 a22 a23 = a a a l11 0 0 1 u12 u13 31 32 33 l21 l22 0 0 1 u23 l31 l32 l33 0 0 1 Introduction – Transformations – Sparsity – Simplex – Implementation
Introduction – Why LU Factorization?
LU factorization is useful in numerical analysis for: – Solving systems of linear equations (AX= B) – Computing the inverse of a matrix
LU factorization is advantageous when there is a need to solve a set of equations for many different values of B Introduction – Transformations – Sparsity – Simplex – Implementation
Transformation Algorithms
Modified form of Gaussian elimination
Doolittle factorization – L has 1’s on its diagonal Crout factorization – U has 1’s on its diagonal Cholesky factorization – U=L T or L=U T
Solution to AX=B is found as follows: – Construct the matrices L and U (if possible) – Solve LY=B for Y using forward substitution – Solve UX=Y for X using back substitution Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Doolittle
Doolittle factorization – L has 1’s on its diagonal General algorithm – determine rows of U from top to bottom; determine columns of L from left to right
for i=1:n for j=i:n
Lik Ukj =Aij gives row i of U end for j=i+1:n
Ljk Uki =Aji gives column i of L end end Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Doolittle example
1 0 0u11 u12 u13 2 −1 − 2 for j=i:n l u u L U =A gives row i of U 21 1 0 0 22 23 = − 4 6 3 ik kj ij end l31 l32 1 0 0 u33 − 4 − 2 8 u 11 ()1 0 0 0 = 2 ⇒ u11 = 2 0 Similarly 1 0 0u11 u12 u13 2 −1 − 2 l u u 21 1 0 0 22 23 = − 4 6 3 l31 l32 1 0 0 u33 − 4 − 2 8
⇒ u12 = − ,1 u13 = −2 Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Doolittle example
1 0 02 −1 − 2 2 −1 − 2 for j=i+1:n l u u L U =A gives column i of L 21 1 00 22 23 = − 4 6 3 jk ki ji end l31 l32 10 0 u33 − 4 − 2 8 2 ()l21 1 0 0 = −4 ⇒ 2l21 = −4 ⇒ l21 = −2 0 Similarly 1 0 02 −1 − 2 2 −1 − 2 l u u 21 1 00 22 23 = − 4 6 3 l31 l32 10 0 u33 − 4 − 2 8
⇒ l31 = −2 Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Doolittle example
1 0 0u11 u12 u13 2 −1 − 2 1 0 02 −1 − 2 2 −1 − 2 l21 1 0 0 u22 u23 = − 4 6 3 − 2 1 00 4 −1 = − 4 6 3 l31 l32 1 0 0 u33 − 4 − 2 8 − 2 −1 10 0 3 − 4 − 2 8 3rd row of U 1st row of U 1 0 02 −1 − 2 2 −1 − 2 1 0 02 −1 − 2 2 −1 − 2 l21 1 00 u22 u23 = − 4 6 3 − 2 1 00 4 −1 = − 4 6 3 l31 l32 10 0 u33 − 4 − 2 8 − 2 −1 10 0 u33 − 4 − 2 8 2nd coln of L 1st coln of L 1 0 02 −1 − 2 2 −1 − 2 1 0 02 −1 − 2 2 −1 − 2 2 1 0 0 4 1 4 6 3 − 2 1 00 u22 u23 = − 4 6 3 − − = − − 2 l32 10 0 u33 − 4 − 2 8 − 2 l32 10 0 u33 − 4 − 2 8 2nd row of U Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Doolittle example
Execute algorithm for our example (n=3) i=1
j=1 L1k Uk1 =A 11 gives row 1 of U for i=1:n j=2 L1k Uk2 =A 12 gives row 1 of U for j=i:n j=3 L1k Uk3 =A 13 gives row 1 of U Lik Ukj =Aij gives row i of U j=1+1=2 L2k Uk1 =A 21 gives column 1 of L end j=3 L3k Uk1 =A 31 gives column 1 of L for j=i+1:n i=2
Ljk Uki =Aji gives column i of L j=2 L2k Uk2 =A 22 gives row 2 of U end j=3 L2k Uk3 =A 23 gives row 2 of U end j=2+1=3 L3k Uk2 =A 32 gives column 2 of L i=3
j=3 L3k Uk3 =A 33 gives row 3 of U Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Crout
Crout factorization – U has 1’s on its diagonal General algorithm – determine columns of L from left to right; determine rows of U from top to bottom (same!?)
for i=1:n for j=i:n
Ljk Uki =Aji gives column i of L end for j=i+1:n
Lik Ukj =Aij gives row i of U end end Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Crout example
l11 0 0 1 u12 u13 2 −1 − 2 for j=i:n l l u L U =A gives column i of L 21 22 0 0 1 23 = − 4 6 3 jk ki ji end l31 l32 l33 0 0 1 − 4 − 2 8 1 ()l11 0 0 0 = 2 ⇒ l11 = 2 0 Similarly l11 0 0 1 u12 u13 2 −1 − 2 l l u 21 22 0 0 1 23 = − 4 6 3 l31 l32 l33 0 0 1 − 4 − 2 8
⇒ l21 = −4 , l31 = −4 Introduction – Transformations – Sparsity – Simplex – Implementation
Transformations – Solution
Once the L and U matrices have been found, we can easily solve our system AX=B AUX=B LY=B UX=Y
l11 0 0 y1 b1 u11 u12 u13 x1 y1 l21 l22 0 y2 = b2 0 u22 u23 x2 = y2 l31 l32 l33 y3 b3 0 0 u33 x3 y3 Forward Substitution Backward Substitution b1 yn y1 = xn = l11 unn i−1 n
bi − ∑lij x j yi − ∑uij y j j=1 j=i+1 yi = for i = 2,... n xi = for i = (n −1),... 1 aii uii Introduction – Transformations – Sparsity – Simplex – Implementation
References - Intro & Transformations
Module for Forward and Backward Substitution. http://math.fullerton.edu/mathews/n2003/BackSubstitutionMod.hml Forward and Backward Substitution. www.ac3.edu/papers/Khoury-thesis/node13.html LU Decomposition. http://en.wikipedia.org/wiki/LU_decomposition Crout Matrix Decomposition. http://en.wikipedia.org/wiki/Crout_matrix_decomposition Doolittle Decomposition of a Matrix. www.engr.colostate.edu/~thompson/hPage/CourseMat/Tutorials/C ompMethods/doolittle.pdf Introduction – Transformations – Sparsity – Simplex – Implementation
Definition and Storage of Sparse Matrix
sparse … many elements are zero for example: diag(d1,…,dn), as n>>0.
dense … few elements are zero
In order to reduce memory burden, introduce some kinds of matrix storage format Introduction – Transformations – Sparsity – Simplex – Implementation
Definition and Storage of Sparse Matrix
Regular storage structure.
row 1 1 2 2 4 4 list = column 3 5 3 4 2 3 value 3 4 5 7 2 6 Introduction – Transformations – Sparsity – Simplex – Implementation
Definition and Storage of Sparse Matrix
– A standard sparse matrix format Subscripts:1 2 3 4 5 6 7 8 9 10 11 Colptr: 1 4 6 8 10 12 1 -3 0 -1 0 Rowind: 1 3 5 1 4 2 5 1 4 2 5 0 0 -2 0 3 Value: 1 2 5 -3 4 -2 –5-1 –4 3 6 2 0 0 0 0 0 4 0 –4 0 5 0 –5 0 6 Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
A↔ G(A) ↔ Adj(G) i Adj(G) ∈ {0,1} j Matrix P is of permutation and G(A)= G(PT AP)
opt{PT APP:G(P T AP)= G(A)} ⇔
opt{PT adj(G)PP : G(P T adj(G)P)= G(A)} Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
1. There is no loop in G(A) Step 1. No loop and t = 1 Step 2. ∃ output vertices S= it i t ...i t ⊂ V(G) t {1 2 nt } Step 3. t t t i1 i 2 ...i n adj(G)= adj(G) \ adj(G) t t t t i i ...i n 1 2 t Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
t=t+1, returning to step 2. After p times, {1,2,...,n}= S1 ∪ S 2 ∪∪ ... S p T T T T (separation of set). ∃P = Ps ,P s ,...,P s 1 2 p
such that P T adj(G)P is a lower triangular
block matrix . Therefore P T AP is a lower Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
triangular block matrix . 2.There is a loop in G(A). If the graph is not strongly connected, in the reachable matrix of adj(A), there are naught entries. Step 1. Choose the j-th column, t = 1,and i (R∩ RT ) St (j)= i =≤≤= 1,1 i n j Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
t t t = i i ...i n 1 2 t
(j∈ St ⊂ {} 12...n, ) St is closed on strong connection. Step 2: Choose the j -th column ( ), 1 j1 ≠ j j= j 1, t=t+1, returning step 1. After p times,
{1,2,...,n}= S1 ∪ S 2 ∪∪ ... S p Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
∃P = (PTT P . . . P TTT ) , ∋ P adj(G)P s1 s 2 s p is a lower triangular block matrix. Therefore PT AP is a lower triangular block matrix. Note of case 2 mentioned above.
ˆ ˆ ˆ A11 A 12 ...A 1p Aˆ A ˆ ...A ˆ T 21 22 2p Aˆ = P adj(G)P = . ...... ˆ ˆ Ap1 . ...A pp Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
p× p B= (bij ) ∈ R , =0 i = j ˆ bij= = 1A ij ≠ 0. ⇒ =0 Aˆ = 0 ij Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
There is no loop of G(B). According to similar analysis to case 1, × . × P(j j ...j)BP(jT j ...j)= ∈ R p× p 12p 12 p ×. × × ×. × . Therefore, P T AP is a lower triangular block matrix. Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (square)
Reducing computation burden on solution to large scale linear system Generally speaking, for large scale system, system stability can easily be judged. The method is used in hierarchical optimization of Large Scale Dynamic System Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (General)
Dulmage-Mendelsohn Decomposition ∃P,Q such that
Ah x x T PAQ= As X A v Introduction – Transformations – Sparsity – Simplex – Implementation
Structure Decomposition of Sparse Matrix (General)
Further, fine decomposition is needed. Ah → the block diagonal form Av → the block diagonal form As → the block upper triangular form. D-M decomposition can be seen in reference [3]. Computation and storage: Minimum of filling in The detail can be see in reference [2]. Introduction – Transformations – Sparsity – Simplex – Implementation
LU Factorization Method : Gilbert/Peierls
left-looking. kth stage computes kth column of L and U 1. L = I 2. U = I 3. for k = 1:n 4. s = L \ A(:,k) 5. (partial pivoting on s) 6. U(1:k,k) = s(1:k) 7. L(k:n,k) = s(k:n) / U(k,k) 8. end Introduction – Transformations – Sparsity – Simplex – Implementation Introduction – Transformations – Sparsity – Simplex – Implementation
LU Factorization Method : Gilbert/Peierls
THEOREM( Gilbert/Peierls). The entire algorithm for LU factorization of A with partial pivoting can be implemented to run in O(flops (LU)+ m) time on a RAM where m is number of the nonzero entries of A. Note: The theorem expresses that the LU factorization will run in the time within a constant factor of the best possible, but it does not say what the Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b
x = b; for j = 1:n if ≠(x(j) 0) , x(j+1:n) = x(j+1:n) - L(j+1:n,j) * x(j); end; --- Total time: O(n+flops) Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b
⇒ xj≠ 0l ∧ ij ≠ 0 x i ≠ 0 ⇒ bi≠ 0 x i ≠ 0
LetG(L) havej→ an i edgelij ≠ 0 if β ={i bi ≠ 0 } χ ={i xi ≠ 0 } Let and χ =ReachG(L) ( β ) Then --- Total time: O(flops) Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation
Sparse lower triangular solve, x=L\b Introduction – Transformations – Sparsity – Simplex – Implementation
References
J. R. Gilbert and T. Peierls, Sparse partial pivoting in time proportional to arithmetic operations, SIAM J. Sci. Statist. Comput., 9 (1988), pp. 862-874 Mihalis Yannakakis’ website: http://www1.cs.columbia.edu/~mihalis A. Pothen and C. Fan, Computing the block triangular form of a sparse matrix, ACM Trans. On Math. Soft. Vol. 18, No.4, Dec. 1990, pp. 303-324 Introduction – Transformations – Sparsity – Simplex – Implementation
Simplex Method – Problem size
min{cxAxT= bx , ≥ 0, A ∈ R m× n } Problem size determined by A On the average, 5~10 nonzeros per column
greenbea.mps of NETLIB Introduction – Transformations – Sparsity – Simplex – Implementation
Simplex Method – Computational Form, Basis
min{cxAxT= bx , ≥ 0, A ∈ R m× n } Note: A is of full row rank and m Basis (of Rm) : m linearly independent columns of A Basic variables x1 x2 x3 x4 10 1 2 01 2 1 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Notations β index set of basic variables γ index set of non-basic variables B:=A β basis R:=A γ non-basis (columns) βxβ c ABRx=[ | ], = , c = γxγ c Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basic Feasible Solution Ax= b Bx γ β + Rx = b −1 γ xβ = B( b − Rx ) Basic feasible solution xγ = 0, x ≥ 0, Ax = b If a problem has an optimal solution, then there is a basic solution which is also optimal. Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Checking Optimality T T T Objective value : cx= cx γβ β + cx =cTBb−1+ () c T − cBT − 1 Rx γ γ β β constant Reduced Costs T =c0 + d x γ Optimality condition : dj ≥0 for all j ∈ γ Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Improving Basic Feasible Solution Choose xq (incoming variable) s.t. dq < 0 Objective value T increase xq as much as possible c0 + d x γ non-basic variables basic variables basic variables remain feasible xm+1 = 0 x1 ↑ (≥ 0) even if xq ∞ ⋮ ⋮ xq ↑ xp → objective value is unbounded ⋮ ⋮ xn = 0 xn ↑ Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Improving Basic Feasible Solution Choose x (incoming variable) s.t. d < 0 Objective value q q T c0 + d x γ increase xq as much as possible non-basic variables basic variables basic variables xm+1 = 0 x1 ↑ x1 ↑ ⋮ ⋮ ⋮ xq ↑ x → goes to 0 first p xp ↓ (outgoing variable) ⋮ ⋮ ⋮ xn = 0 x ↑ n xn ↑ Unbounded solution neighboring improving basis Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B b,...,b ,..., b Neighboring bases : = 1 p m B= [ b1,...,a ,..., b m ] Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B= b1,...,bp ,..., b m m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 (as the linear combination of the bases) Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B= b1,...,bp ,..., b m m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 i 1 v p bp =p a −∑ p bi ( v ≠ 0 ) vi≠ p v T vv1p− 11 v p+ 1 v m bp = ηB η , where = − ,⋯ ,− , , − , ⋯, − vp vv p p v pv p Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B= b1,...,bp ,..., b m m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ − vp vvv ppp v p vp : pivot element Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B= b1,...,bp ,..., b m m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ − vp vvv ppp v p BBE=, where Eee = 1,...,p− 1 ,η , e p + 1 ,. .., e m (elementary transformation matrix) Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating B= b1,...,bp ,..., b m m i −1 B= [ b,...,a ,..., b ] Write a =∑vbi = Bv (v= B a ) 1 m i=1 1 vi bp =p a − ∑ p bi vi≠ p v T vv1p− 11 v p + 1 v m bp = ηBη ,where =−− , ⋯, ,, − ,, ⋯ − vp vvv ppp v p BBE=, where Eee = 1,...,p− 1 ,η , e p + 1 ,. .., e m B−1= EB − 1 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating Eee= 1,...,p− 1 ,η , e p + 1 ,..., e m 1 η 1 ETM ⋱ (Elementary Transformation Matrix) = η p ⋱ m η 1 10 1 η 1 10 ⋱ ⋮ ⋱ ⋮ ⋱ ⋮ = η p 1 ⋯ 1 ⋮ ⋱ ⋮ ⋱ ⋮ ⋱ m 0 1 0 1 η 1 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Basis Updating Basis tends to get denser after each update (B−1= EB − 1 ) 1 η η 1 w 111 w+ w p ⋱ ⋮ ⋮ Ew = η η p w p = w pp ⋱ ⋮ ⋮ m m mpm η η 1 w w+ w Ew = w if wp = 0 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Algorithm Steps Major ops 1. Find an initial feasible basis B 2. Initialization B−1 b T −1 3. Check optimality cβ B −1 4. Choose incoming variable xq B a q 5. Choose outgoing variable xp 6. Update basis EB −1 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Algorithm Steps Major ops 1. Find an initial feasible basis B 2. Initialization B−1 b T −1 3. Check optimality cβ B −1 4. Choose incoming variable xq B a q 5. Choose outgoing variable xp (pivot step) 6. Update basis EB −1 Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Algorithm Choice of pivot (numerical considerations) resulting less fill-ins large pivot element Conflicting goals sometimes In practice, compromise . Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – Typical Operations in Simplex Method Typical operations : B−1 w, wT B − 1 Challenge: sparsity of B-1 could be destroyed by basis update Need a proper way to represent B-1 Two ways: −1 Product form of the inverse ( B = EE k k − 1 ⋯ E 1 ) (obsolete ) LU factorization Introduction – Transformations – Sparsity – Simplex – Implementation Simplex Method – LU Factorization Reduce complexity using LU update (B= BEB ,−1 = EB − 1 ) Side effect : more LU factors Refactorization (reinstate efficiency and numerical accuracy) Sparse LU Updates in Simplex Method Hamid R. Ghaffari April 10, 2007 Outline LU Update Methods Preliminaries Bartels-Golub LU UPDATE Sparse Bartels-Golub Method Reid’s Method The Forrest-Tomlin Method Suhl-Suhl Method More Details on the Topic Revised Simplex Algorithm Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Revised Simplex Algorithm Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Revised Simplex Algorithm Simplex Method Revised Simplex Method Determine the current basis, d d = B−1b 0 0 −1 Choose xq to enter the basis based on c¯ = cN − cB B N, the greatest cost contribution {q|c¯q = min(¯ct )} t If xq cannot decrease the cost, c¯q ≥ 0, d is optimal solution d is optimal solution −1 Determine xp that leaves the basis w = B Aq, n o di dt (become zero) as xq increases. p = min , wt > 0 wi t wt If xq can increase without causing If wp ≤ 0 for all i, the solution is another variable to leave the basis, unbounded. the solution is unbounded Update dictionary. Update B−1 Note: In general we do not compute the inverse. Problems with Revised Simplex Algorithm I The physical limitations of a computer can become a factor. I Round-off error and significant digit loss are common problems in matrix manipulations (ill-conditioned matrices). I It also becomes a task in numerical stability. 2 I It take m (m − 1) multiplications and m(m − 1) additions, a total of m3 − m floating-point (real number) calculations. Many variants of the Revised Simplex Method have been designed to reduce this O(m3)-time algorithm as well as improve its accuracy. I Having LU decomposition B = LU we have −1 ¯ −l T L B = U + (L Aq − Uep)eq , I How to deal with this? The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Introducing Spike I If Aq is the entering column, B the original basis and B¯ the new basis, then we have ¯ T B = B + (Aq − Bep)eq , I How to deal with this? The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Introducing Spike I If Aq is the entering column, B the original basis and B¯ the new basis, then we have ¯ T B = B + (Aq − Bep)eq , I Having LU decomposition B = LU we have −1 ¯ −l T L B = U + (L Aq − Uep)eq , I How to deal with this? The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Introducing Spike I If Aq is the entering column, B the original basis and B¯ the new basis, then we have ¯ T B = B + (Aq − Bep)eq , I Having LU decomposition B = LU we have −1 ¯ −l T L B = U + (L Aq − Uep)eq , The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Introducing Spike I If Aq is the entering column, B the original basis and B¯ the new basis, then we have ¯ T B = B + (Aq − Bep)eq , I Having LU decomposition B = LU we have −1 ¯ −l T L B = U + (L Aq − Uep)eq , I How to deal with this? Introducing Spike I If Aq is the entering column, B the original basis and B¯ the new basis, then we have ¯ T B = B + (Aq − Bep)eq , I Having LU decomposition B = LU we have −1 ¯ −l T L B = U + (L Aq − Uep)eq , I How to deal with this? The various implementations and variations of the Bartels-Golub generally diverge with the next step: reduction of the spiked upper triangular matrix back to an upper-triangular matrix. (Chv´atal,p150) Bartels-Golub Method Illustration The first variant of the Revised Simplex Method was the Bartels-Golub Method. Bartels-Golub Method Algorithm Revised Simplex Method Bartels-Golub d = B−1b d = U−1L−1b 0 0 −1 0 0 −1 −1 c¯ = cN − cB B N, c¯ = cN − cB U L N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 w = B Aq, w = U L Aq, n o n o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update B−1 Update U−1 and L−1 In sparse case, yes. Bartels-Golub Method Characteristics I It significantly improved numerical accuracy. I Can we do better? Bartels-Golub Method Characteristics I It significantly improved numerical accuracy. I Can we do better? In sparse case, yes. Single-Entry-Eta Decomposition: 1 1 1 1 l 1 l 1 1 1 21 = 21 · · l31 1 1 l31 1 1 l41 1 1 1 l41 1 So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated. Sparse Bartels-Golub Method eta matrices First take a look at the following facts: Column-Eta factorization of triangular matrices: 1 1 1 1 l 1 1 1 l 1 21 = · · 21 l31 l32 1 1 l32 1 l31 1 l41 l42 l43 1 l43 1 l42 1 l41 1 So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated. Sparse Bartels-Golub Method eta matrices First take a look at the following facts: Column-Eta factorization of triangular matrices: 1 1 1 1 l 1 1 1 l 1 21 = · · 21 l31 l32 1 1 l32 1 l31 1 l41 l42 l43 1 l43 1 l42 1 l41 1 Single-Entry-Eta Decomposition: 1 1 1 1 l 1 l 1 1 1 21 = 21 · · l31 1 1 l31 1 1 l41 1 1 1 l41 1 Sparse Bartels-Golub Method eta matrices First take a look at the following facts: Column-Eta factorization of triangular matrices: 1 1 1 1 l 1 1 1 l 1 21 = · · 21 l31 l32 1 1 l32 1 l31 1 l41 l42 l43 1 l43 1 l42 1 l41 1 Single-Entry-Eta Decomposition: 1 1 1 1 l 1 l 1 1 1 21 = 21 · · l31 1 1 l31 1 1 l41 1 1 1 l41 1 So L can be expressed as the multiplication of single-entry eta matrices, and hence, L−1 is also is the product of the same matrices with off-diagonal entries negated. Sparse Bartels-Golub Method Algorithm Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n o n t o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n o n t o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n o n t o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. Sparse Bartels-Golub Method Algorithm Bartels-Golub Method Sparse Bartels-Golub Method −1 −1 −1 Q d = U L b d = U t ηt b 0 0 −1 −1 0 0 −1 Q c¯ = cN − cB U L N, c¯ = cN − cB U t ηt N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q w = U L Aq, w = U ηt Aq, n o n t o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 and create any necessary eta matrices. If there are too many eta matrices, completely refactor the basis. I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I The eta matrices were reduced to single-entry eta matrices. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). Sparse Bartels-Golub Method Advantages I It is no more complex than the Bartels-Golub Method. I The eta matrices were reduced to single-entry eta matrices. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). Sparse Bartels-Golub Method Advantages I It is no more complex than the Bartels-Golub Method. I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). Sparse Bartels-Golub Method Advantages I It is no more complex than the Bartels-Golub Method. I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I The eta matrices were reduced to single-entry eta matrices. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). Sparse Bartels-Golub Method Advantages I It is no more complex than the Bartels-Golub Method. I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I The eta matrices were reduced to single-entry eta matrices. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. Sparse Bartels-Golub Method Advantages I It is no more complex than the Bartels-Golub Method. I Instead of just L and U, the factors become the lower-triangular eta matrices and U. I The eta matrices were reduced to single-entry eta matrices. I Instead of having to store the entire matrix, it is only necessary to store the location and value of the off-diagonal element for each matrix. I Refactorizations occur less than once every m times, so the complexity improves significantly to O(m2). I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; 3 I O(n )-cost decomposition; I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; 3 I O(n )-cost decomposition; I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; 3 I O(n )-cost decomposition; I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; 3 I O(n )-cost decomposition; Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. I Large numbers of eta matrices; 3 I O(n )-cost decomposition; Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; 3 I O(n )-cost decomposition; Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; Sparse Bartels-Golub Method Disadvantages I Eventually, the number of eta matrices will become so large that it becomes cheaper to decompose the basis. I Such a refactorization may occur prematurely in an attempt to promote stability if noticeable round-off errors begin to occur. I In practice, in solving large sparse problem, the basis is refactorized quite frequently, often after every twenty iterations of so. (Chv´atal, p. 111) I If the spike always occurs in the first column and extends to the bottom row, the Sparse Bartels-Golub Method becomes worse than the Bartels-Golub Method. I The upper-triangular matrix will always be fully-decomposed resulting in huge amounts of fill-in; I Large numbers of eta matrices; 3 I O(n )-cost decomposition; Ried Suggestion on Sparse Bartels-Golub Method Rather than completely refactoring the basis, applying LU-decomposition only to the part of that remained upper-Hessenberg. Reid’s Method Task: The task is to find a way to reduce that bump before attempting to decompose it; Row singleton: any row of the bump that only has one non-zero entry. Column singleton: any column of the bump that only has one non-zero entry. Method: I When a column singleton is found, in a bump, it is moved to the top left corner of the bump. I When a row singleton is found, in a bump, it is moved to the bottom right corner of the bump. Reid’s Method Column Rotation Reid’s Method Row Rotation Reid’s Method Characteristics Advantages: I It significantly reduces the growth of the number of eta matrices in the Sparse Bartels-Golub Method I So, the basis should not need to be decomposed nearly as often. I The use of LU-decomposition on any remaining bump still allows some attempt to maintain stability. Disadvantages: I The rotations make absolutely no allowance for stability whatsoever, I So, Reid’s Method remains numerically less stable than the Sparse Bartels-Golub Method. The Forrest-Tomlin Method The Forrest-Tomlin Method Bartels-Golub Method Forrest-Tomlin Method −1 −1 −1 Q −1 d = U L b d = U t Rt L b 0 0 −1 −1 0 0 −1 Q −1 c¯ = cN − cB U L N, c¯ = cN − cB U t Rt L N, {q|c¯q = min(¯ct )} {q|c¯q = min(¯ct )} t t c¯q ≥ 0, d is optimal solution c¯q ≥ 0, d is optimal solution −1 −1 −1 Q −1 w = U L Aq, w = U Rt L Aq, n o n t o di dt di dt p = min , wt > 0 p = min , wt > 0 wi t wt wi t wt If wp ≤ 0 for all i, the solution is If wp ≤ 0 for all i, the solution is unbounded. unbounded. Update U−1 and L−1 Update U−1 creating a row factor as necessary. If there are too many factors, completely refactor the basis. The Forrest-Tomlin Method Characteristics Advantages: I At most one row-eta matrix factor will occur for each iteration where an unpredictable number occurred before. I The code can take advantage of such knowledge for predicting necessary storage space and calculations. I Fill-in should also be relatively slow, since fill-in can only occur within the spiked column. Disadvantages: I Sparse Bartels-Golub Method allowed LU-decomposition to pivot for numerical stability, but Forrest-Tomlin Method makes no such allowances. I Therefore, severe calculation errors due to near-singular matrices are more likely to occur. Suhl-Suhl Method This method is a modification of Forrest-Tomlin Method. For More Detail Leena M. Suhl, Uwe H. Suhl A fast LU update for linear programming. Annals of Operations Research 43(1993)33-47 Stiven S. Morgan A Comparison of Simplex Method Algorithms University of Florida, 1997 Vasek Chv´atal Linear Programming W.H. Freeman & Company (September 1983) Thanks