Lecture 9: Linear Programming: Revised Simplex and Interior Point Methods (a nice application of what we have learned so far)
Prof. Krishna R. Pattipati Dept. of Electrical and Computer Engineering University of Connecticut Contact: [email protected] (860) 486-2890
ECE 6435 Fall 2008 Adv Numerical Methods in Sci Comp October 22, 2008
Copyright ©2004 by K. Pattipati Lecture Outline
What is Linear Programming (LP)? Why do we need to solve Linear-Programming problems?
• L1 and L∞ curve fitting (i.e., parameter estimation using 1-and ∞-norms) • Sample LP applications • Transportation Problems, Shortest Path Problems, Optimal Control, Diet Problem Methods for solving LP problems • Revised Simplex method • Ellipsoid method….not practical • Karmarkar’s projective scaling (interior point method) Implementation issues of the Least-Squares subproblem of Karmarkar’s method ….. More in Linear Programming and Network Flows course Comparison of Simplex and projective methods References 1. Dimitris Bertsimas and John N. Tsisiklis, Introduction to Linear Optimization, Athena Scientific, Belmont, MA, 1997. 2. I. Adler, M. G. C. Resende, G. Vega, and N. Karmarkar, “An Implementation of Karmarkar’s Algorithm for Linear Programming,” Mathematical Programming, Vol. 44, 1989, pp. 297-335. 3. I. Adler, N. Karmarkar, M. G. C. Resende, and G. Vega, “Data Structures and Programming Techniques for the Implementation of Karmarkar’s Algorithm,” ORSA Journal on Computing, Vol. 1, No. 2, 1989.
2 Copyright ©2004 by K. Pattipati What is Linear Programming? One of the most celebrated problems since 1951 • Major breakthroughs: • Dantzig: Simplex method (1947-1949) • Khachian: Ellipsoid method (1979) - Polynomial complexity, but not competitive with the Simplex → not practical. • Karmarkar: Projective Interior point algorithm (1984) - Polynomial complexity and competitive (especially for large problems)
LP Problem Definition
Given: an m x n matrix A , m n or A Rmn , m n assume rank ( A) m a column vector b with m components: b Rm a row vector cTn with n components: c R n m x n Ax b has infinitely many solutions b aii x i1 T consider xRAr , Axb r Axx r n b , where xNA n ( ) xAx n : n 0
3 Copyright ©2004 by K. Pattipati Standard form of LP
We impose two restrictions on x :
We want nonnegative solutions of Ax b xi 0 (or) x 0 x such that Ax b & x 0 are said to be feasible
Among all those feasible {xx }, we want * such that T c x c1 x 1 c 2 x 2 ... cnn x is a minimum • This leads to the so-called “standard form of LP”
min cxT convex programming problem. If a * (SLP): s.t. Ax b bounded solution exists, then x is x 0 unique a single minimum. • Claim: Any LP problem can be converted into standard form. Inequality constraints:
TTx a) ai x b i a i 1 b i ; x n11 0, x n slack variable xn1
Aa x In general: AxbAxyb AI bx , 0, y 0 y
Increase number of variables by m & Aa is m by ( n m ) matrix.
4 Copyright ©2004 by K. Pattipati How to Convert Constraints into a SLP? - 1
Inequality Constraints TT b) ai x b i a i x x n1 b i , x n 1 0; x n 1 ~ surplus variable x Ax b A I b , y 0 y
c) di x i define xˆˆ i x i d i & x i 0
d) di x i define xˆˆ i d i x i & x i 0
e) di1 x i d i 2 0 x i d i 1 d i 2 d i 1 define xˆˆ x d and x y d d ; y 0 i i i1 i i i 2 i 1 i slack T f) b12i a i x b i use two slacks T ai x y i11 b i yy , 0 T ii12 ai x y i22 b i TTTT g) axbi i baxb i i i axy i i1 bax i ; i ybii2
5 Copyright ©2004 by K. Pattipati How to Convert Constraints into a SLP? - 2
xi is a free variable
Define xi x i xˆˆ i with x i 0 & x i 0 a) Maximization: change cTT x to c x n ˆ b) min xi s.t. Axb Axyb ; write xxx i i i i1 n ˆ min xxii i1 The optimal solution of x this problem solves the s.t. A A I xˆ b original problem. y Example of LP Problems
• L1 – curve fitting
Recall that given a set of scalars b12 , b,..., bm , the estimate that m minimizes xb i is the median and that this estimate is insensitive i1
to outliers in the data {.bi }
6 Copyright ©2004 by K. Pattipati Curve Fitting
In the vector case, we want x such that: m T minaii x b min Ax b xx 1 i1 • L1 – curve fitting → an LP T write x x xˆ ; ai x b i u i v i ; n T Then the LP problem is: minuii v min e u v x,,,, u v x u v i1 s.t. A x xˆ u v b xx 0; ˆ 0; uv 0; 0
T • L – curve fitting want x such that min max aii x b min Ax b ∞ xx1im
• L∞ – curve fitting → an LP
T Let maxaii x b w ; then the problem is equivalent to: 1im T minw , s.t. w aii x b w for i 1,2,..., m xw, A e x b minw s.t. A e w b
7 Copyright ©2004 by K. Pattipati Transportation Problem - 1
• Since the number of constraints is large (2m) and the number of variables (n) is small, typically the dual problem with (n+1) constraints and 2m variables is solved instead. maxbT s.t. AeTT 0, =1 and 0; 0 Transportation or Hitchcock problem (special LP) mn sources of a commodity or a product and destinations
amount of commodity to be shipped from source i ai , 1 i m amount of commodity to be received at destination
(sink, terminal node) i bj , 1 j n shipping cost from source ij to destination
per unit commodity cij dollars/unit − Problem: How much commodity to be shipped from source i to destination j to minimize the total cost of transportation?
8 Copyright ©2004 by K. Pattipati Transportation Problem - 2
c11, x11 i = 1 j = 1 c21, x21 mn c , x m1 m1 min axij ij x ij ij11 c12, x12 n c , x s.t. x a i 1,2,..., m 22 22 j = 2 ij i i = 2 j1 m cm2, xm2 xij b j j 1,2,..., n c , x i1 1n 1n mn c2n, x2n Also: abij i = m j = m ij11 cmn, xmn Conservation Constraint: m sources: n terminal nodes: sinks, mn variables & ( m n ) constraints supply nodes destination nodes
BIPARTITE GRAPHS: Special LP Problem
abii 1 Assignment problem or weighted bipartite matching problem.
9 Copyright ©2004 by K. Pattipati Shortest Path Problem - 1
Shortest-path problem
• We formulate it as an LP for conceptual reasons only
u - s, u, v, t are computers, edge lengths 2 5 are costs of sending a message between pairs of nodes denoting computers 1 s t - Q: What is the cheapest way to source send a message from s to t? 4 v 3
Intuitively, xxsv ut 0, i.e., no messages are sent from s to v & from u to t .
Shortest path s - u - v - t xsu x uv x vt 1 Shortest path length = 2 1 3 6
10 Copyright ©2004 by K. Pattipati Shortest Path Problem - 2
LP problem formulation
Let xij be the fraction of messages sent from i to j
min 2xsu 4 x sv x uv 5 x ut 3 x vt
s.t. xsu 0; x sv 0; x uv 0; x ut 0; x vt 0
xsu x uv x ut 0 (message not lost at u )
xsv x uv x vt 0
xxut vt 1
• Add all constraints xsu + xsv = 1, which it must be!! • Only 3 independent constraints (although 4 nodes) In matrix notation:
xsu 1 0 1 1 0x 0 sv Ax 0 1 1 0 1 xuv 0 b 0 0 0 1 1 xut 1 xvt nn nodes 1 independent equations Similar to Kirchoff's Laws
11 Copyright ©2004 by K. Pattipati Optimal Control Problem - 1
Shortest path problem as a standard LP NOTE: min exT - A is called the incidence matrix s.t. Ax b - b is a special vector - AAA is a unimodular matrix and so are all invertible submatrices of x 0 detA 1 or 1 Optimal Control • Consider a linear time-invariant discrete-time system
xk1 Ax k bu k; u k ~ scalar for simplicity, k 0,1,.... k 1 kk1 xkl A x0 A bu /0 N 1 N N l 1 Define Terminal Error: eN x d x N x d A x0 A bu l /0
Given x0 , xdk & given the fact that u is constrained by u minuuk max , we can formulate various versions of LP.
12 Copyright ©2004 by K. Pattipati Optimal Control Problem - 2
Versions of LP n n N 1 a) mine x AN x A N l 1 bu 1-norm of error Ni d0 i l i1 i 1 l 0 i n T cii d z i1 d N vector with components ANl1 b d i i il T z u0 , u 1 ,..., uN 1
s.t. uzmin 1 umax 1 T Convert to standard form via: vi u i c i d i z n Optimal Solutions: vuii TT i1 * di z c i if d i z c i 0 vi 0 otherwise s.t. umin 1 z u max 1 T TT * di z c i if d i z c i 0 vi u i c i d i z 1 i n u i 0 otherwise Can also include constraints on state variables
13 Copyright ©2004 by K. Pattipati Optimal Control Problem - 3
Versions of LP T b) min maxeNi = min max c i d i z -norm of error 11i n i n T Define v max cii d z min v 1in TT s.t. umin 1 zu max 1, vcdz i i 0, vcdz i i 0 • Proof of equivalence for (a) ***** Suppose vi , u i ,& z are optimal solutions. Claim: v i & u i can not simultaneously be non-zero. **** If they are and vi u i ,define vˆˆ i = v i u i , u i =0 **** vˆˆi u i v i u i v i u i .... a contradiction. Only one of the two can be non-zero. • Proof of equivalence for (b) Let z*** , v be optimal for revised problem, but z is not optimal for original problem. Suppose zˆ is optimal solution of original problem. T Define v max dii zˆ c feasible for revised problem vv * Contradici to n.
14 Copyright ©2004 by K. Pattipati Diet Problem
Diet Problem • We want to find the most economical diet that meets minimum daily requirements for calories and such nutrients as proteins, calcium, iron, and vitamins. We have n different food items:
cjj cost of food item
xjj units of food item (in grams) included in our economic diet There are m minimum nutritional requirements th bi minimum daily requirement of i nutrient
aij amount of nutrient i provided by a unit of food item j • The problem is an LP: n min cx jj T j1 min cx n s.t. Ax b s.t. aij x j b i ; i 1,2,..., m j1 x 0 x 0; j 1,2,..., n j
15 Copyright ©2004 by K. Pattipati Classes of Algorihtms for LP
Fundamental Property of LP Optimal solution x* is such that ( n m ) of its components are zero. If we know the nm components that are zero, we can immediately compute the optimal solution (i.e., remaining m nonzero components) from Ax b • Since we don’t know the zeros a priori, the chief task of every algorithm is to discover where they belong. Three Classes of Algorithms for LP • Simplex • Ellipsoid • Projective Transformation (scaling) Algorithm Key Ideas of Simplex Algorithm Phase 1: Find a vector x that has ( n m ) zero components, with Ax b and xx 0. This is a feasible , not necessarily optimal. Phase 2: Allow one of the zero components to become positive and force one of the positive components to become zero.
16 Copyright ©2004 by K. Pattipati Geometry of LP - 1
Simplex Algorithm • Q: How to pick “entering” and “leaving” components? A: cost cT x and Ax b , x 0 must be satisfied. • Another Key Property: Need to look at only extreme (corner) points of the feasible set. Minimum occurs at one of the corners Example: min xx x2 12 (vertices) of the fesible set: s.t. xx12 2 4 xx12 0, 2 corner point Q 3 xx120, 0 feasible In n -dimensions, feasible set lies in 2 Q set T n -dimensions and so do the cost planes cx const. 1 P x1 Inequality constraints: 1 2 3 4 x1 2 x 2 4, and x 1 0, x 2 0 xR 0 defines a positive cone in n . x2 TT aii x0 defines a half space on or below the plane a x 0 feasible T set Feasible set = positive cone half spaces defined by aii x b polyhedron (polygon in 2 dimensions). x1 Feasible set is convex : x1 , x 2 feasible x 1 (1 ) x 2 is also feasible [0,1]. Line segment is also in feasible se.t
17 Copyright ©2004 by K. Pattipati Geometry of LP - 2
An LP may not have a solution
x2 Example:
min xx12
-4 -3 -2 -1 -1 x1 such that xx12 2 4 xx,0 -2 12 feasible set is empty
An LP may have an unbounded solution
x 2 Example:
min xx12
such that xx12 2 4 4 x1 opt. xx12 , ( , ) opt. cost value
• So, an algorithm must decide whether an optimal solution exists and find the corner where the optimum occurs.
18 Copyright ©2004 by K. Pattipati Revised Simplex Algorithm - 1
Revised Simplex Algorithm Consider SLP: minz cT x s.t. Ax b and x 0 Assume rank ( A ) m . Then, we can partition A B | N, where B ~ m linearly independent columns. Assume first m columns for convenience
xB [B | N ] b x Rm ; x R n m BN xN We know n m components of x are zero 1 If xNB 0, x B b is said to be the basic solution and the columns of B form the basis
If, in addition, xxBB 0, then is called the basic feasible solution .
In terms of xxBN and , the cost function is TT z cBBNN x c x 11 11 Using xBN B b B Nx B b B am11 x m ... a n x n TTTT11 zcBb BNBNN c cBNx z0 px
z0 p 1 xm 1 ... p n m x n
19 Copyright ©2004 by K. Pattipati Revised Simplex Algorithm - 2
Revised Simplex Algorithm Compute pT in two steps:
T 1) Solve: Bc B TTT 2) Compute: p cNN N or p c N ; is called vector of simplex multipliers p is called the relative cost vector forms the basis fo exchanging basis variables. T If p 0, then the corner is optimal, since p xN 0 and x 0,
it doesn't pay to increase xN .
If a component pk 0, then the cost can be decreased by increasing the
corresponding component of xNk , that is, x : m 1 k n . • Simplex method chooses one entering variable
• One with the most negative pk (or)
• The first negative pk (avoids cycling)
• Simplex allows the component xk to increase from zero.
20 Copyright ©2004 by K. Pattipati Revised Simplex Algorithm - 3
Revised Simplex Algorithm
Q: Which component x1 should leave? A: It will be the first to reach zero. Ax b is satisfied again at the new point.
Assume pxk 0, and consider what happens when we increase Nk from zero. old Let xB initial feasible solution new11 old xB B a k x Nk B b x0 x B new 1 xxB Nk yB b x0 , where By ak th new th i component of xB will be zero when the i component of yx y x , and B1 b = x are equal. This happens when Nk i Nk i 0 i th1 th xNk i component of B b / i component of yx 0ii / y
So, among all yyii s such that 0, the smallest of these ratios determines
how large xNk can become. th If the l ratio is the smallest, then the leaving variable will be xl .
At the new corner, xxNk 0 and l 0.
xBl nonbasic set & column a l joins the nonbasic matrix N .
xkk basic set & column joins the basic matrix B .
xx00li Thus, min : yi 0 1im yyli
21 Copyright ©2004 by K. Pattipati Revised Simplex Algorithm Steps
One Iteration of Revised Simplex Algorithm
1 • Step 1: Given is the basis B such that xB B b 0.
T • Step 2: Solve Bc B for the vector of simplex multipliers . • Step 3: T Select a column ak of N such that p k c Nk a k 0. We may, for example, select
the ak which gives the largest negative values of p k or the first k with negative p k . TT If p cN N 0, stop current solution is optimal.
• Step 4: Solve for y : By ak
• Step 5: Find x00l y l min x i y i where 1 i m and y i 0. new - Look at xBi x0 i y i x Nk .
- If none of the yi s are positive, then the set of solutions to Ax b , x 0 is unbounded and the cost z can be made an arbitrarily large negative number. - Terminate computation unbounded solution. • Step 6: Update the basic solution
xi x y; i l ; Set x corresponding to the new basic variable, k ( l goes out) ili • Step 7: Update the basis and return to Step 1.
22 Copyright ©2004 by K. Pattipati Phase I of LP How to get initial feasible solution…Phase I of LP • An LP problem for Phase I m ˆ ˆ ˆ min yim such that Ax I y b ; x 0, y 0 i1 yˆ ~ Artificial Variable m ˆ If we can find an optimal solution such that yxiB = 0, then we have . i1 m ˆ If yi 0 then there is no feasible solution to Ax b , x 0. i1 Infeasible Problem
Solve via revised simplex starting with x 0, yˆ b & B Im .
• Another approach is to combine both phases I and II by solving: mineTT x Me y (where M is a large number) xy, s.t. Ax y b ; x 0, y 0
− This is called the “big-M” method.
23 Copyright ©2004 by K. Pattipati Basis Updates
How to Update Basis: • NOTE: We need to solve: T B cBk and By a , where the B 's differ by only one column between any
two subsequent iterations column aakl replaces • A simple way to solve these equations is to propagate B-1 from one iteration to the next.
T Recall: Bnew B old column a l column a k B old a k a l e l rank one update 1TT 1 1 1 1Bold a k a l e l B old B old a k a l e l 1 1 1 So, Bnew B old T 1 = I B old . NOTE: B old a k y and B old aell 1el B old a k a l y l 11 Bold EB new product form of the inverse (PFI)
1 0 ...yy1 l ... 0 yeT 0 1 ...yy ... 0 E is called an l 1 T 2 l where E I ell e “Elementary Matrix.” yyll 0 ...... 1yl ... 0 0 ...... yyml ... 1 For large scale problems, store Ep as a vector and update TT and sequentially as follows: TTc E E ... E or y E ...... E E a ... B p p1 1 p 2 1 k
• What if yl is small? This creates a problem… • Modern revised simplex methods use LU or QR decompositions.
24 Copyright ©2004 by K. Pattipati Sequential LU and QR
LU Decomposition • If B is the current basis:
B a12 a ... am L old U old B old
Bnew a1 a 2 ... a l 1 a l 1 a m a k 11 NOTE: H Lold B new u1 u 2 ... u l 1 u l 1 ... u m L old a k is an upper Hessenberg matrix. • Use a sequence of elimination steps on H to get: ˆ ˆ ˆ ˆ1 ˆ UMMMHBLMMUnew m1 ... l 1 l new old l ... m 1 new 11ˆˆ Store: LMMLnew m1 ... l old QR Decomposition T Bnew a1 a 2 ... a l 1 a l 1 a m a k ; Q old B new H Do Givens on H : TT JJHRQQJm11 ... l new and new old l ... J m n Theoretically, revised simplex is an exponential algorithmO( ) . m In practice, it takes approximately 2(nm ) iterations. Each iteration takes approximately O m2 m n m operations.
25 Copyright ©2004 by K. Pattipati Sensitivity Analysis
Duality and Sensitivity Analysis x Bb1 Recall that the basic feasible solution x B x 0 N is the solution of SLP "mincT x s.t. Ax b , x 0" if and only if: TT1 cBB ~ Vector of simplex (Lagrange) multipliers or dual variables TTT 1 p cNB c B N 0 ~ Non-negative relative cost vector
• Note that the optimal cost is given by
TTT1 z c x cB B bb
So, z can be gotten by knowing optimal xB or optimal . Q: Is there another way to get ? A: Yes, by solving an equivalent LP, called a dual LP problem .
26 Copyright ©2004 by K. Pattipati Dual LP Problems
Dual of an SLP Primal Problem Dual Problem
min cxT max T b x T s.t. Ax b , x 0 s.t. Ac ASYMMETRIC n variables n constraints DUAL m constraints m variables non-negativity no constraints on
constraints on x the sign of {i }
• Duality of an Inequality constrained LP (InLP) Primal Dual
min cxT min T b x SYMMETRIC s.t. Ax b s.t. AcT DUAL x 0 0
27 Copyright ©2004 by K. Pattipati Duality Properties
• Dual of a Dual = Primal - For any feasible x and dual feasible (SLP): TTTb Ax c x weak duality lemma TTT (InLP): b Ax c x
Dual feasible solution ≤ primal feasible solution
• Very useful concept in deriving efficient algorithms for large integer programming problems (e.g., scheduling) with separable structures. Complementary Slackness Conditions TTT 1) c A x 0 pj c j a j 0 or x j 0 TT 2) Ax b 0 qi a i x b i 0 or i 0 T ajj ~ synthetic cost of variable
• For variables in the optimal basis, relative cost pj = 0 synthetic cost = real cost
• For variables not in optimal basis, relative cost pj ≥ 0 synthetic cost ≤ real cost
28 Copyright ©2004 by K. Pattipati Simplex Multipliers & Sensitivity - 1
Interpretation of Simplex Multipliers Suppose b b b without changing the optimal basis. Change in the optimal objective function value TT1 z cB B b b
z th i marginal price (value) of the i resource (i.e., right hand side of bi ) bi
{i } are also called shadow prices, dual variables, Lagrange multipliers, or equilibium prices. • Sensitivity (post-optimality) analysis
− Q: How much can we change {ci} & {bi} without changing the optimal basis? Consider: minc dT x ; s.t. Ax b , x 0 x is the parameter to be varied Nominal value of 0. th d ej Want to find the range for the j coefficient.
29 Copyright ©2004 by K. Pattipati Simplex Multipliers & Sensitivity - 2
• Fact: Basis B will be optimal as long as nonbasic reduced costs {pk} remain non-negative (recall that the reduced costs for basic variables are zero). TTTTTT Split c and d as c cBNBN | c and d d | d The required condition is:
TTTT11 cNBNB c B N d d B N 0 pTTTT q 0 q p
So the range of min , max , where p j min max max : qjj 0 and is nonbasic , q j p j max min min : qjj 0 and is nonbasic , q j
• If α Є (αmin, αmax), the new optimal cost is:
TTT1 z( ) ( cBBBB d ). B b z (0) d x
30 Copyright ©2004 by K. Pattipati Simplex Multipliers & Sensitivity - 3
• Consider parametric changes in b
min cxT x s.t. Ax b d ; 0 nominal x 0 • If B is the optimal basis, then need
TTT TT x( xBN x ) [ b d ,0] TT where b B11 b and d B d
• The range of α = (αmin , αmax) is given by:
bbii minmax max{ :ddii 0}, ; max min min{ : 0}, 1im 1im ddii
• If α Є (αmin, αmax), then
TTT1 z( ) cB B ( b d ) ( b d ) z (0) d
31 Copyright ©2004 by K. Pattipati Karmarkar’s Interior Point Method
Karmarkar’s Interior Point Algorithm • Discuss not the original Karmarkar’s algorithm, but an equivalent (and more general) formulation based on barrier functions n TT minc x min f ( x , ) c x ln x j ; 0 xx j1 SLP: s.t. Ax b Barrier s.t. Ax b x 0 Problem optimal solution x* optimal solution x* ( ) • Key: x*(μ) → x* as the barrier parameter μ → 0 • many variations of barrier function formulations. Wewill discuss them later
32 Copyright ©2004 by K. Pattipati Newton’s Method for NLP
Consider the general NLP minf ( x ) s.t. Ax b x
Suppose x is feasible, then x x d d search direction Pick Ax b (new point is feasible) and f ( x)() f x
What does Newton’s Method do for this problem?
Feasibility Ax Ax Ad 0 Ad 0 Newton's method fits a quadratic to fx ( ) at the current point and takes 1 fxdfxgd ( ) ( ) TT 1 2 dHd where g fxH ( ); 2 fx ( ) Newton’s method solves a quadratic problem to find d ( a weighted least squares problem)
33 Copyright ©2004 by K. Pattipati Optimality Conditions
• Consider
TT 1 2 1 2 2 1/2 ming d 1 2 d H d min1 2 || H d H g ||2 ; H symmetric squareroot dd s.t. Ad 0 s.t. Ad 0 • Define Lagragian function:
T L( d , ) gT d 1 2 dT Hd d ; ~ Lagrange multiplier
• Karush-Kuhn-Tucker necessary conditions of optimality: L / d 0 g H d AT 0 L/ 0 Ad 0; ( AH1 AT ) 1 AH 1 g
• Special NLP = barrier formulation of LP: g f ( x ) c D1 e and H 2 f ( x ) D 2
T where D Diag ( xj ) , j 1, 2, ..., n and e (1 1 1 ... 1)
34 Copyright ©2004 by K. Pattipati Optimality Conditions for LP
• Karush-Kuhn-Tucker conditions for special NLP are: D21 d( c D e AT ) 0 Ad 0 • So, 1 d D21( c D e AT ) (1)
• Using Ad = 0 in (1), we get =(AD2 AT ) 1 AD 2 (c- D 1 e) (2)
• So, λ is the solution of weighted least square (WLS) problem:
12T min ||D [ c D e A ]||2
35 Copyright ©2004 by K. Pattipati Barrier Function Algorithm
Barrier Function Algorithm Choose a strictly feasible solution and constant μ > 0. Let the tolerance parameter be ε and a parameter associated with the update of μ be σ. For k 0,1,2, ... DO
Let D Diag ( x j ) Compute the solution to (AD2 AT ) AD 2 ( c D 1 e ) ... WLS solution Let p c AT d D21 ( p D e ) / x x d If xT p , stop: x is near-optimal solution ... complementary slackness condition. else (1 ) n end if end DO
36 Copyright ©2004 by K. Pattipati Practicalities & Insights -1
1) Finding a feasible point
Select any x00 0 and define 02 s b - Ax with || s || 1
02 ||b - Ax0 || and solve x min s . t . ( A s ) b ; x 0, 0 x ,
Thesolution : 0 or when starts becoming negative stop
Suggest x0 = || b || e
2) Since the method uses Newton’s directions, expect quadratic convergence near minimum 3) Major computational step: Least-squares subproblem AD2 AT AD 2() c D 1 e Generally A is sparse We will discuss the computational aspects of Least-squares subproblem later
37 Copyright ©2004 by K. Pattipati Practicalities & Insights - 2
4) The algorithm (theoretically) requires O () nL iterations with overall complexity where O() n3 L
mn Lalog |ij | 1 1 ij01
5) In practice, the method typically takes 20-50 iterations even for very large problems (>20,000 variables). Simplex, on the other hand, takes increasingly large number of iterations with the problem size, n. 6) Initialize 2 OL () and 1/ 6 . In practice, need to experiment with the parameters. 7) T Other potential functions :f ( x , q ) r ln( e x q ) ln x j j where r n n and q a lower -bound on the optimal cost
38 Copyright ©2004 by K. Pattipati Practicalities & Insights - 3
Variants of the algorithm • Problem with barrier function approach: – Update of μ – Selection of initial μ and parameter σ • two classes of algorithms – Affine scaling – Power series approximation . Views affine scaling directions as a set of differential equations . Not competitive with affine scaling methods • Do not know if the variants have polynomial complexity. But, they work well in practice!!
39 Copyright ©2004 by K. Pattipati Affine Scaling Method - 1
Affine scaling: • Typically, the affine scaling methods are used on the dual problem Primal Dual Modified dual min cTTT x max b max b x s.t. Ax b s.t. ATT c s.t. A p c x 0 p 0
• Suppose have a strictly feasible and the corresponding reduced cost vector (slack vector) p • Define 1 pPpˆ , where PDiagpp (12 , , ... , p n ) • So, the dual problem is : max T b s.t. AT P pˆˆ c ; p 0
40 Copyright ©2004 by K. Pattipati Affine Scaling Method - 2
• From the equality constraint: pˆˆ P1( c ATT ) P 1 A ( P 1 c p )
• Assuming full column rank of AT or row rank of A linearly independent constraints in primal AP2 AT AP 1() P 1 c pˆ (APA2T ) 1 AP 1 ( Pcp 1 ˆˆ ) MPcp ( 1 ) • Note that R()() AP1 R M • Eliminating λ from the dual problem, we have: maxbTT M ( P1 c pˆˆ ) f ( p ) min b M pˆ pˆ s.t. H ( pˆ P1 c ) 0 s.t. H 0 pˆˆ 0 where p P1 c and where HIPAM – 1 T , asymmetric projection matrix HH2
41 Copyright ©2004 by K. Pattipati Affine Scaling Method - 3
• In addition, we have AP11 H0 columns of H N ( AP ) • Note that we want NHRPA ( ) (1 T ) • But, R(P-1AT) = R(MT) The gradient of f ( p ) w.r.t. scaled reduced costs p is gˆ MTTT b R ( M ) R ( P1 A ) p Result: The gradient w.r.t. scaled reduced costs, p , already lies in the range space of PA1 T , making projection unnecessary. • In terms of original unscaled reduced costs, the project gradient is
g Pg ATT() AP21 A b pp
42 Copyright ©2004 by K. Pattipati Affine Scaling Method - 4
• The corresponding feasible direction with respect to λ is: d MMTT gˆ () AP21 A b p g AT d p
• If g 0 dual problem is unbounded primal is infeasible p (assuming b 0)
• Otherwise, we replace λ by
d where
max ; 0.95 pi max min :gpi 0, i 1,2,..., n g pi • Note that primal solution x is: 2 2TT 2 1 x P gp P A() AP A b since it satisfies Ax = b.
43 Copyright ©2004 by K. Pattipati Dual Affine Scaling Algorithm Dual affine scaling algorithm Start with a strictly feasible , stopping criterion and . T zbold For k 0, 1, 2, ... DO T p c A ; P Diag ( p12 p ... pn )
Compute the solution d (AP2 ATT ) d b ; g A d p If g 0 p Stop: unbounded dual solution primal is infeasible else pi min :gpi 0, i 1, 2, ..., n g pi d ( p p g next step); z T b p new ||zz If new old max(1,|zold |) stop: found an optimal solution x P2 g p else
zzold new end if end if end DO
44 Copyright ©2004 by K. Pattipati Initial Feasible Solution Finding an initial strictly feasible solution for the dual affine scaling algorithm
||c ||2 0 T b ||Ab ||2 • Want to find a p p e • Select initial ξ0 as 2minc AT : i 1,2,..., m 0 i • Solve an (m+1) variable LP: max T b s.t. AT e c , T b Select .0 ; 105
0
The initial (0 ,0 ) are feasible for the problem • Notes: _ If 0 at iteration k found a feasible _ If the algorithm is such that optimal dual is infeasible primal is unbounded
45 Copyright ©2004 by K. Pattipati Least Squares Subproblem
Lease-squares subproblem: implementation issues • Generally A is sparse • Major computational step at each iteration AP-2 AT d b ... Affine scaling AD2 AT AD 2 ( c - µD -1 e ) AD ( Dc - µe ) ... barrier function method
• Key: need to solve a symmetric positive definite system yb
46 Copyright ©2004 by K. Pattipati Solution Approaches - 1 Solution approaches: Direct methods: a) Cholesky factorization: SST , S lower b) LDLTT factorization: LDL , S unit lower c) QR factorization: of P1 ATT or DA
Methods to speed up factorization • During each iteration only D or P-1 changes, while A remains unaltered – Nonzero structure of is static throughout. – So, during the first iteration, keep track of the list of numerical operations performed • Perform factorization only if the diagonal scaling matrix has changed significantly • Consider – AP2 AT – replace PP by
47 Copyright ©2004 by K. Pattipati Solution Approaches - 2 where old old old new PPPPii if | ii ii | / | ii | Pii Pii otherwise 0.1
new old define PPPii ii ii then new old P . a . a T ii ii iP:0ii th aii column of A – So, use rank-one modification methods discussed in Lecture 8 • Perform pivoting to reduce fill-ins having nonzero elements in factors where there are zero elements in . – Recall that P PT Py Pb – Unfortunately, finding the optimal permutation matrix to reduce filled-in is NP- complete – However, heuristics . minimum degree . minimum local fill-in
48 Copyright ©2004 by K. Pattipati Solution Approaches - 3
• Combine with an iterative method if have a few dense columns in A that will make impracticably dense . (Recall the outer product representation) Hybrid factorization and conjugate gradient method called a preconditioned conjugate gradient method. Idea: At iteration k, split columns of A into two partsSS and where columns of A are sparse (i.e., have density < λ(≈ 0.3)) s – 2 T Form APAss – Find incomplete Cholesky factor L such that
2 TT Zs A s P A s LL – Basically the idea is to step through the Cholesky decomposition, but setting l ij
= 0 if the corresponding sij 0
49 Copyright ©2004 by K. Pattipati Incomplete Cholesky Algorithm - 1 Incomplete Cholesky Algorithm
For km 1, ..., DO
lkk skk For i k 1, ..., m DO
If sik 0
llik sik / kk end if end DO For j k 1, ..., m DO For i j , ..., m DO
If sij 0
sij sij ll ik jk end if end DO end DO end DO
50 Copyright ©2004 by K. Pattipati Incomplete Cholesky Algorithm - 2 Now consider the original problem y AT P2 Ay b L1 ( L 1 )TT . L y L 1 b Qu f where Q L1 ( L 1 )TT ; u L y ; f L 1 b
Solve Qu f via conjugate gradient algorithm (see Lecture 5)
51 Copyright ©2004 by K. Pattipati Conjugate Gradient Algorithm Conjugate gradient algorithm uf ... initial solution cf|| || 2 ... norm of RHS 1 r f Qu ... initial residual (negative gradient of ( uTT Qu u f )) 2 ||r || ... square of norm of initial residual dr ... initial direction k 0
do while /c and k kmax w Qd r / dT Qd ... step length u u d ... new solution r r w ... new residual, r f Qu
2 ||r ||2 / ... parameter to update direction d r d ... new direction
2 ||r ||2 kk 1 end DO • Computational load: O(m2+10m) • Need to store only for vector: u, r, d and w
52 Copyright ©2004 by K. Pattipati Simplex vs. Dual Affine Scaling - 1
Comparison of simplex and dual affine scaling methods • Three types of test problems NETLIB test problems • 31 test problems • The library and test problem can be accessed via electronic mail netlib@anl-mcs (ARPANET/CSNET) (or) research ! netlib (UNIX network) • # of variables n ranged from 51 to 5533 • # of constraints m ranged from 27 to 1151 • # of non-zero elements in A ranged from 102 to 16276 • Comparisons on IBM 3090
53 Copyright ©2004 by K. Pattipati Simplex vs. Dual Affine Scaling - 2
Simplex Affine scaling Iterations (6, 7157) (19,55) Ratio of time per iteration (0.093, 0.356) 1 Total CPU time range (secs) (0.01, 217.67) (0.05, 31.70) Ratio of CPU times (0.2, 10.7) 1 (simplex/Affine)
Multi-commodity Network Flow problems • Specialized LP algorithms exist that are better than simplex • a program to generate random multi-commodity network flow problem called MNETGN • 11 problems were generated • # of variables n in the range (2606, 8800) • # of constraints m in the range (1406, 4135) • Non-zero elements in A ranged from 5212 to 22140
54 Copyright ©2004 by K. Pattipati Simplex vs. Dual Affine Scaling - 3
Simplex Specialized Simplex Affine scaling MINOS 4.0 (MCNF 85)
Total # of iterations (940, 21915) (931, 16624) (28, 35) Ratios of time per (0.010, 0.069) (0.0018, 0.0404) 1 iteration (w.r.t. Affine scaling) Total cpu time (secs) (12.73, 1885.34) (7.42, 260.44) (6.51, 309.50) Ratios of cpu times (1.96, 11.56) (0.59, 4.15) 1 w.r.t. affine scaling
55 Copyright ©2004 by K. Pattipati Simplex vs. Dual Affine Scaling - 4
Timber Harvest Scheduling problems • 11 timber harvest scheduling problems using a program called FORPLAN • # of variables ranged from 744 to 19991 • # of constraints ranged from 55 to 316 • # of nonzero elements in A ranged from 6021 to 176346 Simplex (MINOS 4.0) Affine scaling (default pricing) Total # of iterations (534, 11364) (38, 71) Ratio of time per iteration (0.0141, 0.2947) 1 Total cpu time (secs) (2.74, 123.62) (0.85, 43.80) Ratios of cpu times (1.52, 5.12) 1
Promising approach to large real-world LP problems
56 Copyright ©2004 by K. Pattipati Summary
Methods for solving LP problems • Revised Simplex method • Ellipsoid method….not practical • Karmarkar’s projective scaling (interior point method)
Implementation issues of the Least-Squares subproblem of Karmarkar’s method ….. More in Linear Programming and Network Flows course
Comparison of Simplex and projective methods
57 Copyright ©2004 by K. Pattipati