Optimization of Communication Systems Lecture 18: Interior Point Algorithm

ELE539A: Optimization of Communication Systems Lecture 18: Interior Point Algorithm Professor M. Chiang Electrical Engineering Department, Princeton University March 30, 2007 Lecture Outline • Newton’s method • Equality constrained convex optimization • Inequality constrained convex optimization • Barrier function and central path • Barrier method • Summary of topics covered in the course • Final projects • Conclusions Newton Method Newton step: 2 −1 ∆xnt = −∇ f(x) ∇f(x) 2 Positive definiteness of ∇ f(x) implies that ∆xnt is a descent direction Interpretation: linearize optimality condition ∇f(x∗) = 0 near x, ∇f(x + v) ≈∇f(x)+ ∇2f(x)v = 0 Solving this linear equation in v, obtain v =∆xnt. Newton step is the addition needed to x to satisfy linearized optimality condition Main Properties • Affine invariance: given nonsingular T ∈ Rn×n and let f¯(y)= f(T x). Then Newton step for f¯ at y: −1 ∆ynt = T ∆xnt and x +∆xnt = T (y +∆ynt) • Newton decrement: 1/2 1/2 T 2 −1 T 2 λ(x)= “∇f(x) ∇ f(x) ∇f(x)” = “∆xnt∇ f(x)∆xnt” Let fˆ be second order approximation of f at x. Then ∗ 1 2 f(x) − p ≈ f(x) − inf fˆ(y)= f(x) − fˆ(x +∆xnt)= λ(x) y 2 Newton Method GIVEN a starting point x ∈ dom f and tolerance ǫ> 0 REPEAT 2 −1 1. Compute Newton step and decrement: ∆xnt = −∇ f(x) ∇f(x) and λ = ∇f(x)T ∇2f(x)−1∇f(x) 1/2 ` ´ 2 λ 2. Stopping criterion: QUIT if 2 ≤ ǫ 3. Line search: choose a step size t> 0 4. Update: x := x + t∆x Advantages of Newton method: Fast, Robust, Scalable Error vs. Iteration for Problems in R100 and R10000 M minimize cT x − log(b − aT x), x ∈ R100and ∈ R10000 X i i i=1 5 5 10 10 ∗ ∗ p p 0 − 10 − ) ) ) ) 0 k k ( ( 10 x −5 x ( 10 b ( f f x −10 −5 10 10 −15 10 0 2 4 6 8 10 0 5 10 15 20 k k Equality Constrained Problems Solve a convex optimization with equality constraints: minimize f(x) subject to Ax = b f : Rn → R is twice differentiable A ∈ Rp×n with rank p<n Optimality condition: KKT equations with n + p equations in n + p variables x∗,ν∗: ∗ ∗ ∗ Ax = b, ∇f(x )+ AT ν = 0 Approach 1: Can be turned into an unconstrained optimization, after eliminating the equality constraints Approach 2: Dual Solution Dual function: ∗ g(ν)= −bT ν − f (−AT ν) Dual problem: ∗ maximize − bT ν − f (−AT ν) Example: solve n minimize − log xi Pi=1 subject to Ax = b Dual problem: n maximize − bT ν + log(AT ν) X i i=1 Recover primal variable from dual variable: T xi(ν) = 1/(A ν)i Example With Analytic Solution Convex quadratic minimization over equality constraints: minimize (1/2)xT P x + qT x + r subject to Ax = b Optimality condition: T ∗ 2 P A 3 2 x 3 2 −q 3 ∗ = 4 A 0 5 4 ν 5 4 b 5 If KKT matrix is nonsingular, there is a unique optimal primal-dual pair x∗,ν∗ If KKT matrix is singular but solvable, any solution gives optimal x∗,ν∗ If KKT matrix has no solution, primal problem is unbounded below Approach 3: Direct Derivation of Newton Method Make sure initial point is feasible and A∆xnt = 0 Replace objective with second order Tayler approximation near x: minimize fˆ(x + v)= f(x)+ ∇f(x)T v + (1/2)vT ∇2f(x)v subject to A(x + v)= b Find Newton step ∆xnt by solving: 2 T ∇ f(x) A ∆xnt −∇f(x) 2 3 2 3 = 2 3 4 A 0 5 4 w 5 4 0 5 where w is associated optimal dual variable of Ax = b Newton’s method (Newton decrement, affine invariance, and stopping criterion) stay the same Inequality Constrained Minimization n Let f0,...,fm : R → R be convex and twice continuously differentiable and A ∈ Rp×n with rank p<n: minimize f0(x) subject to fi(x) ≤ 0, i = 1,...,m Ax = b Assume the problem is strictly feasible and an optimal x∗ exists Idea: reduce it to a sequence of linear equality constrained problems and apply Newton’s method First, need to approximately formulate inequality constrained problem as an equality constrained problem Barrier Function Make inequality constraints implicit in the objective: m minimize f0(x)+ I−(fi(x)) Pi=1 subject to Ax = b where I− is indicator function: 8 0 u ≤ 0 I−(u)= < ∞ u> 0 : No inequality constraints, but objective function not differentiable Approximate indicator function by a differentiable, closed, and convex function: Iˆ−(u)= −(1/t) log(−u), dom Iˆ− = −R++ where a larger parameter t gives more accurate approximation Iˆ− increases to ∞ as u increases to 0 Log Barrier 10 5 y 0 −5 −3 −2 −1 0 1 u Use Newton method to solve approximation: m minimize f0(x)+ Iˆ−(fi(x)) Pi=1 subject to Ax = b Log Barrier Log barrier function: m φ(x)= − log(−f (x)), dom φ = {x ∈ Rn|f (x) < 0,i = 1,...,m} X i i i=1 Approximation better if t is large, but then Hessian of f0 + (1/t)φ varies rapidly near boundary of feasible set. Accuracy Stability tradeoff Solve a sequence of approximation with larger t, using Newton method for each step of the sequence Gradient and Hessian of log barrier function: m 1 ∇φ(x) = ∇fi(x) X −f (x) i=1 i m m 2 1 T 1 2 ∇ φ(x) = ∇fi(x)∇fi(x) + ∇ fi(x) X f (x)2 X −f (x) i=1 i i=1 i Central Path Consider the family of optimization problems parameterized by t> 0: minimize tf0(x)+ φ(x) subject to Ax = b Central path: solutions to above problem x∗(t), characterized by: 1. Strict feasibility: ∗ ∗ Ax (t)= b, fi(x (t)) < 0, i = 1,...,m 2. Centrality condition: there exists νˆ ∈ Rp such that m ∗ 1 ∗ T t∇f0(x (t)) + ∇fi(x (t)) + A νˆ = 0 X −f (x∗(t)) i=1 i Every central point gives a dual feasible point. Let ∗ 1 ∗ νˆ λi (t)= − ∗ , i = 1,...,m ν (t)= tfi(x (t)) t Central Path Dual function ∗ ∗ ∗ g(λ (t),ν (t)) = f0(x (t)) − m/t which implies duality gap is m/t. Therefore, suboptimality gap ∗ ∗ f0(x (t)) − p ≤ m/t Interpretation as modified KKT condition: x is a central point x∗(t) iff there exits λ, ν such that Ax = b, fi(x) ≤ 0, i = 1,...,m λ 0 m ∇f (x)+ λ ∇f (x)+ AT ν = 0 0 X i i i=1 −λifi(x) = 1/t, i = 1,...,m Complementary slackness is relaxed from 0 to 1/t Example Inequality form LP: minimize cT x subject to Ax b Log barrier function: m φ(x)= − log(b − aT x) X i i i=1 with gradient and Hessian: m a ∇φ(x) = i = AT d X b − aT x i=1 i i m T aia ∇2φ(x) = i = AT diag(d2)A X (b − aT x)2 i=1 i i T where di = 1/(bi − ai x) Centrality condition becomes: tc + AT d = 0 c is parallel to ∇φ(x). Therefore, hyperplane cT x∗(t) is tangent to level set of φ c xo xs Barrier Method GIVEN a strictly feasible point x ∈ dom f,t := t(0) > 0,µ> 1 and tolerance ǫ> 0 REPEAT ∗ 1. Centering step: compute x (t) by minimizing tf0 + φ subject to Ax = b, starting at x 2. Update: x := x∗(t) m 3. Stopping criterion: QUIT if t ≤ ǫ 4. Increase t: t := µt Other names: Sequential Unconstrained Minimization Technique (SUMT) or path-following method Usually use Newton method for Centering Step Remarks • Each centering step does not need to be exact • Choice of µ: tradeoff number of inner iterations with number of outer iterations • Choice of t(0): tradeoff number of inner iterations within the first outer iteration with number of outer iterations • Number of centering steps required: log(m/(ǫt(0))) log µ where m is the number of inequality constraints and ǫ is desired accuracy Progress of Barrier Method for an LP Example 2 10 0 10 −2 duality10 gap −4 10 −6 10 c2 c3 c1 0 20 40 60 80 NT. Ite. Three curves for µ = 2, 50, 150 Tradeoff of µ Parameter for a Small LP 140 120 100 80 NT. Ite. 60 40 20 0 0 20 40 60 80 100 120 140 160 180 200 µ Progress of Barrier Method for a GP Example 2 10 0 10 −2 duality10 gap −4 10 −6 10 c3 c2 c1 0 20 40 60NT Ite.80 100 120 Insensitive to Problem Size 4 35 10 2 30 10 0 NT. Ite. duality10 gap 25 −2 10 20 c1 c2 c3 −4 10 15 1 2 3 0 10 20 30 40 50 10 10 10 NT. Ite. m Three curves for m = 50, 500, 1000, n = 2m. Phase I Method How to compute a strictly feasible point to start barrier method? Consider a phase I optimization problem in variables x ∈ Rn,s ∈ R: minimize s subject to fi(x) ≤ s, i = 1,...,m Ax = b (0) (0) Strictly feasible point: for any x , let s = max fi(x ) 1. Apply barrier method to solve phase I problem (stop when s< 0) 2. Use the resulted strictly feasible point for the original problem to start barrier method for the original problem Not Covered • Self-concordance analysis and complexity analysis (polynomial time) • Numerical linear algebra (large scale implementation) • Generalized inequalities (SDP) Other (sometimes more efficient) algorithms: • Primal-dual interior point method • Ellipsoid methods • Analytic center cutting plane methods Lecture Summary • Convert equality constrained optimization into unconstrained optimization • Solve a general convex optimization with interior point methods • Turn inequality constrained problem into a sequence of equality constrained problems that are increasingly accurate approximation of the original problem • Polynomial time (in theory) and much faster (in practice): about 25-50 least-squares effort for a wide range of problem sizes Readings: Chapters 9.1-9.3, 9.5, 10.1-10.2, 11.1-11.4 in Boyd and Vandenberghe Why Take This Course • Learn the tools and mentality of optimization (surprisingly useful for other study you may engage in later on) • Learn classic and recent results on optimization of communication systems (over a surprisingly wide range of problems) • Enhance the ability to do original research in academia or industry • Have a fun and productive time on the final project Where We Started Application Presentation Source Channel Source Modulator Session Encoder Encoder Transport Channel Source Channel Network Desti.

Optimization of Communication Systems Lecture 18: Interior Point Algorithm

Advances in Interior Point Methods and Column Generation

Chapter 8 Constrained Optimization 2: Sequential Quadratic Programming, Interior Point and Generalized Reduced Gradient Methods

Cancer Treatment Optimization

The Central Curve in Linear Programming

IMPROVED PARTICLE SWARM OPTIMIZATION for NON-LINEAR PROGRAMMING PROBLEM with BARRIER METHOD Raju Prajapati1 1University of Engineering & Management, Kolkata, W.B

Barrier Method

Electrical Networks, Hyperplane Arrangements and Matroids By

Problem Set 7 Convex Optimization (Winter 2018)

Price-Based Distributed Optimization in Large-Scale Networked Systems

Constrained Cohort Intelligence Using Static and Dynamic Penalty Function Approach for Mechanical Components Design

Interior Point Method

Efficiency Improvements in Meta-Heuristic Algorithms to Solve