Sequential Convex Programming

Total Page:16

File Type:pdf, Size:1020Kb

Sequential Convex Programming Sequential Convex Programming • sequential convex programming • alternating convex optimization • convex-concave procedure EE364b, Stanford University Methods for nonconvex optimization problems • convex optimization methods are (roughly) always global, always fast • for general nonconvex problems, we have to give up one – local optimization methods are fast, but need not find global solution (and even when they do, cannot certify it) – global optimization methods find global solution (and certify it), but are not always fast (indeed, are often slow) • this lecture: local optimization methods that are based on solving a sequence of convex problems EE364b, Stanford University 1 Sequential convex programming (SCP) • a local optimization method for nonconvex problems that leverages convex optimization – convex portions of a problem are handled ‘exactly’ and efficiently • SCP is a heuristic – it can fail to find optimal (or even feasible) point – results can (and often do) depend on starting point (can run algorithm from many initial points and take best result) • SCP often works well, i.e., finds a feasible point with good, if not optimal, objective value EE364b, Stanford University 2 Problem we consider nonconvex problem minimize f0(x) subject to fi(x) ≤ 0, i =1,...,m hi(x)=0, j =1,...,p with variable x ∈ Rn • f0 and fi (possibly) nonconvex • hi (possibly) non-affine EE364b, Stanford University 3 Basic idea of SCP • maintain estimate of solution x(k), and convex trust region T (k) ⊂ Rn (k) • form convex approximation fˆi of fi over trust region T (k) • form affine approximation hˆi of hi over trust region T • x(k+1) is optimal point for approximate convex problem minimize fˆ0(x) subject to fˆi(x) ≤ 0, i =1,...,m hˆi(x)=0, i =1,...,p x ∈T (k) EE364b, Stanford University 4 Trust region • typical trust region is box around current point: (k) (k) T = {x ||xi − xi |≤ ρi, i =1,...,n} • if xi appears only in convex inequalities and affine equalities, can take ρi = ∞ EE364b, Stanford University 5 Affine and convex approximations via Taylor expansions • (affine) first order Taylor expansion: fˆ(x)= f(x(k))+ ∇f(x(k))T (x − x(k)) • (convex part of) second order Taylor expansion: fˆ(x)= f(x(k))+ ∇f(x(k))T (x − x(k))+(1/2)(x − x(k))T P (x − x(k)) 2 (k) P = ∇ f(x ) +, PSD part of Hessian • give local approximations, which don’t depend on trust region radii ρi EE364b, Stanford University 6 Quadratic trust regions • full second order Taylor expansion: fˆ(x)= f(x(k))+∇f(x(k))T (x−x(k))+(1/2)(x−x(k))∇2f(x(k))(x−x(k)), • trust region is compact ellipse around current point: for some P ≻ 0 T (k) = {x | (x − x(k))T P (x − x(k)) ≤ ρ} • Update is any x(k+1) for which there is λ ≥ 0 s.t. 2 (k) (k+1) ∇ f(x )+ λP 0, λ(kx k2 − 1)=0, (∇2f(x(k))+ λP )x(k) = −∇f(x(k)) EE364b, Stanford University 7 Particle method • particle method: (k) – choose points z1,...,zK ∈T (e.g., all vertices, some vertices, grid, random, . ) – evaluate yi = f(zi) – fit data (zi,yi) with convex (affine) function (using convex optimization) • advantages: – handles nondifferentiable functions, or functions for which evaluating derivatives is difficult – gives regional models, which depend on current point and trust region radii ρi EE364b, Stanford University 8 Fitting affine or quadratic functions to data fit convex quadratic function to data (zi,yi) K (k) T (k) T (k) 2 minimize i=1 (zi − x ) P (zi − x )+ q (zi − x )+ r − yi subject to P 0 P with variables P ∈ Sn, q ∈ Rn, r ∈ R • can use other objectives, add other convex constraints • no need to solve exactly • this problem is solved for each nonconvex constraint, each SCP step EE364b, Stanford University 9 Quasi-linearization • a cheap and simple method for affine approximation • write h(x) as A(x)x + b(x) (many ways to do this) • use hˆ(x)= A(x(k))x + b(x(k)) • example: h(x)=(1/2)xT Px + qT x + r = ((1/2)Px + q)T x + r (k) T • hˆql(x) = ((1/2)Px + q) x + r (k) T (k) (k) • hˆtay(x)=(Px + q) (x − x )+ h(x ) EE364b, Stanford University 10 Example • nonconvex QP minimize f(x)=(1/2)xT Px + qT x subject to kxk∞ ≤ 1 with P symmetric but not PSD • use approximation (k) (k) T (k) (k) T (k) f(x )+(Px + q) (x − x )+(1/2)(x − x ) P+(x − x ) EE364b, Stanford University 11 • example with x ∈ R20 • SCP with ρ =0.2, started from 10 different points −10 −20 −30 ) ) k ( −40 x ( f −50 −60 −70 5 10 15 20 25 30 k • runs typically converge to points between −60 and −50 • dashed line shows lower bound on optimal value ≈−66.5 EE364b, Stanford University 12 Lower bound via Lagrange dual 2 • write constraints as xi ≤ 1 and form Lagrangian n T T 2 L(x,λ) = (1/2)x Px + q x + λi(xi − 1) i=1 X = (1/2)xT (P +2 diag(λ)) x + qT x − 1T λ − • g(λ)= −(1/2)qT (P +2 diag(λ)) 1 q − 1T λ; need P +2 diag(λ) ≻ 0 • solve dual problem to get best lower bound: − maximize −(1/2)qT (P +2 diag(λ)) 1 q − 1T λ subject to λ 0, P +2 diag(λ) ≻ 0 EE364b, Stanford University 13 Some (related) issues • approximate convex problem can be infeasible • how do we evaluate progress when x(k) isn’t feasible? need to take into account (k) – objective f0(x ) (k) – inequality constraint violations fi(x )+ (k) – equality constraint violations |hi(x )| • controlling the trust region size – ρ too large: approximations are poor, leading to bad choice of x(k+1) – ρ too small: approximations are good, but progress is slow EE364b, Stanford University 14 Exact penalty formulation • instead of original problem, we solve unconstrained problem m p minimize φ(x)= f0(x)+ λ ( i=1 fi(x)+ + i=1 |hi(x)|) where λ> 0 P P • for λ large enough, minimizer of φ is solution of original problem • for SCP, use convex approximation m p φˆ(x)= fˆ0(x)+ λ fˆi(x)+ + |hˆi(x)| i=1 i=1 ! X X • approximate problem always feasible EE364b, Stanford University 15 Trust region update • judge algorithm progress by decrease in φ, using solution x˜ of approximate problem • decrease with approximate objective: δˆ = φ(x(k)) − φˆ(˜x) (called predicted decrease) • decrease with exact objective: δ = φ(x(k)) − φ(˜x) • if δ ≥ αδˆ, ρ(k+1) = βsuccρ(k), x(k+1) =x ˜ (α ∈ (0, 1), βsucc ≥ 1; typical values α =0.1, βsucc =1.1) • if δ<αδˆ, ρ(k+1) = βfailρ(k), x(k+1) = x(k) (βfail ∈ (0, 1); typical value βfail =0.5) • interpretation: if actual decrease is more (less) than fraction α of predicted decrease then increase (decrease) trust region size EE364b, Stanford University 16 Nonlinear optimal control l2, m2 τ 2 θ2 τ1 l1, m1 θ1 • 2-link system, controlled by torques τ1 and τ2 (no gravity) EE364b, Stanford University 17 • dynamics given by M(θ)θ¨ + W (θ, θ˙)θ˙ = τ, with (m + m )l2 m l l (s s + c c ) M(θ) = 1 2 1 2 1 2 1 2 1 2 m l l (s s + c c ) m l2 2 1 2 1 2 1 2 2 2 0 m l l (s c − c s )θ˙ W (θ, θ˙) = 2 1 2 1 2 1 2 2 m l l (s c − c s )θ˙ 0 2 1 2 1 2 1 2 1 si = sin θi, ci = cos θi • nonlinear optimal control problem: T 2 minimize J = 0 kτ(t)k2 dt subject to θ(0) = θ , θ˙(0) = 0, θ(T )= θ , θ˙(T )=0 R init final kτ(t)k∞ ≤ τmax, 0 ≤ t ≤ T EE364b, Stanford University 18 Discretization • discretize with time interval h = T/N N 2 • J ≈ h i=1 kτik2, with τi = τ(ih) • approximateP derivatives as θ − θ − θ − 2θ + θ − θ˙(ih) ≈ i+1 i 1, θ¨(ih) ≈ i+1 i i 1 2h h2 • approximate dynamics as set of nonlinear equality constraints: θ − 2θ + θ − θ − θ − θ − θ − M(θ ) i+1 i i 1 + W θ , i+1 i 1 i+1 i 1 = τ i h2 i 2h 2h i • θ0 = θ1 = θinit; θN = θN+1 = θfinal EE364b, Stanford University 19 • discretized nonlinear optimal control problem: N 2 minimize h i=1 kτik2 subject to θ0 = θ1 = θinit, θN = θN+1 = θfinal P kτik∞ ≤ τmax, i =1,...,N θi+1−2θi+θi−1 θi+1−θi−1 θi+1−θi−1 M(θi) h2 + W θi, 2h 2h = τi • replace equality constraints with quasilinearized versions (k) (k) (k) θi+1 − 2θi + θi−1 (k) θi+1 − θi−1 θi+1 − θi−1 M(θi ) 2 + W θi , = τi h 2h ! 2h • trust region: only on θi • initialize with θi = ((i − 1)/(N − 1))(θfinal − θinit), i =1,...,N EE364b, Stanford University 20 Numerical example • m1 =1, m2 =5, l1 =1, l2 =1 • N = 40, T = 10 • θinit = (0, −2.9), θfinal = (3, 2.9) • τmax =1.1 • α =0.1, βsucc =1.1, βfail =0.5, ρ(1) = 90◦ • λ =2 EE364b, Stanford University 21 SCP progress 70 60 50 ) ) k ( 40 x ( φ 30 20 10 5 10 15 20 25 30 35 40 k EE364b, Stanford University 22 Convergence of J and torque residuals 2 14 10 13.5 1 10 13 0 10 ) 12.5 k ( J 12 −1 10 11.5 −2 10 11 sum of torque residuals −3 10.5 10 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 k k EE364b, Stanford University 23 Predicted and actual decreases in φ 2 140 10 120 1 10 100 (solid) 80 0 ) 10 ◦ δ ( 60 ) k −1 ( 10 40 ρ 20 (dotted), −2 10 ˆ δ 0 −3 −20 10 5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40 k k EE364b, Stanford University 24 Trajectory plan 1.5 3.5 3 1 2.5 2 1 1 0.5 1.5 τ θ 1 0 0.5 −0.5 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 t t 2 5 1 2 2 0 τ 0 θ −1 −5 0 2 4 6 8 10 0 2 4 6 8 10 t t EE364b, Stanford University 25 Convex composite • general form: for h : Rm → R convex, c : Rn → Rm smooth, f(x)= h(c(x)) • exact penalty formulation of minimize f(x) subject to c(x)=0 • approximate f locally by convex approximation: near x, T f(y) ≈ fˆx(y)= h(c(x)+ ∇c(x) (y − x)) EE364b, Stanford University 26 Convex composite (prox-linear) algorithm given function f = h ◦ c and convex domain C, line search parameters α ∈ (0,.5), β ∈ (0, 1), stopping tolerance ǫ> 0 k := 0 repeat ˆ Use model f = fx(k) (k+1) ˆ (k+1) (k+1) (k) Set xˆ = argminx∈C{f(x)} and direction ∆ =x ˆ − x Set δ(k) = fˆ(x(k) +∆(k)) − f(x(k)) Set t =1 while f(x(k) + t∆(k)) ≥ f(x(k))+ αtδ(k) t = β · t (k+1) If k∆ k2/t ≤ ǫ, quit k := k +1 EE364b, Stanford University 27 Nonlinear measurements (phase retrieval) n n • phase retrieval problem: for ai ∈ C , x⋆ ∈ C , observe ∗ 2 bi = |ai x⋆| • goal is to find x, natural objectives are of form f(x)= |Ax|2 − b • “robust” phase retrieval problem m ∗ 2 f(x)= |ai x| − bi i=1 X or quadratic objective m 1 2 f(x)= |a∗x|2 − b 2 i i i=1 X EE364b, Stanford University 28 Numerical example • m = 200,n = 50, over reals R (sign retrieval) m×n 2 • Generate 10 independent examples, A ∈ R , b = |Ax⋆| , Aij ∼N (0, 1), x⋆ ∼N (0,I) • Two sets of experiments: initialize at (0) (0) x ∼N (0,I) or x ∼N (x⋆,I) 2 2 • Use h(z)= kzk1 or h(z)= kzk2, c(x)=(Ax) − b.
Recommended publications
  • 4. Convex Optimization Problems
    Convex Optimization — Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form • convex optimization problems • quasiconvex optimization • linear optimization • quadratic optimization • geometric programming • generalized inequality constraints • semidefinite programming • vector optimization • 4–1 Optimization problem in standard form minimize f0(x) subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p x Rn is the optimization variable • ∈ f : Rn R is the objective or cost function • 0 → f : Rn R, i =1,...,m, are the inequality constraint functions • i → h : Rn R are the equality constraint functions • i → optimal value: p⋆ = inf f (x) f (x) 0, i =1,...,m, h (x)=0, i =1,...,p { 0 | i ≤ i } p⋆ = if problem is infeasible (no x satisfies the constraints) • ∞ p⋆ = if problem is unbounded below • −∞ Convex optimization problems 4–2 Optimal and locally optimal points x is feasible if x dom f and it satisfies the constraints ∈ 0 ⋆ a feasible x is optimal if f0(x)= p ; Xopt is the set of optimal points x is locally optimal if there is an R> 0 such that x is optimal for minimize (over z) f0(z) subject to fi(z) 0, i =1,...,m, hi(z)=0, i =1,...,p z x≤ R k − k2 ≤ examples (with n =1, m = p =0) f (x)=1/x, dom f = R : p⋆ =0, no optimal point • 0 0 ++ f (x)= log x, dom f = R : p⋆ = • 0 − 0 ++ −∞ f (x)= x log x, dom f = R : p⋆ = 1/e, x =1/e is optimal • 0 0 ++ − f (x)= x3 3x, p⋆ = , local optimum at x =1 • 0 − −∞ Convex optimization problems 4–3 Implicit constraints the standard form optimization problem has an implicit
    [Show full text]
  • Convex Optimisation Matlab Toolbox
    UNLOCBOX: USER’S GUIDE MATLAB CONVEX OPTIMIZATION TOOLBOX Lausanne - February 2014 PERRAUDIN Nathanaël, KALOFOLIAS Vassilis LTS2 - EPFL Abstract Nowadays the trend to solve optimization problems is to use specific algorithms rather than very gen- eral ones. The UNLocBoX provides a general framework allowing the user to design his own algorithms. To do so, the framework try to stay as close from the mathematical problem as possible. More precisely, the UNLocBoX is a Matlab toolbox designed to solve convex optimization problem of the form K min ∑ fn(x); x2C n=1 using proximal splitting techniques. It is mainly composed of solvers, proximal operators and demonstra- tion files allowing the user to quickly implement a problem. Contents 1 Introduction 1 2 Literature 2 3 Installation and initialization2 3.1 Dependencies.........................................2 3.2 GPU computation.......................................2 4 Structure of the UNLocBoX2 5 Problems of interest4 6 Solvers 4 6.1 Defining functions......................................5 6.2 Selecting a solver.......................................5 6.3 Optional parameters......................................6 6.4 Plug-ins: how to tune your algorithm.............................6 6.5 Returned arguments......................................8 7 Proximal operators9 7.1 Constraints..........................................9 8 Example 10 References 16 LTS2 - EPFL 1 Introduction During your research, you have most likely faced the situation where you need to solve a convex optimiza- tion problem. Maybe you were trying to find an optimal combination of parameters, you were transforming some process into something automatic or, even more simply, your algorithm was equivalent to solving convex problem. In this situation, you usually face a dilemma: you need a specific algorithm to solve your problem, but you do not want to spend hours writing it.
    [Show full text]
  • Conditional Gradient Sliding for Convex Optimization ∗
    CONDITIONAL GRADIENT SLIDING FOR CONVEX OPTIMIZATION ∗ GUANGHUI LAN y AND YI ZHOU z Abstract. In this paper, we present a new conditional gradient type method for convex optimization by calling a linear optimization (LO) oracle to minimize a series of linear functions over the feasible set. Different from the classic conditional gradient method, the conditional gradient sliding (CGS) algorithm developed herein can skip the computation of gradients from time to time, and as a result, can achieve the optimal complexity bounds in terms of not only the number of callsp to the LO oracle, but also the number of gradient evaluations. More specifically, we show that the CGS method requires O(1= ) and O(log(1/)) gradient evaluations, respectively, for solving smooth and strongly convex problems, while still maintaining the optimal O(1/) bound on the number of calls to LO oracle. We also develop variants of the CGS method which can achieve the optimal complexity bounds for solving stochastic optimization problems and an important class of saddle point optimization problems. To the best of our knowledge, this is the first time that these types of projection-free optimal first-order methods have been developed in the literature. Some preliminary numerical results have also been provided to demonstrate the advantages of the CGS method. Keywords: convex programming, complexity, conditional gradient method, Frank-Wolfe method, Nesterov's method AMS 2000 subject classification: 90C25, 90C06, 90C22, 49M37 1. Introduction. The conditional gradient (CndG) method, which was initially developed by Frank and Wolfe in 1956 [13] (see also [11, 12]), has been considered one of the earliest first-order methods for solving general convex programming (CP) problems.
    [Show full text]
  • Additional Exercises for Convex Optimization
    Additional Exercises for Convex Optimization Stephen Boyd Lieven Vandenberghe January 4, 2020 This is a collection of additional exercises, meant to supplement those found in the book Convex Optimization, by Stephen Boyd and Lieven Vandenberghe. These exercises were used in several courses on convex optimization, EE364a (Stanford), EE236b (UCLA), or 6.975 (MIT), usually for homework, but sometimes as exam questions. Some of the exercises were originally written for the book, but were removed at some point. Many of them include a computational component using one of the software packages for convex optimization: CVX (Matlab), CVXPY (Python), or Convex.jl (Julia). We refer to these collectively as CVX*. (Some problems have not yet been updated for all three languages.) The files required for these exercises can be found at the book web site www.stanford.edu/~boyd/cvxbook/. You are free to use these exercises any way you like (for example in a course you teach), provided you acknowledge the source. In turn, we gratefully acknowledge the teaching assistants (and in some cases, students) who have helped us develop and debug these exercises. Pablo Parrilo helped develop some of the exercises that were originally used in MIT 6.975, Sanjay Lall developed some other problems when he taught EE364a, and the instructors of EE364a during summer quarters developed others. We’ll update this document as new exercises become available, so the exercise numbers and sections will occasionally change. We have categorized the exercises into sections that follow the book chapters, as well as various additional application areas. Some exercises fit into more than one section, or don’t fit well into any section, so we have just arbitrarily assigned these.
    [Show full text]
  • Efficient Convex Quadratic Optimization Solver for Embedded MPC Applications
    DEGREE PROJECT IN ELECTRICAL ENGINEERING, SECOND CYCLE, 30 CREDITS STOCKHOLM, SWEDEN 2018 Efficient Convex Quadratic Optimization Solver for Embedded MPC Applications ALBERTO DÍAZ DORADO KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCE Efficient Convex Quadratic Optimization Solver for Embedded MPC Applications ALBERTO DÍAZ DORADO Master in Electrical Engineering Date: October 2, 2018 Supervisor: Arda Aytekin , Martin Biel Examiner: Mikael Johansson Swedish title: Effektiv Konvex Kvadratisk Optimeringslösare för Inbäddade MPC-applikationer School of Electrical Engineering and Computer Science iii Abstract Model predictive control (MPC) is an advanced control technique that requires solving an optimization problem at each sampling instant. Sev- eral emerging applications require the use of short sampling times to cope with the fast dynamics of the underlying process. In many cases, these applications also need to be implemented on embedded hard- ware with limited resources. As a result, the use of model predictive controllers in these application domains remains challenging. This work deals with the implementation of an interior point algo- rithm for use in embedded MPC applications. We propose a modular software design that allows for high solver customization, while still producing compact and fast code. Our interior point method includes an efficient implementation of a novel approach to constraint softening, which has only been tested in high-level languages before. We show that a well conceived low-level implementation of integrated constraint softening adds no significant overhead to the solution time, and hence, constitutes an attractive alternative in embedded MPC solvers. iv Sammanfattning Modell prediktiv reglering (MPC) är en avancerad regler-teknik som involverar att lösa ett optimeringsproblem vid varje sampeltillfälle.
    [Show full text]
  • 1 Linear Programming
    ORF 523 Lecture 9 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a [email protected]. In this lecture, we see some of the most well-known classes of convex optimization problems and some of their applications. These include: • Linear Programming (LP) • (Convex) Quadratic Programming (QP) • (Convex) Quadratically Constrained Quadratic Programming (QCQP) • Second Order Cone Programming (SOCP) • Semidefinite Programming (SDP) 1 Linear Programming Definition 1. A linear program (LP) is the problem of optimizing a linear function over a polyhedron: min cT x T s.t. ai x ≤ bi; i = 1; : : : ; m; or written more compactly as min cT x s.t. Ax ≤ b; for some A 2 Rm×n; b 2 Rm: We'll be very brief on our discussion of LPs since this is the central topic of ORF 522. It suffices to say that LPs probably still take the top spot in terms of ubiquity of applications. Here are a few examples: • A variety of problems in production planning and scheduling 1 • Exact formulation of several important combinatorial optimization problems (e.g., min-cut, shortest path, bipartite matching) • Relaxations for all 0/1 combinatorial programs • Subroutines of branch-and-bound algorithms for integer programming • Relaxations for cardinality constrained (compressed sensing type) optimization prob- lems • Computing Nash equilibria in zero-sum games • ::: 2 Quadratic Programming Definition 2. A quadratic program (QP) is an optimization problem with a quadratic ob- jective and linear constraints min xT Qx + qT x + c x s.t. Ax ≤ b: Here, we have Q 2 Sn×n, q 2 Rn; c 2 R;A 2 Rm×n; b 2 Rm: The difficulty of this problem changes drastically depending on whether Q is positive semidef- inite (psd) or not.
    [Show full text]
  • Linear Programming and Convex Optimization
    Lecture 2 : Linear programming and convex optimization Rajat Mittal ? IIT Kanpur We talked about optimization problems and why they are important. We also looked at one of the class of problems called least square problems. Next we look at another class, 1 Linear Programming 1.1 Definition Linear programming is one of the well studied classes of optimization problem. We already discussed that a linear program is one which has linear objective and constraint functions. This implies that a standard linear program looks like P T min i cixi = c x T subject to ai xi ≤ bi 8i 2 f1; ··· ; mg n Here the vectors c; a1; ··· ; am 2 R and scalars bi 2 R are the problem parameters. 1.2 Examples { Max flow: Given a graph, start(s) and end node (t), capacities on every edge; find out the maximum flow possible through edges. s t Fig. 1. Max flow problem: there will be capacities for every edge in the problem statement The linear program looks like: P max fs;ug f(s; u) P P s.t. fu;vg f(u; v) = fv;ug f(v; u) 8v 6= s; t; ∼ 0 ≤ f(u; v) ≤ c(u; v) Note: There is another one which can be made using the flow through paths. ? Thanks to books from Boyd and Vandenberghe, Dantzig and Thapa, Papadimitriou and Steiglitz { Another example: This time, I will give the linear program and you will tell me what real world situation models it :). max 2x1 + 4x2 s.t. x1 + x2 ≤ 10; x1 ≤ 4 1.3 Solving linear programs You might have already had a course on linear optimization.
    [Show full text]
  • Convex Optimization: Algorithms and Complexity
    Foundations and Trends R in Machine Learning Vol. 8, No. 3-4 (2015) 231–357 c 2015 S. Bubeck DOI: 10.1561/2200000050 Convex Optimization: Algorithms and Complexity Sébastien Bubeck Theory Group, Microsoft Research [email protected] Contents 1 Introduction 232 1.1 Some convex optimization problems in machine learning . 233 1.2 Basic properties of convexity . 234 1.3 Why convexity? . 237 1.4 Black-box model . 238 1.5 Structured optimization . 240 1.6 Overview of the results and disclaimer . 240 2 Convex optimization in finite dimension 244 2.1 The center of gravity method . 245 2.2 The ellipsoid method . 247 2.3 Vaidya’s cutting plane method . 250 2.4 Conjugate gradient . 258 3 Dimension-free convex optimization 262 3.1 Projected subgradient descent for Lipschitz functions . 263 3.2 Gradient descent for smooth functions . 266 3.3 Conditional gradient descent, aka Frank-Wolfe . 271 3.4 Strong convexity . 276 3.5 Lower bounds . 279 3.6 Geometric descent . 284 ii iii 3.7 Nesterov’s accelerated gradient descent . 289 4 Almost dimension-free convex optimization in non-Euclidean spaces 296 4.1 Mirror maps . 298 4.2 Mirror descent . 299 4.3 Standard setups for mirror descent . 301 4.4 Lazy mirror descent, aka Nesterov’s dual averaging . 303 4.5 Mirror prox . 305 4.6 The vector field point of view on MD, DA, and MP . 307 5 Beyond the black-box model 309 5.1 Sum of a smooth and a simple non-smooth term . 310 5.2 Smooth saddle-point representation of a non-smooth function312 5.3 Interior point methods .
    [Show full text]
  • Efficient Convex Optimization for Linear
    Efficient Convex Optimization for Linear MPC Stephen J. Wright Abstract MPC formulations with linear dynamics and quadratic objectives can be solved efficiently by using a primal-dual interior-point framework, with complexity proportional to the length of the horizon. An alternative, which is more able to ex- ploit the similarity of the problems that are solved at each decision point of linear MPC, is to use an active-set approach, in which the MPC problem is viewed as a convex quadratic program that is parametrized by the initial state x0. Another alter- native is to identify explicitly polyhedral regions of the space occupied by x0 within which the set of active constraints remains constant, and to pre-calculate solution operators on each of these regions. All these approaches are discussed here. 1 Introduction In linear model predictive control (linear MPC), the problem to be solved at each decision point has linear dynamics and a quadratic objective. This a classic problem in optimization — quadratic programming (QP) — which is convex when (as is usu- ally true) the quadratic objective is convex. It remains a convex QP even when linear constraints on the states and controls are allowed at each stage, or when only linear functions of the state can be observed. Moreover, this quadratic program has special structure that can be exploited by algorithms that solve it, particularly interior-point algorithms. In deployment of linear MPC, unless there is an unanticipated upset, the quadratic program to be solved differs only slightly from one decision point to the next. The question arises of whether the solution at one decision point can be used to “warm- start” the algorithm at the next decision point.
    [Show full text]
  • A Tutorial on Convex Optimization
    A Tutorial on Convex Optimization Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California email: [email protected] Abstract— In recent years, convex optimization has be- quickly deduce the convexity (and hence tractabil- come a computational tool of central importance in engi- ity) of many problems, often by inspection; neering, thanks to it’s ability to solve very large, practical 2) to review and introduce some canonical opti- engineering problems reliably and efficiently. The goal of this tutorial is to give an overview of the basic concepts of mization problems, which can be used to model convex sets, functions and convex optimization problems, so problems and for which reliable optimization code that the reader can more readily recognize and formulate can be readily obtained; engineering problems using modern convex optimization. 3) to emphasize modeling and formulation; we do This tutorial coincides with the publication of the new book not discuss topics like duality or writing custom on convex optimization, by Boyd and Vandenberghe [7], who have made available a large amount of free course codes. material and links to freely available code. These can be We assume that the reader has a working knowledge of downloaded and used immediately by the audience both linear algebra and vector calculus, and some (minimal) for self-study and to solve real problems. exposure to optimization. Our presentation is quite informal. Rather than pro- I. INTRODUCTION vide details for all the facts and claims presented, our Convex optimization can be described as a fusion goal is instead to give the reader a flavor for what is of three disciplines: optimization [22], [20], [1], [3], possible with convex optimization.
    [Show full text]
  • Lecture 4 Convex Programming and Lagrange Duality
    Lecture 4 Convex Programming and Lagrange Duality (Convex Programming program, Convex Theorem on Alternative, convex duality Optimality Conditions in Convex Programming) 4.1 Convex Programming Program In Lecture2 we have discussed Linear Programming model which cover numerous applica- tions. Whenever applicable, LP allows to obtain useful quantitative and qualitative informa- tion on the problem at hand. The analytic structure of LP programs gives rise to a number of general results (e.g., those of the LP Duality Theory) which provide us in many cases with valuable insight. Nevertheless, many \real situations" cannot be covered by LP mod- els. To handle these \essentially nonlinear" cases, one needs to extend the basic theoretical results and computational techniques known for LP beyond the realm of Linear Program- ming. There are several equivalent ways to define a general convex optimization problem. Here we follow the \traditional" and the most direct way: when passing from a generic LP problem fmin cT xj Ax ≥ bg [A 2 Rm×n] (LP) to its nonlinear extensions, we simply allow for some of linear functions involved { the linear T T objective function f(x) = c x and linear inequality constraints gi(x) = ai x ≤ bi; i = 1; :::; m; to become nonlinear. A (constrained) Mathematical Programming program is a problem as follows: ( g(x) ≡ (g (x); :::; g (x)) ≤ 0; ) (P) min f(x) j x 2 X; 1 m : (4.1.1) h(x) ≡ (h1(x); :::; hk(x)) = 0 (P) is called convex (or Convex Programming program), if • X is a convex subset of Rn 89 90 LECTURE 4. CONVEX PROGRAMMING AND LAGRANGE DUALITY • f; g1; :::; gm are real-valued convex functions on X, and • there are no equality constraints at all.
    [Show full text]
  • Optimum Engineering Design with MATLAB
    Optimization Methods in Engineering Design Prof. Kamran Iqbal, PhD, PE Professor of Systems Engineering College of Engineering and Information Technology University of Arkansas at Little Rock [email protected] Course Objectives • Learn basic optimization methods and how they are applied in engineering design • Use MATLAB to solve optimum engineering design problems – Linear programming problems – Nonlinear programming problems – Mixed integer programing problems Course Prerequisites • Prior knowledge of these areas will be assumed of the participants: – Linear algebra – Multivariable calculus – Scientific reasoning – Basic programming – MATLAB Course Materials • Arora, Introduction to Optimum Design, 3e, Elsevier, (https://www.researchgate.net/publication/273120102_Introductio n_to_Optimum_design) • Parkinson, Optimization Methods for Engineering Design, Brigham Young University (http://apmonitor.com/me575/index.php/Main/BookChapters) • Iqbal, Fundamental Engineering Optimization Methods, BookBoon (https://bookboon.com/en/fundamental-engineering-optimization- methods-ebook) Engineering Design • Engineering design is an iterative process • Design choices are limited due to resource constraints • Design optimization aims to obtain the ‘best solution’ to the problem under constraints. Optimum Engineering Design Conventional design Optimal design Optimum Engineering Design • The optimum engineering design involves: – Formulation of mathematical optimization problem – Solving the problem using analytical or iterative methods Soda Can Design Problem: Design a soda can of 125 푚푙 capacity for mass production. It is desired to minimize the material cost. For simplicity, assume that same material is used for the cylindrical can as well as the two ends. For esthetic reasons, the height to diameter ratio should be at least 2. https://images.google.com/ Insulated Spherical Tank Design Problem: Design an insulated spherical tank of radius 푅, i.e., choose the insulation layer thickness 푡 to minimize the life-cycle costs (over 10 years) of the tank.
    [Show full text]