The Crawling Phenomenon in Sequential Convex Programming

The Crawling Phenomenon in Sequential Convex Programming Taylor P. Reynolds and Mehran Mesbahi Abstract— The paper examines the so-called crawling phe- Likely the simplest class of SCP methods is that of se- nomenon for a class of sequential convex programming al- quential linear programming (SLP). Algorithms in this class gorithms. These algorithms are designed to solve non-convex linearize all nonlinear functions about a current reference optimization problems by using convex approximations, trust regions and relaxations. The crawling phenomenon occurs when solution to obtain a linear program. The linear program is the iterates of the algorithm get as close as permitted to each then solved (with a trust region) to obtain a new reference, other, yet these iterates are not close to a stationary point of the and the process repeats [16], [17]. From a computational original non-convex problem. It is shown that once the design perspective, SLP became attractive early due to the matu- parameters for a general class of such iterative algorithms rity of the simplex algorithm. However, over time solvers are fixed, there are generic problem instances that lead to the crawling phenomenon. A simple example and potential for more general classes of convex optimization problems remedies to address this phenomenon are also presented. have advanced to the point that restricting oneself to linear Index Terms— Non-convex optimization; sequential convex programs to save computational resources is unnecessary (ex- programming; crawling phenomenon cept perhaps for extremely large problems). More generally, difference of convex functions, or D.C. programming, is a I. INTRODUCTION class of SCP methods [18]. The convex-concave procedure Solving non-convex optimization problems is difficult. introduced in [19] decomposes each non-convex inequality Alas, non-convex problems are pervasive in control synthesis constraint into a sum of convex and concave functions, problems, and more generally, engineering system design. linearizing the concave part while leaving the convex part Examples include optimal control problems with nonlinear intact. This procedure remains popular in modern applica- dynamics (represented as non-convex algebraic or differential tions, in particular for support vector machines and principal equality constraints), or problems that have non-convex con- component analysis in the field of machine learning [20]. straints that cannot be “losslessly convexified” [1], [2], [3]. Another important class of SCP methods is that of se- These problems are found in fields ranging from aerospace quential quadratic programming (SQP). The collective works guidance [4], [5], [6] and mechanical truss design [7] to of Han [21], Powell [22], [23], Boggs and Tolle [24], [25] power grid optimization [8], and computer vision [9]. For and Fukushima [26] exerted significant influence on the such non-convex optimization problems, sequential convex early developments of SQP algorithms, and their impact programming (SCP) is a powerful framework with which remains evident today. SQP methods approximate a non- one can design algorithms that find the desired solution(s). convex problem with a quadratic program by approximating Other such techniques include nonlinear programming [10], the Hessian of the non-convex problem’s Lagrangian at a [11], sum of squares optimization [12], [13] and evolutionary reference solution. The quadratic program is solved to obtain algorithms. a new reference solution, and the process repeats. SQP Sequential convex programming is a natural approach for methods are arguably among the most mature class of SCP solving non-convex optimization problems; convex program- methods [27], [28]. ming is generally thought of as “easy” (from a computational The use of quadratic programs requires that all constraints perspective due to the availability of interior point methods) are affine in the solution variable. Many problems of interest and the associated theoretical backing (such as strong dual- are, however, subject to nonlinear constraints (both convex ity) is well established [14], [15]. Different variations of SCP and non-convex). The class of SCP methods discussed herein techniques stem from the same idea: solve a sequence of are therefore those that solve a more general convex problem convex approximations to the original non-convex problem, at each iteration (i.e., no a priori restriction to an LP or each time using the solution of a previous iteration’s convex QP). More importantly, we discuss algorithms that are of problem to improve the approximation. The main challenge trust region type and use slack variables to ensure that lies in how the convex approximations are formulated, what each convex approximation is always feasible. This class structure is devised for measuring progress towards an opti- of SCP methods has been developed largely over the last mal solution and updating the approximations, and how all decade, and represents one of the most active areas of current of this lends itself to theoretical analysis. development [29], [30], [31], [32]. A subset of SLP and SQP methods are contained within this class, and our discussion This research has been supported by NASA grant NNX17AH02A SUP02 holds for them as well. and NSERC grant PGSD3-502758-2017. The purpose of this paper is to demonstrate that this The authors are with the W.E. Boeing Department of Aeronau- tics and Astronautics, University of Washington, Seattle, WA, USA. general class of SCP algorithms is susceptible to what we ftpr6,[email protected] call the crawling phenomenon. The crawling phenomenon is defined as slow progress towards a stationary point of z kz − z¯k ≤ η 2 the non-convex problem when the algorithm is not close to z2 2 g1(z) ≤ 0 any such solution. We show that the crawling phenomenon is a generic property of algorithms that use a Lagrangian- z¯ f1(z) = 0 f (z) = 0 z¯ like function to measure the accuracy of the convex approx- 1 imations at each iteration and update the trust region and z1 z ¯ reference solution accordingly. Essentially, the trust region 1 f1(z; z¯) = 0 f¯ (z; z¯) = 0 and solution update rules that form a key component of 1 the theoretical convergence analysis induce the undesirable g2(z) ≤ 0 crawling phenomenon. Arriving at a local minimum indeed (a) F1 = ;, see (4a). (b) F2 = ;, see (4b). implies that adjacent iterates will be close together (in the Fig. 1: A depiction of the two causes of artificial infeasibility. sense of their normed difference), but what we show is that for a certain general class of SCP algorithms, the converse does not necessarily hold. kz − z¯k increases. As a separate issue, consider the possible This paper is organized as follows. First, we describe scenario where φ and each gi are linear functions, and we ¯ the generic class of SCP algorithms, our nomenclature and choose each hi to be the linearization of hi around z¯. In notation in xII. In xIII we define the crawling phenomenon this case, the resulting optimization problem is unbounded and prove that algorithms in this class are susceptible to below, a phenomenon referred to as artificial unboundedness. it, followed by a simple example. Potential remedies for To address the local nature of the approximations (1) and avoiding the crawling phenomenon are discussed in xIV, to avoid artificial unboundedness, we add a trust region and xV offers concluding remarks. constraint of the form, II. SEQUENTIAL CONVEX PROGRAMMING kz − z¯kq ≤ η; q = f1; 2; 1g; (2) Consider the class of optimization problems of the form, where η 2 R++ is a positive trust region radius. We may then min φ(z) construct the following convex approximation to the original z problem Problem P: s.t. g (z) ≤ 0; i 2 I i cvx (P) min φ(z) (3a) hi(z) ≤ 0; i 2 Incvx z fj(z) = 0; j 2 Encvx s.t. gi(z) ≤ 0; i 2 Icvx (3b) nz h¯ (z; z¯) ≤ 0; i 2 I (3c) where z 2 R , Icvx = f1; : : : ; mig, Incvx = fmi + i ncvx ¯ 1; : : : ; mi + nig represent the indices of the convex and fj(z; z¯) = 0; j 2 Encvx (3d) non-convex inequality constraints and E = f1; : : : ; n g ncvx e kz − z¯kq ≤ η: (3e) represents the indices of non-convex equality constraints. We Problem3 is not necessarily well-defined due to artificial assume that each hi and fj are at least once differentiable and, without loss of generality, that the cost function φ is infeasibility that can arise in two forms. The following convex. To simplify the notation, we assume that any convex independent cases, depicted in Figure1, can happen (affine) equality constraints are represented by a pair of F1 = fz j (3c); (3d) and (3e) are satisfiedg = ;; (4a) convex inequality constraints. Problem P is typically referred F = fz j (3b); (3c) and (3d) are satisfiedg = ;: (4b) to as a nonlinear programming problem, but we shall refer 2 to it as the “original” problem. While there exist general Artificial infeasibility can be avoided by adding so-called purpose solvers that are able to solve the original problem virtual control to (3c) and (3d) as, (see, e.g., [10], [11]), we focus on iterative schemes that are h¯ (z; z¯) − σ ≤ 0; σ ≥ 0 (5a) based on convex optimization. i i i ¯ At each iteration, an SCP method approximates the non- fj(z; z¯) − νj = 0: (5b) convex constraints hi and fj with convex functions of the ni ne The vectors σ 2 R+ and ν 2 R are added as solution vari- solution variable of the form ables, and the resulting augmented convex program assumes ¯ the form, hi(z; z¯) ≤ 0; i 2 Incvx; (1a) ¯ min φ(z) + λP (σ; ν) fj(z; z¯) = 0; j 2 Encvx; (1b) z,σ,ν n where z¯ 2 R z is some reference. There are several choices s.t. gi(z) ≤ 0; i 2 Icvx ¯ ¯ for hi: first-order Taylor series, second-order Taylor series hi(z; z¯) − σi ≤ 0; i 2 Incvx (C) with (possibly approximated) positive semi-definite Hessian, ¯ fj(z; z¯) − νj = 0; j 2 Encvx inner convex approximations, or any other convex function ¯ kz − z¯kq ≤ η; that locally approximates the non-convex hi.

Load more