A Tutorial on Convex Optimization
Total Page:16
File Type:pdf, Size:1020Kb
ATutorial on Convex Optimization Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California email: [email protected] Abstract—Inrecent years, convex optimization has be- hence tractability) of many problems, often by come a computational tool of central importance in engi- inspection; neering, thanks to it’s ability to solve very large, practical 2) to review and introduce some canonical opti- engineering problems reliably and efficiently. The goal of this tutorial is to give an overview of the basic concepts of mization problems, which can be used to model convex sets, functions and convex optimization problems, so problems and for which reliable optimization code that the reader can more readily recognize and formulate can be readily obtained; engineering problems using modern convex optimization. 3) emphasize the modeling and formulation aspect; This tutorial coincides with the publication of the new book we do not discuss the aspects of writing custom on convex optimization, by Boyd and Vandenberghe [7], who have made available a large amount of free course codes. material and links to freely available code. These can be We assume that the reader has a working knowledge of downloaded and used immediately by the audience both linear algebra and vector calculus, and some (minimal) for self-study and to solve real problems. exposure to optimization. I. INTRODUCTION Our presentation is quite informal. Rather than pro- vide details for all the facts and claims presented, our Convex optimization can be described as a fusion goal is instead to give the reader a flavor for what is of three disciplines: optimization [22], [20], [1], [3], possible with convex optimization. Complete details can [4], convex analysis [19], [24], [27], [16], [13], and be found in [7], from which all the material presented numerical computation [26], [12], [10], [17]. It has here is taken. Thus we encourage the reader to skip recently become a tool of central importance in engi- sections that might not seem clear and continue reading; neering, enabling the solution of very large, practical the topics are not all interdependent. engineering problems reliably and efficiently. In some sense, convex optimization is providing new indispens- Motivation able computational tools today, which naturally extend Avast number of design problems in engineering can our ability to solve problems such as least squares and be posed as constrained optimization problems, of the linear programming to a much larger and richer class of form: problems. Our ability to solve these new types of problems minimize f0(x) ≤ comes from recent breakthroughs in algorithms for solv- subject to fi(x) 0,i=1,...,m (1) ing convex optimization problems [18], [23], [29], [30], hi(x)=0,i=1,...,p. coupled with the dramatic improvements in computing where x is a vector of decision variables, and the power, both of which have happened only in the past functions f0, fi and hi, respectively, are the cost, in- decade or so. Today, new applications of convex op- equality constraints, and equality constraints. However, timization are constantly being reported from almost such problems can be very hard to solve in general, every area of engineering, including: control, signal especially when the number of decision variables in x processing, networks, circuit design, communication, in- is large. There are several reasons for this difficulty: formation theory, computer science, operations research, First, the problem “terrain” may be riddled with local economics, statistics, structural design. See [7], [2], [5], optima. Second, it might be very hard to find a fea- [6], [9], [11], [15], [8], [21], [14], [28] and the references sible point (i.e.,anx which satisfy all equalities and therein. inequalities), in fact, the feasible set which needn’t even The objectives of this tutorial are: be fully connected, could be empty. Third, stopping 1) to show that there is are straight forward, sys- criteria used in general optimization algorithms are often tematic rules and facts, which when mastered, arbitrary. Forth, optimization algorithms might have very allow one to quickly deduce the convexity (and poor convergence rates. Fifth, numerical problems could cause the minimization algorithm to stop all together or where A = a1 ··· aq ;orasthe nullspace of a wander. matrix It has been known for a long time [19], [3], [16], [13] nullspace(B)={x | Bx =0} that if the fi are all convex, and the hi are affine, then { | T T } the first three problems disappear: any local optimum = x b1 x =0,...,bp x =0 is, in fact, a global optimum; feasibility of convex op- T timization problems can be determined unambiguously, where B = b1 ··· bp .Asanexample, let × at least in principle; and very precise stopping criteria Sn = {X ∈ Rn n | X = XT } denote the set are available using duality.However, convergence rate of symmetric matrices. Then Sn is a subspace, since and numerical sensitivity issues still remain a potential symmetric matrices are closed under addition. Another problem. waytosee this is to note that Sn can be written as n×n It was not until the late ’80’s and ’90’s that researchers {X ∈ R | Xij = Xji, ∀i, j} which is the nullspace in the former Soviet Union and United States discovered of linear function X − X T . n that if, in addition to convexity, the fi satisfied a property A set S ⊆ R is affine if it contains line through any known as self-concordance, then issues of convergence two points in it, i.e., and numerical sensitivity could be avoided using interior ∈ ∈ ⇒ ∈ point methods [18], [23], [29], [30], [25]. The self- x, y S, λ, µ R,λ+ µ =1= λx + µy S. concordance property is satisfied by a very large set of important functions used in engineering. Hence, it is now p possible to solve a large class of convex optimization λ =1.5 px problems in engineering with great efficiency. p λ =0.6 py II. CONVEX SETS p In this section we list some important convex sets λ = −0.5 and operations. It is important to note that some of these sets have different representations. Picking the Geometrically, an affine set is simply a subspace which right representation can make the difference between a is not necessarily centered at the origin. Two common tractable problem and an intractable one. representations for of an affine set are: the range of affine We will be concerned only with optimization prob- function q lems whose decision variables are vectors in Rn or S = {Az + b | z ∈ R }, × matrices in Rm n. Throughout the paper, we will make frequent use of informal sketches to help the reader or as the solution of a set of linear equalities: develop an intuition for the geometry of convex opti- T T S = {x | b1 x = d1,...,b x = dp} mization. p n m = {x | Bx = d}. A function f : R → R is affine if it has the form linear plus constant f(x)=Ax + b.IfF is a matrix n A set S ⊆ R is a convex set if it contains the line F : n → p×q F valued function, i.e., R R , then is affine segment joining any of its points, i.e., if it has the form x, y ∈ S, λ, µ ≥ 0,λ+ µ =1=⇒ λx + µy ∈ S F (x)=A0 + x1A1 + ···+ x A n n convex not convex p×q where Ai ∈ R .Affine functions are sometimes loosely refered to as linear. S ⊆ n y Recall that R is a subspace if it contains the x plane through any two of its points and the origin, i.e., x, y ∈ S, λ, µ ∈ R =⇒ λx + µy ∈ S. Geometrically, we can think of convex sets as always Two common representations of a subspace are as the bulging outward, with no dents or kinks in them. Clearly range of a matrix subspaces and affine sets are convex, since their defini- tions subsume convexity. { | ∈ q} range(A)= Aw w R A set S ⊆ Rn is a convex cone if it contains all = {w1a1 + ···+ wqaq | wi ∈ R} rays passing through its points which emanate from the 2 origin, as well as all line segments joining any points points; Co(S) is the unit simplex which is the triangle on those rays, i.e., joining the vectors along with all the points inside it; 3 Cone(S) is the nonnegative orthant R+. x, y ∈ S, λ, µ ≥ 0, =⇒ λx + µy ∈ S Recall that a hyperplane, represented as {x | aT x = Geometrically, x, y ∈ S means that S contains the entire b} (a =0 ), is in general an affine set, and is a subspace ‘pie slice’ between x, and y. if b =0. Another useful representation of a hyperplane T is given by {x | a (x − x0)=0}, where a is normal vector; x0 lies on hyperplane. Hyperplanes are convex, x since they contain all lines (and hence segments) joining any of their points. A halfspace, described as {x | aT x ≤ b} (a =0 )is y 0 generally convex and is a convex cone if b =0. Another T n useful representation is {x | a (x − x0) ≤ 0}, where The nonnegative orthant, R+ is a convex cone. The set n n a is (outward) normal vector and x0 lies on boundary. S+ = {X ∈ S | X 0} of symmetric positive Halfspaces are convex. semidefinite (PSD) matrices is also a convex cone, since aT x = jb any positive combination of semidefinite matrices is n semidefinite. Hence we call S+ the positive semidefinite a cone. p A convex cone K ⊆ Rn is said to be proper if it is x0 closed, has nonempty interior, and is pointed, i.e., there T is no line in K.Aproper cone K defines a generalized T a x ≥ b n a x ≤ b inequality K in R : x K y ⇐⇒ y − x ∈ K We now come to a very important fact about prop- ≺ ⇐⇒ − ∈ int (strict version x K y y x K).