2. Subgradients

Total Page:16

File Type:pdf, Size:1020Kb

2. Subgradients L. Vandenberghe ECE236C (Spring 2020) 2. Subgradients • definition • subgradient calculus • duality and optimality conditions • directional derivative 2.1 Basic inequality recall the basic inequality for differentiable convex functions: f ¹yº ≥ f ¹xº + r f ¹xºT¹y − xº for all y 2 dom f ¹x; f ¹xºº r f ¹xº −1 • the first-order approximation of f at x is a global lower bound • r f ¹xº defines non-vertical supporting hyperplane to epigraph of f at ¹x; f ¹xºº: r f ¹xº T y x − ≤ 0 for all ¹y;tº 2 epi f −1 t f ¹xº Subgradients 2.2 Subgradient g is a subgradient of a convex function f at x 2 dom f if f ¹yº ≥ f ¹xº + gT¹y − xº for all y 2 dom f f ¹yº T f ¹x1º + g1 ¹y − x1º T f ¹x1º + g2 ¹y − x1º T f ¹x2º + g3 ¹y − x2º x1 x2 g1, g2 are subgradients at x1; g3 is a subgradient at x2 Subgradients 2.3 Subdifferential the subdifferential @ f ¹xº of f at x is the set of all subgradients: @ f ¹xº = fg j gT¹y − xº ≤ f ¹yº − f ¹xº; 8y 2 dom f g Properties • @ f ¹xº is a closed convex set (possibly empty) this follows from the definition: @ f ¹xº is an intersection of halfspaces • if x 2 int dom f then @ f ¹xº is nonempty and bounded proof on next two pages Subgradients 2.4 Proof: we show that @ f ¹xº is nonempty when x 2 int dom f • ¹x; f ¹xºº is in the boundary of the convex set epi f • therefore there exists a supporting hyperplane to epi f at ¹x; f ¹xºº: a T y x 9¹a; bº 0; − ≤ 0 8¹y;tº 2 epi f , b t f ¹xº • b > 0 gives a contradiction as t ! 1 • b = 0 gives a contradiction for y = x + a with small > 0 1 • therefore b < 0 and g = a is a subgradient of f at x jbj Subgradients 2.5 Proof: @ f ¹xº is bounded when x 2 int dom f • for small r > 0, define a set of 2n points B = fx ± rek j k = 1;:::; ng ⊂ dom f and define M = max f ¹yº < 1 y2B • for every g 2 @ f ¹xº, there is a point y 2 B with T rkgk1 = g ¹y − xº (choose an index k with jgk j = kgk1, and take y = x + r sign¹gkºek) • since g is a subgradient, this implies that T f ¹xº + rkgk1 = f ¹xº + g ¹y − xº ≤ f ¹yº ≤ M • we conclude that @ f ¹xº is bounded: M − f ¹xº kgk1 ≤ for all g 2 @ f ¹xº r Subgradients 2.6 Example f ¹xº = max f f1¹xº; f2¹xºg with f1, f2 convex and differentiable f ¹yº f2¹yº f1¹yº • if f1¹xˆº = f2¹xˆº, subdifferential at xˆ is line segment »r f1¹xˆº; r f2¹xˆº¼ • if f1¹xˆº > f2¹xˆº, subdifferential at xˆ is fr f1¹xˆºg • if f1¹xˆº < f2¹xˆº, subdifferential at xˆ is fr f2¹xˆºg Subgradients 2.7 Examples Absolute value f ¹xº = jxj f ¹xº @ f ¹xº 1 x x −1 Euclidean norm f ¹xº = kxk2 1 @ f ¹xº = f xg if x , 0;@ f ¹xº = fg j kgk2 ≤ 1g if x = 0 kxk2 Subgradients 2.8 Monotonicity the subdifferential of a convex function is a monotone operator: ¹u − vºT¹x − yº ≥ 0 for all x, y, u 2 @ f ¹xº, v 2 @ f ¹yº Proof: by definition f ¹yº ≥ f ¹xº + uT¹y − xº; f ¹xº ≥ f ¹yº + vT¹x − yº combining the two inequalities shows monotonicity Subgradients 2.9 Examples of non-subdifferentiable functions the following functions are not subdifferentiable at x = 0 • f : R ! R, dom f = R+ f ¹xº = 1 if x = 0; f ¹xº = 0 if x > 0 • f : R ! R, dom f = R+ p f ¹xº = − x the only supporting hyperplane to epi f at ¹0; f ¹0ºº is vertical Subgradients 2.10 Subgradients and sublevel sets if g is a subgradient of f at x, then f ¹yº ≤ f ¹xº =) gT¹y − xº ≤ 0 g x f ¹yº ≤ f ¹xº the nonzero subgradients at x define supporting hyperplanes to the sublevel set fy j f ¹yº ≤ f ¹xºg Subgradients 2.11 Outline • definition • subgradient calculus • duality and optimality conditions • directional derivative Subgradient calculus Weak subgradient calculus: rules for finding one subgradient • sufficient for most nondifferentiable convex optimization algorithms • if you can evaluate f ¹xº, you can usually compute a subgradient Strong subgradient calculus: rules for finding @ f ¹xº (all subgradients) • some algorithms, optimality conditions, etc., need entire subdifferential • can be quite complicated we will assume that x 2 int dom f Subgradients 2.12 Basic rules Differentiable functions: @ f ¹xº = fr f ¹xºg if f is differentiable at x Nonnegative linear combination if f ¹xº = α1 f1¹xº + α2 f2¹xº with α1; α2 ≥ 0, then @ f ¹xº = α1@ f1¹xº + α2@ f2¹xº (right-hand side is addition of sets) Affine transformation of variables: if f ¹xº = h¹Ax + bº, then @ f ¹xº = AT @h¹Ax + bº Subgradients 2.13 Pointwise maximum f ¹xº = max f f1¹xº;:::; fm¹xºg define I¹xº = fi j fi¹xº = f ¹xºg, the ‘active’ functions at x Weak result to compute a subgradient at x, choose any k 2 I¹xº, any subgradient of fk at x Strong result [ @ f ¹xº = conv @ fi¹xº i2I¹xº • the convex hull of the union of subdifferentials of ‘active’ functions at x • if fi’s are differentiable, @ f ¹xº = conv fr fi¹xº j i 2 I¹xºg Subgradients 2.14 Example: piecewise-linear function T f ¹xº = max ¹ai x + biº i=1;:::;m f ¹xº T ai x + bi x the subdifferential at x is a polyhedron @ f ¹xº = conv fai j i 2 I¹xºg T with I¹xº = fi j ai x + bi = f ¹xºg Subgradients 2.15 Example: `1-norm T f ¹xº = kxk1 = max s x s2f−1;1gn the subdifferential is a product of intervals 8 [−1;1¼ xk = 0 ><> @ f ¹xº = J1 × · · · × Jn; Jk = f1g xk > 0 > f−1g xk < 0 : ¹1;1º (−1;1º ¹1;1º ¹1;1º (−1;−1º ¹1;−1º ¹1;−1º @ f ¹0;0º = [−1;1¼ × [−1;1¼ @ f ¹1;0º = f1g × [−1;1¼ @ f ¹1;1º = f¹1;1ºg Subgradients 2.16 Pointwise supremum f ¹xº = sup fα¹xº; fα¹xº convex in x for every α α2A Weak result: to find a subgradient at xˆ, • find any β for which f ¹xˆº = fβ¹xˆº (assuming maximum is attained) • choose any g 2 @ fβ¹xˆº (Partial) strong result: define I¹xº = fα 2 A j fα¹xº = f ¹xºg [ conv @ fα¹xº ⊆ @ f ¹xº α2I¹xº equality requires extra conditions (for example, A compact, fα continuous in α) Subgradients 2.17 Exercise: maximum eigenvalue Problem: explain how to find a subgradient of T f ¹xº = λmax¹A¹xºº = sup y A¹xºy kyk2=1 where A¹xº = A0 + x1A1 + ··· + xn An with symmetric coefficients Ai Solution: to find a subgradient at xˆ, • choose any unit eigenvector y with eigenvalue λmax¹A¹xˆºº • the gradient of yT A¹xºy at xˆ is a subgradient of f : T T ¹y A1y;:::; y Anyº 2 @ f ¹xˆº Subgradients 2.18 Minimization f ¹xº = inf h¹x; yº; h jointly convex in ¹x; yº y Weak result: to find a subgradient at xˆ, • find yˆ that minimizes h¹xˆ; yº (assuming minimum is attained) • find subgradient ¹g;0º 2 @h¹xˆ; yˆº Proof: for all x, y, h¹x; yº ≥ h¹xˆ; yˆº + gT¹x − xˆº + 0T¹y − yˆº = f ¹xˆº + gT¹x − xˆº therefore f ¹xº = inf h¹x; yº ≥ f ¹xˆº + gT¹x − xˆº y Subgradients 2.19 Exercise: Euclidean distance to convex set Problem: explain how to find a subgradient of f ¹xº = inf kx − yk2 y2C where C is a closed convex set Solution: to find a subgradient at xˆ, • if f ¹xˆº = 0 (that is, xˆ 2 Cº, take g = 0 • if f ¹xˆº > 0, find projection yˆ = P¹xˆº on C and take 1 1 g = ¹xˆ − yˆº = ¹xˆ − P¹xˆºº kyˆ − xˆk2 kxˆ − P¹xˆºk2 Subgradients 2.20 Composition f ¹xº = h¹ f1¹xº;:::; fk¹xºº; h convex and nondecreasing, fi convex Weak result: to find a subgradient at xˆ, • find z 2 @h¹ f1¹xˆº;:::; fk¹xˆºº and gi 2 @ fi¹xˆº • then g = z1g1 + ··· + zkgk 2 @ f ¹xˆº reduces to standard formula for differentiable h, fi Proof: T T f ¹xº ≥ h f1¹xˆº + g1 ¹x − xˆº;:::; fk¹xˆº + gk ¹x − xˆº T T T ≥ h ¹ f1¹xˆº;:::; fk¹xˆºº + z g1 ¹x − xˆº;:::; gk ¹x − xˆº T = h ¹ f1¹xˆº;:::; fk¹xˆºº + ¹z1g1 + ··· + zkgkº ¹x − xˆº = f ¹xˆº + gT¹x − xˆº Subgradients 2.21 Optimal value function define f ¹u; vº as the optimal value of convex problem minimize f0¹xº subject to fi¹xº ≤ ui; i = 1;:::; m Ax = b + v (functions fi are convex; optimization variable is x) Weak result: suppose f ¹uˆ; vˆº is finite and strong duality holds with the dual ! X T maximize inf f0¹xº + λi¹ fi¹xº − uˆiº + ν ¹Ax − b − vˆº x i subject to λ 0 if λˆ, νˆ are optimal dual variables (for right-hand sides uˆ; vˆ) then (−λ;ˆ −νˆº 2 @ f ¹uˆ; vˆº Subgradients 2.22 Proof: by weak duality for problem with right-hand sides u, v ! X T f ¹u; vº ≥ inf f0¹xº + λˆi¹ fi¹xº − uiº + νˆ ¹Ax − b − vº x i ! X T = inf f0¹xº + λˆi¹ fi¹xº − uˆiº + νˆ ¹Ax − b − vˆº x i − λˆT¹u − uˆº − νˆT¹v − vˆº = f ¹uˆ; vˆº − λˆT¹u − uˆº − νˆT¹v − vˆº Subgradients 2.23 Expectation f ¹xº = E h¹x;uº u random, h convex in x for every u Weak result: to find a subgradient at xˆ, • choose a function u 7! g¹uº with g¹uº 2 @xh¹xˆ;uº • then, g = Eu g¹uº 2 @ f ¹xˆº Proof: by convexity of h and definition of g¹uº, f ¹xº = E h¹x;uº ≥ E h¹xˆ;uº + g¹uºT¹x − xˆº = f ¹xˆº + gT¹x − xˆº Subgradients 2.24 Outline • definition • subgradient calculus • duality and optimality conditions • directional derivative f ¹yº Optimality conditions — unconstrained x? minimizes f ¹xº if and only 0 2 @ f ¹x?º x? this follows directly from the definition of subgradient: f ¹yº ≥ f ¹x?º + 0T¹y − x?º for all y () 0 2 @ f ¹x?º Subgradients 2.25 Example: piecewise-linear minimization T f ¹xº = max ¹ai x + biº i=1;:::;m Optimality condition ? T 0 2 conv fai j i 2 I¹x ºg where I¹xº = fi j ai x + bi = f ¹xºg • in other words, x? is optimal if and only if there is a λ with m T X ? λ 0; 1 λ = 1; λiai = 0; λi = 0 for i < I¹x º i=1 • these are the optimality conditions for the equivalent linear program minimize t maximize bT λ subject to Ax + b t1 subject to AT λ = 0 λ 0; 1T λ = 1 Subgradients 2.26 Optimality conditions — constrained minimize f0¹xº subject to fi¹xº ≤ 0; i = 1;:::; m n assume dom fi = R , so functions fi are subdifferentiable everywhere Karush–Kuhn–Tucker conditions if strong duality holds, then x?, λ? are primal, dual optimal if and only if 1.
Recommended publications
  • On Choquet Theorem for Random Upper Semicontinuous Functions
    CORE Metadata, citation and similar papers at core.ac.uk Provided by Elsevier - Publisher Connector International Journal of Approximate Reasoning 46 (2007) 3–16 www.elsevier.com/locate/ijar On Choquet theorem for random upper semicontinuous functions Hung T. Nguyen a, Yangeng Wang b,1, Guo Wei c,* a Department of Mathematical Sciences, New Mexico State University, Las Cruces, NM 88003, USA b Department of Mathematics, Northwest University, Xian, Shaanxi 710069, PR China c Department of Mathematics and Computer Science, University of North Carolina at Pembroke, Pembroke, NC 28372, USA Received 16 May 2006; received in revised form 17 October 2006; accepted 1 December 2006 Available online 29 December 2006 Abstract Motivated by the problem of modeling of coarse data in statistics, we investigate in this paper topologies of the space of upper semicontinuous functions with values in the unit interval [0,1] to define rigorously the concept of random fuzzy closed sets. Unlike the case of the extended real line by Salinetti and Wets, we obtain a topological embedding of random fuzzy closed sets into the space of closed sets of a product space, from which Choquet theorem can be investigated rigorously. Pre- cise topological and metric considerations as well as results of probability measures and special prop- erties of capacity functionals for random u.s.c. functions are also given. Ó 2006 Elsevier Inc. All rights reserved. Keywords: Choquet theorem; Random sets; Upper semicontinuous functions 1. Introduction Random sets are mathematical models for coarse data in statistics (see e.g. [12]). A the- ory of random closed sets on Hausdorff, locally compact and second countable (i.e., a * Corresponding author.
    [Show full text]
  • CORE View Metadata, Citation and Similar Papers at Core.Ac.Uk
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Bulgarian Digital Mathematics Library at IMI-BAS Serdica Math. J. 27 (2001), 203-218 FIRST ORDER CHARACTERIZATIONS OF PSEUDOCONVEX FUNCTIONS Vsevolod Ivanov Ivanov Communicated by A. L. Dontchev Abstract. First order characterizations of pseudoconvex functions are investigated in terms of generalized directional derivatives. A connection with the invexity is analysed. Well-known first order characterizations of the solution sets of pseudolinear programs are generalized to the case of pseudoconvex programs. The concepts of pseudoconvexity and invexity do not depend on a single definition of the generalized directional derivative. 1. Introduction. Three characterizations of pseudoconvex functions are considered in this paper. The first is new. It is well-known that each pseudo- convex function is invex. Then the following question arises: what is the type of 2000 Mathematics Subject Classification: 26B25, 90C26, 26E15. Key words: Generalized convexity, nonsmooth function, generalized directional derivative, pseudoconvex function, quasiconvex function, invex function, nonsmooth optimization, solution sets, pseudomonotone generalized directional derivative. 204 Vsevolod Ivanov Ivanov the function η from the definition of invexity, when the invex function is pseudo- convex. This question is considered in Section 3, and a first order necessary and sufficient condition for pseudoconvexity of a function is given there. It is shown that the class of strongly pseudoconvex functions, considered by Weir [25], coin- cides with pseudoconvex ones. The main result of Section 3 is applied to characterize the solution set of a nonlinear programming problem in Section 4. The base results of Jeyakumar and Yang in the paper [13] are generalized there to the case, when the function is pseudoconvex.
    [Show full text]
  • Convexity I: Sets and Functions
    Convexity I: Sets and Functions Ryan Tibshirani Convex Optimization 10-725/36-725 See supplements for reviews of basic real analysis • basic multivariate calculus • basic linear algebra • Last time: why convexity? Why convexity? Simply put: because we can broadly understand and solve convex optimization problems Nonconvex problems are mostly treated on a case by case basis Reminder: a convex optimization problem is of ● the form ● ● min f(x) x2D ● ● subject to gi(x) 0; i = 1; : : : m ≤ hj(x) = 0; j = 1; : : : r ● ● where f and gi, i = 1; : : : m are all convex, and ● hj, j = 1; : : : r are affine. Special property: any ● local minimizer is a global minimizer ● 2 Outline Today: Convex sets • Examples • Key properties • Operations preserving convexity • Same for convex functions • 3 Convex sets n Convex set: C R such that ⊆ x; y C = tx + (1 t)y C for all 0 t 1 2 ) − 2 ≤ ≤ In words, line segment joining any two elements lies entirely in set 24 2 Convex sets Figure 2.2 Some simple convexn and nonconvex sets. Left. The hexagon, Convex combinationwhich includesof x1; its : :boundary : xk (shownR : darker), any linear is convex. combinationMiddle. The kidney shaped set is not convex, since2 the line segment between the twopointsin the set shown as dots is not contained in the set. Right. The square contains some boundaryθ1x points1 + but::: not+ others,θkxk and is not convex. k with θi 0, i = 1; : : : k, and θi = 1. Convex hull of a set C, ≥ i=1 conv(C), is all convex combinations of elements. Always convex P 4 Figure 2.3 The convex hulls of two sets in R2.
    [Show full text]
  • Convex Functions
    Convex Functions Prof. Daniel P. Palomar ELEC5470/IEDA6100A - Convex Optimization The Hong Kong University of Science and Technology (HKUST) Fall 2020-21 Outline of Lecture • Definition convex function • Examples on R, Rn, and Rn×n • Restriction of a convex function to a line • First- and second-order conditions • Operations that preserve convexity • Quasi-convexity, log-convexity, and convexity w.r.t. generalized inequalities (Acknowledgement to Stephen Boyd for material for this lecture.) Daniel P. Palomar 1 Definition of Convex Function • A function f : Rn −! R is said to be convex if the domain, dom f, is convex and for any x; y 2 dom f and 0 ≤ θ ≤ 1, f (θx + (1 − θ) y) ≤ θf (x) + (1 − θ) f (y) : • f is strictly convex if the inequality is strict for 0 < θ < 1. • f is concave if −f is convex. Daniel P. Palomar 2 Examples on R Convex functions: • affine: ax + b on R • powers of absolute value: jxjp on R, for p ≥ 1 (e.g., jxj) p 2 • powers: x on R++, for p ≥ 1 or p ≤ 0 (e.g., x ) • exponential: eax on R • negative entropy: x log x on R++ Concave functions: • affine: ax + b on R p • powers: x on R++, for 0 ≤ p ≤ 1 • logarithm: log x on R++ Daniel P. Palomar 3 Examples on Rn • Affine functions f (x) = aT x + b are convex and concave on Rn. n • Norms kxk are convex on R (e.g., kxk1, kxk1, kxk2). • Quadratic functions f (x) = xT P x + 2qT x + r are convex Rn if and only if P 0.
    [Show full text]
  • Sum of Squares and Polynomial Convexity
    CONFIDENTIAL. Limited circulation. For review only. Sum of Squares and Polynomial Convexity Amir Ali Ahmadi and Pablo A. Parrilo Abstract— The notion of sos-convexity has recently been semidefiniteness of the Hessian matrix is replaced with proposed as a tractable sufficient condition for convexity of the existence of an appropriately defined sum of squares polynomials based on sum of squares decomposition. A multi- decomposition. As we will briefly review in this paper, by variate polynomial p(x) = p(x1; : : : ; xn) is said to be sos-convex if its Hessian H(x) can be factored as H(x) = M T (x) M (x) drawing some appealing connections between real algebra with a possibly nonsquare polynomial matrix M(x). It turns and numerical optimization, the latter problem can be re- out that one can reduce the problem of deciding sos-convexity duced to the feasibility of a semidefinite program. of a polynomial to the feasibility of a semidefinite program, Despite the relative recency of the concept of sos- which can be checked efficiently. Motivated by this computa- convexity, it has already appeared in a number of theo- tional tractability, it has been speculated whether every convex polynomial must necessarily be sos-convex. In this paper, we retical and practical settings. In [6], Helton and Nie use answer this question in the negative by presenting an explicit sos-convexity to give sufficient conditions for semidefinite example of a trivariate homogeneous polynomial of degree eight representability of semialgebraic sets. In [7], Lasserre uses that is convex but not sos-convex. sos-convexity to extend Jensen’s inequality in convex anal- ysis to linear functionals that are not necessarily probability I.
    [Show full text]
  • Subgradients
    Subgradients Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: gradient descent Consider the problem min f(x) x n for f convex and differentiable, dom(f) = R . Gradient descent: (0) n choose initial x 2 R , repeat (k) (k−1) (k−1) x = x − tk · rf(x ); k = 1; 2; 3;::: Step sizes tk chosen to be fixed and small, or by backtracking line search If rf Lipschitz, gradient descent has convergence rate O(1/) Downsides: • Requires f differentiable next lecture • Can be slow to converge two lectures from now 2 Outline Today: crucial mathematical underpinnings! • Subgradients • Examples • Subgradient rules • Optimality characterizations 3 Subgradients Remember that for convex and differentiable f, f(y) ≥ f(x) + rf(x)T (y − x) for all x; y I.e., linear approximation always underestimates f n A subgradient of a convex function f at x is any g 2 R such that f(y) ≥ f(x) + gT (y − x) for all y • Always exists • If f differentiable at x, then g = rf(x) uniquely • Actually, same definition works for nonconvex f (however, subgradients need not exist) 4 Examples of subgradients Consider f : R ! R, f(x) = jxj 2.0 1.5 1.0 f(x) 0.5 0.0 −0.5 −2 −1 0 1 2 x • For x 6= 0, unique subgradient g = sign(x) • For x = 0, subgradient g is any element of [−1; 1] 5 n Consider f : R ! R, f(x) = kxk2 f(x) x2 x1 • For x 6= 0, unique subgradient g = x=kxk2 • For x = 0, subgradient g is any element of fz : kzk2 ≤ 1g 6 n Consider f : R ! R, f(x) = kxk1 f(x) x2 x1 • For xi 6= 0, unique ith component gi = sign(xi) • For xi = 0, ith component gi is any element of [−1; 1] 7 n
    [Show full text]
  • EE364 Review Session 2
    EE364 Review EE364 Review Session 2 Convex sets and convex functions • Examples from chapter 3 • Installing and using CVX • 1 Operations that preserve convexity Convex sets Convex functions Intersection Nonnegative weighted sum Affine transformation Composition with an affine function Perspective transformation Perspective transformation Pointwise maximum and supremum Minimization Convex sets and convex functions are related via the epigraph. • Composition rules are extremely important. • EE364 Review Session 2 2 Simple composition rules Let h : R R and g : Rn R. Let f(x) = h(g(x)). Then: ! ! f is convex if h is convex and nondecreasing, and g is convex • f is convex if h is convex and nonincreasing, and g is concave • f is concave if h is concave and nondecreasing, and g is concave • f is concave if h is concave and nonincreasing, and g is convex • EE364 Review Session 2 3 Ex. 3.6 Functions and epigraphs. When is the epigraph of a function a halfspace? When is the epigraph of a function a convex cone? When is the epigraph of a function a polyhedron? Solution: If the function is affine, positively homogeneous (f(αx) = αf(x) for α 0), and piecewise-affine, respectively. ≥ Ex. 3.19 Nonnegative weighted sums and integrals. r 1. Show that f(x) = i=1 αix[i] is a convex function of x, where α1 α2 αPr 0, and x[i] denotes the ith largest component of ≥ ≥ · · · ≥ ≥ k n x. (You can use the fact that f(x) = i=1 x[i] is convex on R .) P 2. Let T (x; !) denote the trigonometric polynomial T (x; !) = x + x cos ! + x cos 2! + + x cos(n 1)!: 1 2 3 · · · n − EE364 Review Session 2 4 Show that the function 2π f(x) = log T (x; !) d! − Z0 is convex on x Rn T (x; !) > 0; 0 ! 2π .
    [Show full text]
  • NP-Hardness of Deciding Convexity of Quartic Polynomials and Related Problems
    NP-hardness of Deciding Convexity of Quartic Polynomials and Related Problems Amir Ali Ahmadi, Alex Olshevsky, Pablo A. Parrilo, and John N. Tsitsiklis ∗y Abstract We show that unless P=NP, there exists no polynomial time (or even pseudo-polynomial time) algorithm that can decide whether a multivariate polynomial of degree four (or higher even degree) is globally convex. This solves a problem that has been open since 1992 when N. Z. Shor asked for the complexity of deciding convexity for quartic polynomials. We also prove that deciding strict convexity, strong convexity, quasiconvexity, and pseudoconvexity of polynomials of even degree four or higher is strongly NP-hard. By contrast, we show that quasiconvexity and pseudoconvexity of odd degree polynomials can be decided in polynomial time. 1 Introduction The role of convexity in modern day mathematical programming has proven to be remarkably fundamental, to the point that tractability of an optimization problem is nowadays assessed, more often than not, by whether or not the problem benefits from some sort of underlying convexity. In the famous words of Rockafellar [39]: \In fact the great watershed in optimization isn't between linearity and nonlinearity, but convexity and nonconvexity." But how easy is it to distinguish between convexity and nonconvexity? Can we decide in an efficient manner if a given optimization problem is convex? A class of optimization problems that allow for a rigorous study of this question from a com- putational complexity viewpoint is the class of polynomial optimization problems. These are op- timization problems where the objective is given by a polynomial function and the feasible set is described by polynomial inequalities.
    [Show full text]
  • Chapter 5 Convex Optimization in Function Space 5.1 Foundations of Convex Analysis
    Chapter 5 Convex Optimization in Function Space 5.1 Foundations of Convex Analysis Let V be a vector space over lR and k ¢ k : V ! lR be a norm on V . We recall that (V; k ¢ k) is called a Banach space, if it is complete, i.e., if any Cauchy sequence fvkglN of elements vk 2 V; k 2 lN; converges to an element v 2 V (kvk ¡ vk ! 0 as k ! 1). Examples: Let ­ be a domain in lRd; d 2 lN. Then, the space C(­) of continuous functions on ­ is a Banach space with the norm kukC(­) := sup ju(x)j : x2­ The spaces Lp(­); 1 · p < 1; of (in the Lebesgue sense) p-integrable functions are Banach spaces with the norms Z ³ ´1=p p kukLp(­) := ju(x)j dx : ­ The space L1(­) of essentially bounded functions on ­ is a Banach space with the norm kukL1(­) := ess sup ju(x)j : x2­ The (topologically and algebraically) dual space V ¤ is the space of all bounded linear functionals ¹ : V ! lR. Given ¹ 2 V ¤, for ¹(v) we often write h¹; vi with h¢; ¢i denoting the dual product between V ¤ and V . We note that V ¤ is a Banach space equipped with the norm j h¹; vi j k¹k := sup : v2V nf0g kvk Examples: The dual of C(­) is the space M(­) of Radon measures ¹ with Z h¹; vi := v d¹ ; v 2 C(­) : ­ The dual of L1(­) is the space L1(­). The dual of Lp(­); 1 < p < 1; is the space Lq(­) with q being conjugate to p, i.e., 1=p + 1=q = 1.
    [Show full text]
  • Arxiv:2103.15804V4 [Math.AT] 25 May 2021
    Decorated Merge Trees for Persistent Topology Justin Curry, Haibin Hang, Washington Mio, Tom Needham, and Osman Berat Okutan Abstract. This paper introduces decorated merge trees (DMTs) as a novel invariant for persistent spaces. DMTs combine both π0 and Hn information into a single data structure that distinguishes filtrations that merge trees and persistent homology cannot distinguish alone. Three variants on DMTs, which empha- size category theory, representation theory and persistence barcodes, respectively, offer different advantages in terms of theory and computation. Two notions of distance|an interleaving distance and bottleneck distance|for DMTs are defined and a hierarchy of stability results that both refine and generalize existing stability results is proved here. To overcome some of the computational complexity inherent in these dis- tances, we provide a novel use of Gromov-Wasserstein couplings to compute optimal merge tree alignments for a combinatorial version of our interleaving distance which can be tractably estimated. We introduce computational frameworks for generating, visualizing and comparing decorated merge trees derived from synthetic and real data. Example applications include comparison of point clouds, interpretation of persis- tent homology of sliding window embeddings of time series, visualization of topological features in segmented brain tumor images and topology-driven graph alignment. Contents 1. Introduction 1 2. Decorated Merge Trees Three Different Ways3 3. Continuity and Stability of Decorated Merge Trees 10 4. Representations of Tree Posets and Lift Decorations 16 5. Computing Interleaving Distances 20 6. Algorithmic Details and Examples 24 7. Discussion 29 References 30 Appendix A. Interval Topology 34 Appendix B. Comparison and Existence of Merge Trees 35 Appendix C.
    [Show full text]
  • Estimation of the Derivative of the Convex Function by Means of Its
    Journal of Inequalities in Pure and Applied Mathematics http://jipam.vu.edu.au/ Volume 6, Issue 4, Article 113, 2005 ESTIMATION OF THE DERIVATIVE OF THE CONVEX FUNCTION BY MEANS OF ITS UNIFORM APPROXIMATION KAKHA SHASHIASHVILI AND MALKHAZ SHASHIASHVILI TBILISI STATE UNIVERSITY FACULTY OF MECHANICS AND MATHEMATICS 2, UNIVERSITY STR., TBILISI 0143 GEORGIA [email protected] Received 15 June, 2005; accepted 26 August, 2005 Communicated by I. Gavrea ABSTRACT. Suppose given the uniform approximation to the unknown convex function on a bounded interval. Starting from it the objective is to estimate the derivative of the unknown convex function. We propose the method of estimation that can be applied to evaluate optimal hedging strategies for the American contingent claims provided that the value function of the claim is convex in the state variable. Key words and phrases: Convex function, Energy estimate, Left-derivative, Lower convex envelope, Hedging strategies the American contingent claims. 2000 Mathematics Subject Classification. 26A51, 26D10, 90A09. 1. INTRODUCTION Consider the continuous convex function f(x) on a bounded interval [a, b] and suppose that its explicit analytical form is unknown to us, whereas there is the possibility to construct its continuous uniform approximation fδ, where δ is a small parameter. Our objective consists in constructing the approximation to the unknown left-derivative f 0(x−) based on the known ^ function fδ(x). For this purpose we consider the lower convex envelope f δ(x) of fδ(x), that is the maximal convex function, less than or equal to fδ(x). Geometrically it represents the thread stretched from below over the graph of the function fδ(x).
    [Show full text]
  • Chapter 2, Lecture 2: Convex Functions 1 Definition Of
    Math 484: Nonlinear Programming1 Mikhail Lavrov Chapter 2, Lecture 2: Convex functions February 6, 2019 University of Illinois at Urbana-Champaign 1 Definition of convex functions n 00 Convex functions f : R ! R with Hf(x) 0 for all x, or functions f : R ! R with f (x) ≥ 0 for all x, are going to be our model of what we want convex functions to be. But we actually work with a slightly more general definition that doesn't require us to say anything about derivatives. n Let C ⊆ R be a convex set. A function f : C ! R is convex on C if, for all x; y 2 C, the inequality holds that f(tx + (1 − t)y) ≤ tf(x) + (1 − t)f(y): (We ask for C to be convex so that tx + (1 − t)y is guaranteed to stay in the domain of f.) This is easiest to visualize in one dimension: x1 x2 The point tx+(1−t)y is somewhere on the line segment [x; y]. The left-hand side of the definition, f(tx + (1 − t)y), is just the value of the function at that point: the green curve in the diagram. The right-hand side of the definition, tf(x) + (1 − t)f(y), is the dashed line segment: a straight line that meets f at x and y. So, geometrically, the definition says that secant lines of f always lie above the graph of f. Although the picture we drew is for a function R ! R, nothing different happens in higher dimen- sions, because only points on the line segment [x; y] (and f's values at those points) play a role in the inequality.
    [Show full text]