Mind the Duality Gap: Safer Rules for the Lasso

Total Page:16

File Type:pdf, Size:1020Kb

Mind the Duality Gap: Safer Rules for the Lasso Mind the duality gap: safer rules for the Lasso Olivier Fercoq [email protected] Alexandre Gramfort [email protected] Joseph Salmon [email protected] Institut Mines-Tel´ ecom,´ Tel´ ecom´ ParisTech, CNRS LTCI 46 rue Barrault, 75013, Paris, France Abstract one can mention dictionary learning (Mairal, 2010), bio- statistics (Haury et al., 2012) and medical imaging (Lustig Screening rules allow to early discard irrelevant et al., 2007; Gramfort et al., 2012) to name a few. variables from the optimization in Lasso prob- lems, or its derivatives, making solvers faster. In Many algorithms exist to approximate Lasso solutions, but this paper, we propose new versions of the so- it is still a burning issue to accelerate solvers in high di- called safe rules for the Lasso. Based on duality mensions. Indeed, although some other variable selection gap considerations, our new rules create safe test and prediction methods exist (Fan & Lv, 2008), the best regions whose diameters converge to zero, pro- performing methods usually rely on the Lasso. For sta- vided that one relies on a converging solver. This bility selection methods (Meinshausen & Buhlmann¨ , 2010; property helps screening out more variables, for Bach, 2008; Varoquaux et al., 2012), hundreds of Lasso a wider range of regularization parameter values. problems need to be solved. For non-convex approaches In addition to faster convergence, we prove that such as SCAD (Fan & Li, 2001) or MCP (Zhang, 2010), we correctly identify the active sets (supports) solving the Lasso is often a required preliminary step (Zou, of the solutions in finite time. While our pro- 2006; Zhang & Zhang, 2012; Candes` et al., 2008). posed strategy can cope with any solver, its per- Among possible algorithmic candidates for solving the formance is demonstrated using a coordinate de- Lasso, one can mention homotopy methods (Osborne et al., scent algorithm particularly adapted to machine 2000), LARS (Efron et al., 2004), and approximate homo- learning use cases. Significant computing time topy (Mairal & Yu, 2012), that provide solutions for the full reductions are obtained with respect to previous Lasso path, i.e., for all possible choices of tuning parame- safe rules. ter λ. More recently, particularly for p ¡ n, coordinate descent approaches (Friedman et al., 2007) have proved to 1. Introduction be among the best methods to tackle large scale problems. Following the seminal work by El Ghaoui et al.(2012), Since the mid 1990’s, high dimensional statistics has at- screening techniques have emerged as a way to exploit the tracted considerable attention, especially in the context of known sparsity of the solution by discarding features prior linear regression with more explanatory variables than ob- to starting a Lasso solver. Such techniques are coined safe servations: the so-called p ¡ n case. In such a context, the rules when they screen out coefficients guaranteed to be least squares with `1 regularization, referred to as the Lasso zero in the targeted optimal solution. Zeroing those coeffi- (Tibshirani, 1996) in statistics, or Basis Pursuit (Chen et al., cients allows to focus more precisely on the non-zero ones 1998) in signal processing, has been one of the most pop- (likely to represent signal) and helps reducing the computa- ular tools. It enjoys theoretical guarantees (Bickel et al., tional burden. We refer to (Xiang et al., 2014) for a concise 2009), as well as practical benefits: it provides sparse solu- introduction on safe rules. Other alternatives have tried to tions and fast convex solvers are available. This has made screen the Lasso relaxing the “safety”. Potentially, some the Lasso a popular method in modern data-science tool- variables are wrongly disregarded and post-processing is kits. Among successful fields where it has been applied, needed to recover them. This is for instance the strategy Proceedings of the 32 nd International Conference on Machine adopted for the strong rules (Tibshirani et al., 2012). Learning, Lille, France, 2015. JMLR: W&CP volume 37. Copy- The original basic safe rules operate as follows: one right 2015 by the author(s). Mind the duality gap: safer rules for the Lasso chooses a fixed tuning parameter λ, and before launching finite time the active variables of the optimal solution (or any solver, tests whether a coordinate can be zeroed or not equivalently the inactive variables), and the tests become (equivalently if the corresponding variable can be disre- more and more precise as the optimization algorithm pro- garded or not). We will refer to such safe rules as static ceeds. We also show that our new GAP SAFE rules, built safe rules. Note that the test is performed according to a on dual gap computations, are converging safe rules since safe region, i.e., a region containing a dual optimal solu- their associated safe regions have a diameter converging to tion of the Lasso problem. In the static case, the screening zero. We also explain how our GAP SAFE tests are se- is performed only once, prior any optimization iteration. quential by nature. Application of our GAP SAFE rules Two directions have emerged to improve on static strate- with a coordinate descent solver for the Lasso problem is gies. proposed in Section4. Using standard data-sets, we report the time improvement compared to prior safe rules. • The first direction is oriented towards the resolution of the Lasso for a large number of tuning parameters. 1.1. Model and notation Indeed, practitioners commonly compute the Lasso We denote by rds the set t1; : : : ; du for any integer d P N. over a grid of parameters and select the best one in a n Our observation vector is y P R and the design matrix data-driven manner, e.g., by cross-validation. As two X x ; ; x nˆp has p explanatory variables (or 1 “ r 1 ¨ ¨ ¨ ps P R consecutive λ s in the grid lead to similar solutions, features) column-wise. We aim at approximating y as a knowing the first solution may help improve screen- linear combination of few variables xj’s, hence expressing ing for the second one. We call sequential safe rules p y as Xβ where β P R is a sparse vector. The standard such strategies, also referred to as recursive safe rules Euclidean norm is written } ¨ }, the `1 norm } ¨ }1, the `8 in (El Ghaoui et al., 2012). This road has been pur- norm } ¨ }8, and the matrix transposition of a matrix Q is sued in (Wang et al., 2013; Xu & Ramadge, 2013; Xi- J denoted by Q . We denote ptq` “ maxp0; tq. ang et al., 2014), and can be thought of as a “warm start” of the screening (in addition to the warm start of For such a task, the Lasso is often considered (see the solution itself). When performing sequential safe Buhlmann¨ & van de Geer(2011) for an introduction). For a rules, one should keep in mind that generally, only an tuning parameter λ ¡ 0, controlling the trade-off between approximation of the previous dual solution is com- data fidelity and sparsity of the solutions, a Lasso estimator puted. Though, the safety of the rule is guaranteed β^pλq is any solution of the primal optimization problem only if one uses the exact solution. Neglecting this is- ^pλq 1 2 sue, leads to “unsafe” rules: relevant variables might β P arg min kXβ ´ yk2 ` λ kβk1 : (1) βP p 2 be wrongly disregarded. R “Pλpβq • The second direction aims at improving the screen- looooooooooooomooooooooooooonn J Denoting ∆X “ θ P R : xj θ ¤ 1; @j P rps the ing by interlacing it throughout the optimization algo- dual feasible set, a dual formulation of the Lasso reads (see rithm itself: although screening might be useless at the for instance Kim et al.(2007) or Xiang et al.(2014)): ( beginning of the algorithm, it might become (more) 1 λ2 y 2 efficient as the algorithm proceeds towards the opti- ^pλq 2 θ “ arg max kyk2 ´ θ ´ : (2) n 2 mal solution. We call these strategies dynamic safe θP∆X ĂR 2 2 λ rules following (Bonnefoy et al., 2014a;b). “Dλpθq looooooooooooomooooooooooooon We can reinterpret Eq. (2) as θ^pλq “ Π py{λq, where Based on convex optimization arguments, we leverage du- ∆X Π refers to the projection onto a closed convex set C. In ality gap computations to propose a simple strategy uni- C particular, this ensures that the dual solution θ^pλq is always fying both sequential and dynamic safe rules. We coined unique, contrarily to the primal β^pλq. GAP SAFE rules such safe rules. The main contributions of this paper are 1) the introduction 1.2. A KKT detour of new safe rules which demonstrate a clear practical im- β^pλq p provement compared to prior strategies 2) the definition of For the Lasso problem, a primal solution P R and the θ^pλq n a theoretical framework for comparing safe rules by look- dual solution P R are linked through the relation: ing at the convergence of their associated safe regions. y “ Xβ^pλq ` λθ^pλq : (3) In Section2, we present the framework and the basic con- The Karush-Khun-Tucker (KKT) conditions state: cepts which guarantee the soundness of static and dynamic tsignpβ^pλqqu if β^pλq ‰ 0; screening rules. Then, in Section3, we introduce the new j p ; xJθ^pλq j j (4) @ P r s j P ^pλq concept of converging safe rules. Such rules identify in #r´1; 1s if βj “ 0: Mind the duality gap: safer rules for the Lasso Rule Center Radius Ingredients y J J Static Safe (El Ghaoui et al., 2012) y{λ Rλp q λmax “}X y}8 “|x ‹ y| λmax j 1 2 2 λmax Dynamic ST3 (Xiang et al., 2011) y{λ ´ δxj‹ pRλpθkq ´ δ q 2 δ “ ´ 1 {}xj‹ } q λ Dynamic Safe (Bonnefoy et al., 2014a) y{λ Rλpθkq θk P ∆` X (e.g., as˘ in (11)) q Sequential (Wang et al., 2013) θ^pλt´1q 1 ´ 1 }y} exact θ^pλt´1q required λt´1q λt 1 GAP SAFE sphere (proposed) θk rλ pβk; θˇkq “ 2ˇGλ pβk; θkq dual gap for βk; θk t ˇ λt ˇ t ˇ ˇ a Table 1.
Recommended publications
  • Duality Gap Estimation Via a Refined Shapley--Folkman Lemma | SIAM
    SIAM J. OPTIM. \bigcircc 2020 Society for Industrial and Applied Mathematics Vol. 30, No. 2, pp. 1094{1118 DUALITY GAP ESTIMATION VIA A REFINED SHAPLEY{FOLKMAN LEMMA \ast YINGJIE BIy AND AO TANG z Abstract. Based on concepts like the kth convex hull and finer characterization of noncon- vexity of a function, we propose a refinement of the Shapley{Folkman lemma and derive anew estimate for the duality gap of nonconvex optimization problems with separable objective functions. We apply our result to the network utility maximization problem in networking and the dynamic spectrum management problem in communication as examples to demonstrate that the new bound can be qualitatively tighter than the existing ones. The idea is also applicable to cases with general nonconvex constraints. Key words. nonconvex optimization, duality gap, convex relaxation, network resource alloca- tion AMS subject classifications. 90C26, 90C46 DOI. 10.1137/18M1174805 1. Introduction. The Shapley{Folkman lemma (Theorem 1.1) was stated and used to establish the existence of approximate equilibria in economy with nonconvex preferences [13]. It roughly says that the sum of a large number of sets is close to convex and thus can be used to generalize results on convex objects to nonconvex ones. n m P Theorem 1.1. Let S1;S2;:::;Sn be subsets of R . For each z 2 conv i=1 Si = Pn conv S , there exist points zi 2 conv S such that z = Pn zi and zi 2 S except i=1 i i i=1 i for at most m values of i. Remark 1.2.
    [Show full text]
  • Subdifferentiability and the Duality
    Subdifferentiability and the Duality Gap Neil E. Gretsky ([email protected]) ∗ Department of Mathematics, University of California, Riverside Joseph M. Ostroy ([email protected]) y Department of Economics, University of California, Los Angeles William R. Zame ([email protected]) z Department of Economics, University of California, Los Angeles Abstract. We point out a connection between sensitivity analysis and the funda- mental theorem of linear programming by characterizing when a linear programming problem has no duality gap. The main result is that the value function is subd- ifferentiable at the primal constraint if and only if there exists an optimal dual solution and there is no duality gap. To illustrate the subtlety of the condition, we extend Kretschmer's gap example to construct (as the value function of a linear programming problem) a convex function which is subdifferentiable at a point but is not continuous there. We also apply the theorem to the continuum version of the assignment model. Keywords: duality gap, value function, subdifferentiability, assignment model AMS codes: 90C48,46N10 1. Introduction The purpose of this note is to point out a connection between sensi- tivity analysis and the fundamental theorem of linear programming. The subject has received considerable attention and the connection we find is remarkably simple. In fact, our observation in the context of convex programming follows as an application of conjugate duality [11, Theorem 16]. Nevertheless, it is useful to give a separate proof since the conclusion is more readily established and its import for linear programming is more clearly seen. The main result (Theorem 1) is that in a linear programming prob- lem there exists an optimal dual solution and there is no duality gap if and only if the value function is subdifferentiable at the primal constraint.
    [Show full text]
  • Arxiv:2011.09194V1 [Math.OC]
    Noname manuscript No. (will be inserted by the editor) Lagrangian duality for nonconvex optimization problems with abstract convex functions Ewa M. Bednarczuk · Monika Syga Received: date / Accepted: date Abstract We investigate Lagrangian duality for nonconvex optimization prob- lems. To this aim we use the Φ-convexity theory and minimax theorem for Φ-convex functions. We provide conditions for zero duality gap and strong duality. Among the classes of functions, to which our duality results can be applied, are prox-bounded functions, DC functions, weakly convex functions and paraconvex functions. Keywords Abstract convexity · Minimax theorem · Lagrangian duality · Nonconvex optimization · Zero duality gap · Weak duality · Strong duality · Prox-regular functions · Paraconvex and weakly convex functions 1 Introduction Lagrangian and conjugate dualities have far reaching consequences for solution methods and theory in convex optimization in finite and infinite dimensional spaces. For recent state-of the-art of the topic of convex conjugate duality we refer the reader to the monograph by Radu Bot¸[5]. There exist numerous attempts to construct pairs of dual problems in non- convex optimization e.g., for DC functions [19], [34], for composite functions [8], DC and composite functions [30], [31] and for prox-bounded functions [15]. In the present paper we investigate Lagrange duality for general optimiza- tion problems within the framework of abstract convexity, namely, within the theory of Φ-convexity. The class Φ-convex functions encompasses convex l.s.c. Ewa M. Bednarczuk Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01–447 Warsaw Warsaw University of Technology, Faculty of Mathematics and Information Science, ul.
    [Show full text]
  • Lagrangian Duality and Perturbational Duality I ∗
    Lagrangian duality and perturbational duality I ∗ Erik J. Balder Our approach to the Karush-Kuhn-Tucker theorem in [OSC] was entirely based on subdifferential calculus (essentially, it was an outgrowth of the two subdifferential calculus rules contained in the Fenchel-Moreau and Dubovitskii-Milyutin theorems, i.e., Theorems 2.9 and 2.17 of [OSC]). On the other hand, Proposition B.4(v) in [OSC] gives an intimate connection between the subdifferential of a function and the Fenchel conjugate of that function. In the present set of lecture notes this connection forms the central analytical tool by which one can study the connections between an optimization problem and its so-called dual optimization problem (such connections are commonly known as duality relations). We shall first study duality for the convex optimization problem that figured in our Karush-Kuhn-Tucker results. In this simple form such duality is known as Lagrangian duality. Next, in section 2 this is followed by a far-reaching extension of duality to abstract optimization problems, which leads to duality-stability relationships. Then, in section 3 we specialize duality to optimization problems with cone-type constraints, which includes Fenchel duality for semidefinite programming problems. 1 Lagrangian duality An interesting and useful interpretation of the KKT theorem can be obtained in terms of the so-called duality principle (or relationships) for convex optimization. Recall our standard convex minimization problem as we had it in [OSC]: (P ) inf ff(x): g1(x) ≤ 0; ··· ; gm(x) ≤ 0; Ax − b = 0g x2S n and recall that we allow the functions f; g1; : : : ; gm on R to have values in (−∞; +1].
    [Show full text]
  • Bounding the Duality Gap for Problems with Separable Objective
    Bounding the Duality Gap for Problems with Separable Objective Madeleine Udell and Stephen Boyd March 8, 2014 Abstract We consider the problem of minimizing a sum of non-convex func- tions over a compact domain, subject to linear inequality and equality constraints. We consider approximate solutions obtained by solving a convexified problem, in which each function in the objective is replaced by its convex envelope. We propose a randomized algorithm to solve the convexified problem which finds an -suboptimal solution to the original problem. With probability 1, is bounded by a term propor- tional to the number of constraints in the problem. The bound does not depend on the number of variables in the problem or the number of terms in the objective. In contrast to previous related work, our proof is constructive, self-contained, and gives a bound that is tight. 1 Problem and results The problem. We consider the optimization problem Pn minimize f(x) = i=1 fi(xi) subject to Ax ≤ b (P) Gx = h; N ni Pn with variable x = (x1; : : : ; xn) 2 R , where xi 2 R , with i=1 ni = N. m1×N There are m1 linear inequality constraints, so A 2 R , and m2 linear equality constraints, so G 2 Rm2×N . The optimal value of P is denoted p?. The objective function terms are lower semi-continuous on their domains: 1 ni fi : Si ! R, where Si ⊂ R is a compact set. We say that a point x is feasible (for P) if Ax ≤ b, Gx = h, and xi 2 Si, i = 1; : : : ; n.
    [Show full text]
  • Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
    Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex Hongyang Zhang Junru Shao Ruslan Salakhutdinov Carnegie Mellon University Carnegie Mellon University Carnegie Mellon University [email protected] [email protected] [email protected] Abstract Several recently proposed architectures of neural networks such as ResNeXt, Inception, Xception, SqueezeNet and Wide ResNet are based on the designing idea of having multiple branches and have demonstrated improved performance in many applications. We show that one cause for such success is due to the fact that the multi-branch architecture is less non-convex in terms of duality gap. The duality gap measures the degree of intrinsic non-convexity of an optimization problem: smaller gap in relative value implies lower degree of intrinsic non-convexity. The challenge is to quantitatively measure the duality gap of highly non-convex problems such as deep neural networks. In this work, we provide strong guarantees of this quantity for two classes of network architectures. For the neural networks with arbitrary activation functions, multi-branch architecture and a variant of hinge loss, we show that the duality gap of both population and empirical risks shrinks to zero as the number of branches increases. This result sheds light on better understanding the power of over-parametrization where increasing the network width tends to make the loss surface less non-convex. For the neural networks with linear activation function and `2 loss, we show that the duality gap of empirical risk is zero. Our two results work for arbitrary depths and adversarial data, while the analytical techniques might be of independent interest to non-convex optimization more broadly.
    [Show full text]
  • A Tutorial on Convex Optimization II: Duality and Interior Point Methods
    A Tutorial on Convex Optimization II: Duality and Interior Point Methods Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California 94304 email: [email protected] Abstract— In recent years, convex optimization has become a and concepts. For detailed examples and applications, the computational tool of central importance in engineering, thanks reader is refered to [8], [2], [6], [5], [7], [10], [12], [17], to its ability to solve very large, practical engineering problems [9], [25], [16], [31], and the references therein. reliably and efficiently. The goal of this tutorial is to continue the overview of modern convex optimization from where our We now briefly outline the paper. There are two main ACC2004 Tutorial on Convex Optimization left off, to cover sections after this one. Section II is on duality, where we important topics that were omitted there due to lack of space summarize the key ideas the general theory, illustrating and time, and highlight the intimate connections between them. the four main practical applications of duality with simple The topics of duality and interior point algorithms will be our examples. Section III is on interior point algorithms, where focus, along with simple examples. The material in this tutorial is excerpted from the recent book on convex optimization, by the focus is on barrier methods, which can be implemented Boyd and Vandenberghe, who have made available a large easily using only a few key technical components, and yet amount of free course material and freely available software. are highly effective both in theory and in practice. All of the These can be downloaded and used immediately by the reader theory we cover can be readily extended to general conic both for self-study and to solve real problems.
    [Show full text]
  • Lecture 11: October 8 11.1 Primal and Dual Problems
    10-725/36-725: Convex Optimization Fall 2015 Lecture 11: October 8 Lecturer: Ryan Tibshirani Scribes: Tian Tong Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with the permission of the Instructor. 11.1 Primal and dual problems 11.1.1 Lagrangian Consider a general optimization problem (called as primal problem) min f(x) (11.1) x subject to hi(x) ≤ 0; i = 1; ··· ; m `j(x) = 0; j = 1; ··· ; r: We define its Lagrangian as m r X X L(x; u; v) = f(x) + uihi(x) + vj`j(x): i=1 j=1 Lagrange multipliers u 2 Rm; v 2 Rr. Lemma 11.1 At each feasible x, f(x) = supu≥0;v L(x; u; v), and the supremum is taken iff u ≥ 0 satisfying uihi(x) = 0; i = 1; ··· ; m. Pm Proof: At each feasible x, we have hi(x) ≤ 0 and `(x) = 0, thus L(x; u; v) = f(x) + i=1 uihi(x) + Pr j=1 vj`j(x) ≤ f(x). The last inequality becomes equality iff uihi(x) = 0; i = 1; ··· ; m. Proposition 11.2 The optimal value of the primal problem, named as f ?, satisfies: f ? = inf sup L(x; u; v): x u≥0;v ? Proof: First considering feasible x (marked as x 2 C), we have f = infx2C f(x) = infx2C supu≥0;v L(x; u; v). Second considering non-feasible x, since supu≥0;v L(x; u; v) = 1 for any x2 = C, infx=2C supu≥0;v L(x; u; v) = ? 1.
    [Show full text]
  • Stability of the Duality Gap in Linear Optimization
    Set-Valued and Variational Analysis manuscript No. (will be inserted by the editor) Stability of the duality gap in linear optimization M.A. Goberna · A.B. Ridolfi · V.N. Vera de Serio Received: date / Accepted: date Abstract In this paper we consider the duality gap function g that measures the difference between the optimal values of the primal problem and of the dual problem in linear pro- gramming and in linear semi-infinite programming. We analyze its behavior when the data defining these problems may be perturbed, considering seven different scenarios. In partic- ular we find some stability results by proving that, under mild conditions, either the duality gap of the perturbed problems is zero or +¥ around the given data, or g has an infinite jump at it. We also give conditions guaranteeing that those data providing a finite duality gap are limits of sequences of data providing zero duality gap for sufficiently small perturbations, which is a generic result. Keywords Linear programming · Linear semi-infinite programming · Duality gap function · Stability · Primal-dual partition Mathematics Subject Classifications (2010) 90C05, 90C34, 90C31, 90C46 1 Introduction Linear optimization consists in the minimization of a linear objective function subject to linear constraints. Here the duality gap plays an important role both for theoretical and for This research was partially supported by MINECO of Spain and FEDER of EU, Grant MTM2014-59179- C2-01 and SECTyP-UNCuyo Res. 4540/13-R. M.A. Goberna Department of Mathematics, University of Alicante, 03080 Alicante, Spain. E-mail: [email protected] A.B. Ridolfi Universidad Nacional de Cuyo, Facultad de Ciencias Aplicadas a la Industria, Facultad de Ciencias Economicas,´ Mendoza, Argentina.
    [Show full text]
  • Abstract Convexity for Nonconvex Optimization Duality
    Abstract Convexity for Nonconvex Optimization Duality Angelia Nedi¶c,¤ Asuman Ozdaglar,y and Alex Rubinovz February 26, 2007 Abstract In this paper, we use abstract convexity results to study augmented dual prob- lems for (nonconvex) constrained optimization problems. We consider a nonin- creasing function f (to be interpreted as a primal or perturbation function) that is lower semicontinuous at 0 and establish its abstract convexity at 0 with respect to a set of elementary functions de¯ned by nonconvex augmenting functions. We consider three di®erent classes of augmenting functions: nonnegative augment- ing functions, bounded-below augmenting functions, and unbounded augmenting functions. We use the abstract convexity results to study augmented optimization duality without imposing boundedness assumptions. Key words: Abstract convexity, augmenting functions, augmented Lagrangian functions, asymptotic directions, duality gap. 1 Introduction The analysis of convex optimization duality relies on using linear separation results from convex analysis on the epigraph of the perturbation (primal) function of the optimization problem. This translates into dual problems constructed using traditional Lagrangian functions, which is a linear combination of the objective and constraint functions (see, for example, Rockafellar [13], Hiriart-Urruty and Lemarechal [8], Bonnans and Shapiro [5], Borwein and Lewis [6], Bertsekas, Nedi¶c,and Ozdaglar [3, 4], Auslender and Teboulle [1]). However, linear separation results are not applicable for nonconvex
    [Show full text]
  • Arxiv:1902.10921V1 [Math.OC] 28 Feb 2019 Programming Nvriy Ejn,109,P .Ciaemi:Yxia@Buaa and E-Mail: Mathematics China of R
    Noname manuscript No. (will be inserted by the editor) A survey of hidden convex optimization Yong Xia Received: date / Accepted: date Abstract Motivated by the fact that not all nonconvex optimization problems are difficult to solve, we survey in this paper three widely-used ways to reveal the hidden convex structure for different classes of nonconvex optimization problems. Finally, ten open problems are raised. Keywords convex programming quadratic programming quadratic matrix programming fractional programming· Lagrangian dual · semidefinite programming · · · Mathematics Subject Classification (2010) 90C20, 90C25, 90C26, 90C32 1 Introduction Nowadays, convex programming becomes very popular due to not only its various real applications but also the charming property that any local solution also remains global optimality. However, it does not mean that convex programming problems are easy to solve. First, it may be difficult to identify convexity. Actually, deciding whether a quartic polynomials is globally convex is NP-hard [1]. Second, evaluating a convex function is also not always easy. For example, the induced matrix norm Ax p A p = sup k k k k x x p =06 k k This research was supported by National Natural Science Foundation of China under grants arXiv:1902.10921v1 [math.OC] 28 Feb 2019 11822103, 11571029 and Beijing Natural Science Foundation Z180005. Y. Xia LMIB of the Ministry of Education; School of Mathematics and System Sciences, Beihang University, Beijing, 100191, P. R. China E-mail: [email protected] 2 Yong Xia is a convex function. But evaluating A p is NP-hard if p =1, 2, [35]. Third, convex programming problems mayk bek difficult to solve.6 It has been∞ shown in [20] that the general mixed-binary quadratic optimization problems can be equivalently reformulated as convex programming over the copositive cone Co := A Rn×n : A = AT , xT Ax 0, x 0, x Rn .
    [Show full text]
  • CONTINUOUS CONVEX SETS and ZERO DUALITY GAP for CONVEX PROGRAMS Emil Ernst, Michel Volle
    CONTINUOUS CONVEX SETS AND ZERO DUALITY GAP FOR CONVEX PROGRAMS Emil Ernst, Michel Volle To cite this version: Emil Ernst, Michel Volle. CONTINUOUS CONVEX SETS AND ZERO DUALITY GAP FOR CON- VEX PROGRAMS. 2011. hal-00652632 HAL Id: hal-00652632 https://hal.archives-ouvertes.fr/hal-00652632 Preprint submitted on 16 Dec 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. CONTINUOUS CONVEX SETS AND ZERO DUALITY GAP FOR CONVEX PROGRAMS EMIL ERNST AND MICHEL VOLLE ABSTRACT. This article uses classical notions of convex analysis over euclidean spaces, like Gale & Klee’s boundary rays and asymptotes of a convex set, or the inner aperture directions defined by Larman and Brøndsted for the same class of sets, to provide a new zero duality gap criterion for ordinary convex programs. On this ground, we are able to characterize objective functions and respectively feasible sets for which the duality gap is always zero, re- gardless of the value of the constraints and respectively of the objective function. 1. INTRODUCTION The aim of this paper is to refine the well-known convex optimization Theorem of primal attainement (see for instance [2, Proposition 5.3.3], and also the Appendix of this article) which proves that the duality gap of a convex program defined over an euclidean space amounts to zero provided that it does not possess any direction of recession (that is a direction of recession which is common to both the objective function and the feasible set).
    [Show full text]