Chapter 4 Duality
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Boosting Algorithms for Maximizing the Soft Margin
Boosting Algorithms for Maximizing the Soft Margin Manfred K. Warmuth∗ Karen Glocer Gunnar Ratsch¨ Dept. of Engineering Dept. of Engineering Friedrich Miescher Laboratory University of California University of California Max Planck Society Santa Cruz, CA, U.S.A. Santa Cruz, CA, U.S.A. T¨ubingen, Germany Abstract We present a novel boosting algorithm, called SoftBoost, designed for sets of bi- nary labeled examples that are not necessarily separable by convex combinations of base hypotheses. Our algorithm achieves robustness by capping the distribu- tions on the examples. Our update of the distribution is motivated by minimizing a relative entropy subject to the capping constraints and constraints on the edges of the obtained base hypotheses. The capping constraints imply a soft margin in the dual optimization problem. Our algorithm produces a convex combination of hypotheses whose soft margin is within δ of its maximum. We employ relative en- ln N tropy projection methods to prove an O( δ2 ) iteration bound for our algorithm, where N is number of examples. We compare our algorithm with other approaches including LPBoost, Brown- Boost, and SmoothBoost. We show that there exist cases where the numberof iter- ations required by LPBoost grows linearly in N instead of the logarithmic growth for SoftBoost. In simulation studies we show that our algorithm converges about as fast as LPBoost, faster than BrownBoost, and much faster than SmoothBoost. In a benchmark comparison we illustrate the competitiveness of our approach. 1 Introduction Boosting methods have been used with great success in many applications like OCR, text classifi- cation, natural language processing, drug discovery, and computational biology [13]. -
On the Dual Formulation of Boosting Algorithms
On the Dual Formulation of Boosting Algorithms Chunhua Shen, and Hanxi Li Abstract—We study boosting algorithms from a new perspective. We show that the Lagrange dual problems of ℓ1 norm regularized AdaBoost, LogitBoost and soft-margin LPBoost with generalized hinge loss are all entropy maximization problems. By looking at the dual problems of these boosting algorithms, we show that the success of boosting algorithms can be understood in terms of maintaining a better margin distribution by maximizing margins and at the same time controlling the margin variance. We also theoretically prove that, approximately, ℓ1 norm regularized AdaBoost maximizes the average margin, instead of the minimum margin. The duality formulation also enables us to develop column generation based optimization algorithms, which are totally corrective. We show that they exhibit almost identical classification results to that of standard stage-wise additive boosting algorithms but with much faster convergence rates. Therefore fewer weak classifiers are needed to build the ensemble using our proposed optimization technique. Index Terms—AdaBoost, LogitBoost, LPBoost, Lagrange duality, linear programming, entropy maximization. ✦ 1 INTRODUCTION that the hard-margin LPBoost does not perform well in OOSTING has attracted a lot of research interests since most cases although it usually produces larger minimum B the first practical boosting algorithm, AdaBoost, was margins. More often LPBoost has worse generalization introduced by Freund and Schapire [1]. The machine learn- performance. In other words, a higher minimum margin ing community has spent much effort on understanding would not necessarily imply a lower test error. Breiman [11] how the algorithm works [2], [3], [4]. -
The Shadow Price of Health
This PDF is a selection from an out-of-print volume from the National Bureau of Economic Research Volume Title: The Demand for Health: A Theoretical and Empirical Investigation Volume Author/Editor: Michael Grossman Volume Publisher: NBER Volume ISBN: 0-87014-248-8 Volume URL: http://www.nber.org/books/gros72-1 Publication Date: 1972 Chapter Title: The Shadow Price of Health Chapter Author: Michael Grossman Chapter URL: http://www.nber.org/chapters/c3486 Chapter pages in book: (p. 11 - 30) II THE SHADOW PRICE OF HEALTH In the previous chapter, 1 showed how a consumer selects the optimal quantity of health in any period of his life. In this chapter, 1 explore the effects of changes in the supply and demand prices of health in the context of the pure investment model. In Section 1, I comment on the demand curve for health capital in the investment model.. In Section 2, I relate variations in depreciation rates with age to life cycle patterns of health and gross investment. I also examine the impact of changes in depreciation rates among individuals of the same age and briefly incorporate uncertainty into the model. In the third section, I consider the effects of shifts in market efficiency, measured by the wage rate, and nonmarket efficiency, measured by human capital, on supply and demand prices. 1. THE INVESTMENT DEMAND CURVE if the marginal utility of healthy days or the marginal disutility of sick days were equal to zero, health would be solely an investment commodity. The optimal amount of could then be found by equating the marginal monetary rate of return on an investment in health to the cost of health capital: = = r — + (2-1) Setting Uh1 =0in equation (1-13'), one derives equation (2-1). -
Duality, Part 1: Dual Bases and Dual Maps Notation
Duality, part 1: Dual Bases and Dual Maps Notation F denotes either R or C. V and W denote vector spaces over F. Define ': R3 ! R by '(x; y; z) = 4x − 5y + 2z. Then ' is a linear functional on R3. n n Fix (b1;:::; bn) 2 C . Define ': C ! C by '(z1;:::; zn) = b1z1 + ··· + bnzn: Then ' is a linear functional on Cn. Define ': P(R) ! R by '(p) = 3p00(5) + 7p(4). Then ' is a linear functional on P(R). R 1 Define ': P(R) ! R by '(p) = 0 p(x) dx. Then ' is a linear functional on P(R). Examples: Linear Functionals Definition: linear functional A linear functional on V is a linear map from V to F. In other words, a linear functional is an element of L(V; F). n n Fix (b1;:::; bn) 2 C . Define ': C ! C by '(z1;:::; zn) = b1z1 + ··· + bnzn: Then ' is a linear functional on Cn. Define ': P(R) ! R by '(p) = 3p00(5) + 7p(4). Then ' is a linear functional on P(R). R 1 Define ': P(R) ! R by '(p) = 0 p(x) dx. Then ' is a linear functional on P(R). Linear Functionals Definition: linear functional A linear functional on V is a linear map from V to F. In other words, a linear functional is an element of L(V; F). Examples: Define ': R3 ! R by '(x; y; z) = 4x − 5y + 2z. Then ' is a linear functional on R3. Define ': P(R) ! R by '(p) = 3p00(5) + 7p(4). Then ' is a linear functional on P(R). R 1 Define ': P(R) ! R by '(p) = 0 p(x) dx. -
Applications of LP Duality
Lecture 6 Duality of LPs and Applications∗ Last lecture we introduced duality of linear programs. We saw how to form duals, and proved both the weak and strong duality theorems. In this lecture we will see a few more theoretical results and then begin discussion of applications of duality. 6.1 More Duality Results 6.1.1 A Quick Review Last time we saw that if the primal (P) is max c>x s:t: Ax ≤ b then the dual (D) is min b>y s:t: A>y = c y ≥ 0: This is just one form of the primal and dual and we saw that the transformation from one to the other is completely mechanical. The duality theorem tells us that if (P) and (D) are a primal-dual pair then we have one of the three possibilities 1. Both (P) and (D) are infeasible. 2. One is infeasible and the other is unbounded. 3. Both are feasible and if x∗ and y∗ are optimal solutions to (P) and (D) respectively, then c>x∗ = b>y∗. *Lecturer: Anupam Gupta. Scribe: Deepak Bal. 1 LECTURE 6. DUALITY OF LPS AND APPLICATIONS 2 6.1.2 A Comment about Complexity Note that the duality theorem (and equivalently, the Farkas Lemma) puts several problems related to LP feasibility and solvability in NP \ co-NP. E.g., Consider the question of whether the equational form LP Ax = b; x ≥ 0 is feasible. If the program is feasible, we may efficiently verify this by checking that a \certificate” point satisfies the equations. By taking this point to be a vertex and appealing to Hwk1 (Problem 4), we see that we may represent this certificate point in size polynomial in the size of the input. -
Duality Theory and Sensitivity Analysis
4 Duality Theory and Sensitivity Analysis The notion of duality is one of the most important concepts in linear programming. Basically, associated with each linear programming problem (we may call it the primal problem), defined by the constraint matrix A, the right-hand-side vector b, and the cost vector c, there is a corresponding linear programming problem (called the dual problem) which is constructed by the same set of data A, b, and c. A pair of primal and dual problems are closely related. The interesting relationship between the primal and dual reveals important insights into solving linear programming problems. To begin this chapter, we introduce a dual problem for the standard-form linear programming problem. Then we study the fundamental relationship between the primal and dual problems. Both the "strong" and "weak" duality theorems will be presented. An economic interpretation of the dual variables and dual problem further exploits the concepts in duality theory. These concepts are then used to derive two important simplex algorithms, namely the dual simplex algorithm and the primal dual algorithm, for solving linear programming problems. We conclude this chapter with the sensitivity analysis, which is the study of the effects of changes in the parameters (A, b, and c) of a linear programming problem on its optimal solution. In particular, we study different methods of changing the cost vector, changing the right-hand-side vector, adding and removing a variable, and adding and removing a constraint in linear programming. 55 56 Duality Theory and Sensitivity Analysis Chap. 4 4.1 DUAL LINEAR PROGRAM Consider a linear programming problem in its standard form: Minimize cT x (4.1a) subject to Ax = b (4.1b) x:::O (4.1c) where c and x are n-dimensional column vectors, A an m x n matrix, and b an m dimensional column vector. -
On the Ekeland Variational Principle with Applications and Detours
Lectures on The Ekeland Variational Principle with Applications and Detours By D. G. De Figueiredo Tata Institute of Fundamental Research, Bombay 1989 Author D. G. De Figueiredo Departmento de Mathematica Universidade de Brasilia 70.910 – Brasilia-DF BRAZIL c Tata Institute of Fundamental Research, 1989 ISBN 3-540- 51179-2-Springer-Verlag, Berlin, Heidelberg. New York. Tokyo ISBN 0-387- 51179-2-Springer-Verlag, New York. Heidelberg. Berlin. Tokyo No part of this book may be reproduced in any form by print, microfilm or any other means with- out written permission from the Tata Institute of Fundamental Research, Colaba, Bombay 400 005 Printed by INSDOC Regional Centre, Indian Institute of Science Campus, Bangalore 560012 and published by H. Goetze, Springer-Verlag, Heidelberg, West Germany PRINTED IN INDIA Preface Since its appearance in 1972 the variational principle of Ekeland has found many applications in different fields in Analysis. The best refer- ences for those are by Ekeland himself: his survey article [23] and his book with J.-P. Aubin [2]. Not all material presented here appears in those places. Some are scattered around and there lies my motivation in writing these notes. Since they are intended to students I included a lot of related material. Those are the detours. A chapter on Nemyt- skii mappings may sound strange. However I believe it is useful, since their properties so often used are seldom proved. We always say to the students: go and look in Krasnoselskii or Vainberg! I think some of the proofs presented here are more straightforward. There are two chapters on applications to PDE. -
Optimization Basics for Electric Power Markets
Optimization Basics for Electric Power Markets Presenter: Leigh Tesfatsion Professor of Econ, Math, and ECpE Iowa State University Ames, Iowa 50011-1070 http://www.econ.iastate.edu/tesfatsi/ [email protected] Last Revised: 20 September 2020 Important Acknowledgement: These lecture materials incorporate edited slides and figures originally developed by R. Baldick, S. Bradley et al., A. Hallam, T. Overbye, and M. Wachowiak 1 Topic Outline ⚫ ISO Market Optimization on a Typical Operating Day D ⚫ Alternative modeling formulations ⚫ Optimization illustration: Real-time economic dispatch ⚫ Classic Nonlinear Programming Problem (NPP): Minimization subject to equality constraints ⚫ NPP via the Lagrange multiplier approach ⚫ NPP Lagrange multipliers as shadow prices ⚫ Real-time economic dispatch: Numerical example ⚫ General Nonlinear Programming Problem (GNPP): Minimization subject to equality and inequality constraints ⚫ GNPP via the Lagrange multiplier approach ⚫ GNPP Lagrange multipliers as shadow prices ⚫ Necessary versus sufficient conditions for optimization ⚫ Technical references 2 Key Objective of EE/Econ 458 ◆ Understand the optimization processes undertaken by participants in restructured wholesale power markets ◆ For ISOs, these processes include: ◼ Security-Constrained Unit Commitment (SCUC) for determining which Generating Companies (GenCos) will be asked to commit to the production of energy the following day ◼ Security-Constrained Economic Dispatch (SCED) for determining energy dispatch and locational marginal price (LMP) levels -
Use of Multiple Regression Analysis to Summarize and Interpret Linear Programming Shadow Prices in an Economic Planning Model
/Îj^'/T'f USE OF MULTIPLE REGRESSION ANALYSIS TO SUMMARIZE AND INTERPRET LINEAR PROGRAMMING SHADOW PRICES IN AN ECONOMIC PLANNING MODEL Daniel G. Williams U.S. Department of Agriculture Economics, Statistics, and Cooperatives Service Technical Bulletin No. 1622 Use of Multiple Regression Analysis to Summarize and Interpret Linear Program- ming Shadow Prices in an Economic Planning Model. By Daniel G. Williams, Economic Development Division; Economics, Statistics, and Cooperatives Service, U.S. Department of Agriculture. Technical Bulletin No. 1622. ABSTRACT A simple method is presented for evaluating the benefit to a region (regional objective function) of new manufacturing firms. These firms are subsets of the more aggregated 4-digit SIC manufacturing industries, some included and some not, in a rural multicounty economic planning model. The model must contain many types of industries to include the full range of industry; such addition is costly. Multiple regression analysis can summarize and interpret shadow prices of export industries so that local planners in their decisionmaking can use the underlying economic characteristics of the model industries rather than use only their industry product classifications. Keywords: Linear programming, shadow prices, objective function, multiple regression, regional exports, rural development. Washington, D.C. 20250 May 1980 CONTENTS Summary iü Introduction 1 Objective Function and the Generalized Shadow Price 4 The Multiple Regression and Linear Programming Models 5 Interpretation of the Multiple -
Lagrangian Duality
Lagrangian Duality Richard Lusby Department of Management Engineering Technical University of Denmark Today's Topics (jg Lagrange Multipliers Lagrangian Relaxation Lagrangian Duality R Lusby (42111) Lagrangian Duality 2/30 Example: Economic Order Quantity Model (jg Parameters I Demand rate: d I Order cost: K I Unit cost: c I Holding cost: h Decision Variable: The optimal order quantity Q Objective Function: dK hQ minimize + dc + Q 2 Optimal Order Quantity: r 2dK Q∗ = h R Lusby (42111) Lagrangian Duality 3/30 EOQ Model Consider the following extension(jg Suppose we have several items with a space constraint q The problem P dj Kj hQj minimize + dj cj + j Qj 2 P subject to: Qj q j ≤ We have the following possibilities 1 The constraint is non-binding/slack, i.e r X 2dj Kj < q h j 2 The constraint is binding/active R Lusby (42111) Lagrangian Duality 4/30 Lagrange Multiplier (jg The problem P dj Kj hQj minimize T (Q ; Q ;:::; Qn) = + dj cj + 1 2 j Qj 2 P subject to: j Qj = q Lagrange multiplier λ X minimize T (Q ; Q ;:::; Qn) + λ( Qj q) 1 2 − j Differentiate with respect to Qj : @L dj Kj h = 2 + + λ = 0 j @Qj − Qj 2 8 Solving for Qj r 2dj Kj Qj = j h + 2λ 8 R Lusby (42111) Lagrangian Duality 5/30 Lagrange multiplier Continued(jg Substituting this into the constraint we have r X 2dj Kj = q h + 2λ j Solving for λ and Qj gives: P 2 p2dj Kj j h q − λ = λ∗ = 2 r 2dj Kj Q∗ = j j h + 2λ∗ 8 R Lusby (42111) Lagrangian Duality 6/30 Interpretation of λ (jg In linear programming a dual variable is a shadow price: ∗ ∗ @Z yi = @bi Similarly, in the EOQ -
Arxiv:2011.09194V1 [Math.OC]
Noname manuscript No. (will be inserted by the editor) Lagrangian duality for nonconvex optimization problems with abstract convex functions Ewa M. Bednarczuk · Monika Syga Received: date / Accepted: date Abstract We investigate Lagrangian duality for nonconvex optimization prob- lems. To this aim we use the Φ-convexity theory and minimax theorem for Φ-convex functions. We provide conditions for zero duality gap and strong duality. Among the classes of functions, to which our duality results can be applied, are prox-bounded functions, DC functions, weakly convex functions and paraconvex functions. Keywords Abstract convexity · Minimax theorem · Lagrangian duality · Nonconvex optimization · Zero duality gap · Weak duality · Strong duality · Prox-regular functions · Paraconvex and weakly convex functions 1 Introduction Lagrangian and conjugate dualities have far reaching consequences for solution methods and theory in convex optimization in finite and infinite dimensional spaces. For recent state-of the-art of the topic of convex conjugate duality we refer the reader to the monograph by Radu Bot¸[5]. There exist numerous attempts to construct pairs of dual problems in non- convex optimization e.g., for DC functions [19], [34], for composite functions [8], DC and composite functions [30], [31] and for prox-bounded functions [15]. In the present paper we investigate Lagrange duality for general optimiza- tion problems within the framework of abstract convexity, namely, within the theory of Φ-convexity. The class Φ-convex functions encompasses convex l.s.c. Ewa M. Bednarczuk Systems Research Institute, Polish Academy of Sciences, Newelska 6, 01–447 Warsaw Warsaw University of Technology, Faculty of Mathematics and Information Science, ul. -
Recent Developments in the Theory of Duality in Locally Convex Vector Spaces
[ VOLUME 6 I ISSUE 2 I APRIL– JUNE 2019] E ISSN 2348 –1269, PRINT ISSN 2349-5138 RECENT DEVELOPMENTS IN THE THEORY OF DUALITY IN LOCALLY CONVEX VECTOR SPACES CHETNA KUMARI1 & RABISH KUMAR2* 1Research Scholar, University Department of Mathematics, B. R. A. Bihar University, Muzaffarpur 2*Research Scholar, University Department of Mathematics T. M. B. University, Bhagalpur Received: February 19, 2019 Accepted: April 01, 2019 ABSTRACT: : The present paper concerned with vector spaces over the real field: the passage to complex spaces offers no difficulty. We shall assume that the definition and properties of convex sets are known. A locally convex space is a topological vector space in which there is a fundamental system of neighborhoods of 0 which are convex; these neighborhoods can always be supposed to be symmetric and absorbing. Key Words: LOCALLY CONVEX SPACES We shall be exclusively concerned with vector spaces over the real field: the passage to complex spaces offers no difficulty. We shall assume that the definition and properties of convex sets are known. A convex set A in a vector space E is symmetric if —A=A; then 0ЄA if A is not empty. A convex set A is absorbing if for every X≠0 in E), there exists a number α≠0 such that λxЄA for |λ| ≤ α ; this implies that A generates E. A locally convex space is a topological vector space in which there is a fundamental system of neighborhoods of 0 which are convex; these neighborhoods can always be supposed to be symmetric and absorbing. Conversely, if any filter base is given on a vector space E, and consists of convex, symmetric, and absorbing sets, then it defines one and only one topology on E for which x+y and λx are continuous functions of both their arguments.