Incremental Aggregated Proximal and Augmented Lagrangian Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
4. Convex Optimization Problems
Convex Optimization — Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form • convex optimization problems • quasiconvex optimization • linear optimization • quadratic optimization • geometric programming • generalized inequality constraints • semidefinite programming • vector optimization • 4–1 Optimization problem in standard form minimize f0(x) subject to f (x) 0, i =1,...,m i ≤ hi(x)=0, i =1,...,p x Rn is the optimization variable • ∈ f : Rn R is the objective or cost function • 0 → f : Rn R, i =1,...,m, are the inequality constraint functions • i → h : Rn R are the equality constraint functions • i → optimal value: p⋆ = inf f (x) f (x) 0, i =1,...,m, h (x)=0, i =1,...,p { 0 | i ≤ i } p⋆ = if problem is infeasible (no x satisfies the constraints) • ∞ p⋆ = if problem is unbounded below • −∞ Convex optimization problems 4–2 Optimal and locally optimal points x is feasible if x dom f and it satisfies the constraints ∈ 0 ⋆ a feasible x is optimal if f0(x)= p ; Xopt is the set of optimal points x is locally optimal if there is an R> 0 such that x is optimal for minimize (over z) f0(z) subject to fi(z) 0, i =1,...,m, hi(z)=0, i =1,...,p z x≤ R k − k2 ≤ examples (with n =1, m = p =0) f (x)=1/x, dom f = R : p⋆ =0, no optimal point • 0 0 ++ f (x)= log x, dom f = R : p⋆ = • 0 − 0 ++ −∞ f (x)= x log x, dom f = R : p⋆ = 1/e, x =1/e is optimal • 0 0 ++ − f (x)= x3 3x, p⋆ = , local optimum at x =1 • 0 − −∞ Convex optimization problems 4–3 Implicit constraints the standard form optimization problem has an implicit -
Convex Optimisation Matlab Toolbox
UNLOCBOX: USER’S GUIDE MATLAB CONVEX OPTIMIZATION TOOLBOX Lausanne - February 2014 PERRAUDIN Nathanaël, KALOFOLIAS Vassilis LTS2 - EPFL Abstract Nowadays the trend to solve optimization problems is to use specific algorithms rather than very gen- eral ones. The UNLocBoX provides a general framework allowing the user to design his own algorithms. To do so, the framework try to stay as close from the mathematical problem as possible. More precisely, the UNLocBoX is a Matlab toolbox designed to solve convex optimization problem of the form K min ∑ fn(x); x2C n=1 using proximal splitting techniques. It is mainly composed of solvers, proximal operators and demonstra- tion files allowing the user to quickly implement a problem. Contents 1 Introduction 1 2 Literature 2 3 Installation and initialization2 3.1 Dependencies.........................................2 3.2 GPU computation.......................................2 4 Structure of the UNLocBoX2 5 Problems of interest4 6 Solvers 4 6.1 Defining functions......................................5 6.2 Selecting a solver.......................................5 6.3 Optional parameters......................................6 6.4 Plug-ins: how to tune your algorithm.............................6 6.5 Returned arguments......................................8 7 Proximal operators9 7.1 Constraints..........................................9 8 Example 10 References 16 LTS2 - EPFL 1 Introduction During your research, you have most likely faced the situation where you need to solve a convex optimiza- tion problem. Maybe you were trying to find an optimal combination of parameters, you were transforming some process into something automatic or, even more simply, your algorithm was equivalent to solving convex problem. In this situation, you usually face a dilemma: you need a specific algorithm to solve your problem, but you do not want to spend hours writing it. -
A Branch-And-Price Approach with Milp Formulation to Modularity Density Maximization on Graphs
A BRANCH-AND-PRICE APPROACH WITH MILP FORMULATION TO MODULARITY DENSITY MAXIMIZATION ON GRAPHS KEISUKE SATO Signalling and Transport Information Technology Division, Railway Technical Research Institute. 2-8-38 Hikari-cho, Kokubunji-shi, Tokyo 185-8540, Japan YOICHI IZUNAGA Information Systems Research Division, The Institute of Behavioral Sciences. 2-9 Ichigayahonmura-cho, Shinjyuku-ku, Tokyo 162-0845, Japan Abstract. For clustering of an undirected graph, this paper presents an exact algorithm for the maximization of modularity density, a more complicated criterion to overcome drawbacks of the well-known modularity. The problem can be interpreted as the set-partitioning problem, which reminds us of its integer linear programming (ILP) formulation. We provide a branch-and-price framework for solving this ILP, or column generation combined with branch-and-bound. Above all, we formulate the column gen- eration subproblem to be solved repeatedly as a simpler mixed integer linear programming (MILP) problem. Acceleration tech- niques called the set-packing relaxation and the multiple-cutting- planes-at-a-time combined with the MILP formulation enable us to optimize the modularity density for famous test instances in- cluding ones with over 100 vertices in around four minutes by a PC. Our solution method is deterministic and the computation time is not affected by any stochastic behavior. For one of them, column generation at the root node of the branch-and-bound tree arXiv:1705.02961v3 [cs.SI] 27 Jun 2017 provides a fractional upper bound solution and our algorithm finds an integral optimal solution after branching. E-mail addresses: (Keisuke Sato) [email protected], (Yoichi Izunaga) [email protected]. -
A Lagrangian Decomposition Approach Combined with Metaheuristics for the Knapsack Constrained Maximum Spanning Tree Problem
MASTERARBEIT A Lagrangian Decomposition Approach Combined with Metaheuristics for the Knapsack Constrained Maximum Spanning Tree Problem ausgeführt am Institut für Computergraphik und Algorithmen der Technischen Universität Wien unter der Anleitung von Univ.-Prof. Dipl.-Ing. Dr.techn. Günther Raidl und Univ.-Ass. Dipl.-Ing. Dr.techn. Jakob Puchinger durch Sandro Pirkwieser, Bakk.techn. Matrikelnummer 0116200 Simmeringer Hauptstraße 50/30, A-1110 Wien Datum Unterschrift Abstract This master’s thesis deals with solving the Knapsack Constrained Maximum Spanning Tree (KCMST) problem, which is a so far less addressed NP-hard combinatorial optimization problem belonging to the area of network design. Thereby sought is a spanning tree whose profit is maximal, but at the same time its total weight must not exceed a specified value. For this purpose a Lagrangian decomposition approach, which is a special variant of La- grangian relaxation, is applied to derive upper bounds. In the course of the application the problem is split up in two subproblems, which are likewise to be maximized but easier to solve on its own. The subgradient optimization method as well as the Volume Algorithm are deployed to solve the Lagrangian dual problem. To derive according lower bounds, i.e. feasible solutions, a simple Lagrangian heuristic is applied which is strengthened by a problem specific local search. Furthermore an Evolutionary Algorithm is presented which uses a suitable encoding for the solutions and appropriate operators, whereas the latter are able to utilize heuristics based on defined edge-profits. It is shown that simple edge-profits, derived straightforward from the data given by an instance, are of no benefit. -
Conditional Gradient Sliding for Convex Optimization ∗
CONDITIONAL GRADIENT SLIDING FOR CONVEX OPTIMIZATION ∗ GUANGHUI LAN y AND YI ZHOU z Abstract. In this paper, we present a new conditional gradient type method for convex optimization by calling a linear optimization (LO) oracle to minimize a series of linear functions over the feasible set. Different from the classic conditional gradient method, the conditional gradient sliding (CGS) algorithm developed herein can skip the computation of gradients from time to time, and as a result, can achieve the optimal complexity bounds in terms of not only the number of callsp to the LO oracle, but also the number of gradient evaluations. More specifically, we show that the CGS method requires O(1= ) and O(log(1/)) gradient evaluations, respectively, for solving smooth and strongly convex problems, while still maintaining the optimal O(1/) bound on the number of calls to LO oracle. We also develop variants of the CGS method which can achieve the optimal complexity bounds for solving stochastic optimization problems and an important class of saddle point optimization problems. To the best of our knowledge, this is the first time that these types of projection-free optimal first-order methods have been developed in the literature. Some preliminary numerical results have also been provided to demonstrate the advantages of the CGS method. Keywords: convex programming, complexity, conditional gradient method, Frank-Wolfe method, Nesterov's method AMS 2000 subject classification: 90C25, 90C06, 90C22, 49M37 1. Introduction. The conditional gradient (CndG) method, which was initially developed by Frank and Wolfe in 1956 [13] (see also [11, 12]), has been considered one of the earliest first-order methods for solving general convex programming (CP) problems. -
Chapter 8 Stochastic Gradient / Subgradient Methods
Chapter 8 Stochastic gradient / subgradient methods Contents (class version) 8.0 Introduction........................................ 8.2 8.1 The subgradient method................................. 8.5 Subgradients and subdifferentials................................. 8.5 Properties of subdifferentials.................................... 8.7 Convergence of the subgradient method.............................. 8.10 8.2 Example: Hinge loss with 1-norm regularizer for binary classifier design...... 8.17 8.3 Incremental (sub)gradient method............................ 8.19 Incremental (sub)gradient method................................. 8.21 8.4 Stochastic gradient (SG) method............................. 8.23 SG update.............................................. 8.23 Stochastic gradient algorithm: convergence analysis....................... 8.26 Variance reduction: overview................................... 8.33 Momentum............................................. 8.35 Adaptive step-sizes......................................... 8.37 8.5 Example: X-ray CT reconstruction........................... 8.41 8.1 © J. Fessler, April 12, 2020, 17:55 (class version) 8.2 8.6 Summary.......................................... 8.50 8.0 Introduction This chapter describes two families of algorithms: • subgradient methods • stochastic gradient methods aka stochastic gradient descent methods Often we turn to these methods as a “last resort,” for applications where none of the methods discussed previously are suitable. Many machine learning applications, -
12. Coordinate Descent Methods
EE 546, Univ of Washington, Spring 2014 12. Coordinate descent methods theoretical justifications • randomized coordinate descent method • minimizing composite objectives • accelerated coordinate descent method • Coordinate descent methods 12–1 Notations consider smooth unconstrained minimization problem: minimize f(x) x RN ∈ n coordinate blocks: x =(x ,...,x ) with x RNi and N = N • 1 n i ∈ i=1 i more generally, partition with a permutation matrix: U =[PU1 Un] • ··· n T xi = Ui x, x = Uixi i=1 X blocks of gradient: • f(x) = U T f(x) ∇i i ∇ coordinate update: • x+ = x tU f(x) − i∇i Coordinate descent methods 12–2 (Block) coordinate descent choose x(0) Rn, and iterate for k =0, 1, 2,... ∈ 1. choose coordinate i(k) 2. update x(k+1) = x(k) t U f(x(k)) − k ik∇ik among the first schemes for solving smooth unconstrained problems • cyclic or round-Robin: difficult to analyze convergence • mostly local convergence results for particular classes of problems • does it really work (better than full gradient method)? • Coordinate descent methods 12–3 Steepest coordinate descent choose x(0) Rn, and iterate for k =0, 1, 2,... ∈ (k) 1. choose i(k) = argmax if(x ) 2 i 1,...,n k∇ k ∈{ } 2. update x(k+1) = x(k) t U f(x(k)) − k i(k)∇i(k) assumptions f(x) is block-wise Lipschitz continuous • ∇ f(x + U v) f(x) L v , i =1,...,n k∇i i −∇i k2 ≤ ik k2 f has bounded sub-level set, in particular, define • ⋆ R(x) = max max y x 2 : f(y) f(x) y x⋆ X⋆ k − k ≤ ∈ Coordinate descent methods 12–4 Analysis for constant step size quadratic upper bound due to block coordinate-wise -
Process Optimization
Process Optimization Mathematical Programming and Optimization of Multi-Plant Operations and Process Design Ralph W. Pike Director, Minerals Processing Research Institute Horton Professor of Chemical Engineering Louisiana State University Department of Chemical Engineering, Lamar University, April, 10, 2007 Process Optimization • Typical Industrial Problems • Mathematical Programming Software • Mathematical Basis for Optimization • Lagrange Multipliers and the Simplex Algorithm • Generalized Reduced Gradient Algorithm • On-Line Optimization • Mixed Integer Programming and the Branch and Bound Algorithm • Chemical Production Complex Optimization New Results • Using one computer language to write and run a program in another language • Cumulative probability distribution instead of an optimal point using Monte Carlo simulation for a multi-criteria, mixed integer nonlinear programming problem • Global optimization Design vs. Operations • Optimal Design −Uses flowsheet simulators and SQP – Heuristics for a design, a superstructure, an optimal design • Optimal Operations – On-line optimization – Plant optimal scheduling – Corporate supply chain optimization Plant Problem Size Contact Alkylation Ethylene 3,200 TPD 15,000 BPD 200 million lb/yr Units 14 76 ~200 Streams 35 110 ~4,000 Constraints Equality 761 1,579 ~400,000 Inequality 28 50 ~10,000 Variables Measured 43 125 ~300 Unmeasured 732 1,509 ~10,000 Parameters 11 64 ~100 Optimization Programming Languages • GAMS - General Algebraic Modeling System • LINDO - Widely used in business applications -
Linear Programming Notes X: Integer Programming
Linear Programming Notes X: Integer Programming 1 Introduction By now you are familiar with the standard linear programming problem. The assumption that choice variables are infinitely divisible (can be any real number) is unrealistic in many settings. When we asked how many chairs and tables should the profit-maximizing carpenter make, it did not make sense to come up with an answer like “three and one half chairs.” Maybe the carpenter is talented enough to make half a chair (using half the resources needed to make the entire chair), but probably she wouldn’t be able to sell half a chair for half the price of a whole chair. So, sometimes it makes sense to add to a problem the additional constraint that some (or all) of the variables must take on integer values. This leads to the basic formulation. Given c = (c1, . , cn), b = (b1, . , bm), A a matrix with m rows and n columns (and entry aij in row i and column j), and I a subset of {1, . , n}, find x = (x1, . , xn) max c · x subject to Ax ≤ b, x ≥ 0, xj is an integer whenever j ∈ I. (1) What is new? The set I and the constraint that xj is an integer when j ∈ I. Everything else is like a standard linear programming problem. I is the set of components of x that must take on integer values. If I is empty, then the integer programming problem is a linear programming problem. If I is not empty but does not include all of {1, . , n}, then sometimes the problem is called a mixed integer programming problem. -
The Gradient Sampling Methodology
The Gradient Sampling Methodology James V. Burke∗ Frank E. Curtisy Adrian S. Lewisz Michael L. Overtonx January 14, 2019 1 Introduction in particular, this leads to calling the negative gra- dient, namely, −∇f(x), the direction of steepest de- The principal methodology for minimizing a smooth scent for f at x. However, when f is not differentiable function is the steepest descent (gradient) method. near x, one finds that following the negative gradient One way to extend this methodology to the minimiza- direction might offer only a small amount of decrease tion of a nonsmooth function involves approximating in f; indeed, obtaining decrease from x along −∇f(x) subdifferentials through the random sampling of gra- may be possible only with a very small stepsize. The dients. This approach, known as gradient sampling GS methodology is based on the idea of stabilizing (GS), gained a solid theoretical foundation about a this definition of steepest descent by instead finding decade ago [BLO05, Kiw07], and has developed into a a direction to approximately solve comprehensive methodology for handling nonsmooth, potentially nonconvex functions in the context of op- min max gT d; (2) kdk2≤1 g2@¯ f(x) timization algorithms. In this article, we summarize the foundations of the GS methodology, provide an ¯ where @f(x) is the -subdifferential of f at x [Gol77]. overview of the enhancements and extensions to it To understand the context of this idea, recall that that have developed over the past decade, and high- the (Clarke) subdifferential of a locally Lipschitz f light some interesting open questions related to GS. -
Integer Linear Programming
Introduction Linear Programming Integer Programming Integer Linear Programming Subhas C. Nandy ([email protected]) Advanced Computing and Microelectronics Unit Indian Statistical Institute Kolkata 700108, India. Introduction Linear Programming Integer Programming Organization 1 Introduction 2 Linear Programming 3 Integer Programming Introduction Linear Programming Integer Programming Linear Programming A technique for optimizing a linear objective function, subject to a set of linear equality and linear inequality constraints. Mathematically, maximize c1x1 + c2x2 + ::: + cnxn Subject to: a11x1 + a12x2 + ::: + a1nxn b1 ≤ a21x1 + a22x2 + ::: + a2nxn b2 : ≤ : am1x1 + am2x2 + ::: + amnxn bm ≤ xi 0 for all i = 1; 2;:::; n. ≥ Introduction Linear Programming Integer Programming Linear Programming In matrix notation, maximize C T X Subject to: AX B X ≤0 where C is a n≥ 1 vector |- cost vector, A is a m× n matrix |- coefficient matrix, B is a m × 1 vector |- requirement vector, and X is an n× 1 vector of unknowns. × He developed it during World War II as a way to plan expenditures and returns so as to reduce costs to the army and increase losses incurred by the enemy. The method was kept secret until 1947 when George B. Dantzig published the simplex method and John von Neumann developed the theory of duality as a linear optimization solution. Dantzig's original example was to find the best assignment of 70 people to 70 jobs subject to constraints. The computing power required to test all the permutations to select the best assignment is vast. However, the theory behind linear programming drastically reduces the number of feasible solutions that must be checked for optimality. -
IEOR 269, Spring 2010 Integer Programming and Combinatorial Optimization
IEOR 269, Spring 2010 Integer Programming and Combinatorial Optimization Professor Dorit S. Hochbaum Contents 1 Introduction 1 2 Formulation of some ILP 2 2.1 0-1 knapsack problem . 2 2.2 Assignment problem . 2 3 Non-linear Objective functions 4 3.1 Production problem with set-up costs . 4 3.2 Piecewise linear cost function . 5 3.3 Piecewise linear convex cost function . 6 3.4 Disjunctive constraints . 7 4 Some famous combinatorial problems 7 4.1 Max clique problem . 7 4.2 SAT (satisfiability) . 7 4.3 Vertex cover problem . 7 5 General optimization 8 6 Neighborhood 8 6.1 Exact neighborhood . 8 7 Complexity of algorithms 9 7.1 Finding the maximum element . 9 7.2 0-1 knapsack . 9 7.3 Linear systems . 10 7.4 Linear Programming . 11 8 Some interesting IP formulations 12 8.1 The fixed cost plant location problem . 12 8.2 Minimum/maximum spanning tree (MST) . 12 9 The Minimum Spanning Tree (MST) Problem 13 i IEOR269 notes, Prof. Hochbaum, 2010 ii 10 General Matching Problem 14 10.1 Maximum Matching Problem in Bipartite Graphs . 14 10.2 Maximum Matching Problem in Non-Bipartite Graphs . 15 10.3 Constraint Matrix Analysis for Matching Problems . 16 11 Traveling Salesperson Problem (TSP) 17 11.1 IP Formulation for TSP . 17 12 Discussion of LP-Formulation for MST 18 13 Branch-and-Bound 20 13.1 The Branch-and-Bound technique . 20 13.2 Other Branch-and-Bound techniques . 22 14 Basic graph definitions 23 15 Complexity analysis 24 15.1 Measuring quality of an algorithm .