A New Class of Scaling Matrices for Scaled Trust Region Algorithms

Total Page:16

File Type:pdf, Size:1020Kb

A New Class of Scaling Matrices for Scaled Trust Region Algorithms ISSN: 2377-3316 http://dx.doi.org/10.21174/josetap.v3i1. 44 A New Class of Scaling Matrices for Scaled Trust Region Algorithms Aydin Ayanzadeh*1, Shokoufeh Yazdanian2, Ehsan Shahamatnia3 1 Department of Applied Informatics, Informatics Institute, Istanbul Technical University, Istanbul, Turkey 2Department of Computer science, University of Tabriz, Tabriz, Iran 3Department of Computer and Electrical Engineering, Universidade Nova de Lisboa, Lisbon, Portugal Abstract— A new class of affine scaling matrices for the interior point Newton-type methods is considered to solve the nonlinear systems with simple bounds. We review the essential properties of a scaling matrix and consider several well- known scaling matrices proposed in the literature. We define a new scaling matrix that is the convex combination of these matrices. The proposed scaling matrix inherits those interesting properties of the individual matrices and satisfies additional desired requirements. The numerical experiments demonstrate the superiority of the new scaling matrix in solving several important test problems. Keywords— Diagonal Scaling; Bound-Constrained Equations; Scaling Matrices; Trust Region Methods; Affine Scaling. I. INTRODUCTION where FX: n is a continuously differentiable Diverse applications including signal processing mapping, X n is an open set containing the n- [36] and compressive sensing [3,4] incorporate many dimensional box ={x n | l x u }. The vectors convex nonlinear optimization problems. Although nn are lower and upper specialized algorithms have been developed for some of lu( { − }) , ( { + }) these problems, the interior-point methods are still the bounds on the variables such that has a nonempty main tool to tackle them. These methods require a interior. feasible initial point [19]. It should be paid attention that Efficient methods for the solution of this problem proper data collection plays an important role in with good local convergence behavior have been assessing the results obtained [39]. In addition, a vast proposed. The affine scaling trust region approach forms variety of soft-computing techniques such as a practical framework for smooth and nonsmooth box evolutionary computing methods [2, 18, 28, 37, 38] and constrained systems of nonlinear equations [6-9]. These neural networks [1, 31] include optimization problems, kinds of methods use ellipsoidal trust region defined by which are sensitive to the initial points. Like verifying a diagonal scaling matrix [12]. The diagonal scaling solution uniqueness conditions, these tasks convert into handles the bounds while at each iteration a quadratic linear feasibility problems with strict inequalities or model of the object function 1 || F ||2 is minimized nonlinear feasibility problems with bound constraints 2 [16, 34]. The latter is often challenging so that the within a trust region around the current iteration. existing algorithms need to be theoretically and The main motivation for the current work is a series computationally improved. The nonlinear minimization of papers by Bellavia et al [6-11]. The methods they problem with bound constraints is: introduced, STRN and CODOSOL have very good n numerical properties [6,7]. These methods are widely F( x )= 0, x = { x | l x u ; i = 1,..., n }. iii (1) used in practice [14, 35] and their efficiency has been * Corresponding author can be contacted via the journal website. Journal of Software Engineering: Theories and Practices 3 (1): 1-8, 2018 | https://lsp.institute Journal of Software Engineering: Theories and Practices proved in several papers [7, 23]. In [8] the authors studied global and fast convergence of an inexact dogleg method. They did not investigate the choice of a suitable scaling matrix and only reported the preliminary results. Later in [11] they focused on medium scale problems (4) and replaced the inexact Newton step by the exact solution of the Newton equation. They considered for all i=1,...,n and all x Î W. In affine scaling several diagonal scaling matrices and showed the methods in order to handle the bounds the direction of assumptions required to ensure the convergence. The name of the method is Constrained Dogleg (CoDo) the scaled gradient is defined by method which is freely accessible through the website Given an iterate and the trust region size http://codosol.de.unifi.it. The effectiveness of (CoDo) is xk Î int (W) verified by comparing it to STRSCNE [7] and to IATR D > 0, the trust region subproblem for (2) is: [9]. k minm ( p ) ; || G p || , x + p int ( ) (5) In these scaling-based algorithms, the performance p n k k k k is influenced by the selection of a scaling matrix. We where is the norm of the linear model for F(x) at introduce a new class of scaling matrices, which is mk xk obtained by the convex combination of current known , i.e. m ( p) =|| F + F p || and G= G() x nn with matrices. We analyze the numerical performance of k k k ¢ kk n n n . For the standard spherical trust CoDoSol (The Matlab Solver of CoDo) for different G : Gk = I 1 convex combinations of the scaling matrices and - region and for G = D 2 the elliptical trust region compare the results using performance profile approach. k k We also use a Projected Affine Scaling Interior Point achieves. In order to solve this subproblem different algorithm to check the local convergence properties of approaches have been proposed like STRN [6] which the new scaling matrix. combines ideas from the classical trust-region Newton method for unconstrained nonlinear equations and the In section 2 we explain the role of the scaling interior affine scaling approach for constrained matrices. In section 3 we consider several scaling optimization problems or CoDoSol [11] which is based matrices and review the requirements and assumptions. on a dogleg procedure tailored for constrained problems. In section 4, we introduce a new class of scaling matrices and prove that the new matrices satisfy the III. SCALING MATRICES required assumptions. Finally, in section 5 we report our conclusion of computational results. We consider the following well known matrices: • D CL (x ) given by Coleman and Li [12]. The II. SCALED TRUST REGIONS diagonal elements are: In this section we describe the idea behind using scaled matrices to solve problem (1). First note that (1) is closely related to the box constrained optimization problem: 1 2 min x ÎW f (x ) = || F (x ) || s.t . x Î W. 2 (2) (6) Every solution of (1) is a global minimum of (2.2) and DxHUU () * • given by Heinkenschloss et al [21]: if x * is a minimum of (2) such that f (x *) = 0, then x is a solution of (1). The first order optimality conditions of (2) are equivalent to the nonlinear system of equations D( x ) f ( x )= 0, x , where (7) and D is a suitable scaling matrix of order n where p >1 is a fixed constant. (3) • D KK given by Kanzow and Klug [26, 18]: D(x) = diag (d1(x),...,d n (x)). Coleman and Li [12] considered only one choice of the scaling matrix, then Heinkenschloss et al. [21] noted that the optimality conditions holds for a general class of scaling matrices satisfying the conditions: 20 Journal of Software Engineering: Theories and Practices 01a DCON= a D CL + a D HUU + a D KK + a D HMZ ,,i (11) 1 2 3 4 i =1,2,3,4 4 ai =1. i=1 This scaling matrix demonstrates the advantages of (8) all the scaling matrices involved in its definition and seems an appropriate matrix with combined properties. we first have to verify that this matrix satisfies four For a given constant g > 0. - D HMZ (x ) by Hager et desired requirements as follows al [20]: (i) Clearly D CON satisfies (4). Table 1. Test problems. (9) where (10) and a(x ) is a continuous function, strictly positive for any x and uniformly bounded away from zero. This matrix has been used by authors in study of a cyclic Barzilai-Borwein gradient method for bound constrained minimization problems they replaced the (ii) As Bellavia et al. [11] showed, the above three Hessian of the objective function with , where lk I lk matrices are bounded in for WÇ Br (x ) x Î W is the classical Barzilai-Borwein parameter. Then, in and . This implies that the combination of order to compute the new iterate, they move along the r > 0 scaled gradient , with these matrices is also bounded. (iii) It has been shown in [11] that all the matrices a(x ) = l . k k except D HUU join this property. Thus if we assume Bellavia et al in [11] introduced the following then CON satisfies this condition. a2 = 0 D assumptions for a scaling matrix and verified that all above scaling matrices satisfy almost all the (iv) Same as before, since all the matrices satisfy requirements specified by the following assumption. condition (iv), the convex combination also verifies this property. A. Assumption (i) D(x) satisfies (4), It is worth mentioning that D CL is discontinuous at (ii) D(x) is bounded in for any x points where there exists an index i for which WÇ Br (x ) KK HUU and r > 0, but the matrices D and D are locally Lipschitz continuous and continuous (iii) There exists a such that the step size l to k respectively. The new matrix D CON is continuous and the boundary from xk along satisfies so we can have a matrix having all the advantages of 3 matrices, a matrix that can be as fast and efficient as the l > l whenever is uniformly bounded k first one and as smooth and continuous as second one. above, (iv) For any x in int (W) there exist two positive IV. NUMERICAL EXPERIMENTS constants and c such that We ran two experiments, CoDoSol on 15 problems x Br (x ) Ì int (W) and Projected Affine Scaling Interior Point on 2 and -1 for any x in .
Recommended publications
  • Advances in Interior Point Methods and Column Generation
    Advances in Interior Point Methods and Column Generation Pablo Gonz´alezBrevis Doctor of Philosophy University of Edinburgh 2013 Declaration I declare that this thesis was composed by myself and that the work contained therein is my own, except where explicitly stated otherwise in the text. (Pablo Gonz´alezBrevis) ii To my endless inspiration and angels on earth: Paulina, Crist´obal and Esteban iii Abstract In this thesis we study how to efficiently combine the column generation technique (CG) and interior point methods (IPMs) for solving the relaxation of a selection of integer programming problems. In order to obtain an efficient method a change in the column generation technique and a new reoptimization strategy for a primal-dual interior point method are proposed. It is well-known that the standard column generation technique suffers from un- stable behaviour due to the use of optimal dual solutions that are extreme points of the restricted master problem (RMP). This unstable behaviour slows down column generation so variations of the standard technique which rely on interior points of the dual feasible set of the RMP have been proposed in the literature. Among these tech- niques, there is the primal-dual column generation method (PDCGM) which relies on sub-optimal and well-centred dual solutions. This technique dynamically adjusts the column generation tolerance as the method approaches optimality. Also, it relies on the notion of the symmetric neighbourhood of the central path so sub-optimal and well-centred solutions are obtained. We provide a thorough theoretical analysis that guarantees the convergence of the primal-dual approach even though sub-optimal solu- tions are used in the course of the algorithm.
    [Show full text]
  • Using Ant System Optimization Technique for Approximate Solution to Multi- Objective Programming Problems
    International Journal of Scientific & Engineering Research, Volume 4, Issue 9, September-2013 ISSN 2229-5518 1701 Using Ant System Optimization Technique for Approximate Solution to Multi- objective Programming Problems M. El-sayed Wahed and Amal A. Abou-Elkayer Department of Computer Science, Faculty of Computers and Information Suez Canal University (EGYPT) E-mail: [email protected], Abstract In this paper we use the ant system optimization metaheuristic to find approximate solution to the Multi-objective Linear Programming Problems (MLPP), the advantageous and disadvantageous of the suggested method also discussed focusing on the parallel computation and real time optimization, it's worth to mention here that the suggested method doesn't require any artificial variables the slack and surplus variables are enough, a test example is given at the end to show how the method works. Keywords: Multi-objective Linear. Programming. problems, Ant System Optimization Problems Introduction The Multi-objective Linear. Programming. problems is an optimization method which can be used to find solutions to problems where the Multi-objective function and constraints are linear functions of the decision variables, the intersection of constraints result in a polyhedronIJSER which represents the region that contain all feasible solutions, the constraints equation in the MLPP may be in the form of equalities or inequalities, and the inequalities can be changed to equalities by using slack or surplus variables, one referred to Rao [1], Philips[2], for good start and introduction to the MLPP. The LP type of optimization problems were first recognized in the 30's by economists while developing methods for the optimal allocations of resources, the main progress came from George B.
    [Show full text]
  • A New Trust Region Method with Simple Model for Large-Scale Optimization ∗
    A New Trust Region Method with Simple Model for Large-Scale Optimization ∗ Qunyan Zhou,y Wenyu Sunz and Hongchao Zhangx Abstract In this paper a new trust region method with simple model for solv- ing large-scale unconstrained nonlinear optimization is proposed. By employing generalized weak quasi-Newton equations, we derive several schemes to construct variants of scalar matrices as the Hessian approx- imation used in the trust region subproblem. Under some reasonable conditions, global convergence of the proposed algorithm is established in the trust region framework. The numerical experiments on solving the test problems with dimensions from 50 to 20000 in the CUTEr li- brary are reported to show efficiency of the algorithm. Key words. unconstrained optimization, Barzilai-Borwein method, weak quasi-Newton equation, trust region method, global convergence AMS(2010) 65K05, 90C30 1. Introduction In this paper, we consider the unconstrained optimization problem min f(x); (1.1) x2Rn where f : Rn ! R is continuously differentiable with dimension n relatively large. Trust region methods are a class of powerful and robust globalization ∗This work was supported by National Natural Science Foundation of China (11171159, 11401308), the Natural Science Fund for Universities of Jiangsu Province (13KJB110007), the Fund of Jiangsu University of Technology (KYY13012). ySchool of Mathematics and Physics, Jiangsu University of Technology, Changzhou 213001, Jiangsu, China. Email: [email protected] zCorresponding author. School of Mathematical Sciences, Jiangsu Key Laboratory for NSLSCS, Nanjing Normal University, Nanjing 210023, China. Email: [email protected] xDepartment of Mathematics, Louisiana State University, USA. Email: [email protected] 1 methods for solving (1.1).
    [Show full text]
  • An Annotated Bibliography of Network Interior Point Methods
    AN ANNOTATED BIBLIOGRAPHY OF NETWORK INTERIOR POINT METHODS MAURICIO G.C. RESENDE AND GERALDO VEIGA Abstract. This paper presents an annotated bibliography on interior point methods for solving network flow problems. We consider single and multi- commodity network flow problems, as well as preconditioners used in imple- mentations of conjugate gradient methods for solving the normal systems of equations that arise in interior network flow algorithms. Applications in elec- trical engineering and miscellaneous papers complete the bibliography. The collection includes papers published in journals, books, Ph.D. dissertations, and unpublished technical reports. 1. Introduction Many problems arising in transportation, communications, and manufacturing can be modeled as network flow problems (Ahuja et al. (1993)). In these problems, one seeks an optimal way to move flow, such as overnight mail, information, buses, or electrical currents, on a network, such as a postal network, a computer network, a transportation grid, or a power grid. Among these optimization problems, many are special classes of linear programming problems, with combinatorial properties that enable development of efficient solution techniques. The focus of this annotated bibliography is on recent computational approaches, based on interior point methods, for solving large scale network flow problems. In the last two decades, many other approaches have been developed. A history of computational approaches up to 1977 is summarized in Bradley et al. (1977). Several computational studies established the fact that specialized network simplex algorithms are orders of magnitude faster than the best linear programming codes of that time. (See, e.g. the studies in Glover and Klingman (1981); Grigoriadis (1986); Kennington and Helgason (1980) and Mulvey (1978)).
    [Show full text]
  • Constrained Multifidelity Optimization Using Model Calibration
    Struct Multidisc Optim (2012) 46:93–109 DOI 10.1007/s00158-011-0749-1 RESEARCH PAPER Constrained multifidelity optimization using model calibration Andrew March · Karen Willcox Received: 4 April 2011 / Revised: 22 November 2011 / Accepted: 28 November 2011 / Published online: 8 January 2012 c Springer-Verlag 2012 Abstract Multifidelity optimization approaches seek to derivative-free and sequential quadratic programming meth- bring higher-fidelity analyses earlier into the design process ods. The method uses approximately the same number of by using performance estimates from lower-fidelity models high-fidelity analyses as a multifidelity trust-region algo- to accelerate convergence towards the optimum of a high- rithm that estimates the high-fidelity gradient using finite fidelity design problem. Current multifidelity optimization differences. methods generally fall into two broad categories: provably convergent methods that use either the high-fidelity gradient Keywords Multifidelity · Derivative-free · Optimization · or a high-fidelity pattern-search, and heuristic model cali- Multidisciplinary · Aerodynamic design bration approaches, such as interpolating high-fidelity data or adding a Kriging error model to a lower-fidelity function. This paper presents a multifidelity optimization method that bridges these two ideas; our method iteratively calibrates Nomenclature lower-fidelity information to the high-fidelity function in order to find an optimum of the high-fidelity design prob- A Active and violated constraint Jacobian lem. The
    [Show full text]
  • An Affine-Scaling Pivot Algorithm for Linear Programming
    AN AFFINE-SCALING PIVOT ALGORITHM FOR LINEAR PROGRAMMING PING-QI PAN¤ Abstract. We proposed a pivot algorithm using the affine-scaling technique. It produces a sequence of interior points as well as a sequence of vertices, until reaching an optimal vertex. We report preliminary but favorable computational results obtained with a dense implementation of it on a set of standard Netlib test problems. Key words. pivot, interior point, affine-scaling, orthogonal factorization AMS subject classifications. 65K05, 90C05 1. Introduction. Consider the linear programming (LP) problem in the stan- dard form min cT x (1.1) subject to Ax = b; x ¸ 0; where c 2 Rn; b 2 Rm , A 2 Rm£n(m < n). Assume that rank(A) = m and c 62 range(AT ). So the trivial case is eliminated where the objective value cT x is constant over the feasible region of (1.1). The assumption rank(A) = m is not substantial either to the proposed algorithm,and will be dropped later. As a LP problem solver, the simplex algorithm might be one of the most famous and widely used mathematical tools in the world. Its philosophy is to move on the underlying polyhedron, from a vertex to adjacent vertex, along edges until an optimal vertex is reached. Since it was founded by G.B.Dantzig [8] in 1947, the simplex algorithm has been very successful in practice, and occupied a dominate position in this area, despite its infiniteness in case of degeneracy. However, it turned out that the simplex algorithm may require an exponential amount of time to solve LP problems in the worst-case [29].
    [Show full text]
  • Computing a Trust Region Step
    Distribution Category: Mathematics and Computers ANL--81-83 (UC-32) DE82 005656 ANIr81483 ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 COMPUTING A TRUST REGION STEP Jorge J. Mord and D. C. Sorensen Applied Mathematics Division DISCLAIMER - -. ".,.,... December 1981 Computing a Trust Region Step Jorge J. Mord and D. C. Sorensen Applied Mathematics Division Argonne National Laboratory Argonne Illinois 60439 1. Introduction. In an important class of minimization algorithms called "trust region methods" (see, for example, Sorensen [1981]), the calculation of the step between iterates requires the solution of a problem of the form (1.1) minij(w): 1|w |isAl where A is a positive parameter, I- II is the Euclidean norm in R", and (1.2) #(w) = g w + Mw7Bw, with g E R", and B E R a symmetric matrix. The quadratic function ' generally represents a local model to the objective function defined by interpolatory data at an iterate and thus it is important to be able to solve (1.1) for any symmetric matrix B; in particular, for a matrix B with negative eigenvalues. In trust region methods it is sometimes helpful to include a scaling matrix for the variables. In this case, problem (1.1) is replaced by (1.3) mint%(v):IIDu Is A where D E R""" is a nonsingular matrix. The change of variables Du = w shows that problem (1.3) is equivalent to (1.4) minif(w):1|w| Is Al where f(w) _%f(D-'w), and that the solutions of problems (1.3) and (1.4) are related by Dv = w.
    [Show full text]
  • On Some Polynomial-Time Algorithms for Solving Linear Programming Problems
    Available online a t www.pelagiaresearchlibrary.com Pelagia Research Library Advances in Applied Science Research, 2012, 3 (5):3367-3373 ISSN: 0976-8610 CODEN (USA): AASRFC On Some Polynomial-time Algorithms for Solving Linear Programming Problems B. O. Adejo and H. S. Adaji Department of Mathematical Sciences, Kogi State University, Anyigba _____________________________________________________________________________________________ ABSTRACT In this article we survey some Polynomial-time Algorithms for Solving Linear Programming Problems namely: the ellipsoid method, Karmarkar’s algorithm and the affine scaling algorithm. Finally, we considered a test problem which we solved with the methods where applicable and conclusions drawn from the results so obtained. __________________________________________________________________________________________ INTRODUCTION An algorithm A is said to be a polynomial-time algorithm for a problem P, if the number of steps (i. e., iterations) required to solve P on applying A is bounded by a polynomial function O(m n,, L) of dimension and input length of the problem. The Ellipsoid method is a specific algorithm developed by soviet mathematicians: Shor [5], Yudin and Nemirovskii [7]. Khachian [4] adapted the ellipsoid method to derive the first polynomial-time algorithm for linear programming. Although the algorithm is theoretically better than the simplex algorithm, which has an exponential running time in the worst case, it is very slow practically and not competitive with the simplex method. Nevertheless, it is a very important theoretical tool for developing polynomial-time algorithms for a large class of convex optimization problems. Narendra K. Karmarkar is an Indian mathematician; renowned for developing the Karmarkar’s algorithm. He is listed as an ISI highly cited researcher.
    [Show full text]
  • Welcome to the 2013 MOPTA Conference!
    Welcome to the 2013 MOPTA Conference! Mission Statement The Modeling and Optimization: Theory and Applications (MOPTA) conference is an annual event aiming to bring together a diverse group of people from both discrete and continuous optimization, working on both theoretical and applied aspects. The format consists of invited talks from distinguished speakers and selected contributed talks, spread over three days. The goal is to present a diverse set of exciting new developments from different optimization areas while at the same time providing a setting that will allow increased interaction among the participants. We aim to bring together researchers from both the theoretical and applied communities who do not usually have the chance to interact in the framework of a medium- scale event. MOPTA 2013 is hosted by the Department of Industrial and Systems Engineering at Lehigh University. Organization Committee Frank E. Curtis - Chair [email protected] Tamás Terlaky [email protected] Ted Ralphs [email protected] Katya Scheinberg [email protected] Lawrence V. Snyder [email protected] Robert H. Storer [email protected] Aurélie Thiele [email protected] Luis Zuluaga [email protected] Staff Kathy Rambo We thank our sponsors! 1 Program Wednesday, August 14 – Rauch Business Center 7:30-8:10 - Registration and continental breakfast - Perella Auditorium Lobby 8:10-8:20 - Welcome: Tamás Terlaky, Department Chair, Lehigh ISE - Perella Auditorium (RBC 184) 8:20-8:30 - Opening remarks: Patrick Farrell, Provost, Lehigh University
    [Show full text]
  • Tutorial: PART 2
    Tutorial: PART 2 Optimization for Machine Learning Elad Hazan Princeton University + help from Sanjeev Arora & Yoram Singer Agenda 1. Learning as mathematical optimization • Stochastic optimization, ERM, online regret minimization • Online gradient descent 2. Regularization • AdaGrad and optimal regularization 3. Gradient Descent++ • Frank-Wolfe, acceleration, variance reduction, second order methods, non-convex optimization Accelerating gradient descent? 1. Adaptive regularization (AdaGrad) works for non-smooth&non-convex 2. Variance reduction uses special ERM structure very effective for smooth&convex 3. Acceleration/momentum smooth convex only, general purpose optimization since 80’s Condition number of convex functions # defined as � = , where (simplified) $ 0 ≺ �� ≼ �+� � ≼ �� � = strong convexity, � = smoothness Non-convex smooth functions: (simplified) −�� ≼ �+� � ≼ �� Why do we care? well-conditioned functions exhibit much faster optimization! (but equivalent via reductions) Examples Smooth gradient descent The descent lemma, �-smooth functions: (algorithm: �012 = �0 − ��4 ) + � �012 − � �0 ≤ −�0 �012 − �0 + � �0 − �012 1 = − � + ��+ � + = − � + 0 4� 0 Thus, for M-bounded functions: ( � �0 ≤ �) 1 −2� ≤ � � − �(� ) = > �(� ) − � � ≤ − > � + ; 2 012 0 4� 0 0 0 Thus, exists a t for which, 8�� � + ≤ 0 � Smooth gradient descent 2 Conclusions: for � = � − �� and T = Ω , finds 012 0 4 C + �0 ≤ � 1. Holds even for non-convex functions ∗ 2. For convex functions implies � �0 − � � ≤ �(�) (faster for smooth!) Non-convex stochastic gradient descent The descent lemma, �-smooth functions: (algorithm: �012 = �0 − ��N0) + � � �012 − � �0 ≤ � −�4 �012 − �0 + � �0 − �012 + + + + = � −�P0 ⋅ ��4 + � �N4 = −��4 + � �� �N4 + + + = −��4 + � �(�4 + ��� �P0 ) Thus, for M-bounded functions: ( � �0 ≤ �) M� T = � ⇒ ∃ . � + ≤ � �+ 0Y; 0 Controlling the variance: Interpolating GD and SGD Model: both full and stochastic gradients. Estimator combines both into lower variance RV: �012 = �0 − � �P� �0 − �P� �[ + ��(�[) Every so oFten, compute Full gradient and restart at new �[.
    [Show full text]
  • An Affine-Scaling Interior-Point Method for Continuous Knapsack Constraints with Application to Support Vector Machines∗
    SIAM J. OPTIM. c 2011 Society for Industrial and Applied Mathematics Vol. 21, No. 1, pp. 361–390 AN AFFINE-SCALING INTERIOR-POINT METHOD FOR CONTINUOUS KNAPSACK CONSTRAINTS WITH APPLICATION TO SUPPORT VECTOR MACHINES∗ MARIA D. GONZALEZ-LIMA†, WILLIAM W. HAGER‡ , AND HONGCHAO ZHANG§ Abstract. An affine-scaling algorithm (ASL) for optimization problems with a single linear equality constraint and box restrictions is developed. The algorithm has the property that each iterate lies in the relative interior of the feasible set. The search direction is obtained by approximating the Hessian of the objective function in Newton’s method by a multiple of the identity matrix. The algorithm is particularly well suited for optimization problems where the Hessian of the objective function is a large, dense, and possibly ill-conditioned matrix. Global convergence to a stationary point is established for a nonmonotone line search. When the objective function is strongly convex, ASL converges R-linearly to the global optimum provided the constraint multiplier is unique and a nondegeneracy condition holds. A specific implementation of the algorithm is developed in which the Hessian approximation is given by the cyclic Barzilai-Borwein (CBB) formula. The algorithm is evaluated numerically using support vector machine test problems. Key words. interior-point, affine-scaling, cyclic Barzilai-Borwein methods, global convergence, linear convergence, support vector machines AMS subject classifications. 90C06, 90C26, 65Y20 DOI. 10.1137/090766255 1. Introduction. In this paper we develop an interior-point algorithm for a box-constrained optimization problem with a linear equality constraint: (1.1) min f(x) subject to x ∈ Rn, l ≤ x ≤ u, aTx = b.
    [Show full text]
  • An Active–Set Trust-Region Method for Bound-Constrained Nonlinear Optimization Without Derivatives Applied to Noisy Aerodynamic Design Problems Anke Troltzsch
    An active–set trust-region method for bound-constrained nonlinear optimization without derivatives applied to noisy aerodynamic design problems Anke Troltzsch To cite this version: Anke Troltzsch. An active–set trust-region method for bound-constrained nonlinear optimization with- out derivatives applied to noisy aerodynamic design problems. Optimization and Control [math.OC]. Institut National Polytechnique de Toulouse - INPT, 2011. English. tel-00639257 HAL Id: tel-00639257 https://tel.archives-ouvertes.fr/tel-00639257 Submitted on 8 Nov 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. ฀ ฀฀฀฀฀ ฀ %0$503"5%&%0$503"5%&-6/*7&34*5²-6/*7฀&34*5²%&506-064&%&506-064& ฀฀ Institut National Polytechnique฀ de Toulouse (INP Toulouse) ฀฀฀ Mathématiques Informatique฀ Télécommunications ฀ ฀฀฀฀฀฀ Anke TRÖLTZSCH฀ ฀ mardi 7 juin 2011 ฀ ฀ ฀ ฀ An active-set trust-region method for bound-constrained nonlinear optimization without฀ derivatives applied to noisy aerodynamic฀ design problems ฀ ฀฀฀ Mathématiques Informatique฀ Télécommunications (MITT) ฀฀฀฀ CERFACS ฀฀฀
    [Show full text]