A New Class of Scaling Matrices for Scaled Trust Region Algorithms
Total Page:16
File Type:pdf, Size:1020Kb
ISSN: 2377-3316 http://dx.doi.org/10.21174/josetap.v3i1. 44 A New Class of Scaling Matrices for Scaled Trust Region Algorithms Aydin Ayanzadeh*1, Shokoufeh Yazdanian2, Ehsan Shahamatnia3 1 Department of Applied Informatics, Informatics Institute, Istanbul Technical University, Istanbul, Turkey 2Department of Computer science, University of Tabriz, Tabriz, Iran 3Department of Computer and Electrical Engineering, Universidade Nova de Lisboa, Lisbon, Portugal Abstract— A new class of affine scaling matrices for the interior point Newton-type methods is considered to solve the nonlinear systems with simple bounds. We review the essential properties of a scaling matrix and consider several well- known scaling matrices proposed in the literature. We define a new scaling matrix that is the convex combination of these matrices. The proposed scaling matrix inherits those interesting properties of the individual matrices and satisfies additional desired requirements. The numerical experiments demonstrate the superiority of the new scaling matrix in solving several important test problems. Keywords— Diagonal Scaling; Bound-Constrained Equations; Scaling Matrices; Trust Region Methods; Affine Scaling. I. INTRODUCTION where FX: n is a continuously differentiable Diverse applications including signal processing mapping, X n is an open set containing the n- [36] and compressive sensing [3,4] incorporate many dimensional box ={x n | l x u }. The vectors convex nonlinear optimization problems. Although nn are lower and upper specialized algorithms have been developed for some of lu( { − }) , ( { + }) these problems, the interior-point methods are still the bounds on the variables such that has a nonempty main tool to tackle them. These methods require a interior. feasible initial point [19]. It should be paid attention that Efficient methods for the solution of this problem proper data collection plays an important role in with good local convergence behavior have been assessing the results obtained [39]. In addition, a vast proposed. The affine scaling trust region approach forms variety of soft-computing techniques such as a practical framework for smooth and nonsmooth box evolutionary computing methods [2, 18, 28, 37, 38] and constrained systems of nonlinear equations [6-9]. These neural networks [1, 31] include optimization problems, kinds of methods use ellipsoidal trust region defined by which are sensitive to the initial points. Like verifying a diagonal scaling matrix [12]. The diagonal scaling solution uniqueness conditions, these tasks convert into handles the bounds while at each iteration a quadratic linear feasibility problems with strict inequalities or model of the object function 1 || F ||2 is minimized nonlinear feasibility problems with bound constraints 2 [16, 34]. The latter is often challenging so that the within a trust region around the current iteration. existing algorithms need to be theoretically and The main motivation for the current work is a series computationally improved. The nonlinear minimization of papers by Bellavia et al [6-11]. The methods they problem with bound constraints is: introduced, STRN and CODOSOL have very good n numerical properties [6,7]. These methods are widely F( x )= 0, x = { x | l x u ; i = 1,..., n }. iii (1) used in practice [14, 35] and their efficiency has been * Corresponding author can be contacted via the journal website. Journal of Software Engineering: Theories and Practices 3 (1): 1-8, 2018 | https://lsp.institute Journal of Software Engineering: Theories and Practices proved in several papers [7, 23]. In [8] the authors studied global and fast convergence of an inexact dogleg method. They did not investigate the choice of a suitable scaling matrix and only reported the preliminary results. Later in [11] they focused on medium scale problems (4) and replaced the inexact Newton step by the exact solution of the Newton equation. They considered for all i=1,...,n and all x Î W. In affine scaling several diagonal scaling matrices and showed the methods in order to handle the bounds the direction of assumptions required to ensure the convergence. The name of the method is Constrained Dogleg (CoDo) the scaled gradient is defined by method which is freely accessible through the website Given an iterate and the trust region size http://codosol.de.unifi.it. The effectiveness of (CoDo) is xk Î int (W) verified by comparing it to STRSCNE [7] and to IATR D > 0, the trust region subproblem for (2) is: [9]. k minm ( p ) ; || G p || , x + p int ( ) (5) In these scaling-based algorithms, the performance p n k k k k is influenced by the selection of a scaling matrix. We where is the norm of the linear model for F(x) at introduce a new class of scaling matrices, which is mk xk obtained by the convex combination of current known , i.e. m ( p) =|| F + F p || and G= G() x nn with matrices. We analyze the numerical performance of k k k ¢ kk n n n . For the standard spherical trust CoDoSol (The Matlab Solver of CoDo) for different G : Gk = I 1 convex combinations of the scaling matrices and - region and for G = D 2 the elliptical trust region compare the results using performance profile approach. k k We also use a Projected Affine Scaling Interior Point achieves. In order to solve this subproblem different algorithm to check the local convergence properties of approaches have been proposed like STRN [6] which the new scaling matrix. combines ideas from the classical trust-region Newton method for unconstrained nonlinear equations and the In section 2 we explain the role of the scaling interior affine scaling approach for constrained matrices. In section 3 we consider several scaling optimization problems or CoDoSol [11] which is based matrices and review the requirements and assumptions. on a dogleg procedure tailored for constrained problems. In section 4, we introduce a new class of scaling matrices and prove that the new matrices satisfy the III. SCALING MATRICES required assumptions. Finally, in section 5 we report our conclusion of computational results. We consider the following well known matrices: • D CL (x ) given by Coleman and Li [12]. The II. SCALED TRUST REGIONS diagonal elements are: In this section we describe the idea behind using scaled matrices to solve problem (1). First note that (1) is closely related to the box constrained optimization problem: 1 2 min x ÎW f (x ) = || F (x ) || s.t . x Î W. 2 (2) (6) Every solution of (1) is a global minimum of (2.2) and DxHUU () * • given by Heinkenschloss et al [21]: if x * is a minimum of (2) such that f (x *) = 0, then x is a solution of (1). The first order optimality conditions of (2) are equivalent to the nonlinear system of equations D( x ) f ( x )= 0, x , where (7) and D is a suitable scaling matrix of order n where p >1 is a fixed constant. (3) • D KK given by Kanzow and Klug [26, 18]: D(x) = diag (d1(x),...,d n (x)). Coleman and Li [12] considered only one choice of the scaling matrix, then Heinkenschloss et al. [21] noted that the optimality conditions holds for a general class of scaling matrices satisfying the conditions: 20 Journal of Software Engineering: Theories and Practices 01a DCON= a D CL + a D HUU + a D KK + a D HMZ ,,i (11) 1 2 3 4 i =1,2,3,4 4 ai =1. i=1 This scaling matrix demonstrates the advantages of (8) all the scaling matrices involved in its definition and seems an appropriate matrix with combined properties. we first have to verify that this matrix satisfies four For a given constant g > 0. - D HMZ (x ) by Hager et desired requirements as follows al [20]: (i) Clearly D CON satisfies (4). Table 1. Test problems. (9) where (10) and a(x ) is a continuous function, strictly positive for any x and uniformly bounded away from zero. This matrix has been used by authors in study of a cyclic Barzilai-Borwein gradient method for bound constrained minimization problems they replaced the (ii) As Bellavia et al. [11] showed, the above three Hessian of the objective function with , where lk I lk matrices are bounded in for WÇ Br (x ) x Î W is the classical Barzilai-Borwein parameter. Then, in and . This implies that the combination of order to compute the new iterate, they move along the r > 0 scaled gradient , with these matrices is also bounded. (iii) It has been shown in [11] that all the matrices a(x ) = l . k k except D HUU join this property. Thus if we assume Bellavia et al in [11] introduced the following then CON satisfies this condition. a2 = 0 D assumptions for a scaling matrix and verified that all above scaling matrices satisfy almost all the (iv) Same as before, since all the matrices satisfy requirements specified by the following assumption. condition (iv), the convex combination also verifies this property. A. Assumption (i) D(x) satisfies (4), It is worth mentioning that D CL is discontinuous at (ii) D(x) is bounded in for any x points where there exists an index i for which WÇ Br (x ) KK HUU and r > 0, but the matrices D and D are locally Lipschitz continuous and continuous (iii) There exists a such that the step size l to k respectively. The new matrix D CON is continuous and the boundary from xk along satisfies so we can have a matrix having all the advantages of 3 matrices, a matrix that can be as fast and efficient as the l > l whenever is uniformly bounded k first one and as smooth and continuous as second one. above, (iv) For any x in int (W) there exist two positive IV. NUMERICAL EXPERIMENTS constants and c such that We ran two experiments, CoDoSol on 15 problems x Br (x ) Ì int (W) and Projected Affine Scaling Interior Point on 2 and -1 for any x in .