Line-Search Methods for Smooth Unconstrained Optimization

Total Page:16

File Type:pdf, Size:1020Kb

Line-Search Methods for Smooth Unconstrained Optimization Notes Line-Search Methods for Smooth Unconstrained Optimization Daniel P. Robinson Department of Applied Mathematics and Statistics Johns Hopkins University September 17, 2020 1 / 106 Outline Notes 1 Generic Linesearch Framework 2 Computing a descent direction pk (search direction) Steepest descent direction Modified Newton direction Quasi-Newton directions for medium scale problems Limited-memory quasi-Newton directions for large-scale problems Linear CG method for large-scale problems 3 Choosing the step length αk (linesearch) Backtracking-Armijo linesearch Wolfe conditions Strong Wolfe conditions 4 Complete Algorithms Steepest descent backtracking Armijo linesearch method Modified Newton backtracking-Armijo linesearch method Modified Newton linesearch method based on the Wolfe conditions Quasi-Newton linesearch method based on the Wolfe conditions 2 / 106 Notes The problem minimize f(x) n x2R n where the objective function f : R ! R assume that f 2 C1 (sometimes C2) and is Lipschitz continuous in practice this assumption may be violated, but the algorithms we develop may still work in practice it is very rare to be able to provide an explicit minimizer we consider iterative methods: given starting guess x0, generate sequence fxkg for k = 1; 2;::: AIM: ensure that (a subsequence) has some favorable limiting properties I satisfies first-order necessary conditions I satisfies second-order necessary conditions Notation: fk = f(xk), gk = g(xk), Hk = H(xk) 4 / 106 The basic idea Notes consider descent methods, i.e., f(xk+1) < f(xk) calculate a search direction pk from xk ensure that this direction is a descent direction, i.e., T gk pk < 0 if gk 6= 0 so that, for small steps along pk, the objective function f will be reduced calculate a suitable steplength αk > 0 so that f(xk + αkpk) < fk computation of αk is the linesearch the computation of αk may itself require an iterative procedure generic update for linesearch methods is given by xk+1 = xk + αkpk 5 / 106 Issue 1: steps might be too long Notes f(x) 3 2.5 (x1,f(x1) 2 1.5 (x2,f(x2) (x3,f(x3) (x ,f(x ) (x5,f(x5) 4 4 1 0.5 0 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 x 2 Figure: The objective function f(x) = x and the iterates xk+1 = xk + αkpk generated by the k+1 k+1 descent directions pk = (−1) and steps αk = 2 + 3=2 from x0 = 2. decrease in f is not proportional to the size of the directional derivative! 6 / 106 Issue 2: steps might be too short Notes f(x) 3 2.5 (x1,f(x1) 2 1.5 (x2,f(x2) (x3,f(x3) (x ,f(x ) (x4,f(x4) 1 5 5 0.5 0 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 x 2 Figure: The objective function f(x) = x and the iterates xk+1 = xk + αkpk generated by the k+1 descent directions pk = −1 and steps αk = 1=2 from x0 = 2. decrease in f is not proportional to the size of the directional derivative! 7 / 106 What is a practical linesearch? Notes in the “early” days exact linesearchs were used, i.e., pick αk to minimize f(xk + αpk) I an exact linesearch requires univariate minimization I cheap if f is simple, e.g., a quadratic function I generally very expensive and not cost effective I exact linesearch may not be much better than an approximate linesearch modern methods use inexact linesearchs I ensure steps are neither too long nor too short I make sure that the decrease in f is proportional to the directional derivative I try to pick “appropriate” initial stepsize for fast convergence F related to how the search direction sk is computed the descent direction (search direction) is also important I “bad” directions may not converge at all I more typically, “bad” directions may converge very slowly 8 / 106 Notes Definition 2.1 (Steepest descent direction) For a differentiable function f, the search direction def pk = −∇f(xk) ≡ −gk is called the steepest-descent direction. pk is a descent direction provided gk 6= 0 T T 2 I gk pk = −gk gk = −kgkk2 < 0 pk solves the problem L minimize m (xk + p) subject to kpk2 = kgkk2 n k p2R L def T I mk (xk + p) = fk + gk p L I mk (xk + p) ≈ f(xk + p) Any method that uses the steepest-descent direction is a method of steepest descent. 11 / 106 Observation: the steepest descent direction is also the unique solution to Notes T 1 T minimize fk + g p + p p n k 2 p2R approximates second-derivative Hessian by the identify matrix I how often is this a good idea? is it a surprise that convergence is typically very slow? Idea: choose positive definite Bk and compute search direction as the unique minimizer Q def T 1 T pk = argmin mk (p) = fk + gk p + p Bkp n 2 p2R pk satisfies Bkpk = −gk why must Bk be positive definite? Q I Bk 0 =) Mk is strictly convex =) unique solution I if gk 6= 0, then pk 6= 0 and is a descent direction T T pk gk = −pk Bkpk < 0 =) pk is a descent direction pick “intelligent” Bk that has “useful” curvature information IF Hk 0 and we choose Bk = Hk, then sk is the Newton direction. Awesome! 13 / 106 Question: How do we choose the positive-definite matrix Bk? Notes Ideally, Bk is chosen such that kBk − Hkk is “small” Bk = Hk when Hk is “sufficiently” positive definite. Comments: for the remainder of this section, we omit the suffix k and write H = Hk, B = Bk, and g = gk. n×n for a symmetric matrix A 2 R , we use the matrix norm kAk2 = max jλjj 1≤j≤n with fλjg the eigenvalues of A. the spectral decomposition of H is given by H = VΛVT, where V = v1 v2 ··· vn and Λ = diag(λ1; λ2; : : : ; λn) with Hvj = λjvj and λ1 ≥ λ2 ≥ · · · ≥ λn. H is positive definite if and only if λj > 0 for all j. computing the spectral decomposition is, generally, very expensive! 14 / 106 Method 1: small scale problems (n < 1000) Notes Algorithm 1 Compute modified Newton matrix B from H 1: input H 2: Choose β > 1, the desired bound on the condition number of B. T 3: Compute the spectral decomposition H = VΛV . 4: if H = 0 then 5: Set " = 1 6: else 7: Set " = kHk2/β > 0 8: end if 9: Compute λ if λ ≥ " Λ¯ = diag(λ¯ ; λ¯ ;:::; λ¯ ) with λ¯ = j j 1 2 n j " otherwise T 10: return B = VΛ¯V 0, which satisfies cond(B) ≤ β replaces the eigenvalues of H that are “not positive enough” with " Λ¯ = Λ + D, where D = diag max(0;" − λ1); max(0;" − λ2);:::; max(0;" − λn) 0 B = H + E, where E = VDVT 0 15 / 106 Question: What are the properties of the resulting search direction? Notes n T −1 −1 T X vj g p = −B g = −VΛ¯ V g = − vj λ¯ j=1 j Taking norms and using the orthogonality of V, gives n T 2 − X (vj g) kpk2 = kB 1gk2 = 2 2 λ¯2 j=1 j T 2 T 2 X (vj g) X (vj g) = + 2 2 λj " λj≥" λj<" Thus, we conclude that 1 kpk = O provided vTg 6= 0 for at least one λ < " 2 " j j T the step p goes unbounded as " ! 0 provided vj g 6= 0 for at least one λj < " any indefinite matrix will generally satisfy this property the next method that we discuss is better! 16 / 106 Method 2: small scale problems (n < 1000) Notes Algorithm 2 Compute modified Newton matrix B from H 1: input H 2: Choose β > 1, the desired bound on the condition number of B. T 3: Compute the spectral decomposition H = VΛV . 4: if H = 0 then 5: Set " = 1 6: else 7: Set " = kHk2/β > 0 8: end if 9: Compute 8 < λj if λj ≥ " Λ¯ = diag(λ¯1; λ¯2;:::; λ¯n) with λ¯j = −λj if λj ≤ −" : " otherwise T 10: return B = VΛ¯V 0, which satisfies cond(B) ≤ β replace small eigenvalues λi of H with " replace “sufficiently negative” eigenvalues λi of H with −λi Λ¯ = Λ + D, where D = diag max(0; −2λ1;" − λ1);:::; max(0; −2λn;" − λn) 0 B = H + E, where E = VDVT 0 17 / 106 Suppose that B = H + E is computed from Algorithm 2 so that E = B − H = V(Λ¯ − Λ)VT Notes and since V is orthogonal that T kEk2 = kV(Λ¯ − Λ)V k2 = kΛ¯ − Λk2 = max jλ¯j − λjj j By definition 8 < 0 if λj ≥ " λ¯j − λj = " − λj if −" < λj < " : −2λj if λj ≤ −" which implies that kEk2 = max f 0;" − λj; −2λj g: 1≤j≤n However, since λ1 ≥ λ2 ≥ · · · ≥ λn, we know that kEk2 = max f 0;" − λn; −2λn g if λn ≥ ", i.e., H is sufficiently positive definite, then E = 0 and B = H if λn < ", then B 6= H and it can be shown that kEk2 ≤ 2 max("; jλnj) regardless, B is well-conditioned by construction 18 / 106 Example 2.2 Consider Notes 2 1 0 g = and H = 4 0 −2 The Newton direction is −2 pN = −H−1g = so that gTpN = 4 > 0 (ascent direction) 2 N T 1 T and p is a saddle point of the quadratic model g p + 2 p Hp since H is indefinite. Algorithm 1 (Method 1) generates 1 0 B = 0 " so that −1 −2 T 16 p = −B g = 4 and g p = −4 − < 0 (descent direction) − " " Algorithm 2 (Method 2) generates 1 0 B = 0 2 so that −2 p = −B−1g = and gTp = −12 < 0 (descent direction) −2 19 / 106 Notes pN pN Method 1 p pN p Method 2 Question: What is the geometric interpretation? Answer: Let the spectral decomposition of H be Notes H = VΛVT and assume that λj 6= 0 for all j.
Recommended publications
  • Hybrid Whale Optimization Algorithm with Modified Conjugate Gradient Method to Solve Global Optimization Problems
    Open Access Library Journal 2020, Volume 7, e6459 ISSN Online: 2333-9721 ISSN Print: 2333-9705 Hybrid Whale Optimization Algorithm with Modified Conjugate Gradient Method to Solve Global Optimization Problems Layth Riyadh Khaleel, Ban Ahmed Mitras Department of Mathematics, College of Computer Sciences & Mathematics, Mosul University, Mosul, Iraq How to cite this paper: Khaleel, L.R. and Abstract Mitras, B.A. (2020) Hybrid Whale Optimi- zation Algorithm with Modified Conjugate Whale Optimization Algorithm (WOA) is a meta-heuristic algorithm. It is a Gradient Method to Solve Global Opti- new algorithm, it simulates the behavior of Humpback Whales in their search mization Problems. Open Access Library for food and migration. In this paper, a modified conjugate gradient algo- Journal, 7: e6459. https://doi.org/10.4236/oalib.1106459 rithm is proposed by deriving new conjugate coefficient. The sufficient des- cent and the global convergence properties for the proposed algorithm proved. Received: May 25, 2020 Novel hybrid algorithm of the Whale Optimization Algorithm (WOA) pro- Accepted: June 27, 2020 posed with modified conjugate gradient Algorithm develops the elementary Published: June 30, 2020 society that randomly generated as the primary society for the Whales opti- Copyright © 2020 by author(s) and Open mization algorithm using the characteristics of the modified conjugate gra- Access Library Inc. dient algorithm. The efficiency of the hybrid algorithm measured by applying This work is licensed under the Creative it to (10) of the optimization functions of high measurement with different Commons Attribution International License (CC BY 4.0). dimensions and the results of the hybrid algorithm were very good in com- http://creativecommons.org/licenses/by/4.0/ parison with the original algorithm.
    [Show full text]
  • Lec9p1, ORF363/COS323
    Lec9p1, ORF363/COS323 Instructor: Amir Ali Ahmadi Fall 2014 This lecture: • Multivariate Newton's method TAs: Y. Chen, G. Hall, • Rates of convergence J. Ye • Modifications for global convergence • Nonlinear least squares ○ The Gauss-Newton algorithm • In the previous lecture, we saw the general framework of descent algorithms, with several choices for the step size and the descent direction. We also discussed convergence issues associated with these methods and provided some formal definitions for studying rates of convergence. Our focus before was on gradient descent methods and variants, which use only first order information (first order derivatives). These algorithms achieved a linear rate of convergence. • Today, we see a wonderful descent method with superlinear (in fact quadratic) rate of convergence: the Newton algorithm. This is a generalization of what we saw a couple of lectures ago in dimension one for root finding and function minimization. • The Newton's method is nothing but a descent method with a specific choice of a descent direction; one that iteratively adjusts itself to the local geometry of the function to be minimized. • In practice, Newton's method can converge with much fewer iterations than gradient methods. For example, for quadratic functions, while we saw that gradient methods can zigzag for a long time (depending on the underlying condition number), Newton's method will always get the optimal solution in a single step. • The cost that we pay for fast convergence is the need to (i) access second order information (i.e., derivatives of first and second order), and (ii) solve a linear system of equations at every step of the algorithm.
    [Show full text]
  • Use of the NLPQLP Sequential Quadratic Programming Algorithm to Solve Rotorcraft Aeromechanical Constrained Optimisation Problems
    NASA/TM—2016–216632 Use of the NLPQLP Sequential Quadratic Programming Algorithm to Solve Rotorcraft Aeromechanical Constrained Optimisation Problems Jane Anne Leyland Ames Research Center Moffett Field, California April 2016 This page is required and contains approved text that cannot be changed. NASA STI Program ... in Profile Since its founding, NASA has been dedicated • CONFERENCE PUBLICATION. to the advancement of aeronautics and Collected papers from scientific and space science. The NASA scientific and technical conferences, symposia, technical information (STI) program plays a seminars, or other meetings sponsored key part in helping NASA maintain this or co-sponsored by NASA. important role. • SPECIAL PUBLICATION. Scientific, The NASA STI program operates under the technical, or historical information from auspices of the Agency Chief Information NASA programs, projects, and Officer. It collects, organizes, provides for missions, often concerned with subjects archiving, and disseminates NASA’s STI. The having substantial public interest. NASA STI program provides access to the NTRS Registered and its public interface, • TECHNICAL TRANSLATION. the NASA Technical Reports Server, thus English-language translations of foreign providing one of the largest collections of scientific and technical material aeronautical and space science STI in the pertinent to NASA’s mission. world. Results are published in both non- NASA channels and by NASA in the NASA Specialized services also include STI Report Series, which includes the organizing and publishing research following report types: results, distributing specialized research announcements and feeds, providing • TECHNICAL PUBLICATION. Reports of information desk and personal search completed research or a major significant support, and enabling data exchange phase of research that present the results services.
    [Show full text]
  • A Sequential Quadratic Programming Algorithm with an Additional Equality Constrained Phase
    A Sequential Quadratic Programming Algorithm with an Additional Equality Constrained Phase Jos´eLuis Morales∗ Jorge Nocedal † Yuchen Wu† December 28, 2008 Abstract A sequential quadratic programming (SQP) method is presented that aims to over- come some of the drawbacks of contemporary SQP methods. It avoids the difficulties associated with indefinite quadratic programming subproblems by defining this sub- problem to be always convex. The novel feature of the approach is the addition of an equality constrained phase that promotes fast convergence and improves performance in the presence of ill conditioning. This equality constrained phase uses exact second order information and can be implemented using either a direct solve or an iterative method. The paper studies the global and local convergence properties of the new algorithm and presents a set of numerical experiments to illustrate its practical performance. 1 Introduction Sequential quadratic programming (SQP) methods are very effective techniques for solv- ing small, medium-size and certain classes of large-scale nonlinear programming problems. They are often preferable to interior-point methods when a sequence of related problems must be solved (as in branch and bound methods) and more generally, when a good estimate of the solution is available. Some SQP methods employ convex quadratic programming sub- problems for the step computation (typically using quasi-Newton Hessian approximations) while other variants define the Hessian of the SQP model using second derivative informa- tion, which can lead to nonconvex quadratic subproblems; see [27, 1] for surveys on SQP methods. ∗Departamento de Matem´aticas, Instituto Tecnol´ogico Aut´onomode M´exico, M´exico. This author was supported by Asociaci´onMexicana de Cultura AC and CONACyT-NSF grant J110.388/2006.
    [Show full text]
  • Lecture Notes on Numerical Optimization
    Lecture Notes on Numerical Optimization Miguel A.´ Carreira-Perpi˜n´an EECS, University of California, Merced December 30, 2020 These are notes for a one-semester graduate course on numerical optimisation given by Prof. Miguel A.´ Carreira-Perpi˜n´an at the University of California, Merced. The notes are largely based on the book “Numerical Optimization” by Jorge Nocedal and Stephen J. Wright (Springer, 2nd ed., 2006), with some additions. These notes may be used for educational, non-commercial purposes. c 2005–2020 Miguel A.´ Carreira-Perpi˜n´an 1 Introduction Goal: describe the basic concepts & main state-of-the-art algorithms for continuous optimiza- • tion. The optimization problem: • c (x)=0, i equality constraints (scalar) min f(x) s.t. i ∈E x Rn ci(x) 0, i inequality constraints (scalar) ∈ ≥ ∈ I x: variables (vector); f(x): objective function (scalar). Feasible region: set of points satisfying all constraints. max f min f. ≡ − − 2 2 2 x1 x2 0 Ex. (fig. 1.1): minx1,x2 (x1 2) +(x2 1) s.t. − ≤ • − − x1 + x2 2. ≤ Ex.: transportation problem (LP) • x a i (capacity of factory i) j ij ≤ i ∀ min cijxij s.t. i xij bj i (demand of shop j) xij P ≥ ∀ { } i,j xij 0 i, j (nonnegative production) X P ≥ ∀ cij: shipping cost; xij: amount of product shipped from factory i to shop j. Ex.: LSQ problem: fit a parametric model (e.g. line, polynomial, neural net...) to a data set. Ex. 2.1 • Optimization algorithms are iterative: build sequence of points that converges to the solution. • Needs good initial point (often by prior knowledge).
    [Show full text]
  • Strongly Polynomial Algorithm for the Intersection of a Line with a Polymatroid Jean Fonlupt, Alexandre Skoda
    Strongly polynomial algorithm for the intersection of a line with a polymatroid Jean Fonlupt, Alexandre Skoda To cite this version: Jean Fonlupt, Alexandre Skoda. Strongly polynomial algorithm for the intersection of a line with a polymatroid. Research Trends in Combinatorial Optimization, Springer Berlin Heidelberg, pp.69-85, 2009, 10.1007/978-3-540-76796-1_5. hal-00634187 HAL Id: hal-00634187 https://hal.archives-ouvertes.fr/hal-00634187 Submitted on 20 Oct 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Strongly polynomial algorithm for the intersection of a line with a polymatroid Jean Fonlupt1 and Alexandre Skoda2 1 Equipe Combinatoire et Optimisation. CNRS et Universit´eParis6 [email protected] 2 Equipe Combinatoire et Optimisation. CNRS et Universit´eParis6 France Telecom R&D. Sophia Antipolis [email protected] Summary. We present a new algorithm for the problem of determining the inter- N section of a half-line Δu = {x ∈ IR | x = λu for λ ≥ 0} with a polymatroid. We then propose a second algorithm which generalizes the first algorithm and solves a parametric linear program. We prove that these two algorithms are strongly poly- nomial and that their running time is O(n8 + γn7)whereγ is the time for an oracle call.
    [Show full text]
  • Optimization and Approximation
    Optimization and Approximation Elisa Riccietti12 1Thanks to Stefania Bellavia, University of Florence. 2Reference book: Numerical Optimization, Nocedal and Wright, Springer 2 Contents I I part: nonlinear optimization 5 1 Prerequisites 7 1.1 Necessary and sufficient conditions . .9 1.2 Convex functions . 10 1.3 Quadratic functions . 11 2 Iterative methods 13 2.1 Directions for line-search methods . 14 2.1.1 Direction of steepest descent . 14 2.1.2 Newton's direction . 15 2.1.3 Quasi-Newton directions . 16 2.2 Rates of convergence . 16 2.3 Steepest descent method for quadratic functions . 17 2.4 Convergence of Newton's method . 19 3 Line-search methods 23 3.1 Armijo and Wolfe conditions . 23 3.2 Convergence of line-search methods . 27 3.3 Backtracking . 30 3.4 Newton's method . 32 4 Quasi-Newton method 33 4.1 BFGS method . 34 4.2 Global convergence of the BFGS method . 38 5 Nonlinear least-squares problems 41 5.1 Background: modelling, regression . 41 5.2 General concepts . 41 5.3 Linear least-squares problems . 43 5.4 Algorithms for nonlinear least-squares problems . 44 5.4.1 Gauss-Newton method . 44 5.5 Levenberg-Marquardt method . 45 6 Constrained optimization 47 6.1 One equality constraint . 48 6.2 One inequality constraint . 50 6.3 First order optimality conditions . 52 3 4 CONTENTS 6.4 Second order optimality conditions . 58 7 Optimization methods for Machine Learning 61 II Linear and integer programming 63 8 Linear programming 65 8.1 How to rewrite an LP in standard form .
    [Show full text]
  • Download Thesisadobe
    Exploiting the Intrinsic Structures of Simultaneous Localization and Mapping by Kasra Khosoussi Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Centre for Autonomous Systems Faculty of Engineering and Information Technology University of Technology Sydney March 2017 Declaration I certify that the work in this thesis has not previously been submitted for a degree nor has it been submitted as part of requirements for a degree except as fully acknowledged within the text. I also certify that the thesis has been written by me. Any help that I have received in my research work and the preparation of the thesis itself has been acknowledged. In addition, I certify that all information sources and literature used are indicated in the thesis. Signature: Date: i Abstract Imagine a robot which is assigned a complex task that requires it to navigate in an unknown environment. This situation frequently arises in a wide spectrum of applications; e.g., sur- gical robots used for diagnosis and treatment operate inside the body, domestic robots have to operate in people’s houses, and Mars rovers explore Mars. To navigate safely in such environments, the robot needs to constantly estimate its location and, simultane- ously, create a consistent representation of the environment by, e.g., building a map. This problem is known as simultaneous localization and mapping (SLAM). Over the past three decades, SLAM has always been a central topic in mobile robotics. Tremendous progress has been made during these years in efficiently solving SLAM using a variety of sensors in large-scale environments.
    [Show full text]
  • Arxiv:1806.10197V1 [Math.NA] 26 Jun 2018 Tive
    LINEARLY CONVERGENT NONLINEAR CONJUGATE GRADIENT METHODS FOR A PARAMETER IDENTIFICATION PROBLEMS MOHAMED KAMEL RIAHI?;† AND ISSAM AL QATTAN‡ ABSTRACT. This paper presents a general description of a parameter estimation inverse prob- lem for systems governed by nonlinear differential equations. The inverse problem is presented using optimal control tools with state constraints, where the minimization process is based on a first-order optimization technique such as adaptive monotony-backtracking steepest descent technique and nonlinear conjugate gradient methods satisfying strong Wolfe conditions. Global convergence theory of both methods is rigorously established where new linear convergence rates have been reported. Indeed, for the nonlinear non-convex optimization we show that under the Lipschitz-continuous condition of the gradient of the objective function we have a linear conver- gence rate toward a stationary point. Furthermore, nonlinear conjugate gradient method has also been shown to be linearly convergent toward stationary points where the second derivative of the objective function is bounded. The convergence analysis in this work has been established in a general nonlinear non-convex optimization under constraints framework where the considered time-dependent model could whether be a system of coupled ordinary differential equations or partial differential equations. Numerical evidence on a selection of popular nonlinear models is presented to support the theoretical results. Nonlinear Conjugate gradient methods, Nonlinear Optimal control and Convergence analysis and Dynamical systems and Parameter estimation and Inverse problem 1. INTRODUCTION Linear and nonlinear dynamical systems are popular approaches to model the behavior of complex systems such as neural network, biological systems and physical phenomena. These models often need to be tailored to experimental data through optimization of parameters in- volved in the mathematical formulations.
    [Show full text]
  • Unconstrained Optimization
    Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min f(x) (4.1) x∈Rn for a target functional (also called objective function) f : Rn → R. In this chapter and throughout most of our discussion on optimization, we will assume that f is sufficiently smooth, that is, at least continuously differentiable. In most applications of optimization problem, one is usually interested in a global minimizer x∗, which satisfies f(x∗) ≤ f(x) for all x in Rn (or at least for all x in the domain of interest). Unless f is particularly nice, optimization algorithms are often not guaranteed to yield global minima but only yield local minima. A point x∗ is called a local minimizer if there is a neighborhood N such that f(x∗) ≤ f(x) for all x ∈ N . Similarly, x∗ is called a strict local minimizer f(x∗) < f(x) for all x ∈N with x 6= x∗. 4.1 Fundamentals Sufficient and necessary conditions for local minimizers can be developed from the Taylor expansion of f. Let us recall Example 2.3: If f is two times continuously differentiable then 1 T f(x + h)= f(x)+ ∇f(x)T h + h H(x)h + O(khk3), (4.2) 2 2 m where ∇f is the gradient and H = ∂ f is the Hessian [matrice Hessienne] ∂xi∂xj i,j=1 of f. Ä ä Theorem 4.1 (First-order necessary condition) If x∗ is a local mini- mizer and f is continuously differentiable in an open neighborhood of x∗ then ∇f(x∗) = 0.
    [Show full text]
  • Hybrid Algorithms for On-Line and Combinatorial Optimization Problems
    Hybrid Algorithms for On-Line Search and Combinatorial Optimization Problems. Yury V. Smirnov May 28, 1997 CMU-CS-97-171 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Copyright c 1997 Yury V. Smirnov All rights reserved This research is sponsored in part by the National Science Foundation under grant number IRI-9502548. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the sponsoring organizations or the U.S. government. Keywords: Hybrid Algorithms, Cross-Fertilization, Agent-Centered Search, Limited Lookahead, Pigeonhole Principle, Linear Programming Relaxation. “ Each problem I solved became a pattern, that I used later to solve other problems.” Ren´e Descartes To my wife, Tanya, and my sons, Nick and Michael ii Abstract By now Artificial Intelligence (AI), Theoretical Computer Science (CS theory) and Opera- tions Research (OR) have investigated a variety of search and optimization problems. However, methods from these scientific areas use different problem descriptions, models, and tools. They also address problems with particular efficiency requirements. For example, approaches from CS theory are mainly concerned with the worst-case scenarios and are not focused on empirical performance. A few efforts have tried to apply methods across areas. Usually a significant amount of work is required to make different approaches “talk the same language,” be suc- cessfully implemented, and, finally, solve the actual same problem with an overall acceptable efficiency. This thesis presents a systematic approach that attempts to advance the state of the art in the transfer of knowledge across the above mentioned areas.
    [Show full text]
  • Mechanical Identification of Materials and Structures with Optical Methods
    materials Article Mechanical Identification of Materials and Structures with Optical Methods and Metaheuristic Optimization Elisa Ficarella 1, Luciano Lamberti 1,* and Sadik Ozgur Degertekin 2 1 Dipartimento di Meccanica, Matematica e Management, Politecnico di Bari, 70126 Bari, Italy 2 Department of Civil Engineering, Dicle University, 21280 Diyarbakır, Turkey * Correspondence: [email protected]; Tel.: +39-080-5962774 Received: 27 May 2019; Accepted: 24 June 2019; Published: 2 July 2019 Abstract: This study presents a hybrid framework for mechanical identification of materials and structures. The inverse problem is solved by combining experimental measurements performed by optical methods and non-linear optimization using metaheuristic algorithms. In particular, we develop three advanced formulations of Simulated Annealing (SA), Harmony Search (HS) and Big Bang-Big Crunch (BBBC) including enhanced approximate line search and computationally cheap gradient evaluation strategies. The rationale behind the new algorithms—denoted as Hybrid Fast Simulated Annealing (HFSA), Hybrid Fast Harmony Search (HFHS) and Hybrid Fast Big Bang-Big Crunch (HFBBBC)—is to generate high quality trial designs lying on a properly selected set of descent directions. Besides hybridizing SA/HS/BBBC metaheuristic search engines with gradient information and approximate line search, HS and BBBC are also hybridized with an enhanced 1-D probabilistic search derived from SA. The results obtained in three inverse problems regarding composite and transversely isotropic hyperelastic materials/structures with up to 17 unknown properties clearly demonstrate the validity of the proposed approach, which allows to significantly reduce the number of structural analyses with respect to previous SA/HS/BBBC formulations and improves robustness of metaheuristic search engines.
    [Show full text]