Ilnumerics Optimization Toolbox
Total Page:16
File Type:pdf, Size:1020Kb
ILNumerics Optimization Toolbox 1 INTRODUCTION Optimization deals with the minimization or maximization of functions. The ILNumerics Optimization Toolbox consists of functions that perform minimization (or maximization) on general nonlinear functions and problems. An optimization problem is the problem of finding the best solution from all feasible solutions. Optimization problems can be divided into two categories depending on whether the variables are continuous or discrete. Here, we are focussing on continuous optimization problems. The standard form of a continuous optimization problem is min Subject to ≤0, =1,…, , , =0, =1,…, Where is the objective function to be minimized over the variable : → are called the inequality constraints , : → are called the equality constraints , and : → is a convex set in , called bound constraints. By convention, the standard form defines a minimization problem . A maximization problem can be handled by negating the objective function. Based on the description of the function f and the feasible set , the problem can be classified as linear, quadratic, non- linear, semi-infinite, semi-definite, multiple-objective, or discrete optimization problem. However, in the current status of the ILNumerics optimization toolbox, only nonlinear unconstraint and constraint optimization functions are provided. 2 UNCONSTRAINED OPTIMIZATION The program available for unconstrained optimization problem in ILNumerics is called optimUnconst . The op timUnconst function solves optimization problems with nonlinear objectives, without bound constraints on the unknown variables . A quasi-Newton method using a Broyden-Fletcher-Goldfarb-Shanno (BFGS) formula to update the approximate Hessian matrix is implemented. The quasi-Newton method has a O(n²) memory requirement. For the moment, only the BFGS and the classical adaptive Newton Method algorithm is available for unconstrained optimization problem. optimUnconst gives the option to provide user defined functions for the computation of the hessian or gradient. By default, the gradient is computed using finite differences based on an optimal step size. The optimUnconst function is essentially an unconstrained nonlinear optimization solver: • xopt = optimUnconst(objfunc,x0); • xopt = optimUnconst(objfunc,x0, gradfunc: gradient); • xopt = optimUnconst(objfunc,x0, hessianFunc: hessian); • xopt = optimUnconst(objfunc,x0, gradfunc: gradient, hessianFunc: hessian); where • objfunc is the objective function, • x0 is the initial guess, • gradient is the gradient of the objective function, • xopt is the optimal point, or the minimizer, • hessianFunc is the hessian function giving the explicit expression of the hessian matrix. 2.12.12.1 TTTHE COST FUNCTION The cost function is directly passed to optimUnconst as a function parameter. In most cases, this will be an ILNumerics function. But it is also fine, to give the cost function as an anonymous function. Requirements on the cost function: The cost function has to be “smooth enough”, i.e. the 2nd derivative of the cost function (Hessian matrix) is expected to exist and to be non-zero on the whole definition set. 2.2 GETTING STARTED WITH UNCONSTRAINED OPTIMIZATION The simplest use of the optimUnconst algorithm is as follows: xopt = optimUnconst(objfunc,x0) ; where • objfunc is the objective function, • x0 is the initial guess, • xopt is the optimal point, or the minimizer. 2.2.1 Example In the following example, we compute the unconstrained minimum of the Rosenbrock 1 function. The function is given by = 100 ! " ! 1 1 http://en.wikipedia.org/wiki/Rosenbrock_function In C# the algorithm looks as follows: public static ILRetArray <double > Rosenbrock( ILInArray <double > x) { using (ILScope .Enter(x)) { return 100 *(x[1]–x[0]*x[0])*(x[1]-x[0]*x[0])+(x[0] - 1)*(x[0] - 1); } } The minimum of the Rosenbrock function is known to lay at [1,1]. In order to find this minimum programmatically, we give it to the optimUnconst function, together with a point to start looking for the minimum, let’s say: x0 = (-5,-5): ILArray <double > xopt = Optimization .optim(Rosenbrock, -5 * ones(2, 1)); >xopt > <Double> [2,1] > [0]: 1.00000 > [1]: 1.00000 The value returned from optimUnconst is called the minimizer of the objective function. It corresponds to the point x, where the objective function reaches a minimum. For non- convex problems, that minimum will be a local minimum with respect to the given starting point or initial guess. For convex functions f, the minimizer corresponds to the global minimum. The minimum, i.e. the value of the objective function at the minimizer can be found by simply evaluating the objective function: >Rosenbrock(xopt) ><Double> (:,:) 1e-025 * > 1.51685 2.3 OPTIM UNCONST WITH GRADIENT AND /OR HESSIAN FOR UNCONSTRAINED OPTIMIZATION By default, the gradient and the hessian approximations (i.e. first and 2 nd derivatives) of the objective function are computed automatically by optimUnconst . In order to speed up this computation, user defined functions for the hessian and/or the gradient can be provided: Xopt=optimUnconst(objfunc, x0, gradfunc: userGradientFunction); xopt=optimUnconst(objfunc, x0, hessianFunc:userDefinedHessian); xopt=optimUnconst(objfunc, x0, gradfunc: userGradientFunction, hessianFunc: userDefinedHessian); where userGradientFunction and userDefinedHessian are the user provided hessian and gradient functions. When the hessian is provided, the algorithm will become a classical Newton method 2 algorithm, with exact hessian approximation instead of the BFGS approximation of the hessian. 2 For an introduction, see: http://en.wikipedia.org/wiki/Newton_method_in_optimization 2.3.1 Example We would like to minimize the function ! = ! 1 " 2 " $ " 4 . The derivative of the function is ) ' . = ! 1, 2, $ " 4/! 1 " 2 " $ " 4 * The C# algorithm of the function is as follow. public ILRetArray <double > NewExampleFunction( ILInArray <double > x) { using (ILScope .Enter(x)) { ILArray <double > a= array(1.0, 2.0, -4.0); return norm(x – a, 2); } } The gradient is given by public ILRetArray <double > GradientNewExampleFunction( ILInArray <double > x) { using (ILScope .Enter(x)) { ILArray <double > a=array(1.0, 2.0, -4.0); return (x – a)/norm(x-a,2); } } Giving the starting point (10.0, 100.0, 1000.0), the minimum of the function is found by: ILArray <double> xopt = Optimization .optimUnconst( NewExampleFunction, array(10.0, 100.0, 1000.0), gradFunc: GradientNewExampleFunction); >xopt > <Double> [3, 1] > [0]: 1 > [1]: 2 > [2]: -4 The value of the function at the minimizer is found by > NewExampleFunction(xopt) > <Double> 0 In a similar way, the hessian function can be provided for the computation. In this case, the BFGS approximation of the hessian is replaced by the user defined function, resulting in a faster convergence of the optimization algorithm. Note that the hessian function must return symmetric matrices of size n x n, where n is the dimensionality of the given starting point. 2.3.2 Example We would like to minimize the function = 2 , The derivative of the function is ' = 4, And the hessian is ' . + = 4 The C# algorithm for the computation is as follows. public ILRetArray <double > objfunc(ILInArray <double > x) { using (ILScope .Enter(x)) { return 2 * x * x; } } The gradient function is defined by: public ILRetArray <double > gradfunc(ILInArray <double > x) { using (ILScope .Enter(x)) { return 4*x; } } The hessian function is defined by: public ILRetArray <double > hessianfunc(ILInArray <double > x) { using (ILScope .Enter(x)) { return 4; } } Giving the starting point 1000.0, the method can be called as follows: ILArray <double> xopt = Optimization .optimUnconst( objfunc, 1000.0, gradfunc: gradfunc, hessianfunc: hessianfunc); From the immediate window, it appears ><Double> (:,:) 1e-012 * > 1.42109 2.4 FEATURES : optimUnconst provides an efficient optimization solver based on the robust BFGS algorithm. The BFGS is a Quasi-Newton algorithm based on updates using a Gauss C-G algorithm , and a line-search of optimUnconst is based on the Golden Section Search algorithm. Even with “big” number, the BFGS will manage to adapt. However, it is recommend to normalize / scale functions with high values to a better floating point precision range (i.e.: values ‘near 1.0’) . 3 CONSTRAINED NONLINEAR OPTIMIZATION ALGORITHMS ILNumerics provides method optim to obtain the value of x that minimizes a nonlinear objective functional f(x) allowing different kind of constraints. The simplest call of optim is: xopt=optim(objfunc, x0); where xopt is the cost function minimizer and x0 the initial guess and objfunc the cost function. In fact, the previous call is equivalent to the optimUnconst function call. However, the hessian and the gradient cannot be provided using this call, but the Newton method can still be used through approximation of the hessian matrix instead of BFGS approximation. 3.13.13.1 NNNONLINEAR OPTIMIZATION WITH BOUNDARY CONSTRAINTS optim solves bounded constraints optimization problems defined by min ,-./01 Subject to . 2- ≤ ≤ /- The calling sequence for such problem is : xopt=optim(objfunc, x0,lowerBound:a,upperBound:b); • objfunc is the objective function, • x0 is the initial guess, • lowerBound is the lower bound constraint, • upperBound is the lower bound constraint. 3.1.1 Example Let us consider the minimization problem min = 1 , 3/-.415 5, 5 ≤ ≤ 0. The C# function to solve the problem can be written as following: public ILRetArray <double > objfunc(ILInArray