<<

A fast and reliable hybrid for numerical nonlinear Charlie Vanaret, Jean-Baptiste Gotteland, Nicolas Durand, Jean-Marc Alliot

To cite this version:

Charlie Vanaret, Jean-Baptiste Gotteland, Nicolas Durand, Jean-Marc Alliot. A fast and reliable hybrid algorithm for numerical nonlinear global optimization. AAAI 2013, 27th AAAI Conference on Artificial Intelligence, 2013. ￿hal-00938911￿

HAL Id: hal-00938911 https://hal-enac.archives-ouvertes.fr/hal-00938911 Submitted on 24 Apr 2014

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. A Fast and Reliable Hybrid Algorithm for Numerical Nonlinear Global Optimization Charlie Vanaret and Jean-Baptiste Gotteland and Nicolas Durand Laboratoire de Mathematiques´ Appliquees,´ Informatique, Automatique pour l’Aerien´ Ecole´ Nationale de l’Aviation Civile, Toulouse, France {vanaret, gottelan, durand}@recherche.enac.fr Jean-Marc Alliot Institut de Recherche en Informatique de Toulouse, France [email protected]

Abstract within the other: Sotiropoulos et al. used a Branch and Bound algorithm to reduce the domain to a list of ǫ-large Highly nonlinear and ill-conditioned numerical optimization problems take their toll on the convergence of existing res- subspaces. A Genetic Algorithm (GA) was then initialized olution methods. Stochastic methods such as Evolutionary within each subspace to improve the upper bound of the carry out an efficient exploration of the search- global minimum. Zhang and Liu used a Genetic Algorithm space at low cost, but get often trapped in local minima and do within the Branch and Bound algorithm to improve the not prove the optimality of the solution. Deterministic meth- bounds and the order of the list of remaining subspaces ods such as Interval Branch and Bound algorithms guaran- to process. (Alliot et al. 2012) proposed a cooperative ap- tee bounds on the solution, yet struggle to converge within proach combining the efficiency of a GA and the reliability a reasonable time on high-dimensional problems. The contri- of IBBA to guarantee the optimality of solutions to highly bution of this paper is a hybrid algorithm in which a Differen- nonlinear bound-constrained problems. Original optimal re- tial Evolution algorithm and an Interval Branch and Contract sults were achieved on benchmark functions, demonstrating algorithm cooperate. Bounds and solutions are exchanged through shared memory to accelerate the proof of optimal- the validity of the approach. However, local monotonicity ity. It prevents premature convergence toward local optima and constraint programming techniques, which exploit the and outperforms both deterministic and stochastic existing analytical form of the objective function, were left out in approaches. We demonstrate the efficiency of this algorithm the basic formulation of the algorithm. In this paper, we on two currently unsolved problems: first by presenting new propose an advanced cooperative algorithm in which a Dif- certified optimal results for the Michalewicz function for up ferential Evolution algorithm cooperates with Interval Con- to 75 dimensions and then by proving that the putative mini- straint Programming. It is reliable as it guarantees bounds mum of Lennard-Jones clusters of 5 atoms is optimal. on the global optimum. New optimal results achieved on two highly multimodal functions – the Michalewicz func- 1 Motivation tion and the Lennard-Jones cluster problem – attest the sub- Evolutionary Computation (EC) algorithms have been stantial gain in performance. In this study, we consider only widely used by the global optimization community for their bound-constrained numerical minimization problems: ability to handle complex and high-dimensional problems min f(x) (1) with no assumption on continuity or differentiability. They x∈D carry out a fast exploration of the search-space and generally D n Rn R where f : ( = i=1[li,ui] ⊂ ) → is the objective converge toward satisfactory solutions. However, EC may function to be minimized. We assume that f is differentiable, get trapped in local optima and provide suboptimal solu- and that the analyticalQ forms of f and its partial derivatives tions. Moreover, their convergence remains hard to control are available. due to their stochastic nature. On the other hand, Interval The standard Differential Evolution algorithm is pre- Branch and Bound Algorithms (IBBA) guarantee rigorous sented in section 2, and details about our Interval Branch and bounds on the solutions to nonlinear and nonconvex numer- Contract Algorithm are given in section 3. The implementa- ical problems but are limited byDRAFT their exponential complex- tion of our hybrid algorithm is detailed in section 4. In sec- ity regarding the number of variables and by the dependency tion 5, we present new optimal results for the Michalewicz problem inherent to Interval Analysis. function and we prove that the putative minimum for the Few approaches attempted to hybridize EC algorithms Lennard-Jones cluster of 5 atoms is optimal. and Interval Branch and Bound algorithms. (Sotiropoulos, Stavropoulos, and Vrahatis 1997) and (Zhang and Liu 2007) 2 Differential Evolution devised integrative methods that embedded one algorithm Evolutionary Computations (EC) are stochastic iterative Copyright c 2013, Association for the Advancement of Artificial optimization algorithms that usually mimic natural pro- Intelligence (www.aaai.org). All rights reserved. cesses. In particular, Evolutionary Algorithms (EA) are based on the theory of evolution (survival of the fittest) upper) bound of an interval X is noted X (resp. X). A where stochastic operators iteratively improve a population box denotes an interval vector. The width of an interval is of individuals (candidate solutions) according to an adap- w(X) = X − X, and the width of a box (X1,...,Xn) is tation criterion (the objective function), in order to con- max1≤i≤n w(Xi). The midpoint of an interval is m(X) = verge toward satisfactory solutions. Numerous EC tech- 1 2 (X + X) and the midpoint of a box (X1,...,Xn) is niques have recently emerged: Particle Swarm Optimization (m(X1), . . . , m(Xn)). The convex hull of a set of boxes (Kennedy and Eberhart 1995), Ant Colony Optimization S is the smallest box that contains S. (Dorigo, Maniezzo, and Colorni 1996), CMA-ES (Hansen Interval arithmetic extends to intervals binary operators and Kern 2004), etc. (Alliot et al. 2012) used a Genetic Al- (+, −, ×,/) and elementary functions (exp, log, cos, etc.). gorithm, but did clearly state that this algorithm had been Interval computations are carried out with outward round- chosen because it was the most familiar to the authors, but ing. An interval extension F of a real-valued function f that other EC algorithms might be more adapted to the hy- guarantees a rigorous enclosure of its range: bridization. In the following, we use a Differential Evolution X IRn X x x X X algorithm as it greatly enhanced the original results. ∀ ∈ ,f( )= {f( ) | ∈ }⊂ F ( ) (2) Differential Evolution (DE) is a simple yet powerful The natural interval extension FN is computed by replacing EC algorithm introduced by Storn and Price (1997). It has real elementary operations by interval arithmetic operations. proved to be particularly efficient on difficult black-box op- IA generally computes a large overestimation of the im- timization problems. DE combines the coordinates of exist- age due to the dependency problem: when a variable appears ing individuals with a particular probability to generate new more than once in an expression, each occurrence is treated potential solutions. Unlike Genetic Algorithms, it offers the as a different variable. However, if f is continuous inside possibility to move individuals around in only part of the a box, its natural interval extension yields the exact range dimensions. Algorithm 1 describes DE for a n-dimensional when each variable occurs only once in its expression: problem. 2 Example 1 Let f1(x) = x − x and X = [0, 2]. The func- 1 2 1 tions f2(x) = x(x − 1) and f3(x)=(x − ) − are Algorithm 1 Differential evolution algorithm (DE) 2 4 equivalent to f1. However, Randomly initialize each individual 2 Evaluate each individual F1(X)=[0, 2] − [0, 2] = [0, 4] − [0, 2]=[−2, 4] while termination criterion is not met do F2(X)=[0, 2] × ([0, 2] − 1) = [0, 2] × [−1, 1]=[−2, 2] (3) for each individual x(i),i ,...,NP do ∈{1 } 1 2 1 1 3 2 1 1 (i0) (i1) (i2) F (X) = ([0, 2] − ) − =[− , ] − =[− , 2] Randomly pick r, x , x , x 3 2 4 2 2 4 4 for each dimension j ∈{1,...,n} do if j = r or rand(0, 1) < CR then We thus have F3(X) ⊂ F2(X) ⊂ F1(X). F3(X) is the best (i) (i0) (i1) (i2) computable enclosure, i.e. it is the exact range of f(X). yj = xj + W × (xj − xj ) else Interval Branch and Bound Algorithms (IBBA) exploit (i) (i) yj = xj the conservative properties of interval extensions to rigor- end if ously bound global optima of numerical optimization prob- end for lems (Hansen 1992). However, their exponential complexity (i) (i) if f(y )

Algorithm 3 Interval Branch and Contract Algorithm f˜ ← +∞ ⊲ best found upper bound L←{X0} ⊲ priority queue of boxes to process S ←{} ⊲ list of solutions repeat Extract a box X from L ⊲ selection rule Compute F (X) ⊲ bounding rule if F (X) ≤ f˜− ǫf then ⊲ cut-off test CONTRACT(X, f˜) ⊲ filtering algorithms f˜ ← min(F (m(X)), f˜) ⊲ midpoint test Figure 2: HC4Revise: propagation phase if w(X) >ǫx and w(F (X)) >ǫf then Bisect X into X1 and X2 ⊲ branching rule L←L∪{X1}∪{X2} else 4 Cooperative hybrid algorithm S←S∪{X} ⊲ termination rule The original cooperative algorithm (Alliot et al. 2012) com- end if DRAFTbined an EA and an IBBA that ran independently, and coop- else erated by exchanging information through shared memory Discard X (Figure 3) in order to accelerate the convergence. The EA end if carries out a fast exploration of the search-space and quickly ∅ until L = finds satisfactory solutions. The best known evaluation is return S used to improve the upper bound of the global minimum, and allows the IBBA to prune parts of the search-space more HC4Revise (Example 2) carries out a double explo- efficiently. A third process periodically projects EA’s indi- ration of the syntax of a constraint to contract each oc- viduals trapped in local minima onto the closest admissible currence of a variable. It consists in an evaluation (bottom- box of the IBBA. ˜ SHARED rent best upper bound f. If the latter is improved, it is up- EA MEMORY IBBA dated to prune more efficiently parts of the search-space that cannot contain a global minimizer. We use the following stores updates ~ xbest f steps in our Interval Branch and Contract algorithm: ~ xworst x replaces stores Selection rule The priority with which boxes are in- population queue serted in the queue L determines the exploration strategy of the search-space: smallest lower bound first, largest box projects onto cleans up (breadth-first search), stack (depth-first search). Our IBCA feasible domain UPDATE uses the location of the current best solution x˜ to extract from L the box X for which the distance to x˜ is maximum: Figure 3: Original cooperative hybrid algorithm n 2 dist(X, x˜)= ǫi(Xi, x˜i) (5) i=1 We propose to replace the EA from the original paper by X a DE algorithm that has shown a better convergence toward x˜i − Xi, if Xi < x˜i the global optimum. Our IBCA algorithm, integrating an ef- where ǫi(Xi, x˜i)= Xi − x˜i, if x˜i

• a constant or adaptive amount (depending on the number Cut-off test If f˜

(i0) (i0) x ∈ X, therefore (i) xj + rand(0, 1)(li − xj ), if yj uj DE’s individuals are evaluated using the real-valued ob- In practice, f(c) is also computed using IA to bound round- ing errors. We denote FT the multivariate Taylor interval jective function in round-to-nearest mode. This value may c X be lower than the (theoretical) exact evaluation, therefore it extension, with ∈ : should not be used to update f˜ at risk of losing a rigorous n X c DRAFTFT ( )= F ( )+ (Xi − ci)Gi(X1,...,Xn) (8) enclosure of the global minimum. An evaluation of the inter- i=1 val extension is therefore required whenever the best known X evaluation is improved. The upper bound of the image inter- The quality of inclusion of equation 8 depends on the val computed by IA is stored in the shared memory, which point c. The Baumann centered form, for which Baumann guarantees certified bounds on the solution. (1988) gave an analytical expression of the optimal center, computes the greatest lower bound of f(X): 4.2 IBCA thread n A good upper bound provided by the DE is retrieved at each LiUi FB(X) = F (cB)+ w(Xi) (9) iteration from the shared memory and compared to the cur- Ui − Li i=1 X where ∂f X X , c and 2. We prove that the currently putative solution to the ∂xi ( ) ∈ Gi( )=[Li,Ui] B =(c1,...,cn) Lennard-Jones 5 atoms cluster problem is the global opti- ci =(XiUi −XiLi)/(Ui −Li). Note that computing F (cB) mum, a result which had up to now never been proved. offers the possibility of updating f˜ (see midpoint test). The order of approximation k of an interval extension 5.1 Michalewicz function F indicates the speed at which the interval inclusion ap- The Michalewicz function (Michalewicz 1996) is a highly proaches the exact range of f: multimodal function (n! local optima) with domain [0, π]n. k w(F (X)) − w(f(X)) = O(w(X) ) (10) n 20 ix2 f (x)= − sin(x ) sin( i ) (13) Centered forms have a quadratic order of approximation n i π i=1 (k = 2), while that of natural interval extensions is only X   linear (k = 1). Consequently, the enclosure FB(X) of f(X) The best found solutions for up to 50 dimensions are given becomes sharper when the width of X approaches 0. In prac- in (Mishra 2006) using a repulsive particle swarm algorithm: ∗ ∗ ∗ ∗ tice, it is not known when the Baumann form computes a f10 = −9.6602,f20 = −19.6370,f30 = −29.6309,f50 = better inclusion than the natural interval extension. To ex- −49.6248. Very few results regarding deterministic methods ploit the quadratic convergence of FB, we set an arbitrary are available: Alliot et al. proved the optimality of the solu- −3 −4 threshold σ on the width of the boxes, under which the Bau- tion for n = 12 with precisions ǫx = 10 and ǫf = 10 in ∗ mann inclusion is computed in addition to the natural inclu- 6000s: f12 = −11.64957. Our advanced version of the co- sion. Intersecting both inclusions may yield a tighter enclo- operative algorithm significantly accelerates the proof of op- sure of the image: timality: The convergence on the same problem is achieved after 0.03s. The proved minima1 for n = 10 to 75 with pa- X X X rameters are presented in table 1. X max (FN ( ),FB( )) if w( ) <σ F ( ) = X (11) (FN ( ) otherwise ∗ ∗ n fn n fn For small boxes, the additional information supplied by the 10 -9.66015171564 50 -49.62483231828 lower bound is generally worth the cost of extra computa- 20 -19.63701359935 60 -59.62314622857 30 -29.63088385032 70 -69.62222020764 tions (see results in section 5.2). 40 -39.62674886468 75 -74.62181118757 Termination rule Parameters ǫx and ǫf determine the de- sired precision of the solution. X is stored in the solution Table 1: Proved minima of Michalewicz function list S when w(X) ≤ ǫx or w(F (X)) ≤ ǫf . Otherwise, X is bisected, and the insertion priority is computed for the re- As an example, a comparison between DE alone, IBCA alone and our hybrid algorithm for n = 20 illustrates the sulting subboxes X1 and X2. gain in performance achieved by our approach: The DE al- Bisection rule X is bisected along one dimension after an- gorithm alone converges toward a local optimum −19.6356, other (round-robin method for each individual box). The two the IBCA alone achieves convergence in 64s and the hybrid resulting subboxes are inserted in L to be subsequently pro- algorithm in 0.09s. cessed. Table 2 presents the average and maximum CPU times (in seconds) and average number of evaluations (NE) of f, F 4.3 Update thread and its partial derivatives, after 100 executions of the hybrid The individuals of the EA are periodically projected within algorithm for n = 10 to 75. the admissible domain to avoid exploring infeasible parts of the search-space. If an individual x lies outside the remain- n Av. time Max. time NE f NE F NE Gi ing domain, it is randomly reinitialized within the closest 10 0.02 0.02 14,516 851 6,480 box X: 20 0.1 0.1 26,055 1,549 22,972 30 0.4 0.4 56,851 3,357 74,201 rand(Xi, m(Xi)), if xi

1. We compute the optima for the Michalewicz function up ∗ It is claimed in (Adorio 2005) that fn = −0.966n. An to 75 variables while the best known results were only ∗ ∗ interpolation of fn over [10, 75] rather suggests that fn = computed for up to 50 variables. We then prove the opti- 2 mality of these solutions, while optimality of the solutions −0.9995371232n + 0.3486088434 (R = 0.9999999133). had currently only be proven up to 12 variables. Last, we 1 Parameters of the algorithm were: ǫx (precision, width of box) −10 −10 present an improvement of Adorio formula regarding the = 10 , ǫf (precision, width of image) = 10 , σ (Baumann putative value of optima of Michalewicz function for all threshold) =0, NP (population size) = 20 to 100, W (weighting dimensions with an R2 = 0.9999999133. factor) =0.7, CR (crossover rate) =0. 5.2 Lennard-Jones clusters Atom x y z 1 0 0 0 The Lennard-Jones potential is a simplified model proposed 2 1.1240936 0 0 by (Jones 1924) to describe pairwise interactions between 3 0.5620468 0.9734936 0 atoms. It is deemed as an accurate model of clusters of no- 4 0.5620468 0.3244979 0.9129386 ble gas atoms. Given rij the distance between the centers of 5 0.5620468 0.3244979 -0.9129385 atoms i and j, the pairwise potential is defined by 1 1 Table 3: Coordinates of optimal solution (5 atoms) v(rij) = 4 12 − 6 (14) rij rij ! Average time (s) 1436s Finding the most stable configuration of a cluster of k atoms Maximum time (s) 1800s amounts to minimizing: Maximal queue size 46 F evaluations (IBCA) 7, 088, 758 k F evaluations (IBCA) , , 2 2 2 ∇ 78 229 737 fn(x)= v( (xi − xj ) +(yi − yj ) +(zi − zj ) ) (15) f evaluations (DE) 483, 642, 320 i

2 Parameters of the algorithm were: ǫx (precision, width of box) −6 −6 = 10 , ǫf (precision, width of image) 10 , σ (Baumann thresh- − old) = 10 4, NP (population size) = 40, W (weighting factor) =0.7, CR (crossover rate) =0.4. References Sloane, N.; Hardin, R.; Duff, T.; and Conway, J. 1995. Adorio, E. P. 2005. Mvf - multivariate test functions library Minimal-energy clusters of hard spheres. Discrete & Com- in c for unconstrained global optimization. Technical report, putational Geometry 14:237–259. Department of Mathematics, U.P. Diliman. Sotiropoulos, G. D.; Stavropoulos, C. E.; and Vrahatis, Alliot, J.-M.; Durand, N.; Gianazza, D.; and Gotteland, J.- N. M. 1997. A new hybrid genetic algorithm for global B. 2012. Finding and proving the optimum: Cooperative optimization. In Proceedings of second world congress on stochastic and deterministic search. 20th European Confer- Nonlinear analysts, 4529–4538. Elsevier Science Publishers ence on Artificial Intelligence. Ltd. Araya, I.; Trombettoni, G.; and Neveu, B. 2010. Exploiting Storn, R., and Price, K. 1997. Differential evolution - a monotonicity in interval constraint propagation. In AAAI. simple and efficient heuristic for global optimization over Baumann, E. 1988. Optimal centered forms. BIT Numerical continuous spaces. Journal of Global Optimization 341– Mathematics 28:80–87. 359. Benhamou, F.; Goualard, F.; Granvilliers, L.; and Puget, J.- Van Hentenryck, P. 1997. Numerica: a modeling language F. 1999. Revising hull and box consistency. In International for global optimization. In Proceedings of the Fifteenth in- Conference on Logic Programming, 230–244. MIT press. ternational joint conference on Artifical intelligence - Vol- ume 2, IJCAI’97, 1642–1647. Chabert, G., and Jaulin, L. 2009. Contractor programming. Artificial Intelligence 173:1079–1100. Vavasis, S. A. 1994. Open problems. Journal of Global Optimization 4:343–344. Dorigo, M.; Maniezzo, V.; and Colorni, A. 1996. The Ant System: Optimization by a colony of cooperating agents. Wales, D. J., and Doye, J. P. K. 1997. Global optimization by IEEE Transactions on Systems, Man, and Cybernetics Part basin-hopping and the lowest energy structures of lennard- B: Cybernetics 26(1):29–41. jones clusters containing up to 110 atoms. The Journal of Physical Chemistry A 101(28):5111–5116. Hansen, N., and Kern, S. 2004. Evaluating the cma evolution strategy on multimodal test functions. In Proceedings of the Zhang, X., and Liu, S. 2007. A new interval-genetic al- 8th International Conference on Parallel Problem Solving gorithm. International Conference on Natural Computation from Nature, 282–291. 4:193–197. Hansen, E. 1992. Global optimization using interval analy- sis. Dekker. Hoare, M., and Pal, P. 1971. Physical cluster mechanics: Statics and energy surfaces for monatomic systems. Ad- vances in Physics 20(84):161–196. Jones, J. E. 1924. On the determination of molecular fields. i. from the variation of the viscosity of a gas with tempera- ture. Proceedings of the Royal Society of London. Series A 106(738):441–462. Kennedy, J., and Eberhart, R. 1995. Particle swarm opti- mization. In Proceedings of the IEEE International Confer- ence on Neural Networks. Leary, R. 1997. Global optima of lennard-jones clusters. J. of Global Optimization 11(1):35–53. Locatelli, M., and Schoen, F. 2003. Efficient algorithms for large scale global optimization: Lennard-jones clusters. Comput. Optim. Appl. 26(2):173–190. Michalewicz, Z. 1996. Genetic algorithms + data structures = evolution programs (3rd ed.). Springer-Verlag. Mishra, S. K. 2006. Some new test functions for global optimization and performance of repulsive particle swarm method. Technical report, UniversityDRAFT Library of Munich, Germany. Moore, R. E. 1966. Interval Analysis. Prentice-Hall. Northby, J. A. 1987. Structure and binding of lennard-jones clusters: 13 <= n <= 147. The Journal of Chemical Physics 87(10):6166–6177. Price, K.; Storn, R.; and Lampinen, J. 2006. Differential Evolution - A Practical Approach to Global Optimization. Natural Computing. Springer-Verlag.