<<

NUMERICAL ALGEBRA, doi:10.3934/naco.2012.2.69 CONTROL AND OPTIMIZATION Volume 2, Number 1, March 2012 pp. 69–90

UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION

Dmitri E. Kvasov and Yaroslav D. Sergeyev1

DEIS, University of Calabria Via P. Bucci, Cubo 42C, 87036 – Rende (CS), Italy Software Department, N.I. Lobachevsky State University Gagarin Av. 23, 603950 – Nizhni Novgorod, Russia

(Communicated by David Gao)

Abstract. In this survey, univariate global optimization problems are consid- ered where the objective function or its first derivative can be multiextremal black-box costly functions satisfying the Lipschitz condition over an interval. Such problems are frequently encountered in practice. A number of geometric methods based on constructing auxiliary functions with the usage of different estimates of the Lipschitz constants are described in the paper.

1. Introduction. Decision-making problems stated as problems of optimization of an objective function subject to a set of constrains arise in various fields of human activity such as engineering design, economic models, biology studies, etc. Optimization problems characterized by the functions with several local optima (typically, their number is unknown and can be very high) have a great importance for practical applications. These problems are usually referred to as multiextremal, or global optimization, ones. Both the objective function and constraints can be black-box and hard to evaluate functions with unknown analytical representations. Such a type of functions is frequently met in real-life applications, but the problems related to them often cannot be solved by traditional optimization techniques (see, e.g., [13, 20, 22, 26, 27, 35, 50, 68, 81, 91] and the references given therein) usually making strong suppositions (convexity, differentiability, etc.) that cannot be used with multiextremal problems. This explains the growing interest of researchers in developing numerical global optimization methods able to tackle this difficult class of problems (see, e.g., [4, 21, 37, 42, 48, 51, 73, 82, 85, 86, 96, 111, 113, 123]). A priori assumptions on the objective function serve as mathematical tools for obtaining estimates of the global solution related to a finite number of function evaluations (trials) and, therefore, play a key role in the construction of any efficient global search . Competitive global optimization methods are, as a rule,

2000 Subject Classification. Primary: 65K05, 90C26; Secondary: 90C56. Key words and phrases. Global optimization, black-box function, Lipschitz condition, geomet- ric approach. This work was supported by the grants 1960.2012.9 and MK-3473.2010.1 awarded by the Pres- ident of the Russian Federation for supporting the leading research groups and young researchers, respectively, as well as by the grant 11-01-00682-a awarded by the Russian Foundation for Funda- mental Research. 1Corresponding author

69 70 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV sequential (see, e.g., [116]), i.e., the choice of new trials depends on the information obtained by such an algorithm at previous iterations. Since each function trial is a resource-consuming operation, it is desirable to obtain a required approximation of the problem solution in a relatively few iterations. One of the natural and valid from both the theoretical and the applied points of view suppositions on the global optimization problem is that the objective function and eventual constraints have bounded slopes. In other words, any limited change in the object parameters gives rise to some limited changes in the characteristics of the objective behavior. This can be justified by the fact that in technical systems the energy of change is always limited (see the related discussion in [113, 115]). One of the most popular and simple mathematical formulations of this property is the Lipschitz continuity condition, which assumes that the difference (in the sense of a chosen norm) of any two function values is majorized by the difference of the corresponding function arguments, multiplied by a positive factor L. In this case, the function is said to be Lipschitz and the corresponding factor L is said to be Lipschitz constant. The problem with either the Lipschitz objective functions or the objective functions having multiextremal Lipschitz first derivatives is said to be Lipschitz global optimization (LGO) problem. The Lipschitz continuity assumption, being quite realistic for many practical problems (see, e.g., [52, 53, 83, 85, 86, 96, 113, 123]), is also a suitable tool for devel- oping, studying and applying the so-called geometric LGO methods, i.e., sequential methods that use in their work auxiliary functions to estimate the objective function behavior over the search region. Together with other techniques for solving LGO problems (see, e.g., [12, 38, 41, 52, 53, 57, 73, 80, 113, 118, 123, 125]), the geometric idea has proved to be very fruitful and many algorithms based on constructing and improving auxiliary functions built by using the Lipschitz constant estimates have been proposed. Many of these methods can be studied within a general framework (as branch-and-bound scheme [53, 54, 85] or divide-the-best approach [96, 98, 105]) making them even more attractive for both theoretical and applied research. Dif- ferent geometric LGO algorithms will be considered in this paper. In order to give an insight to the class of geometric LGO methods, in what follows we shall pay our attention to one-dimensional problems. In global optimization, these problems play a very important role both in the theory and practice and, therefore, were intensively studied in the last decades (see, e.g., [21, 33, 38, 52, 85, 96, 113, 115, 116, 123]). In fact, on the one hand, theoretical analysis of one-dimensional problems is quite useful since mathematical approaches developed to solve them very often can be generalized to the multidimensional case by numerous schemes (see, e.g., [24, 52, 53, 56, 61, 64, 72, 82, 84, 85, 96, 106, 113, 123]). On the other hand, there exists a large number of real-life applications where it is necessary to solve these problems (see, e.g., [82, 85, 86, 90, 96, 113, 123]). Electrical engineering and electronics are among the fields where the usage of efficient one-dimensional global optimization methods are often required (see, e.g., [18, 33, 90, 93, 113]). Let us consider, for example, the following common problem in electronic mea- surements and electrical engineering. There exists a device whose behavior depends on a characteristic f(x), x ∈ [a,b], where the function f(x) may be, for instance, an electrical signal obtained by a complex computer aided over a time interval [a,b] (see the function graph in thick line in Fig. 1). The function f(x) is often multiextremal and Lipschitz (it can be also differentiable with the Lipschitz first derivative). The device works correctly while f(x) > 0. Of course, at the initial UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 71

Figure 1. The problem of finding the minimal root of equa- tion f(x) = 0 with multiextremal non-differentiable left part arising in electrical engineering moment x = a we have f(a) > 0. It is necessary to describe the performance of the device over the time interval [a,b] either determining the point x∗ such that f(x∗)=0, f(x) > 0, x ∈ [a, x∗), x∗ ∈ (a,b], (1) or demonstrating that x∗ satisfying (1) does not exist in [a,b] (in this case the device works correctly for the whole time period; thus, an information about the global minimum of f(x) could be useful in practice to measure the device reliability). This problem is equivalent to the problem of finding the minimal root (the first root from the left) of the equation f(x) = 0, x ∈ [a,b], in the presence of certain initial conditions and can be reformulated as a global optimization problem. There is a simple approach to solve this problem based on a grid technique. It produces a dense mesh starting from the left margin of the interval and going on by a small step till the signal becomes less than zero. For an acquired signal, the determination of the first zero crossing point by this technique is rather slow especially if the search accuracy is high. Since the objective function f(x) is multiextremal (see Fig. 1) the problem is even more difficult because many roots can exist in [a,b] and, therefore, classical root finding techniques can be inappropriate. The rest of the paper is organized as follows. In Section 2, the Lipschitz global optimization problem is formally stated (for both non-differentiable and differen- tiable objective functions) and an overview of geometric ideas to its solving is given. A number of geometric LGO methods are described in Section 3 (in the case of Lip- schitz non-differentiable functions) and in Section 4 (in the case of differentiable functions with the Lipschitz first derivatives). It should be noted that to expose the principal ideas of some known geometric ap- proaches to solving the stated problem, box-constrained LGO problems will be con- sidered in the paper. This special case stays at the basis of the global optimization methods managing general multiextremal constraints. For example, such a promis- ing global optimization approach as the index scheme (see, e.g., [9, 97, 107, 112, 113]) reduces the general constrained problem to a (discontinuous) box-constrained one having a special nice structure.

2. Lipschitz global optimization problem. Formally, a box-constrained one- dimensional Lipschitz global optimization problem can be stated as follows (for 72 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV the sake of certainty, we shall consider the minimization problem). Given a small positive constant ε, it is required to find an ε-approximation of the global minimum point (global minimizer) x∗ of a multiextremal, black-box (and, often, hard to evaluate) objective function f(x) over a closed interval [a,b]: f ∗ = f(x∗) = min f(x), x ∈ [a,b]. (2) It can be supposed either that the objective function f(x) is not necessarily differen- tiable and satisfies the Lipschitz condition with the (unknown) Lipschitz constant L, 0

φi(x) = max{zi−1 − L(x − xi−1), zi + L(x − xi)}, x ∈ [xi−1, xi]. (5)

The minimal value of φi(x) over [xi−1, xi] is therefore the lower bound of f(x) over this interval (sometimes called its characteristic). It is calculated as

z − + z x − x − R = R = φ (ˆx )= i 1 i − L i i 1 (6) i [xi−1,xi] i i 2 2 and is obtained at the point (see Fig. 2) x + x z − z xˆ = i−1 i − i i−1 . (7) i 2 2L The described geometric interpretation lies in the basis of LGO geometric meth- ods. They iteratively construct and update auxiliary minorant functions by evalu- ating f(x) at minimum points of these bounding functions (asx ˆi in Fig. 2). The Lipschitz condition is used to obtain the lower (and, eventually, upper) bound of the global minimum value at each iteration of a geometric LGO algorithm, thus allow- ing one to construct global optimization algorithms and to prove their convergence in a unified manner (see, e.g., [2, 7, 28, 33, 53, 58, 59, 85, 88, 96, 105, 113, 123]). UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 73

Figure 2. Geometric interpretation of the Lipschitz condition

Close geometric ideas can been used also for other classes of optimization prob- lems. For example, various aspects of the so-called αBB-algorithm for solving con- strained nonconvex optimization problems with twice-differentiable functions have been examined, e.g., in [1, 3, 19, 69] (see also [36]). The αBB-algorithm combines a convex lower bounding procedure (based on a parameter α that can be interpreted in a way similar to the Lipschitz constant) within the branch-and-bound framework whereas nonconvex terms are underestimated through quadratic functions derived from the second-order information. One of the main issues to be considered in solving the stated LGO problem is estimating the Lipschitz constant L from (3). There are several approaches to specify the Lipschitz constant. For example, it can be given a priori (see, e.g., [11, 29, 32, 47, 52, 53, 70, 72, 88, 109, 122]). This case is very importan- t in the theory but is not easily applied in practice. More practical approaches are based on an adaptive estimation of L in the course of the search. In such a way, algorithms can use either an adaptive global estimate of the Lipschitz constant (see, e.g., [53, 61, 78, 85, 113, 114, 115, 121]) valid for the whole interval [a,b], or adaptive local estimates Li valid only for some sub-intervals [ai,bi] ⊂ [a,b] (see, e.g., [60, 76, 96, 101, 102, 113]). Estimating local Lipschitz constants during the work of a global optimization algorithm allows one to significantly accelerate the global search (the importance of the estimation of local Lipschitz constants has been highlighted by many authors, see, e.g. [7, 71, 85, 96, 113]). Naturally, balancing between local and global information must be performed in an appropriate way (see, e.g., [101, 96, 113]) since an excessive unjustified usage of local information can lead to the loss of the global solution (see, e.g., [110]). Finally, multiple esti- mates of L can be chosen during the search from a certain set of admissible values from zero to infinity (see, e.g., [34, 40, 49, 56, 95, 96]). It should be stressed that either the Lipschitz constant is known and an algorithm is constructed correspondingly, or it is not known but there exist a sufficiently large number of parameters of the considered algorithm ensuring its convergence. With this in mind, a significant influence that the Lipschitz constant estimates have on the convergence speed of numerical algorithms should be taken into account. An underestimate of the Lipschitz constant L can lead to the loss of the global solution. In contrast, accepting a very high value of L for a concrete objective function means (due to the Lipschitz condition (3)) assuming that the function has a complicated structure with sharp peaks and narrow attraction regions of minimizers within the 74 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

Table 1. Some geometric LGO methods characterized by different ways of estimating the Lipschitz constant

Lipschitz Objective function constant Differentiable Non-differentiable estimate Non-smooth Smooth minorants minorants A priori [32, 47, 88, 109] [7, 16, 43] [67, 99, 104] Multiple [40, 49, 56, 95] [62] — Global [53, 85, 113, 115] [43, 100, 104] [99, 104, 113] Local [96, 101, 102, 113] [100, 103, 113] [96, 99, 104] whole search interval. Thus, an overestimate of L (if it does not correspond to the real behavior of the objective function) leads to a slow convergence of the algorithm to the global minimizer. In the next two Sections, different geometric methods for solving both the prob- lems (2), (3) and (2), (4) will be considered by classifying them on the way of obtain- ing the Lipschitz information (a priori given, multiple, global, and local estimates of the Lipschitz constant). References to some algorithms from each group are re- ported in Table 1 where in the case of the differentiable objective function (2), (4) the geometric methods are further differentiated by the type of the used minorant function that can be either non-smooth or smooth. The choice of these particular algorithms is explained mainly by the following two aspects. First, they are suffi- ciently representative for demonstrating various approaches used in the literature to estimate the Lipschitz constant. Second, they manifest a good performance on various sets of test and practical functions from the literature, and, therefore, are often chosen as worthy candidates for multidimensional extensions.

3. Geometric LGO methods for non-differentiable functions. One of the first and well studied methods for solving the one-dimensional LGO problem (2), (3) is the Piyavskij–Shubert method (see [87, 88, 109]), discussed in a vast literature (see, e.g., the survey [47] and the references given therein; various modifications of this algorithm have been also proposed, see, e.g., [11, 33, 47, 53, 92, 108, 117, 119]). We start our description from this algorithm due to its methodological importance in the analysis of geometric LGO approaches. The Piyavskij–Shubert method is a sequential algorithm that uses a priori given information about the Lipschitz constant (namely, a given value of the Lipschitz constant or its overestimate) to adaptively construct minorant auxiliary functions for the objective function f(x) as described in the previous Section (see Fig. 2). At the first iteration k = 1 of the algorithm, initial trials are performed at several points x0, x1,...,xn(1) (usually, the left and right margins of the search interval [a,b] are chosen for this scope with n(1) = 1). At each successive iteration k > 1, given trial points xi (in the order of growth of their coordinates) and the corre- sponding function values zi = f(xi), 0 ≤ i ≤ k, first, the current lower bounding function Φk(x) is constructed as the union of minorants φi(x) from (5) over all UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 75

Figure 3. Sub-intervals which can contain the global minimizer of a Lipschitz (with the Lipschitz constant L) function f(x) during the work of the Piyavskij–Shubert algorithm are drawn by hatching

sub-intervals [xi−1, xi], 1 ≤ i ≤ k; then, a sub-interval with the minimal charac- teristic Rt is determined (see Fig. 3) taking into account (6); and, finally, a new trial pointx ˆt is calculated on this sub-interval by formula (7). In such a way, a new information about the objective function is acquired by evaluating f(ˆxt), the piecewise linear minorant function is updated, whereas the value ∗ xk = arg min zi 0≤i≤k is chosen as a new approximation the global minimizer from (2), with ∗ fk = min zi 0≤i≤k being the current approximation of the global minimum value f ∗. This iterative process is repeated until some stopping criterion is verified (e.g., until the sub- interval containing a new trial point becomes small enough). In Fig. 3, an example of the lower bounding function Φk(x) for f(x) after six function trials (or k = 5 iterations) is shown by continuous thin line. The black dots on the objective function graph (thick line) indicate function values at the ordered trial points x0, x1, ..., x5. The next, i.e., the seventh, trial will be per- formed at the pointx ˆt =x ˆ1. After executing this new trial, the lower bounding function will be reconstructed (precisely, Φ6(x) will differ from Φ5(x) over the sub- interval [x0, x1] in Fig. 3), thus making better the current approximation of the problem solution. Note that some sub-intervals (namely, those for which the char- ∗ acteristic value is greater than the current minimum value fk ; see, for instance, the sub-interval [x1, x2] in Fig. 3) can be eliminated in order to reduce the required computational resources. More precisely, the global minimizer x∗ can be found only within the set X∗(k) defined as ∗ ∗ X (k)= {x ∈ [a,b]:Φk(x) ≤ fk }. In Fig. 3, the sub-intervals forming the set X∗(k) are indicated by hatching drawn over the axis x. As observed, e.g., in [25, 55, 116], this algorithm is optimal in one step, i.e., the choice of the current evaluation point ensures the maximal improvement of the lower bound of the global minimum value of f(x) with respect to a number of reasonable 76 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

Figure 4. Lower bounds Ri of f(x) over sub-intervals [ai,bi] cor- responding to a particular estimate Lˆ of the Lipschitz constant criteria (see, e.g., [53, 113, 116, 123] for the related discussions on optimality princi- ples in Lipschitz global optimization). As many other geometric LGO methods, the Piyavskij–Shubert algorithm can be viewed as a branch-and-bound algorithm (see, e.g., [53, 54]) or, more generally, as a divide-the-best algorithm [105]. Within this theoretical framework, the convergence of the sequence of trial points generated by the method to the global minimizer (global convergence) can be easily established (see, e.g., [88, 105]). Similar branch-and-bound ideas are used in the method of nonuniform cover- ings proposed in [32, 33] (see also [31]) for functions with a priori given Lipschitz constants (see also [29, 30] for high-performance realizations of this approach). An interesting variation of the Piyavskij–Shubert algorithm has been proposed in [56] where the DIRECT method has been introduced. It iteratively selects sev- eral sub-intervals of [a,b] for partitioning and subdivides each of them into thirds with subsequent evaluations of the objective function f(x) at the central points of new sub-intervals. The selection procedure is based on estimates of the lower bounds of f(x) over sub-intervals by using a set of possible values for the Lipschitz constant, from zero to infinity. In terms of the geometric approach, it is possible to say that all admissible minorant functions (in this case, they are piecewise linear discontinuous functions) are examined during the current iteration of the algorithm without constructing a specific one. In Fig. 4, an example of a partition of [a,b] into 5 sub-intervals performed by the DIRECT method is represented. The objective function f(x) has been evaluated at the central points c1, c2, c3, c4, and c5 of the corresponding sub-intervals (trial points are indicated by black dots in Fig. 4). Note that at the first iteration the interval [a,b] has been partitioned into three equal sub-intervals and the first three trials have been executed at the points c3, c1, and c5. Then, the central sub-interval among the three ones has been divided into thirds and two new trials have been performed at the central points c2 and c4 of the left and the right of these third- s, respectively. Given an overestimate Lˆ of the Lipschitz constant from (3), the objective function is bounded from below over [a,b] by a piecewise linear discontin- uous minorant function Φ5(L,ˆ x) (see Fig. 4; on the vertical axis the sub-intervals characteristics are indicated). UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 77

For the known overestimate Lˆ of the Lipschitz constant, new trials should be executed (as in the Piyavskij-Shubert algorithm) within the sub-interval [a2,b2] having the minimal characteristic R2 in order to obtain an improvement of the current estimate of the minimal function value. But since in practical applications the exact Lipschitz constant L (or its overestimate) is often unknown, it is not possible to indicate with certainty such a ‘promising’ sub-interval to be subdivided. For example, the sub-interval [a2,b2] in Fig. 4 has the smallest lower bound of f(x) with respect to the estimate Lˆ of the Lipschitz constant. However, if a higher estimate L˜ ≫ Lˆ is taken, the slope of the lines in Fig. 4 increases and the sub- interval [a1,b1] becomes preferable to all others since the lower bound of f(x) over this sub-interval becomes the smallest one with respect to this new estimate L˜. Jones et al. [56] have proposed to use various estimates of the Lipschitz constant from zero to infinity at each iteration of the DIRECT. This corresponds to exam- ination of all possible slopes Lˆ when auxiliary functions are considered (which in this case are not always minorants for f(x)) and lower bounds are calculated. Such a consideration leads to the basic idea of the DIRECT: to select for partitioning and sampling (i.e., performing the function trials) within the so-called potentially optimal sub-intervals, i.e., sub-intervals over which f(x) could have the best im- provement with respect to a particular estimate of the Lipschitz constant. Their determination is simplified by representing each sub-interval [ai,bi] of the current partition of [a,b] as a dot in a two-dimensional diagram with horizontal coordinate ci = (bi − ai)/2 and vertical coordinate f(ci): a dot representing some potentially optimal sub-interval is located on the lower right convex hull of all the dots (see [56]). Thus, the DIRECT method is essentially the Piyavskij–Shubert algorithm mod- ified to use a center-sampling strategy and to subdivide all potentially optimal sub-intervals. Since during the search it uses a set of possible estimates of L and does not use its single overestimate, only the so-called everywhere dense convergence (i.e., convergence of the sequence of trial points to any point of the search interval) can be established for this method; it is also difficult to apply for the DIRECT some meaningful stopping criterion, such as, for example, stopping on achieving a desired accuracy in solution. Nevertheless, due to its relative simplicity and a satisfactory performance on several test functions and applied problems, the DIRECT has been widely adopted in practical applications (see, e.g., [10, 15, 17, 23, 39, 49, 66, 75, 120]) and has attracted the attention of the researchers (for its theoretical and experi- mental analysis and several modifications see, e.g., [22, 23, 34, 40, 49, 56, 65, 95]). As already observed, an assumption that the objective function satisfies the Lips- chitz condition raises a question of estimating the corresponding Lipschitz constan- t. Strongin’s algorithm (see, e.g., [113, 115]) answers this question by an adaptive estimation of the Lipschitz constant during the global search. It has a good con- vergence rate as compared to the global optimization methods using only values of the objective function during the search (see, e.g., [44, 115, 113]) and, therefore, has been often chosen as a good candidate for multidimensional extensions (see, e.g., [38, 61, 63, 76, 85, 94, 96, 113]). Formally, this algorithm belongs to the class of the so-called information-statist- ical algorithms (or, just, information algorithms). The information approach origi- nated in the works [79, 114] (see also [113, 115]) and, together with the Piyavskij– Shubert algorithm, it has consolidated foundations of the Lipschitz global optimiza- tion. The main idea of this approach is to apply the theory of random functions 78 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV to building a mathematical representation of available (certain or uncertain) a pri- ori information on the objective function behavior. A systematic approach to the description of some uncertain information on this behavior is to accept that the unknown black-box function to be minimized is a sample of some known random function. Generally, to provide an efficient analytical technique for deriving some estimates of global optimum with a finite number of trials, i.e., for obtaining by Bayesian reasoning some conditional (with respect to the trials performed) estima- tions, the random function should have some special structure. It is then possible to deduce the decision rules for performing new trials as some optimal decision functions (see, e.g., [14, 46, 57, 73, 74, 113, 115, 118, 123, 124, 125]). In the Strongin algorithm, a global estimate M of the Lipschitz constant L is adaptively calculated by using the obtained results zi = f(xi) of the function trials at the points xi, 0 ≤ i ≤ k,

|z − z | M = M(k)= r max i i−1 1≤i≤k xi − xi−1 where r> 1 is the algorithm parameter (it can be always found such a value of this parameter that guarantees the convergence of the sequence of trial points generated by the method to the global minimizer of any fixed function f(x) satisfying the Lipschitz condition (3); see [113, 115]). The Strongin algorithm can be considered in the framework of divide-the-best algorithms (see [105]). As demonstrated, e.g., in [76, 113], there is a firm relation between the information and geometric approaches. In fact, the characteristics Ri of the Strongin information algorithm associated with each sub-interval [xi−1, xi], 1 ≤ i ≤ k, of the search interval [a,b] (see Fig. 2) can be rewritten (see [76, 113]) in a form similar to that of the Piyavskij–Shubert algorithm; this allows one to interpret the Strongin method as a geometric algorithm adaptively constructing auxiliary piecewise linear functions during its work. The usage of the only global information about behavior of the objective function, like the just mentioned global estimation of the Lipschitz constant, can lead to a slow convergence of algorithms to global minimizers. One of the traditional ways of overcoming this difficulty (see, e.g., [5, 52, 53] and the references given therein) recommends stopping the global procedure and switching to a local minimization method in order to improve the solution and to accelerate the search during its final phase. Applying this technique can result in some problems related to the combination of global and local phases, the main problem being that of determining when to stop the global procedure and to start the local one. A premature arrest can provoke the loss of the global solution whereas a late one can slow down the search. For example, it is well known that the DIRECT method balances global and local information during its work. However, the local phase is too pronounced in this balancing. The DIRECT executes too many function trials in regions of lo- cal optima and, therefore, manifests a slow convergence to the global minimizers when the objective function has many local minimizers. In [95], a new geometric algorithm inspired by the DIRECT ideas has been proposed to solve difficult multi- extremal LGO problems. To accomplish this task, a two-phase approach consisting of explicitly defined global and local phases has been incorporated in this method thus providing a faster convergence to the global minimizers. Another type of local improvement strategy has been introduced in [63, 64]. This technique forces the UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 79 global optimization method to make a local improvement of the best approxima- tion of the global minimum immediately after a new approximation better than the current one is found. In [101, 102] (see also [96, 113]), another fruitful approach (the so-called local tuning approach) which allows global optimization algorithms to tune their behavior to the shape of the objective function at different sub-intervals has been proposed for solving the LGO problem (2), (3). The main idea behind this approach lies in the adaptive balancing of local and global information obtained during the search for every sub-interval [xi−1, xi], 1 ≤ i ≤ k, formed by trial points xi. When a sub- interval [xi−1, xi] is narrow, the local information obtained within the near vicinity of the trial points xi−1 and xi influences the method behavior mainly. In this case, the results of trials executed at points lying far from the interval [xi−1, xi] are insignificant for the method. Working with a wide sub-interval, it takes into consideration the global search information obtained from the whole search interval. Both the comparison and the balancing of global and local data are effected by estimating local Lipschitz constants for each sub-interval [xi−1, xi], 1 ≤ i ≤ k, as

µi = r max{λi,γi, ξ}, 1 ≤ i ≤ k, (8) where r> 1 is the method parameter ensuring its global convergence and ξ > 0 is a small technical constant that guarantees the correct algorithm execution when f(xi) is equal to a constant for all trial points xi. Here the values λi and γi, 1 ≤ i ≤ k, are calculated as follows:

|zj − zj−1| λi = max{ : j = i − 1, i, i +1}, xj − xj−1 (when i =1 or i = k we consider only j = i, i +1 or j = i − 1, i, respectively), max max γi = λ (xi − xi−1)/X , |z − z | λmax = max i i−1 , 1≤i≤k xi − xi−1 max X = max {xi − xi−1}. 1≤i≤k The value λmax is an estimate of the global Lipschitz constant L over the inter- val [a,b]. The estimate µi of the local Lipschizt constant Li over an interval [xi−1, xi] contains the following two fundamental parts: λi, which accounts for local proper- ties, and γi, which accounts for global ones. When the interval [xi−1, xi] is large, the global part increases, because in this case local information may not be reliable. In the opposite case, the global part decreases, because local information is of the major importance and the global one loses its influence. Thus, at every sub-interval a balancing of local and global information is performed automatically. Similarly to the Piyavskij–Shubert method, this algorithm constructs auxiliary functions approximating the objective function. Although these functions are not al- ways minorant for f(x) over the whole search interval, they are iteratively improved during the search in order to obtain appropriate bounds for the global minimum from (2). In Fig. 5, an example of the auxiliary function Φˆ k(x) for a Lipschitz func- tion f(x) over [a,b] constructed by using estimations of local Lipschitz constants over sub-intervals of [a,b] is shown by a solid thin line; a lower bounding func- tion Φk(x) for f(x) over [a,b] constructed by using an overestimate of the global Lipschitz constant is represented by a dashed line. Note that Φˆ k(x) estimates the 80 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

Figure 5. An auxiliary function Φˆ k(x) (solid thin line) and a lower bounding function Φk(x) (dashed line) for a Lipschitz function f(x) over [a,b], constructed by using local Lipschitz estimates and by using the global Lipschitz constant, respectively

behavior of f(x) over [a,b] more accurately than Φk(x), especially over sub-intervals where the corresponding local Lipschitz constants are smaller than the global one. The local tuning approach enjoys the following properties (see [101, 102, 113]): (1) the problem of determining when to stop the global procedure does not arise because the local information is taken into consideration throughout the whole duration of the global search; (2) the local information is taken into account not only in the neighborhood of a global minimizer but also over the whole search interval, thus allowing the acceleration of the global search; (3) in order to guarantee convergence to the global minimizer x∗ from (2) it is not necessary to know the exact Lipschitz constant over the whole search interval; on the contrary, only an overestimate of a local Lipschitz constant at a neighborhood of x∗ is needed; (4) geometric local tuning algorithms can be successfully parallelized (see, e.g., [94, 113]) and easily extended to the multidimensional case (see, e.g., [60, 63, 96, 101, 113]). These advantages allow one to adopt the local tuning approach for efficient solv- ing different univariate and multidimensional LGO problems (see, e.g., [60, 63, 76, 96, 101, 102, 104, 106, 113]). To conclude our presentation of the Lipschitz geometric ideas for solving the L- GO problem (2) (3), let us return to the problem of finding the minimal root of an equation with multiextremal non-differentiable left part (see (1) and Fig. 1 in Intro- duction). A possible fast and efficient algorithm for solving this important practical problem can be developed as follows (see, e.g., [96, 113]). Let us suppose that the objective function f(x) has been already evaluated at some trial points xi, 0 ≤ i ≤ n, and zi = f(xi) (see Fig. 6). For every interval [xi−1, xi], 1 ≤ i ≤ n, a piecewise lin- ear function φi(x) is constructed (it is drawn by dashed line in Fig. 6) by using the Lipschitz information in such a way that φi(x) ≤ f(x), x ∈ [xi−1, xi]. By knowing the structure of the auxiliary function φi(x), 1 ≤ i ≤ n, it is possible to determine the minimal index k ≥ 1, such that the equation φk(x) = 0 has a solution (pointx ˜n in Fig. 6) over [xk−1, xk]. Adaptively improving the set of functions φi(x), 1 ≤ i ≤ n, by adding new trial pointsx ˜n, n > 1, we improve both our lower approximation of f(x) and the current solution to the problem. In this manner, such a geometric method either finds the minimal root of the equation f(x) = 0 or determines the global minimizer of f(x) (in the case when the equation under consideration has no UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 81

Figure 6. Finding the minimal root of equation f(x) = 0 with multiextremal non-differentiable left part by a geometric method roots on the given interval). Its performance is significantly faster in comparison with the methods traditionally used by engineers for solving this problem. The usage of the local tuning technique on the behavior of f(x) allows one to obtain a further acceleration of the search (see, e.g., [77, 96, 113]).

4. Geometric LGO methods for differentiable functions with the Lip- schitz first derivatives. The restriction of the class of the objective functions (i.e., the examination of the LGO problem (2), (4) rather than the problem (2), (3)) opens new opportunities for developing efficient geometric LGO methods. In fact, at each point x ∈ [a,b] it is now possible to evaluate both the objective function f(x) and its first derivative f ′(x), thus obtaining more information about the problem (especially, regarding its local properties expressed by the derivative values). The usage of this information allows one to construct auxiliary functions that fit closely the objective function and to accelerate the global search. The geometric approach for solving the LGO problem (2), (4) has obtained a strong impact to expansion after the papers [16, 43] appeared. In these works, non- smooth piecewise quadratic minorants have been used to approximate the behavior of the objective function f(x) from (2) by using the Lipschitz condition (4). In Fig. 7, an example of such a lower bounding function ψi(x) over a sub- interval [xi−1, xi] is given where both the objective function f(x) from (2) and its first derivative f ′(x) satisfying the Lipschitz from (4) have been evaluated at two points xi−1 and xi of the search interval [a,b] with the corresponding values ′ ′ ′ ′ zi−1 = f(xi−1), zi−1 = f (xi−1) and zi = f(xi), zi = f (xi). If an overestimate m ≥ K is known, the lower bounding function ψi(x) for f(x) over [xi−1, xi] can be constructed (due to the Lipschitz condition (4) and the Taylor formula for f(x)) as ′ 2 ψi(x) = max { zi−1 + zi−1(x − xi−1) − 0.5m(x − xi−1) , ′ 2 (9) zi − zi(xi − x) − 0.5m(xi − x) }.

The lower bound value Ri for f(x) over [xi−1, xi] (the sub-interval characteristic; see Fig. 7) which can be explicitly calculated in a way similar to (6) results in this 82 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

Figure 7. Non-smooth piecewise quadratic auxiliary func- tion ψi(x) which can be constructed over a sub-interval [xi−1, xi] for a function f(x) with the Lipschitz first derivative

case more tight to the minimal value of f(x) over [xi−1, xi] than the piecewise linear minorants (5). This fact, as a rule, accelerates the global search, which is a natural consequence of the availability and the usage of a more complete information about the function in the LGO problem (2), (4) with respect to the problem (2), (3). As in the case of the LGO problem (2), (3), various approaches can be used to obtain an estimate m of the Lipschitz constant K for constructing auxiliary functions over [a,b] as the union of functions ψi(x) from (9). For example, in [16] (see also [6, 8]), the constant K from (4) is supposed to be a priori known. In [43] (see also [113]), an adaptive global estimate m of the constant K during the function minimization is proposed. In [100, 103], it was shown that the Lipschitz constant K can be estimated more accurately over the whole interval [a,b] in comparison with [43] and that the local tuning approach can be used in a similar to (8) manner, thus providing a significant acceleration of the global search (see also [96, 113]). It is important to emphasize that in order to ensure the convergence to the global minimizer x∗ from (2), it is not necessary to estimate correctly the global Lipschitz constant K (it may be underestimated) during the execution of the local tuning algorithm but it is sufficient to have an overestimate only of the local Lipschitz constant over a sub-interval containing the point x∗. It is evident from Fig. 7 that the auxiliary functions based on (9) are not smooth at the pointsx ˆi in spite of the smoothness of the objective function f(x) over [a,b]. In [99, 104], it has been demonstrated how to obtain smooth auxiliary functions making them closer to f(x) than the previously used ones and, therefore, accelerat- ing the global search (see also [6, 67] where similar constructions are discussed). A general scheme describing the methods using smooth bounding procedures has been presented in [99, 104] with several approaches for the Lipschitz constant estimation (a priori given, global, and local estimates were considered). The construction of a smooth auxiliary function over [xi−1, xi] is based on the following considerations. The objective function f(x) is above the function ψi(x) ′ for all x ∈ (yi,yi) (see Fig. 8) because due to (4) its curvature is bounded by a parabola 2 πi(x)=0.5mx + bix + ci, UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 83

Figure 8. Smooth piecewise quadratic auxiliary function θi(x) (dashed line) for the objective function f(x) (thick line) with the Lipschitz first derivative over [xi−1, xi]

′ where the unknowns yi , yi, bi, and ci can be determined by solving the following system of equations: ′ ′ ψi(yi)= πi(yi),  ψi(yi)= πi(yi),  ′ ′ ′ ′  ψi(yi)= πi(yi),  ′ ′ ψi(yi)= πi(yi).  ′ Here the first equation provides the coincidence of ψi(x) and πi(x) at the point yi ′ ′ and the third one provides the coincidence of their derivatives ψi(x) and πi(x) at the same point. The second and fourth equations provide the fulfilment of these conditions at the point yi. ′ Thus, once the values yi, yi, bi, and ci are determined (see [96, 99, 104, 113] for details), it may be concluded that the following function ′ ψi(x), x ∈ [xi−1,yi) ∪ [yi, xi], θi(x)= ′ πi(x), x ∈ [yi,yi), is a smooth piecewise quadratic auxiliary function for f(x) over [xi−1, xi], i.e., that ′ the first derivative θi(x) exists over x ∈ [xi−1, xi] and

f(x) ≥ θi(x), x ∈ [xi−1, xi]. As demonstrated by numerical experiments (see, e.g., [96, 104, 113]), the per- formance of geometric methods for solving the LGO problem (2), (4) with smooth auxiliary functions overcomes that of geometric methods with non-smooth mino- rants. The usage of the local tuning technique, in its turn, ensures the further speed up of the methods, especially when a high accuracy of the problem solution is required. Geometric methods based on construction of the Lipschitz piecewise quadratic auxiliary functions can be also applied to solving efficiently the problem of finding the minimal root of an equation with multiextremal differentiable left part (see, e.g., [93, 96, 113]). 84 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

Figure 9. Subdivision of a sub-interval [at,bt] in the situation ′ where f(x) and f (x) are evaluated at the point pt since they have been previously evaluated at the point bt; a discontinuous piecewise quadratic auxiliary function for f(x) is drawn by a dashed line

Up to now, geometric methods for solving the LGO problem (2), (4) that use in their work an a priori given estimate of K from (4), its adaptive global estimate or adaptive estimates of local Lipschitz constants have been considered. Algorithms working with multiple estimates of the Lipschitz constant for f ′(x) chosen from a set of possible values were not known till 2009 (see [62]) in spite of the fact that a geometric method working in this way with the Lipschitz objective functions (the DIRECT method described in the previous Section) has been proposed in 1993 (see [56]). The main obstacle in implementing such an algorithm for differentiable objective functions was a lack of an efficient procedure for determining sub-intervals to perform new trials. A new geometric method resolving this problem in a simple way and evolving the DIRECT ideas to the case of the objective function having the Lipschitz first derivative has been introduced and studied in [62]. In this algorithm, the partition of the search interval [a,b] into sub-intervals [ai,bi] is iteratively performed by subdividing a selected sub-interval [at,bt] (see Fig. 9) into three equal parts of the length (bt − at)/3, i.e.,

[at,bt] = [at,pt] ∪ [pt, qt] ∪ [qt,bt],

pt = at + (bt − at)/3, qt = bt − (bt − at)/3.

A new trial is carried out either at the point pt (if both the objective function f(x) ′ and its first derivative f (x) have been evaluated over the sub-interval [at,bt] at ′ the point bt, see Fig. 9), or at the point qt (if f(x) and f (x) have been evaluated over the sub-interval [at,bt] at the point at). Thus, an efficient partition strategy is adopted since each selected sub-interval is subdivided by the new trial point into three new sub-intervals. A series of non-smooth (discontinuous) piecewise quadratic functions correspond- ing to different estimates Kˆ of the Lipschitz constants K from (4) is then taken into account to find approximations Ri = R[ai,bi] of the lower bounds of f(x) over sub- intervals [ai,bi] (see Fig. 9); given an estimate Kˆ , a lower bound Ri of the function values over the sub-interval [ai,bi] can be calculated as

′ 2 Ri = f(ci) ± f (ci)(bi − ai) − 0.5Kˆ (bi − ai) , (10) UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 85

Figure 10. Different ways of graphical representation of sub- intervals in geometric methods for solving the LGO problem (2), (4) based on multiple estimations of the Lipschitz constant (4) with non-smooth piecewise quadratic auxiliary functions where the sign ‘−’ is used in the case of the right-end function evaluation, i.e., ci = bi, and the sign ‘+’ is used in the case of the left-end function evaluation, i.e., ci = ai (see Fig. 9). In the DIRECT method for solving the problem (2), (3), the potentially opti- mal sub-intervals are determined as candidates for partitioning at each iteration by varying estimates Lˆ of the Lipschitz constant L from zero to infinity. The deter- mination of these sub-intervals is a relatively simple technical task (it is sufficient to represent each sub-interval [ai,bi] as a dot in a two-dimensional diagram with horizontal coordinate ci = (bi − ai)/2 and vertical coordinate f(ci) and to locate the dots on a lower right convex hull of all the dots). When solving the problem (2), (3), the same operation gives troubles due to the non-linear part in (10). The difficulties in establishing the relation of domination (in terms of the lower bounds Ri) between sub-intervals of a current partition generated by the method is illustrated in Fig. 10a. Here, the sub-intervals [a1,b1] and [a3,b3] are the so-called nondominated sub-intervals (i.e., sub-intervals having the smallest lower bound for some particular estimate of the Lipschitz constant for f ′(x); see [62] for details); but they are not represented as the dots on a lower right convex hull of all the dots on an intuitive diagram similar to that of the DIRECT (see Fig. 10a). An efficient solution to this inconvenience is given by a new diagram in Fig. 10b where the intersection of the line with a slope Kˆ passed through any dot representing a sub-interval and the vertical coordinate axis gives us the lower bound (10) of f(x) over the corresponding sub-interval. Thus, the procedure of selecting sub-intervals to be partitioned becomes simple in the case of differentiable objective functions too and another practical geometric method for solving the LGO problem (2), (4) is so available (see [62]). The usage of derivatives allows one to obtain, as it is expected, an acceleration in comparison with the DIRECT method. The development of a geometric method constructing smooth auxiliary functions with multiple estimates of the Lipschitz constant (4) for derivatives still remains an open question. 86 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

REFERENCES

[1] C. S. Adjiman, S. Dallwig, C. A. Floudas and A. Neumaier, A global optimization method, αBB, for general twice-differentiable constrained NLPs – I. Theoretical advances, Comput. Chem. Engng., 22 (1998), 1137–1158. [2] M. Yu. Andramonov, A. M. Rubinov and B. M. Glover, Cutting angle methods in global optimization, Appl. Math. Lett., 12 (1999), 95–100. [3] I. P. Androulakis, C. D. Maranas and C. A. Floudas, αBB: A global optimization method for general constrained nonconvex problems, J. Global Optim., 7 (1995), 337–363. [4] C. Audet, P. Hansen and G. Savard, “Essays and Surveys in Global Optimization,” GERAD 25th Anniversary. Springer–Verlag, New York, 2005. [5] A. M. Bagirov, A. M. Rubinov and J. Zhang, Local optimization method with global multidi- mensional search, J. Global Optim., 32 (2005), 161–179. [6] W. Baritompa and A. Cutler, Accelerations for global optimization covering methods using second derivatives, J. Global Optim., 4 (1994), 329–341. [7] W. Baritompa, Customizing methods for global optimization – A geometric viewpoint, J. Global Optim., 3 (1993), 193–212. [8] W. Baritompa, Accelerations for a variety of global optimization methods, J. Global Optim., 4 (1994), 37–45. [9] K. A. Barkalov and R. G. Strongin, A global optimization technique with an adaptive order of checking for constraints, Comput. Math. Math. Phys., 42 (2002), 1289–1300. [10] M. C. Bartholomew-Biggs, Z. J. Ulanowski and S. Zakovic, Using global optimization for a microparticle identification problem with noisy data, J. Global Optim., 32 (2005), 325–347. [11] P. Basso, Iterative methods for the localization of the global maximum, SIAM J. Numer. Anal., 19 (1982), 781–792. [12] G. Beliakov and A. Ferrer, Bounded lower subdifferentiability optimization techniques: Ap- plications, J. Global Optim., 47 (2010), 211–231. [13] D. P. Bertsekas, “,” Athena Scientific, Belmont, Massachusetts, 1999. [14] B. Betr`o, Bayesian methods in global optimization, J. Global Optim., 1 (1991), 1–14. [15] M. Bj¨orkman and K. Holmstr¨om, Global optimization of costly nonconvex functions using radial basis functions, Optim. Eng., 1 (2000), 373–397. [16] L. Breiman and A. Cutler, A deterministic algorithm for global optimization, Math. Program., 58 (1993), 179–199. [17] R. G. Carter, J. M. Gablonsky, A. Patrick, C. T. Kelley and O. J. Eslinger, Algorithms for noisy problems in gas transmission pipeline optimization, Optim. Eng., 2 (2001), 139–157. [18] L. G. Casado, I. Garc´ıa and Ya. D. Sergeyev, Interval algorithms for finding the minimal root in a set of multiextremal non-differentiable one-dimensional functions, SIAM J. Sci. Comput., 24 (2002), 359–376. [19] M. H. Chang, Y. C. Park, and T. Y. Lee, A new global optimization method for univariate constrained twice-differentiable NLP problems, J. Global Optim., 39 (2007), 79–100. [20] F. H. Clarke, “Optimization and Nonsmooth Analysis,” John Wiley & Sons, New York, 1983. Reprinted by SIAM Publications, 1990. [21] J. J. Cochran, “Wiley Encyclopedia of and Management Science (8 Volumes),” Wiley, New York, 2011. [22] A. R. Conn, K. Scheinberg and L. N. Vicente, “Introduction to Derivative-Free Optimization,” SIAM, Philadelphia, USA, 2009. [23] S. E. Cox, R. T. Haftka, C. A. Baker, B. Grossman, W. H. Mason and L. T. Watson, A comparison of global optimization methods for the design of a high-speed civil transport, J. Global Optim., 21 (2001), 415–433. [24] A. E. Csallner, T. Csendes and M. Cs. Mark´ot, Multisection in interval branch-and-bound methods for global optimization – I. Theoretical results, J. Global Optim., 16 (2000), 371–392. [25] Yu. M. Danilin, Estimation of the efficiency of an absolute-minimum-finding algorithm, USS- R Comput. Math. Math. Phys., 11 (1971), 261–267. [26] V. F. Demyanov and V. N. Malozemov, “Introduction to Minimax,” John Wiley & Sons, New York, 1974. (The 2nd English–language edition: Dover Publications, 1990). [27] V. F. Demyanov and A. M. Rubinov, “Quasidifferential Calculus,” Optimization Software Inc., Publication Division, New York, 1986. [28] S. M. Elsakov and V. I. Shiryaev, Homogeneous algorithms for multiextremal optimization, Comput. Math. Math. Phys., 50 (2010), 1642–1654. UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 87

[29] Yu. G. Evtushenko, V. U. Malkova and A. A. Stanevichyus, Parallel global optimization of functions of several variables, Comput. Math. Math. Phys., 49 (2009), 246–260. [30] Yu. G. Evtushenko, M. A. Posypkin and I. Kh. Sigal, A framework for parallel large-scale global optimization, Comp. Sci. – Res. Dev., 23 (2009), 211–215. [31] Yu. G. Evtushenko and M. A. Posypkin, Coverings for global optimization of partial- nonlinear problems, Doklady Mathematics, 83 (2011), 268–271. [32] Yu. G. Evtushenko, Numerical methods for finding global extrema (Case of a non-uniform mesh), USSR Comput. Math. Math. Phys., 11 (1971), 38–54. [33] Yu. G. Evtushenko, “Numerical Optimization Techniques,” Translations Series in Mathemat- ics and Engineering. Springer–Verlag, New-York, 1985. [34] D. E. Finkel and C. T. Kelley, Additive scaling and the DIRECT algorithm, J. Global Optim., 36 (2006), 597–608. [35] R. Fletcher, “Practical Methods of Optimization,” John Wiley & Sons, New York, 2000. [36] C. A. Floudas and C. E. Gounaris, A review of recent advances in global optimization, J. Global Optim., 45 (2009), 3–38. [37] C. A. Floudas, P. M. Pardalos, C. S. Adjiman, W. Esposito, Z. G¨um¨us, S. Harding, J. Klepeis, C. Meyer and C. Schweiger, “Handbook of Test Problems in Local and Global Optimization,” Kluwer Academic Publishers, Dordrecht, 1999. [38] C. A. Floudas and P. M. Pardalos, “Encyclopedia of Optimization (6 Volumes),” Kluwer Academic Publishers, 2001. (The 2nd edition: Springer, 2009). [39] K. R. Fowler, J. P. Reese, C. E. Kees, J. E. Dennis Jr., C. T. Kelley, C. T. Miller, C. Audet, A. J. Booker, G. Couture, R. W. Darwin, M. W. Farthing, D. E. Finkel, J. M. Gablonsky, G. Gray and T. G. Kolda, Comparison of derivative-free optimization methods for groundwa- ter supply and hydraulic capture community problems, Adv. Water Res., 31 (2008), 743–757. [40] J. M. Gablonsky and C. T. Kelley, A locally-biased form of the DIRECT algorithm J. Global Optim., 21 (2001), 27–37. [41] D. Y. Gao and H. D. Sherali, Canonical duality theory: Connection between nonconvex me- chanics and global optimization, In “Advances in and Global Optimiza- tion”(eds. D. Y. Gao and H. D. Sherali), Springer, New York, (2009), 257–326. [42] D. Y. Gao, “Duality Principles in Nonconvex Systems: Theory, Methods, and Applications,” Kluwer Academic Publishers, Dordrecht, 2000. [43] V. P. Gergel, A global search algorithm using derivatives, In “Systems Dynamics and Opti- mization”(ed. Yu. I. Neimark), NNGU Press, Nizhni Novgorod, Russia, (1992), 161–178. In Russian. [44] V. A. Grishagin, Operating characteristics of some global search algorithms, In “Problems of Stochastic Search,” Zinatne, Riga, 7 (1978), 198–206. In Russian. [45] I. E. Grossmann, “Global Optimization in Engineering Design,” Kluwer Academic Publishers, Dordrecht, 1996. [46] H.-M. Gutmann, A radial basis function method for global optimization, J. Global Optim., 19 (2001), 201–227. [47] P. Hansen and B. Jaumard, Lipschitz optimization, In“Handbook of Global Optimiza- tion”(eds. R. Horst and P. M. Pardalos), Kluwer Academic Publishers, Dordrecht, 1 (1995), 407–493. [48] E. M. T. Hendrix and B. G.-Toth, “Introduction to Nonlinear and Global Optimization,” Springer, New York, 2010. [49] J. He, L. T. Watson, N. Ramakrishnan, C. A. Shaffer, A. Verstak, J. Jiang, K. Bae and W. H. Tranter, Dynamic data structures for a direct search algorithm, Comput. Optim. Appl., 23 (2002), 5–25. [50] J. B. Hiriart-Urruty and C. Lemar´echal, “Convex Analysis and Minimization Algorithms (Parts I and II),” Springer–Verlag, Berlin, 1996. [51] R. Horst, P. M. Pardalos and N. V. Thoai, “Introduction to Global Optimization,” Kluwer Academic Publishers, Dordrecht, 1995. (The 2nd edition: Kluwer Academic Publishers, 2001). [52] R. Horst and P. M. Pardalos, “Handbook of Global Optimization,” volume 1. Kluwer Aca- demic Publishers, Dordrecht, 1995. [53] R. Horst and H. Tuy, “Global Optimization – Deterministic Approaches,” Springer–Verlag, Berlin, 1996. [54] R. Horst, Deterministic global optimization with partition sets whose feasibility is not known: Application to concave minimization, reverse convex constraints, DC–programming, and Lip- schitzian optimization, J. Optim. Theory Appl., 58 (1988), 11–37. 88 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

[55] V. V. Ivanov, On optimal algorithms for the minimization of functions of certain classes, Cybernetics, 4 (1972), 81–94. In Russian. [56] D. R. Jones, C. D. Perttunen and B. E. Stuckman, Lipschitzian optimization without the Lipschitz constant, J. Optim. Theory Appl., 79 (1993), 157–181. [57] D. R. Jones, M. Schonlau and W. J. Welch, Efficient global optimization of expensive black-box functions, J. Global Optim., 13 (1998), 455–492. [58] O. V. Khamisov, Global optimization of functions with a concave support minorant, Comput. Math. Math. Phys., 44 (2004), 1473–1483. [59] A. G. Korotchenko, An algorithm for seeking the maximum value of univariate functions, USSR Comput. Math. Math. Phys., 18 (1978), 34–45. [60] D. E. Kvasov, C. Pizzuti and Ya. D. Sergeyev, Local tuning and partition strategies for diagonal GO methods, Numer. Math., 94 (2003), 93–106. [61] D. E. Kvasov and Ya. D. Sergeyev, Multidimensional global optimization algorithm based on adaptive diagonal curves, Comput. Math. Math. Phys., 43 (2003), 40–56. [62] D. E. Kvasov and Ya. D. Sergeyev, A univariate global search working with a set of Lipschitz constants for the first derivative, Optim. Lett., 3 (2009), 303–318. [63] D. Lera and Ya. D. Sergeyev, An information global minimization algorithm using the local improvement technique, J. Global Optim., 48 (2010), 99–112. [64] D. Lera and Ya. D. Sergeyev, Lipschitz and H¨older global optimization using space-filling curves, Appl. Numer. Math., 60 (2010), 115–129. [65] G. Liuzzi, S. Lucidi and V. Piccialli, A DIRECT-based approach exploiting local minimizations for the solution of large-scale global optimization problems, Comput. Optim. Appl., 45 (2010), 353–375. [66] K. Ljungberg, S. Holmgren and O.¨ Carlborg, Simultaneous search for multiple QTL using the global optimization algorithm DIRECT , Bioinformatics, 20 (2004), 1887–1895. [67] D. MacLagan, T. Sturge and W. Baritompa, Equivalent methods for global optimization, In “State of the Art in Global Optimization”(eds. C. A. Floudas and P. M. Pardalos), Kluwer Academic Publishers, Dordrecht, (1996), 201–211. [68] O. L. Mangasarian, “Nonlinear Programming,” McGraw–Hill, New York, 1969. Reprinted by SIAM Publications, 1994. [69] C. D. Maranas and C. A. Floudas, Global minimum potential energy conformations of small molecules, J. Global Optim., 4 (1994), 135–170. [70] C. C. Meewella and D. Q. Mayne, An algorithm for global optimization of Lipschitz continuous functions, J. Optim. Theory Appl., 57 (1988), 307–322. [71] C. C. Meewella and D. Q. Mayne, Efficient domain partitioning algorithms for global opti- mization of rational and Lipschitz continuous functions, J. Optim. Theory Appl., 61 (1989), 247–270. [72] R. H. Mladineo, An algorithm for finding the global maximum of a multimodal multivariate function, Math. Program., 34 (1986), 188–200. [73] J. Mockus, W. Eddy, A. Mockus, L. Mockus and G. Reklaitis, “Bayesian Heuristic Approach to Discrete and Global Optimization,” Kluwer Academic Publishers, Dordrecht, 1996. [74] J. Mockus, “Bayesian Approach to Global Optimization,” Kluwer Academic Publishers, Dor- drecht, 1989. [75] C. G. Moles, P. Mendes and J. R. Banga, Parameter estimation in biochemical pathways: A comparison of global optimization methods, Genome Res., 13 (2003), 2467–2474. [76] A. Molinaro, C. Pizzuti and Ya. D. Sergeyev, Acceleration tools for diagonal information global optimization algorithms, Comput. Optim. Appl., 18 (2001), 5–26. [77] A. Molinaro and Ya. D. Sergeyev, Finding the minimal root of an equation with the multiex- tremal and nondifferentiable left-hand part, Numer. Algorithms, 28 (2001), 255–272. [78] V. N. Nefedov, Some problems of solving Lipschitzian global optimization problems using the method, Comput. Math. Math. Phys., 32 (1992), 433–445. [79] Yu. I. Neimark and R. G. Strongin, The information approach to the problem of search of extrema of functions, Engineering Cybernetics, 1 (1966), 17–26. [80] A. Neumaier, Complete search in continuous global optimization and constraint satisfaction, In “Acta Numerica 2004”(ed. A. Iserles), Cambridge University Press, UK, 13 (2004), 271– 369. [81] J. Nocedal and S. J. Wright, “Numerical Optimization,” Springer–Verlag, Dordrecht, 1999. (The 2nd edition: Springer, 2006). UNIVARIATE GEOMETRIC LIPSCHITZ GLOBAL OPTIMIZATION 89

[82] P. M. Pardalos and M. G. C. Resende, “Handbook of Applied Optimization,” Oxford, Uni- versity Press, New York, 2002. [83] P. M. Pardalos and H. E. Romeijn, “Handbook of Optimization in Medicine,” Springer, New York, 2009. [84] R. Paulaviˇcius, J. Zilinskasˇ and A. Grothey, Investigation of selection strategies in branch and bound algorithm with simplicial partitions and combination of Lipschitz bounds, Optim. Lett., 4 (2010), 173–183. [85] J. D. Pint´er, “Global Optimization in Action (Continuous and Lipschitz Optimization: Algo- rithms, Implementations and Applications),” Kluwer Academic Publishers, Dordrecht, 1996. [86] J. D. Pint´er, Global Optimization: Scientific and Engineering Case Studies, Nonconvex Op- timization and Its Applications, Springer–Verlag, Berlin, 85 (2006). [87] S. A. Piyavskij, An algorithm for finding the absolute minimum of a function, In “Optimum Decison Theory,” Inst. Cybern. Acad. Science Ukrainian SSR, Kiev, 2 (1967), 13–24. In Russian. [88] S. A. Piyavskij, An algorithm for finding the absolute extremum of a function, USSR Comput. Math. Math. Phys., 12 (1972), 57–67. (In Russian: Zh. Vychisl. Mat. Mat. Fiz., 12 (1972), 888–896.) [89] S. Rebennack, P. M. Pardalos, M. V. F. Pereira and N. A. Iliadis, “Handbook of Power Systems I,” Springer, New York, 2010. [90] M. G. C. Resende and P. M. Pardalos, “Handbook of Optimization in Telecommunications,” Springer, New York, 2006. [91] R. T. Rockafellar, “Convex analysis,” Princeton University Press, Princeton, NJ, USA, 1970. Reprinted in 1996. [92] F. Schoen, On a sequential search strategy in global optimization problems, Calcolo, 19 (1982), 321–334. [93] Ya. D. Sergeyev, P. Daponte, D. Grimaldi and A. Molinaro, Two methods for solving opti- mization problems arising in electronic measurements and electrical engineering, SIAM J. Optim., 10 (1999), 1–21. [94] Ya. D. Sergeyev and V. A. Grishagin, Parallel asynchronous global search and the nested optimization scheme, J. Comput. Anal. Appl., 3 (2001), 123–145. [95] Ya. D. Sergeyev and D. E. Kvasov, Global search based on efficient diagonal partitions and a set of Lipschitz constants, SIAM J. Optim., 16 (2006), 910–937. [96] Ya. D. Sergeyev and D. E. Kvasov, “Diagonal Global Optimization Methods,” FizMatLit, Moscow, 2008. In Russian. [97] Ya. D. Sergeyev and D. L. Markin, An algorithm for solving global optimization problems with nonlinear constraints, J. Global Optim., 7 (1995), 407–419. [98] Ya. D. Sergeyev, “Divide the best” algorithms for global optimization, Technical Report 2–94, Department of Mathematics, University of Calabria, Rende(CS), Italy, 1994. [99] Ya. D. Sergeyev, Global optimization algorithms using smooth auxiliary functions, Technical Report 5, ISI–CNR, Institute of Systems and Informatics, Rende(CS), Italy, 1994. [100] Ya. D. Sergeyev, A global optimization algorithm using derivatives and local tuning, Tech- nical Report 1, ISI–CNR, Institute of Systems and Informatics, Rende(CS), Italy, 1994. [101] Ya. D. Sergeyev, An information global optimization algorithm with local tuning, SIAM J. Optim., 5 (1995), 858–870. [102] Ya. D. Sergeyev, A one-dimensional deterministic global minimization algorithm, Comput. Math. Math. Phys., 35 (1995), 705–717. [103] Ya. D. Sergeyev, A method using local tuning for minimizing functions with Lipschitz deriva- tives, In “Developments in Global Optimization”(eds. I. M. Bomze, T. Csendes, R. Horst, and P. M. Pardalos), Kluwer Academic Publishers, (1997), 199–216. [104] Ya. D. Sergeyev, Global one-dimensional optimization using smooth auxiliary functions, Math. Program., 81 (1998), 127–146. [105] Ya. D. Sergeyev, On convergence of “Divide the Best” global optimization algorithms, Op- timization, 44 (1998), 303–325. [106] Ya. D. Sergeyev, Multidimensional global optimization using the first derivatives, Comput. Math. Math. Phys., 39 (1999), 711–720. [107] Ya. D. Sergeyev, Univariate global optimization with multiextremal non-differentiable con- straints without penalty functions, Comput. Optim. Appl., 34 (2006), 229–248. [108] Z. Shen and Y. Zhu, An interval version of Shubert’s for the localization of the global maximum, Computing, 38 (1987), 275–280. 90 DMITRIE.KVASOVANDYAROSLAVD.SERGEYEV

[109] B. O. Shubert, A sequential method seeking the global maximum of a function, SIAM J. Numer. Anal., 9 (1972), 379–388. [110] C. P. Stephens and W. Baritompa, Global optimization requires global information, J. Optim. Theory Appl., 96 (1998), 575–588. [111] A. S. Strekalovsky, “Elements of Nonconvex Optimization,” Nauka, Novosibirsk, 2003. In Russian. [112] R. G. Strongin and D. L. Markin, Minimization of multiextremal functions with nonconvex constraints, Cybernetics, 22 (1986), 486–493. [113] R. G. Strongin and Ya. D. Sergeyev, “Global Optimization with Non-Convex Constraints: Sequential and Parallel Algorithms,” Kluwer Academic Publishers, Dordrecht, 2000. [114] R. G. Strongin, Multiextremal minimization for measurements with interference, Engineer- ing Cybernetics, 16 (1969), 105–115. [115] R. G. Strongin, “Numerical Methods in Multiextremal Problems (Information-Statistical Algorithms),” Nauka, Moscow, 1978. In Russian. [116] A. G. Sukharev, “Minimax Algorithms in Problems of ,” Nauka, Moscow, 1989. In Russian. [117] L. N. Timonov, An algorithm for search of a global extremum, Engineering Cybernetics, 15 (1977), 38–44. [118] A. T¨orn and A. Zilinskas,ˇ “Global Optimization,” Lecture Notes in Computer Science, Springer–Verlag, Berlin, 350 (1989). [119] R. J. Vanderbei, Extension of Piyavskii’s algorithm to continuous global optimization, J. Global Optim., 14 (1999), 205–216. [120] L. T. Watson and C. Baker, A fully-distributed parallel global search algorithm, Engineering Computations, 18 (2001), 155–169. [121] G. R. Wood and B. P. Zhang, Estimation of the Lipschitz constant of a function, J. Global Optim., 8 (1996), 91–103. [122] G. R. Wood, Multidimensional bisection applied to global optimisation, Comput. Math. Appl., 21 (1991), 161–172. [123] A. A. Zhigljavsky and A. Zilinskas,ˇ “Stochastic Global Optimization,” Springer, N. Y., 2008. [124] A. Zilinskas,ˇ Axiomatic approach to statistical models and their use in multimodal optimiza- tion theory, Math. Program., 22 (1982), 104–116. [125] A. Zilinskas,ˇ “Global Optimization. Axiomatics of Statistical Models, Algorithms, and Ap- plications,” Mokslas, Vilnius, 1986. In Russian. Received May 2011; 1st revision June 2011; 2nd revision August 2011. E-mail address: [email protected] E-mail address: [email protected]