<<

Proceedings of the 42nd LEEE I Conference on Decision and Control Maui, Hawaii USA, December 2003 T hM 07-2

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

I-Jeng Wang* and James C. Spall** The Johns Hopkins University Applied Physics Laboratory 11 100 Johns Hopkins Road Laurel, MD 20723-6099, USA

Abstract- We present a stochastic approximation Several results have been presented for constrained opti- based on penalty function method and a simultaneous per- mization in the stochastic domain. In the area of stochastic turbation gradient estimate for solving stochastic optimization approximation (SA), most of the available results are based problems with general inequality constraints. We present a general convergence result that applies to a class of penalty on the simple idea of projecting the estimate e,, back to its functions including the quadratic penalty function, the aug- nearest point in G whenever e,, lies outside the constraint set mented Lagrangian, and the absolute penalty function. We also G. These projection-based SA are typically of the establish an asymptotic normality result for the algorithm with following form: smooth penalty functions under minor assumptions. Numerical results are given to compare the performance of the proposed %+I = nG[&-angn(6n)], (2) algorithm with different penalty functions. I. INTRODUCTION where w:Rd -+ G is the set projection operator, and gn(8,) is an estimate of the gradient g(8“); see, for example [2], In this paper, we consider a constrained stochastic opti- [3], [5], [6]. The main difficulty for this projection approach mization problem for which only noisy measurements of the lies in the implementation (calculation) of the projection cost function are available. More specifically, we are aimed operator KG. Except for simple constraints like interval or to solve the following optimization problem: linear constraints, calculation of %(e) for an arbitrary vector 8 is a formidable task. Other techniques for dealing with constraints have also where L: Wd -+ W is a real-valued cost function, 8 E Rd is been considered: Hiriart-Urmty [7] and Pflug [SI present and the parameter vector, and G c Rd is the constraint set. We analyze a SA algorithm based on the penalty function method also assume that the gradient of L(.) exists and is denoted for stochastic optimization of a convex function with convex by g(.). We assume that there exists a unique solution 8* for inequality constraints; Kushner and Clark 131 present several the constrained optimization problem defined by (1). We con- SA algorithms based on the Lagrange multiplier method, sider the situation where no explicit closed-form expression the penalty function method, and a combination of both. of the function L is available (or is very complicated even if Most of these techniques rely on the Kiefer-Wofolwitz (KW) available), and the only information are noisy measurements [4] type of gradient estimate when the gradient of the cost . of L at specified values of the parameter vector 8. This function is not readily available. Furthermore, the conver- scenario arises naturally for simulation-based optimization gence of these SA algorithms based on “non-projection” where the cost function L is defined as the expected value techniques generally requires complicated assumptions on of a random cost associated with the stochastic simulation the cost function L and the constraint set G. In this paper, we of a complex system. We also assume that significant costs present and study the convergence of a class of algorithms (in term of time and/or computational costs) are involved in based on the penalty function methods and the simultaneous obtaining each measurement (or sample) of L(8).These con- perturbation (SP) gradient estimate [9]. The advantage of straint prevent us from estimating the gradient (or Hessian) the SP gradient estimate over the KW-type estimate for of L(.) accurately, hence prohibit the application of effective unconstrained optimization has been demonstrated with the techniques for inequality constraint, simultaneous perturbation stochastic approximation (SPSA) for example, the sequential methods algorithms. And whenever possible, we present sufficient (see; for example, section 4.3 of [I]). Throughout the paper conditions (as remarks) that can be more easily verified than we use 8, to denote the nth estimate of the solution 8*. the much weaker conditions used in our convergence proofs. We focus on general explicit inequality constraints where This work was supported by the JHU/APL Independent Research and Development hogram. G is defined by *Phone: 240-228-6204; E-mail: i - j eng .wang@ jhuapl .edu. **Phone: 240-228-4960; E-mail: james .spall@ jhuapl .edu. G { 8 E Rd: qj(8) 5 0,j= 1,.. .,s}, (3)

0-7803-7924-1103/$17.00 02003 IEEE 3808 where 4,: Rd ---f R are continuously differentiable real- Note that when A,, = 0, the penalty function defined by (6) valued functions. We assume that the analytical expression of reduces to the standard quadratic penalty function discussed the function qj is available. We extend the result presented in in [lo]

[lo] to incorporate a larger classes of penalty functions based S on the augmented Lagragian method. We also establish the ~r(0,o)=~(o)+rC [m~{o,~j(e))l~~ asymptotic normality for the proposed algorithm. Simulation j= 1 results are presented to illustrated the performance of the Even though the convergence of the proposed algorithm only technique for stochastic optimization. requires {A,,} be bounded (hence we can set = O), we 11. CONSTRAINED SPSA ALGORITHMS can significantly improve the performance of the algorithm with appropriate choice of the sequence based on concepts A. Penalty Functions from Lagrange multiplier theory. Moreover, it has been The basic idea of the penalty-function approach is to shown [ 11 that, with the standard quadratic penalty function, convert the originally constrained optimization problem (1) the penalized cost function L,, = L + r,,P can become ill- into an unconstrained one defined by conditioned as r,, increases (that is, the condition number

of the of L,, at 0; diverges to 03 with rJ. minL,(B) 4 L(8) (4) e +#(e), The use of the general penalty function defined in (6) can prevent this difficulty if {A,,} is chosen so that it is close to where P: Rd -+ R is the penalty function and r is a positive real number normally referred to as the penalty parameter. the true Lagrange multipliers. In Section IV, we will present The penalty functions are defined such that P is an increasing an based on the method of multipliers (see; function of the constraint functions qj; P > 0 if and only if for example, [ 111) to update A,, and compare its performance with the standard quadratic penalty function. qj > 0 for any j; P -+ m as qj + -; and P -+ -1 (I 2 0) as qj -+ -W. In this paper, we consider a penalty function B. A SPSA Algorithms for Inequality Constraints method based on the augmented Lagrangian function defined In this section, we present the specific form of the al- by gorithm for solving the constrained stochastic optimization lS problem. The algorithm we consider is defined by L,(e,n)= L(e) + - c { [max{o,Lj + rqj(8)}l2 - A;}, 2r j=* (5) &+I = en -ani?n(en) -anrnVP(%), (7) where A E Rs can be viewed as an estimate of the Lagrange where &(On) is an estimate of the gradient of L, g(-), at multiplier vector. The associated penalty function is e,,, {r,,} is an increasing sequence of positive scalar with Iim,,-,-r, = 00, VP(0) is the gradient of P(0) at 0, and {a,,} is a positive scalar sequence satisfying U,, -+. 0 and c,”=la, = -. The gradient estimate is obtained from two Let {r,,} be a positive and strictly increasing sequence with noisy measurements of the cost function L by r,, -+ 00 and {A,} be a bounded nonnegative sequence in Rs. (L(en +CA) + E:) - (L(0, - CnAn) + e;) -1 It can be shown (see; for example, section 4.2 of [l]) that (8) the minimum of the sequence functions {L,}, defined by 2% An’ where A,, E Wdis a random perturbation vector, c,~-, 0 is a positive sequence, E,‘ and E; are noise in the measurements. converges to the solution of the original constrained problem and & denotes the vector [in,7 ...,T . For analysis, we (1). Since the penalized cost function (or the augmented rewrite the algorithm (7) into bl Lagrangian) (5) is a differentiable function of 8, we can apply the standard stochastic approximation technique with %+I = e,, -ang(en) -a,,r,vP(&) +and, -an- En (9) the SP gradient estimate for L to minimize {L,,(-)}.In other ’ words, the original problem can be solved with an algorithm where d,, and E,, are defined by of the following form: L(% +CA) - L(0,, - Cn&) 4, 4 g(@n)- > 2cnAn E,, 4 En+ -En-,

where g,, is the SP estimate of the gradient g(-) at e,, that we respectively. shall specify later. Note that since we assume the constraints We establish the convergence of the algorithm (7) and the are explicitly given, the gradient of the penalty function P(.) associated asymptotic normality under appropriate assump- is directly used in the algorithm. tions in the next section.

3809 111. CONVERGENCE AND ASYMPTOTICNORMALITY (C.6) VL,,(.)satisfies condition (A5) A. Convergence Theorem Theorem 2: Suppose that assumptions (C. 1-C.6) hold. To establish convergence of the algorithm (7), we need Then the sequence {e,*}defined by (7) converges to 8* to study the asymptotic behavior of an SA algorithm with almost surely. a “time-varying” regression function. In other words, we Proofi We only need to verify the conditions (A.l-AS) need to consider the convergence of an SA algorithm of the in Theorem 1 to show the desired result: following form: Condition (A. 1) basically requires the stationary point of the sequence {VI,,,(.)}converges to e*.Assumption %+I = e, -anfn(e,,)+andn+u,e,, (10) (C.l) together with existing result on penalty function where {fn(.)} is a sequence of functions. We state here methods establishes this desired convergence. without proof a version of the convergence theorem given From the results in [9], [14] and assumptions (C.2-C.3), by Spall and Cristion in [13] for an algorithm in the generic we can show that condition (A.2) hold. form (10). Since r, ---f 00, we have condition (A.3) hold from 7’heorem I: Assume the following conditions hold: assumption (C.4). From (9), assumption (C.l) and ((2.5) , we have For each n large enough (2N for some N E N), there exists a unique 0,: such that f,(O;) = 0. Furthermore, 1(%+1 - %),I < \(e,*- e*)il limn+- 0; = 8*. d, -+ 0, and z=lakek converges. for large n if - (e*)il > p. Hence for large 12, For some N < m, any p > 0 and for each n 5 N, if the sequence {e,,,>does not “jump” over the interval 116 - 8*11 2 p, then there exists a 6&) > 0 such that between (e*); and O,,i. Therefore if l0,i - (e*);/> p (O-e*)Tffl(e)2 &(p)lle-e*II where &(p) satisfies eventually, then the sequence {fn,(Bn)} does not change (A.4) Er=i~&(p)= m and dn&(p)-l 4 0. sign eve’ntually. That is, condition holds. For each i = 1,2,. .. , d, and any p > 0, if I 8,,i - (e*),I > Assumption (A.5) holds directly from (C.6). p eventually, then either J,,(O,,) 2 0 eventually or la $,i( 6,) < 0 eventually. Theorem 2 given above is general in the sense that it does For any z > 0 and nonempty Sc{1,2,. . . ,d}, there ex- not specify the exact type of penalty function P(.) to adopt. ists a p’( z, S) > z such that for all 8 E { 8 E Rd : 1 (8 - In particular, assumption (C.4) seems difficult to satisfy. In e*)il < z when i S,l(0 - e*)il2 p’(r,S) when i E fact, assumption (C.4) is fairly weak and does address the S.), limitation of the penalty function based algorithm. For example, suppose that a constraint function qk(-) has a local minimum at 8’ with qk(e‘)> 0. Then for Then the sequence {e,} defined by the algorithm (10) every 8 with qj(e) 5 0, j # k, we have (e - e’)TVP(e)> 0 converges to 8*. whenever 8 is close enough to 8’. As -I,* gets larger, the Based on Theorem I,-we give a convergence result for term VP(e) would dominate the behavior of the algorithm and result in a possible convergence to a wrong solution. algorithm (7) by substituting VL,( e,) = g( e,,)+ r,VP( e,), e’, We also like to point out that assumption (C.4) is satisfied d,*. and & into f,(O,,), d,*,and e,, in (lo), respectively. We need the following assumptions: if cost function L and constraint functions qj,j = 1,. .. , s are convex and satisfy the slater condition, that is, the minimum (C.l) There exists K1 E such that for all n 2 K1, we have M cost function value L( 8.) is finite and there exists a 8 E Rd a unique e; E with vL,,(e;) = 0. E@ such that qj( 0) < 0 for all j (this is the case studied in [SI). (C.2) {An1} are i.i.d. and symmetrically distributed about 0, Assumption (C.6) ensures that for n sufficiently large each with IAnJ 5 % a.s. and E1A;’I L at. element of g( 0) + rnVP(e) make a non-negligible contri- (c.3) converges almost surely. E bution to products of the form (e - e*)T(g(e)+ r,,VP(f3)) (C.4) If 110 - 8*11 2 p, then there exists a 6(p)> 0 such that when (8- e*),# 0. A sufficient condition for (C.6) is that (i) if e E G, (e - e*)Tg(e)2 s(p)lp - e*11 > 0. for each i, si( e)+ r,ViP( 8) be uniformly bounded both away (ii) if 8 @ G, at least one of the following two from 0 and 00 when \\(e- e*)ill 2 p > 0 for all i. conditions hold Theorem 2 in the stated form does require that the penalty (e 2 e*)Tg(e) L s(p)lp - e*ll and (e - function P be differentiable. However, it is possible to extend e*)TvP(e) 2 0. the stated results to the case where P is Lipschitz but (e - e*)*g(e) 1 -M and (e - e*)Tvp(e)L not differentiable at a set of point with zero measure, for 6(P)lle - > 0 example, the absolute value penalty function (C.5) a,r, -+ 0, g(.) and VP(.) are Lipschitz. (See comments p(e)= max {m~{O,qj(e)}}* below) j=1 .....s

381 0 In the case where the density function of measurement noise smooth penalty functions of the form (E: and E; in (8)) exists and has infinite support, we can c take advantage of the fact that iterations of the algorithm visit P(6)= 2 Pi(4j(m any zero-measure set with zero probability. Assuming that j= 1 the set D 4 (6 E Rd : VP( 0)does not exist} has Lebesgue which including both the quadratic penalty and augmented measure 0 and the random perturbation A,, follows Bernoulli Lagrangian functions. distribution (P(A6 = 0) = P(Ai = 1) = ;), we can construct Assume further that E[e,1S,,,An]= 0 a.s., E[e;lS,] -+ cJ2 a simple proof to show that a.s., E[(Ait)-2]-+ p2, and E[(AL)2].--) e2,where SIis the 0- algebra generated by 61 , .. ., e,,. Let H( 0) denote the Hessian P{e,, ED i.0.) = o matrix of L(8)and if P{ 60 E D} = 0. Therefore, the convergence result in The- ~p(e)= V2 (pj(qj(e)))- orem 2 applies to the penalty functions with non-smoothness jEAC at a set with measure zero. Hence in any practical application, The next proposition establishes the asymptotic normality we can simply ignore this technical difficulty and use for the proposed algorithm with the following choice of parameters: a, = an-', c, = cn-Y and r, = rrzq with a,c,r VP(e) = max{0,4J(e)(e)}VqJ(e)(6), > 0, f? = a- q - 2y> 0, and 3y- 5 + 3 >_ 0. where J(6)= argmaxj,l.....sgj(6) (note thatJ(6) is uniquely Proposition I: Assume that conditions (C.l-6) hold. Let defined for 6 $0).An alternative approach to handle this P be orthogonal with mp(e*)PT= u-'r-'diag(Al, ...,&) technical difficulty is to apply the SP gradient estimate Then directly to the penalized cost L(8)+ rP(8) and adopt the convergence analysis presented in [ 151 for nondifferentiable nP/2(e,- e*) d~ N(~,PMP~),n 3 optimization additional convexity assumptions. with where M = $a2?~-202p2diag[(2A,- p+)-', .. . , (2& - Use of non-differentiable penalty functions might allow us p+)-'] with p+ =p <2mini4 if a.= 1 and p+ =O if a < 1, to avoid the difficulty of ill-conditioning as r, -t - without and using the more complicated penalty function methods such as the augmented Lagrangian method used here, The rationale if 3 y- 4 + 3 > 0, (arHp(e*)- ~P+I)-'T if 3y- f + = 0, here is that there exists a constant P = A; (A; is the .-(" 3 Lagrange multiplier associate with the jth constraint) such that the minimum of L + rP is identical to the solution of where the Ith element of T is the original constrained problem for all r > P, based on the 1 P theory of exact penalties (see; for example, section 4.3 of --ac2g26 ~j;:)(e*)+3c L!l)(e*) i=l &#I [l]). This property of the absolute value penalty function [ 1 Pro08 For large enough n, we have allow us to use a constant penalty parameter r > P (instead of r,, --+ oo) to avoid the issue of ill-conditioning. However, it is difficult to obtain a good estimate for J in our situation where the analytical expression of g(.) (the gradient of the cost function I,(.)) is not available. And it is not clear that the application of exact penalty functions with r,, -+ - would lead to better performance than the augmented Lagrangian based technique. In SectionN we will also illustrate (via numerical results) the potential poor performance of the algorithm with an arbitrarily chosen large r. where B. Asymptotic Normality When differentiable penalty functions are used, we can es- tablish the asymptotic normality for the proposed algorithms. In the case where q,(6*)< 0 for all j = 1,.. . ,s (that is, there is no active constraint at e*), the asymptotic behavior of the algorithm is exactly the same as the unconsu-ained SPSA Following the techniques used in [9] and the general Nor- algorithm and has been established in [SI. Here we consider mality results from [16] we can establish the desired result. the case where at least one of constraints is active at 6*, thatis, thesetAg{j=l, ... s:qj(6*)=0} isnotempty. Note that based on the result in Proposition 1, the conver- We establish the asymptotic Normality for the algorithm with gence rate at ni is achieved with a = 1 and y = t - q > 0.

381 1 IV. NUMERICALEXPERIMENTS 07~ We test our algorithm on a constrained optimization prob- lem described in [17, p.3521: minL(e)= e:+e,2+2e::+e,2-5el -5&-2ie3-t7e4 e subject to ql(e) = 2e:+ e; +e; +2e1 - - e4 -5 I o q2(6) = e: +e; + e; -te: + el - e2 + & - e4 - s 5 o q3 (e) = e: + 28; + e; + 28: - el - e4 - io I0. The minimum cost L(6*)= -44 under constraints occurs at 8" = [O,1,2, - 1IT where the constraints q~(.)5 0 and q2(.) 5 o are active. The Lagrange multiplier is [A;,q,%lT = w 12, l,0lT.The problem had not been solved to satisfactory accuracy with deterministic search methods that operate directly with constraints (claimed by [17]). Further, we Fig. 1. Error to the optimum ((le,l- 8*1() averaged over 100 independent simulations. increase the difficulty of the problem by adding i.i.d. zero- mean Gaussian noise to L(8) and assume that only noisy measurements of the cost function L are available (without Absolute value penalty function: gradient). The initial point is chosen at [O,O,O,OIT;and ~(6)= {m={O,qj(e)}}. (16) the standard deviation of the added Gaussian noise is 4.0 j=l .....s (roughly equal to the initial error). As discussed earlier, we will ignore the technical dif- We consider three different penalty functions: ficulty that P(-) is not differentiable everywhere. The Quadratic penalty function: gradient of P(.) when it exists is lS ~(0)= C Ima{o,qj(e)}~~. (11) WO)= max{o,qJ(e)(e)}vqJ(e)(e),(17) zJ=1 where J(0)= argmaxj=i.....sqj(8). In this case the gradient of P( .) required in the algorithm For all the simulation we use the following parameter is S values: a, = O.l(n + 100)-o.602 and c, = FZ-~.~~~.These we)= c "{O,q,(e)>vqj(e>. (12) parameters for a,, and c,, are chosen following a practical j= 1 implementation guideline recommended in [ 181. For the Augmented Lagrangian: augmented Lagrangian method, & is initialized as a zero vector. For the quadratic penalty function and augmented lS P(O)=- C[ma{0,S+rqj(e)}]2--2. (13) Lagrangian, we use r,, = 10n0.' for the penalty parameter. 2r2 j=l For the absolute value penalty function, we consider two possible values for the constant penalty: r,, = r = 3.01 and In this case, the actual penalty function used will vary r,, = r = 10. Note that in our experiment, f = = 3. over iteration depending on the specific value selected A; Hence the first choice of r at 3.01 is theoretically optimal for r,, and A,,. The gradient of the penalty function but not practical since there is no reliable way to estimate required-in the algorithm for the nth iteration is f. The second choice of r represent a more typical scenario lS where an upper bound on f is estimated. we)= - C ma{O,&j +r,,qJ(q}VqJ(q.(14) rn j=1 Figure 1 plots the averaged errors (over 100 independent simulations) to the optimum over 4000 iteration of the To properly update A,t, we adopt a variation of the algorithms. The simulation result in Figure 1 seems to sug- multiplier method [ 11: gest that the proposed algorithm with the quadratic penalty function and the augmented Lagrangian led to comparable Anj = m={O,&, +rnqj(&),M}, (15) performance (the augmented Lagrangian method performed where Ani denotes the jth element of the vector &, and slightly better than the standard quadratic technique). This M E R+ is a large constant scalar. Since (15) ensures suggests that a more effective update scheme for A,, than that {&} is bounded, convergence of the minimum of (15) is needed for the augmented Lagrangian technique. The {L,,(.)}remains valid. Furthermore, {a} will be close absolute value function with r = 3.01(x F = 3) has the best to the true Lagrange multiplier as n -+ m. performance. However, when an arbitrary upper bound on i is

3812 used (r = lo), the performance is much worse than both the [SI H. Kushner and G. Yin, Stochastic Appmximation Al- quadratic penalty function and the augmented Lagrangian. gorithms and Applications. Springer-Verlag, 1997. This illustrates a key difficulty in effective application of the [6] P. Sadegh, “Constrained optimization via stochastic ap- exact penalty theorem with the absolute penalty function. proximation with a simultaneous perturbation gradient approximation,” Automatica, vol. 33, no. 5, pp. 889- CONCLUSIONS AND REMARKS v. 892, 1997. We present a stochastic approximation algorithm based [7] J. Hiriart-Umty, “Algorithms of penalization type and on penalty function method and a simultaneous perturbation of dual type for the solution of stochastic optimization gradient estimate for solving stochastic optimization prob- problems with stochastic constraints,” in Recent De- lems with general inequality constraints. We also present velopments in Statistics (J. Barra, ed.), pp. 183-2219, a general convergence result and the associated asymptotic North Holland Publishing Company, 1977. Normality for the proposed algorithm. Numerical results are [8] G. C. Pflug, “On the convergence of a penalty-type included to demonstrate the performance of the proposed stochastic optimization procedure,” Joumal of Infomza- algorithm with the standard quadratic penalty function and a tion and Optimization Sciences, vol. 2, no. 3, pp. 249- more complicated penalty function based on the augmented 258, 1981. Lagrangian method. [9] J. C. Spall, “Multivariate stochastic approximation us- In this paper, we consider the explicit constraints where ing a simultaneous perturbation gradient approxima- the analytical expressions of the constraints are available. It is tion,” IEEE Transactions on Automatic Control, vol. 37, also possible to apply the same algorithm with appropriate pp. 332-341, Match 1992. gradient estimate for P(O) to problems with implicit con- 101 I-J. Wang and J. C. Spall, “A constrained simultaneous straints where constraints can only be measured or estimated perturbation stochastic approximation algorithm based with possible errors. The success of this approach would on penalty function method,” in Proceedings of the 1999 depend on efficient techniques to obtain unbiased gradient American Control Conference, vol. 1, pp. 393-399, San estimate of the penalty function. For example, if we can Diego, CA, June 1999. measure or estimate a value of the penalty function P(On) 111 D. P. Bertsekas, Constrained Optimization and La- at arbitrary location with zero-mean error, then the SP gragne Mulitplier Methods, Academic Press, NY, 1982. - gradient estimate can be applied. Of course, in this situation 121 E. K. Chong and S. H. ZKAn Introduction to Opti- further assumptions on r,, need to be satisfied (in general, - 2 mization. New York, New York: John Wiley and Sons, we would at least need (7)< -). However, in a 1996. typical application, we most hkely can only measure the 131 J. C. Spall and J. A. Cristion, “Model-free control of . value of constraint qj(8“) with zero-mean error. Additional nonlinear stochastic systems with discrete-time mea- bias would be present if the standard finite-difference or the surements,” IEEE Transactions on Automatic Control, SP techniques were applied to estimate VP(O,,)directly in vol. 43, no. 9, pp. 1178-1200, 1998. this situation. A novel technique to obtain unbiased estimate [ 141 I-J. Wang, Analysis of Stochastic Approximation and of VP(0”) based on a reasonable number of measurements Related Algorithms. PhD thesis, School of Electrical is required to make the algorithm proposed in this paper and Computer Engineering, Purdue University, August feasible in dealing with implicit constraints. 1996. [I51 Y. He, M. C. Fu, and S. I. Marcus, “Convergence of VI. REFERENCES simultaneous perturbation stochastic approximation for [ 13 D. P. Bertsekas, Nonlinear Programming, Athena Sci- nondifferentiable optimization,” IEEE Transactions on entific, Belmont, MA, 1995. Automatic Control, vol. 48, no. 8, pp. 1459-1463,2003. [2] P. Dupuis and H. J. Kushner, “Asymptotic behavior of [16] V. Fabian, “On asymptotic normality in stochastic ap- constrained stochastic approximations via the theory proximation,’’ The Annals of Mathematical Statistics, of large deviations,” Probability Theory and Related vol. 39, no. 4, pp. 1327-1332, 1968. Fields, vol. 75, pp. 223-274, 1987. [ 171 H.-P. Schwefel, Evolution and Optimum Seeking. John [3] H. Kushner and D. Clark, Stochastic Approximation Wiley and Sons, Inc., 1995. Methods for Constrained and Urzconstraiiied Systems. [18] J. C. Spall, “Implementation of the simultaneous per- Springer-Verlag, 1978. turbation algorithm for stochastic optimization,” IEEE [4] J. Kiefer and J. Wolfowitz, “Stochastic estimation of the Transactions on Aerospace and Electronic Systems, maximum of a regression function,” AIZIZ.Math Statist., vol. 34, no. 3, pp. 817-823, 1998. vol. 23, pp. 462-466, 1952.

381 3