Mathematical Optimization Techniques

MATHEMATICAL OPTIMIZATION TECHNIQUES S. Russenschuck CERN, 1211 Geneva 23, Switzerland Abstract From the beginning the ROXIE program was structured such that mathematical optimization techniques can be applied to the design of the superconduct- ing magnets. With the concept of features it is possible to create the complex coil assemblies in 2 and 3 dimensions with only a small number of engineering data which can then be addressed as design variables of the optimization problem. In this chapter some background information on the application of mathematical optimization techniques is given. 1 Historical overview Mathematical optimization including numerical techniques such as linear and nonlinear programming, integer programming, network flow theory and dynamic optimization has its origin in operations research developed in world war II, e.g., Morse and Kimball 1950 [45]. Most of the real-world optimization problems involve multiple conflicting objectives which should be considered simultaneously, so-called vector-optimization problems. The solution process for vector-optimization problems is threefold, based on decision-making methods, methods to treat nonlinear constraints and optimization algorithms to min- imize the objective function. Methods for decision-making, based on the optimality criterion by Pareto in 1896 [48], have been introduced and applied to a wide range of problems in economics by Marglin 1966 [42], Geoffrion 1968 [18] and Fandel 1972 [12]. The theory of nonlinear programming with constraints is based on the optimality criterion by Kuhn and Tucker, 1951 [37]. Methods for the treatment of nonlinear constraints have been developed by Zoutdendijk 1960 [70], Fiacco and McCormick 1968 [13] and Rockafellar 1973 [54] among others. Numerous optimization algorithms both using deterministic and stochastic elements have been developed in the sixties and covered in the books by Wilde 1964 [67], Rosenbrock 1966 [55], Him- melblau 1972 [25], Brent 1973 [5], and Schwefel 1977 [62]. Researchers tend to come back to genetic and evolutionary algorithms recently as they are suited for parallel processing, finding global optima, and are reported to be suitable for a large number of design variables Fogel 1994 [15], Holland 1992 [26]. Mathematical optimization techniques have been applied to computational electromagnetics al- ready for decades. Halbach 1967 [23] introduced a method for optimizing coil arrangements and pole shapes of magnets by means of finite element (FE) field calculation. Armstrong, Fan, Simkin and Trow- bridge 1982 [2] combined optimization algorithms with the volume integral method for the pole profile optimization of a H-magnet. Girdinio, Molfino, Molinari and Viviani 1983 [20] optimized a profile of an electrode. These attempts tended to be application-specific, however. Only since the late 80 th, have numerical field calculation packages for both 2d and 3d applications been placed in an optimization environment. Reasons for this delay have included constraints in computing power, problems with dis- continuities and nondifferentiabilities in the objective function arising from FE meshes, accuracy of the field solution and software implementation problems. A small selection of papers can be found in the references. The variety of methods applied shows that no general method exists to solve nonlinear optimization problems in computational electromagnetics in the same way that the simplex algorithm exists to 60 solve linear problems. There are many different applications in computational electromagnetics and each one requires its own particular procedure. Some optimization procedures are described in the following sections that have been proven efficient for problems in computational electromagnetics and are provided for general use in the ROXIE program. 2 Pareto-optimality Most of the real-world optimization problems involve multiple conflicting objectives that must be mutu- ally reconciled. Characteristic for these so-called vector-optimization problems is the appearance of an objective-conflict where the individual solutions for each single objective function differ and no solution exists where all the objectives reach their individual minimum. A vector-optimization problem in a standardized mathematical form reads: ¢¡¤£¦¥§ ©¨ ¨ ¢¡¤£¦¥§ ¨ ¨ ¨ (1) ¨ +*-,./102!3#&%4'5# "!$#&%(')# subject to , ¨ £ ¡ 76"8 "9: * (2) ¨ 7"8 =< "9:+>? /10; (3) ¥ 6 6 GH"9: @DA @DAFE @BACA (4) ¨ IJ K @ @ @ with the design variable vector % and the in general nonlinear objective functions © L¨¨ @BACE arranged in the vector . The @BACA and are the lower respectively upper bounds for the design variables. For the definition of the optimal solution of the vector-optimization problem we apply the optimality criterion by Pareto originally introduced for problems in economics Pareto [48], Stadler [65]. ¨ ¨ M A Pareto-optimal solution is given when there exists no solution in the feasible domain N O ¨ , ¨ 0 ¨ £ ¡ ¥§] QPR#&%TSU* 76"8VW/ "8V 6 6 "9 V3<Z"9 >?V[GB\9 @BA @BAFEYX @BAFA for which M K^ 2¨ 26_K^ L¨ Pba 9c(d X ` (5) M ¥ K^ 2¨ 2e_K^ L¨ HfUg&h;iDGkj;hmlUiDf j Pba 9c(d ` (6) A design where the improvement of one objective causes the degradation of at least one other objective, is an element of the Pareto-optimal solution set. It is clear that this definition yields a set of solutions rather than one unique solution. Fig. 1. shows a geometric interpretation of Pareto-optimal solutions for two conflicting objectives. 3 Methods of decision-making Applying mathematical optimization routines requires a decision-making method that guarantees a solution from the Pareto-optimal solution set. Below some methods are described that have been applied to computational electromagnetics, author’s papers [56], [57]. A comprehensive overview can be found in Cohon [7]. 61 o¦p nrq ou ots oxu s|v sv u^v uv ots.ymz { v w v w ou|ymz { ouy}z { ots n ots.ymz { ~x Fig. 1: Pareto-optimal solutions. Point 1 and 2 are not pareto-optimal because one objective can always be improved without deteriorating the other 3.1 Objective weighting The objective weighting function, Kuhn and Tucker [37] is the sum of the weighted objectives and results in the minimization problem: O¢ ¨ ¨ ¡¤£¦¥ ¨ ¨ ] © iK2K^ LS P N (7) K with the weighting factors i.K representing the users preference. For convex optimization problems where ¨ ¨ ¨ ¨ ¨ ¨ Pba 9c(d ¢ ©P 86"6\9 K1 -( 9$4 -76\$K1 |( 9$4K ` for all and N , yields it can be proved indirectly, Fandel [12], that eq. (7) is a minimization problem with a unique Pareto- optimal solution. The problem is to find the appropriate weighting factors in particular when the objectives have different numerical values and sensitivity. Using objective weighting results therefore in an iterative solution process where a number of optimizations have to be performed with updated weighting factors. 3.2 Distance function The problem of choosing the weighting factors appropriately also occurs when the distance function M method, Charnes and Cooper [6], is applied. Most common is a least squares objective function. The K QP¨ N are the requirements for the optimumk design. The minimization problem reads for and the norm + " ;S S ©S S @ @ @ ¨ % M ¡¤£¦¥ ¨ ¨ ¨ ¨ ¡¤£t¥ ¨ © ¨ M ¡¤£¦¥ ¨ ¨ i.Km HK^ K (8) K- M For convex functions and for K taken as the minimal individual solutions it can be proved, in the same manner as for the objective weighting function, that (8) has an unique Pareto-optimal solution. The disadvantage of least squares objective functions with the Euclidean norm is the low sensitivity for residuals smaller than one. Therefore sufficiently high weighting factors i.K have to be introduced. If the absolute value norm is applied, the disadvantage is the nondifferentiable objective function in the optimum. 62 3.3 Constraint formulation The problem with the weighting factors can be overcome by defining the problem in the constraint formulation, Marglin [42]. Only one of the objectives is minimized and the others are considered by constraints. The resulting optimization problem reads: ¡¤£¦¥ ¨ , (9) ¨ HgYK 6"8 K^ (10) £ 9cV gYK ` X[` and additional constraints, eq. (2)-(4). The represent the minimum request value specified by the user for the k-th objective. Combining (10) and (2) and, because they can be *¢¡£ P ¨ ¤ ¨ treated§Hseparately, omitting the bounds for the design variables (4) yields in a vector notation ¨ ¨ ¨ #&¥L¦ / P©# ¡ : ¡¤£¦¥ ¨ , (11) ¡ ¨ * 6 8 ¨ ¤ ¨ ¨ (12) ¡ ¨ ¨ ¨ ¨ / 8 ¨ (13) 3.4 Sensitivity analysis The constraint formulation has the advantage that a sensitivity analysis can be performed using the nec- ¨ essary optimality conditions at the optimum point M which read, see Luenberger [38]: M ¡ M ¡ M ± ¨ ¨ ¨ ¨ ¨ ª¬« "ª¬« ,+ W ª°« * ª°« / 8 ¨ ¯ ¨ ¨ ©® (14) ¡ M ¨ * H 8 ¨ ¤ ¨ ¨ (15) ¡ M ¨ ¨ ¨ ¨ / H 8 ¨ (16) 8 ¨ ¯² ¨ (17) ± ¨ ¯ The ¨ are the vectors of the corresponding Lagrange multipliers. Equations (14) - (17) are the Kuhn- Tucker equations. The gradient of the Lagrange function has to be zero, and the Lagrange multipliers of the active inequality constraints have to take values greater than zero, otherwise it would be possible to decrease the value of a constraint without increasing the objective function, which is of course not characteristic for an optimal point. By means of the corresponding Lagrange function L it can also be proved that (11) - (13) is a minimization problem with a unique Pareto-optimal solution if all constraints are active. A non-active constraint would be equivalent to a zero weight in the weighting function. The Lagrange-multipliers are estimated by solving the linear equation system (14) by means of the variational problem M ± ¨ ¡¤£¦¥ ¡¤£¦¥ « ª « ª°« H L¨ H·¶ ¹¸ « « ¯ ¨ µ µ ® ³´ ³´ (18) 63 ¸ with the gradients of the constraints arranged in the matrices ¶ and . The Lagrange multipliers are a measure of the price which has to be paid when the constraint is decreased. Mathematically this relationship is expressed by [38] M ¨ ªº|H 7" ¯ ¨ (19) M ± ¨ ¨ 7" ª »3H (20) 3.5 Payoff table A tool which provides the decision maker with a lot of information about the hidden resources of a design is the payoff-table.

Mathematical Optimization Techniques

Nonlinear Optimization Using the Generalized Reduced Gradient Method Revue Française D’Automatique, Informatique, Recherche Opé- Rationnelle

4. Convex Optimization Problems

Metaheuristics1

Combinatorial Structures in Nonlinear Programming

Nonlinear Integer Programming ∗

12. Coordinate Descent Methods

Process Optimization

The Gradient Sampling Methodology

An Improved Sequential Quadratic Programming Method Based on Positive Constraint Sets, Chemical Engineering Transactions, 81, 157-162 DOI:10.3303/CET2081027

IEOR 269, Spring 2010 Integer Programming and Combinatorial Optimization

A New Algorithm of Nonlinear Conjugate Gradient Method with Strong Convergence*

Introduction to Integer Programming – Integer Programming Models