A novel elitist multiobjective optimization algorithm:

multiobjective extremal optimization

Min-Rong Chen*, Yong-Zai Lu

Department of Automation, Shanghai Jiao Tong University, Dongchuan Road 800, Class A0403291, 200240 Shanghai, China

Abstract: Recently, a general-purpose local-search heuristic method called Extremal Optimization (EO) has been successfully applied to some NP-hard combinatorial optimization problems. This paper presents an investigation on EO with its application in multiobjective optimization and proposes a new novel elitist (1+ λ ) multiobjective algorithm, called Multiobjective Extremal Optimization (MOEO). In order to extend EO to solve the multiobjective optimization problems, the Pareto dominance strategy is introduced to the fitness assignment of the proposed approach. We also present a new hybrid mutation operator that enhances the exploratory capabilities of our algorithm. The proposed approach is validated using five popular benchmark functions. The simulation results indicate that the proposed approach is highly competitive with the state-of-the-art multiobjective evolutionary algorithms. Thus MOEO can be considered a good alternative to solve multiobjective optimization problems. Keywords: Multiple objective programming; ; Extremal optimization; Self-organized criticality

1. Introduction Most real-world engineering optimization problems are multiobjective in nature, since they normally have several (possibly conflicting) objectives that must be satisfied at the same time. Instead of aiming at finding a single solution, the multiobjective optimization methods try to produce a set of good trade-off solutions from which the decision maker may select one. The Operations Research (OR) community has developed a lot of mathematical programming techniques to solve multiobejctive optimization problems (MOPs) since the 1950s [1, 2]. However, mathematical programming techniques have some limitations when dealing with MOPs [3]. For example, many of them may not work when the Pareto front is concave or disconnected. Some require differentiability of the objective functions and the constraints. In addition, most of them only generate a single solution from each

______*Corresponding author. E-mail address: [email protected] run. During the past two decades, a considerable amount of multiobjective evolutionary algorithms (MOEAs) have been presented to solve MOPs [3-7]. Evolutionary algorithms seem particularly suitable to solve MOPs, because they can deal with a set of possible solutions (or so-called population) simultaneously. This allows us to find several members of the Pareto-optimal set in a single run of the algorithm [3]. There is also no requirement for differentiability of the objective functions and the constraints. Moreover, evolutionary algorithms are susceptible to the shape of the Pareto front and can easily deal with discontinuous or concave Pareto fronts. Recently, a general-purpose local-search heuristic algorithm named Extremal Optimization (EO) was proposed by Boettcher and Percus [8]. EO is based on the Bak- Sneppen (BS) model [9], which shows the emergence of self-organized criticality (SOC) [10] in ecosystems. The evolution in this model is driven by a process where the weakest species in the population, together with its nearest neighbors, is always forced to mutate. The dynamics of this extremal process exhibits the characteristics of SOC, such as punctuated equilibrium [9]. EO opens the door to applying non-equilibrium process, while the (SA) applies equilibrium statistical mechanics. In contrast to (GA) which operates on an entire “gene-pool” of huge number of possible solutions, EO successively eliminates those worst components in the sub- optimal solutions. Its large fluctuations provide significant hill-climbing ability, which enables EO to perform well particularly at the phase transitions. EO has been successfully applied to some NP-hard combinatorial optimization problems such as graph bi-partitioning [8], graph coloring [11], spin glasses [12], MAXSAT [13], production scheduling [14,15], function optimization [16] and dynamic combinatorial problems [17]. So far there have been some papers studying on the multiobjective optimization using extremal dynamics. Ahmed and Elettreby [18,19] introduced random version of BS model. They also generalized the single objective BS model to a multiobjective one by weighted sum aggregation method. The method is easy to implement but its most serious drawback is that it cannot generate proper members of the Pareto-optimal set when the Pareto front is concave regardless of the weights used [20]. Galski et al. [21] applied the Generalized Extremal Optimization (GEO) algorithm [22] to design a spacecraft thermal control system. The design procedure was tackled as a multiobjective optimization problem and they also resorted to the weighted sum aggregation method to solve the problem. In order to extend GEO to solve the MOPs effectively, Galski et al. [23] further present a revised multiobjective version of the GEO algorithm, called M-GEO. The M-GEO algorithm does not use the weighted sum aggregation method. Instead, the Pareto dominance concept was introduced to M-GEO in order to find out the approximate Pareto front, and at the same time the approximate Pareto front was stored and updated each run. Since the fitness assignment in the M-GEO is not based on the Pareto dominance strategy, M-GEO belongs to the non-Pareto approach [23]. M-GEO was successfully applied to the inverse design of a remote sensing satellite constellation [23]. In this work, we develop a novel elitist multiobjective optimization method, called Multiobjective Extremal Optimization (MOEO). Our approach does not use the weighted sum aggregation method to solve MOPs. Instead, we adopt the fitness assignment method based on the Pareto dominance strategy. Thus, MOEO is a Pareto-based approach. It is interesting to note that, similar to the (1+λ ) Pareto Archived Evolution Strategy (PAES) [24], MOEO is also a single-parent λ -offspring multiobjective optimization algorithm. Furthermore, we propose a new hybrid mutation operator that enhances the exploratory capabilities of our algorithm. Our approach has been validated using five benchmark functions reported in the specialized literature and compared with four competitive MOEAs: the Nondominated Sorting Genetic Algorithm-II (NSGA-II) [25], the Pareto Archived Evolution Strategy (PAES) [24], the Strength Pareto (SPEA) [26] and the Strength Pareto Evolutionary Algorithm2 (SPEA2) [27]. The simulation results demonstrate that the proposed approach is highly competitive with the state- of-the-art MOEAs. Hence, MOEO may be a good alternative to solve the multiobjective optimization problems. This paper is organized as follows. In Section 2, we give the problem formulation of multiobjective optimization problem. Section 3 describes four state-of-the-art elitist multiobjective evolutionary algorithms. The extremal optimization algorithm is introduced in Section 4. In Section 5, we propose the MOEO algorithm and describe its main components in detail. In Section 6, the proposed approach is validated using five popular benchmark functions. In addition, the simulation results are compared with those of four state-of-the-art multiobjective evolutionary algorithms. Finally, Section 7 concludes the paper with an outlook on future work. 2. Problem formulation Without loss of generality, the MOPs are mathematically defined as follows: Find x which minimizes Fx()= ( f12 (),(),, x f x fk ()) x subject to : gx()≥= 0, i 1,2,, m i (1) hxj ( )== 0, j 1,2, , p T where xxxx=∈Ω(,12 , ,n ) is a vector of decision variables, each decision variable is bounded by lower and upper limits lxullll≤≤ , = 1, , n, k is the number of objectives, m is the number of inequality constraints and p is the number of equality constraints. The following four concepts are of importance [28]: Definition 1. Pareto dominance: A vector uu= (, , u ) is said to dominate another 1 k vector vv= (,1 , vk )(denoted by uv≺ ) if and only if u is partially less than v , i.e.,

∀∈ikuvjkuv{1, , } :ii ≤ ∧∃∈ ( {1, , } : j < j ) . Definition 2. Pareto optimality: A solution x ∈Ω is said to be Pareto optimal with respect to Ω if and only if there is no x' ∈Ω for which vFx==('' ) ( fx ( ), , fx ( ' )) 1 k dominates uFx==( ) ( fx1 ( ), , fxk ( )) . The phrase “Pareto optimal” is taken to mean with respect to the entire decision variable space unless otherwise specified.

Definition 3. Pareto-optimal set: The Pareto optimal set PS is defined as the set of all '' Pareto optimal solutions, i.e., PxS =∈Ω¬∃∈Ω{| x :()()} FxFx≺ .

Definition 4. Pareto-optimal front: The Pareto-optimal front PF is defined as the set of all objective functions values corresponding to the solutions in PS , i.e.,

PFxfxfxxPFkS=={ () (1 (),, ())| ∈ }. 3. Elitist multiobjective evolutionary algorithm As opposed to single-objective optimization, where the best solution is always copied into the next population, the incorporation of elitism in MOEAs is substantially more complex. Often used is the concept of maintaining an external archive of solutions that are nondominated among all individuals generated so far. Another promising elitism approach provides the so-called ()µ + λ selection, where parents and offspring compete against each other. In the study of Zitzler et al. [29], it was clearly shown that elitism helps in achieving better convergence in MOEAs. Among the existing elitist MOEAs, Deb et al.’s NSGA-II [25], Zitzler et al.’s SPEA [26] and Zitzler et al.’s SPEA2 [27], Knowles and Corne’s Pareto-archived PAES [24] enjoyed more attention. 1) Nondominated Sorting Genetic Algorithm II (NSGA-II): This algorithm is proposed by Deb et al. [25]. NSGA-II is a revised version of the nondominated sorting genetic algorithm (NSGA) proposed by Srinivas and Deb [30]. The original NSGA is based on several layers of classifications of the individuals as suggested by Goldberg [31]. Before selection, all nondominated individuals are classified into one category. This group of classified individuals is then ignored and another layer of nondominated individuals is considered. The process continues until all individuals in the population are classified. Since individuals in the first front have the maximum fitness value, they always get more chance of surviving than the remainder of the population. The NSGA-II is more efficient in terms of computational complexity than the original NSGA. To keep diversity, NSGA- II uses a crowded comparison operator without specifying any additional parameters, while the original NSGA used fitness sharing. Besides, NSGA-II adopts ()µ + λ -selection as its elitist mechanism [32]. 2) Strength Pareto Evolutionary Algorithm (SPEA) and Strength Pareto Evolutionary Algorithm 2 (SPEA2): Zitzler et al. [26] suggested the SPEA. They suggested maintaining an external population at every generation storing all nondominated solutions discovered so far. This external population participates in all genetic operations. Presented by Zitzler et al. [27], SPEA2 is an improvement of the SPEA. In contrast to SPEA, SPEA2 uses a fine-grained fitness assignment strategy which incorporates density information. Furthermore, the archive size in SPEA2 is fixed, i.e., whenever the number of nondominated individuals is less than the predefined archive size, the archive is filled up by dominated individuals. In addition, the clustering technique, which is used in SPEA as the archive truncation method, has been replaced by an alternative truncation method which has similar features but does not lose boundary points. Finally, another difference to SPEA is that only members of the archive participate in the mating selection process. 3) Pareto Archived Evolution Strategy (PAES): Knowles and Corne [24] suggested a simple MOEA using an (1+1) evolution strategy (a single parent generates a single offspring). PAES is purely based on the mutation operator to search for new individuals. An archive is used to store the nondominated solutions found in the evolutionary process. Such a historical archive is the elitist mechanism adopted in PAES. An interesting aspect of this algorithm is the adaptive grid method used as the archive truncation method. They also presented (1+ λ ) and ( µ + λ ) variations of the basic approach. The former is identical to PAES (1+1) except that λ offspring are generated from the current solution. The ( µ + λ ) version maintains a population of size µ from which λ copies are made. The fittest µ from the µ + λ solutions then replace the current population. 4. Extremal optimization 4.1. Bak-Sneppen model EO is based on Bak-Sneppen (BS) model of biological evolution, which simulates far-from equilibrium dynamics in statistical physics. BS model is one of the models that show the nature of SOC. The SOC means that regardless of the initial state, the system always tunes itself to a critical point having a power-law behavior without any tuning control parameter. In the BS model, species are located on the sites of a lattice. Each species is randomly assigned a fitness value with a uniform distribution. At each update step, the worst adapted species is always forced to mutate. The change in the fitness of the worst adapted species will cause the alteration of the fitness landscape around it. Thus the fitness value of the neighbors of the weakest one will also be changed together, even if they are well adapted. After a number of iterations, the system evolves to a highly correlated state known as SOC. In that state, almost all species have fitness values above a certain threshold, and a little change of one species will result in co-evolutionary chain reactions called “avalanches”. The probability distribution of the sizes “ K ” of these avalanches is depicted by a power law PK()∼ K−τ , where τ is a positive parameter. That is, the smaller avalanches are more likely to occur than those bigger ones, but even the avalanches as big as the whole system may occur with a small but non-negligible probability. Therefore, the large fluctuation makes any possible configuration accessible. 4.2. Extremal optimization Inspired by the BS model, Boettcher and Percus proposed the EO algorithm for a minimization optimization problem as follows [8]:

1) Generate a solution S randomly. Set the optimal solution SSbest = . 2) For the current solution S ,

a) evaluate the fitness λi for each component (i.e., decision variable) xi , b) rank all the components by their fitness values and find the component x with the j

“worst fitness”, i.e., λ ji≤ λ for all i,

' c) choose one solution S ' in the neighborhood of S , i.e., SNS∈ (), such that the worst component x j must change its state, d) accept SS= ' unconditionally, e) if the current cost function value is less than the so-far minimum cost function value, i.e. CS( )< CS (best ) , then set SSbest = , 3) Repeat at Step 2) as long as desired.

4) Return Sbest and CS (best ). Note that in the EO algorithm, each decision variable in the current solution S is considered “species”. In this study, we adopt the term “component” to represent “species” which is usually used in biology. From the above algorithm, it can be seen that unlike genetic algorithms which work with a population of candidate solutions, EO evolves a single solution S and makes local modification to the worst components. This requires that a suitable representation be selected which permits the solution components to be assigned a quality measure (i.e. fitness). This differs from holistic approaches such as evolutionary algorithms that assign equal-fitness to all components of a solution based on their collective evaluation against an objective function. It is important to note that the governing principle behind the EO algorithm is the improvement through successively removing low-quality species and changing them randomly. This is obviously at odds with genetic algorithms, which select good solutions in an attempt to make better solutions. By always mutating the worst adapted species, EO could evolve solutions quickly and systematically, and at the same time preserve the possibility of probing different regions of the design space via avalanches. 5. Multiobjective extremal optimization In order to further extend EO to solve the multiobjective problems, in this work, we propose a novel elitist multiobjective optimization algorithm, so-called Multiobjective Extremal Optimization (MOEO), through introducing Pareto dominance strategy to EO. Similar to EO, the MOEO performs only one operation, i.e. mutation, on each decision variable of the current solution. It is well known that there exist two fundamental goals in multiobjective optimization design: one is to minimize the distance of the generated solutions to the Pareto-optimal set, the other is to maximize the diversity of the achieved Pareto set approximation. In this study, MOEO mainly consists of three components: fitness assignment, diversity preservation and external archive. A good fitness assignment is beneficial for guiding the search towards the Pareto-optimal set. In order to increase the diversity of the nondominated solutions, we introduce the diversity-preserving mechanism to our approach. For the sake of preventing nondominated solutions from being lost, we also adopt an external archive to store the nondominated solutions found in the evolutionary process. In the following sections, main algorithm, fitness assignment, diversity preservation, external archive and mutation operator will be addressed in some details. 5.1. Main algorithm

1) Randomly generate an initial solution Sxxx= (,12 ," ,n ). Set the external archive empty. Set iteration = 0 . 2) Generate n offspring of the current solution S by performing mutation on each decision variable one by one. 3) Perform dominance ranking on the n offspring and then obtain their rank numbers, i.e.,

rnj ∈−[0, 1] , jn∈{1," , } .

4) Assign the fitness λ jj= r for each variable x j , jn∈{1," , } . 5) If there is only one variable with fitness value of zero, the variable will be considered the worst component;

otherwise, the diversity preservation mechanism is invoked. Assuming that the worst component is xw with

fitness λw = 0 , wn∈{1," , } .

6) Perform mutation only on xw while keeping other variables unchanged, then get a new solution Sw .

7) Accept SS= w unconditionally. 8) Apply UpdateArchive(S, archive) to update the external archive (see Figure 4). 9) If the iterations reach the predefined maximum number of the generations, go to Step 10); otherwise, set iteration=+ iteration 1, and go to Step 2). 10) Output the external archive as the Pareto-optimal set.

Figure 1. The pseudo-code of MOEO algorithm

For a multiobjective optimization problem, the proposed MOEO algorithm works as Figure 1. At the same time, we give the flowchart of MOEO in Figure 2.

START

Randomly generate a solution, set archive empty, gen=0

Evaluate fitness value for each component in the current solution

number of worst Yes invoke diversity components>1? preservation

No Perform mutation on the worst component

Update the external archive using Archiving Logic

gen=gen+1

Yes gen

No output external archive as Pareto optimal set

STOP

Figure 2. The flowchart of MOEO algorithm

5.2. Fitness assignment In this study, to extend EO to solve multiobjective optimization problems efficiently, we introduce the Pareto-based fitness assignment strategy to EO. We use the dominance ranking method proposed by Fonseca and Fleming [33], i.e., the fitness value of one solution equals to the number of other solutions by which it is dominated, to determine the fitness value for each solution. Therefore, the nondominated solutions are ranked as zero, whilst the worst possible ranking is the number of all the solutions minus one. It is important to note that there exists only one solution in the search process of MOEO. In order to find out the worst component via fitness assignment, we generate a population of new offspring by performing mutation on the components of the current solution one by one. Then the dominance ranking is carried out in the newly generated offspring. In this work, the component corresponding to the nondominated offspring (i.e., its fitness value equals to zero) is considered the weakest one. Then the nondominated offspring will replace the current solution in the next generation. From the aforementioned, we can see that, similar to (1+ λ ) PAES, MOEO is also a (1+ λ ) multiobjective optimization approach. That is, λ offspring can be generated from the current solution via performing mutation on all the components in turn, and then the offspring with the lowest fitness will replace the current solution. Here, λ equals to the number of the decision variables. To be clearer, we illustrate the process of dominance ranking in MOEO using Figure 3 (a). Assume that this is a two-objective optimization problem and both objectives are to be minimized. The locations of the current solution in the objective space (marked with black solid circle) will change to new one in the next generation via mutating its weakest component. For example, given the current solution Sxxxxi = (,1234 , , ), of which the location in the objective space is denoted by the circle i , we can identify the weakest component by mutating the four variables one by one and then performing dominance ranking on the newly generated offspring. First, a new offspring ' ' SxxxxiA = (,1234 , , ) can be obtained by mutating x1 to x1 , keeping other variables unchanged. Similarly, the other three offspring, i.e., Sxxxx= (,' , , ), Sxxxx= (, ,' , ) and iB 1342 iC 123 4 Sxxxx= (, , ,' ), are generated. The four white dashed circles (i.e., A, B, C, D) in Figure 3(a) iD 123 4 stand for the locations of four offspring (i.e., SiA , SiB , SiC , SiD ) in the objective space, respectively. The next location of the current solution Si in the objective space depends on the dominance ranking number of the four newly generated offspring. It can be seen from Figure 3 (a) that for the circle i , circle A is nondominated by any of the other three dashed circles (i.e., B, C,

D). The rank numbers of A, B, C and D are 0, 1, 1, 3, respectively. Hence, the component x1 corresponding to A is considered the weakest one and the current solution Si will change to SiA in the next generation. If there exists more than one worst component, i.e., at least two components with the same fitness value of zero, then the following diversity-preserving mechanism will be invoked.

F2 I IV F2 B D i A D i C

A B C

II III

F1 F1 (a) (b) Figure 3. (a) Dominance ranking in MOEO; (b) diversity preservation in MOEO

5.3. Diversity preservation The goal of introducing the diversity preservation mechanism is to maintain a good spread of solutions in the obtained set of solutions. In this study, we propose a new approach to keep good diversity of nondominated solutions in the search process. It is worth pointing out that our approach does not require any user-defined parameter for maintaining diversity among population members. Note that this diversity preservation mechanism is only invoked when more than one nondominated offspring has been generated. Another diversity preservation strategy will be used when archiving the Pareto set in Section 5.4. As can be seen from Figure 3 (b), the locations of the nondominated offspring (marked with white dash circle) relative to that of the parent i (indicated by black circle) can be categorized into four cases, i.e. regions I, II, III and IV. The offspring (circle B) residing in the region II dominates its parent, those (circles A and C) in the regions I and III do not dominate the parent and vice versa, and the one (circle D) in the region IV is dominated by the parent. It should be pointed out that those offspring lying in region IV will increase the searching time to approach the Pareto-optimal set. Thus we will not choose the nondominated offspring residing in region IV as the new solution in the next generation, unless there exists no nondominated offspring lying in other three regions. In addition, for the purpose of keeping a good spread of nondominated solutions, we not only regard the nondominated offspring lying in region I but also those in regions II and III as the candidates of the new solution in the next generation. If there are more than one nondominated offspring lying in the above three regions, we can randomly choose one as the new solution in the next generation. In a word, if there exists more than one nondominated offspring, MOEO will first pick one randomly from those nondominated offspring which do not reside in region IV. The worst case is that all the nondominated offspring reside in region IV, and then MOEO will choose one from them randomly. 5.4. External archive The main objective of the external archive is to keep a historical record of the nondominated solutions found along the search process. Note that the archive size is fixed in MOEO. This external archive provides the elitist mechanism for MOEO. The archive in our approach consists of two main components as follows. 1) Archiving logic: The function of the archiving logic is to decide whether the current solution should be added to the archive or not. The archiving logic works as follows. The newly generated solution S is compared with the archive to check if it dominates any member of the archive. If some members of the archive are dominated by S, all the dominated solutions are eliminated from the archive and S is added to the archive. If at least one member of the archive dominates S, the archive does not need to be updated and the iteration continues. However, if S and any member of the archive do not dominate each other, there exist three cases: a) If the archive is not full, S is added to the archive. b) If the archive is full and S resides in the most crowded region in the objective space among the members of the archive, the archive does not need to be updated. c) If the archive is full and S does not reside in the most crowded region of the archive, the member in the most crowded region of the archive is replaced by S. Based on the above archiving logic, the pseudo-code of function UpdateArchive(S,archive) is shown in Figure 4. 2) Crowding-distance metric: In our approach, we adopt the crowding-distance metric proposed by Deb et al. [25] to judge whether the current solution resides in the most crowded region of the archive. It is interesting to note that this metric is capable of preserving the boundary solutions in the archive. Via sorting the current solution S and all members of the archive together according to their crowding distances in an ascending order, the one which resides in the most crowded region can be found out. If S is the one with the lowest rank, S will be considered lying in the most crowded region. Otherwise, the solution with the lowest rank will be removed from the archive and S will be added to the archive. As can be seen from the literature [25], the complexity of the crowding-distance computation is Ok((2) Alog (2)) A , where A is the archive size and k is the number of objectives. Thus the complexity of crowding-distance computation is not large. For more details about crowding-distance metric, the interested readers are referred to [25].

function UpdateArchive(S, archive) begin function if the current solution S is dominated by at least one member of the archive, then the archive does not need to be updated; else if some members of the archive are dominated by S , then remove all the dominated members from the archive and add S to the archive; end if else if the archive is not full, then add S to the archive; else if S resides in the most crowded region of the archive, then the archive does not need to be updated; else replace the member in the most crowded region of the archive by S; end if end if end if end function

Figure 4. Pseudo-code of function UpdateArchive(S, archive) 5.5. Mutation operator Since there is merely mutation operator in MOEO, the mutation plays a key role in MOEO search that generates new solutions through adding or removing genes at the current solutions, i.e., chromosomes. In this work, we present a new mutation method based on mixing Gaussian mutation and Cauchy mutation and choose it as the mutation operator in MOEO. The mechanisms of Gaussian and Cauchy mutation operations have been studied by Yao et al. [35]. They pointed out that Cauchy mutation can get to the global optimum faster than Gaussian mutation due to its higher probability of making longer jumps. However, they also indicated that Cauchy mutation spends less time in exploiting the local neighbourhood and thus has a weaker fine-tuning ability than Gaussian mutation from small to mid-range regions. Therefore, Cauchy mutation is better at coarse-grained search while Gaussian mutation is better at fine-grained search. At the same time, they pointed out that Cauchy mutation performs better when the current search point is far away from the global optimum, while Gaussian mutation is better at finding a local optimum in a good region. Thus, it would be ideal if Cauchy mutation is used when search points are far away from the global optimum and Gaussian mutation is adopted when search points are in the neighbourhood of the global optimum. Unfortunately, the global optimum is usually unknown in practice, making the ideal switch from Cauchy to Gaussian mutation very difficult. A new method based on mixing (rather than switching) different mutation operators has been proposed by Yao et al. [35]. The idea is to mix different search biases of Cauchy and Gaussian mutations. The method generates two offspring from the parents, one by Cauchy mutation and the other by Gaussian mutation. The better one is then chosen as the offspring. Inspired by the above idea, we present a new mutation method based on mixing Gaussian mutation and Cauchy mutation. For the sake of convenience, we call it “hybrid GC mutation”. It must be indicated that, unlike the method in [35], this mutation doesn’t compare the anticipated outcomes between Gaussian mutation and Cauchy mutation due to the characteristics of EO. In the hybrid GC mutation, the Cauchy mutation is first adopted. It means that the large step size will be taken first at each mutation. If the new generated variable after mutation goes beyond the interval of the corresponding decision variable, the Cauchy mutation will be used repeatedly for some times (TC), until the new generated offspring falls into the interval. Otherwise, Gaussian mutation will be carried out repeatedly for another some times (TG), until the offspring satisfies the requirement. That is, the step size will become smaller than before. If the new generated variable after mutation still goes beyond the interval of the corresponding decision variable, then we choose the upper or lower bound of the decision variable as the new generated variable. Thus, our approach combines the advantages of coarse-grained search and fine-grained search. The above analyses show that the hybrid GC mutation is very simple yet effective. Unlike some switching algorithms which have to decide when to switch between different mutations during search, the hybrid GC mutation does not need to make such decisions. In this paper, the Gaussian mutation performs with the following representation: ' xxNkk=+ k(0,1) (2) ' where xk and xk denote the k-th decision variables before mutation and after mutation, respectively, Nk (0,1) denotes the Gaussian random number with mean zero and standard deviation one and is generated anew for k-th decision variable. The Cauchy mutation performs as follows: ' xxkkk= +δ (3) where δk denotes the Cauchy random variable with the scale parameter equal to one and is generated anew for the k-th decision variable. In the hybrid GC mutation, the values of parameters TC and TG are set by the user beforehand. The value of TC decides the coarse-grained searching time, while the value of TG has an effect on the fine-grained searching time. Therefore, both values of the two parameters can not be large because it will prolong the search process and hence increase the computational overhead. According to our experimental experience, the moderate values of TC and TG can be set to 2~4. 6. Experiments and test results 6.1. Test problems We choose five problems out of six benchmark problems proposed by Zitzler et al. [29] and call them ZDT1, ZDT2, ZDT3, ZDT4, and ZDT6. All problems have two objective functions. None of these problems have any inequality or equality constraints. We describe these problems in Table 1. The table also shows the number of variables, their bounds, the Pareto-optimal solutions, and the nature of Pareto-optimal front for each problem. All objective functions in Table 1 are to be minimized. Deb et al. have compared the performance of NSGA-II with PAES and SPEA and obtained their experimental results in [25]. In this paper, we apply the MOEO to solve five difficult test problems in Table 1 and compare our simulation results with those obtained in [25] under the same conditions. Furthermore, our approach is also compared with SPEA2 proposed by Zitzler et al. [27], which is the improvement of SPEA. To be fair, we adopt the identical parameter settings as suggested in [25]. All approaches are run for a maximum of 25 000 fitness function evaluations (FFE). In addition, 10 independent runs are carried out. The source codes of all experiments are coded in JAVA. For MOEO, the algorithm is encoded in the floating point representation and uses an archive of size 100. The maximum number of generations for ZDT1, ZDT2 and ZDT3 is 830 (FFE=maximum generation×number of decision variables=830×30=24 900 ≈ 25 000) and for ZDT4 and ZDT6 is 2500 (FFE=2500×10=25 000). The parameters in the hybrid GC mutation, i.e., TC and TG, are both set to 3. For SPEA2, the algorithm is also encoded in the floating point representation. We use a population of size 80 and an external population of size 20 (this 4:1 ratio is suggested by the developers of SPEA2 to maintain an adequate selection pressure for the elite solutions), so that the overall population size becomes 100. We use the simulated binary crossover

(SBX) operator and polynomial mutation [34]. The crossover probability of pc = 0.9 and a mutation probability of pm = 1/ n (where n is the number of variables) are used. The distribution indexes [34] for crossover and mutation operators are ηc = 20 and ηm = 20 are adopted, respectively. The maximum number of generations for all the test problems is 250 (FFE=maximum generation×population size=250×100=25 000).

Table 1. Test problems used in this study, n is the number of decision variables Variable Optimal Problem n Objective functions Pareto front bounds solutions fX()= x 11 x1 ∈[0,1] fX()=− gX ()[1 xgX /()] ZDT1 30 [0,1] 21 xi = 0, convex gX()=+ 19(∑n x )/(1) n − i = 2 i in= 2," , fX()= x 11 x1 ∈[0,1] 2 fX()=− gX ()[1( xgX /())] = ZDT2 30 [0,1] 21 xi 0, nonconvex gX()=+ 19(∑n x )/(1) n − i = 2 i in= 2," , fX()= x 11 x ∈[0,1] x 1 f ()XgXxgX=− ()[1 /() −1 sin(10)]π x convex, ZDT3 30 [0,1] xi = 0, 21gX() 1 disconnected gX()=+ 19(∑n x )/(1) n − in= 2," , i = 2 i

fX()= x x ∈[0,1] x ∈[0,1] 11 1 1 fX()=− gX ()[1 xgX /()] ZDT4 10 xi ∈−[5,5], 21 xi = 0, nonconvex 2 in= 2," , g(Xn )=+ 1 10( − 1) +∑n [ x − 10cos(4π x )] i = 2 iiin= 2," ,

6 fX()=− 1exp(4)sin(6) − xπ x ∈ 111x1 [0,1] nonconvex, f() X=− gX ()[1(()/())] f X gX 2 x = 0, ZDT6 10 [0,1] 21 i nonuniformly 0.25 gX()=+ 19[(∑n x )/(1)] n − in= 2," , spaced i = 2 i

6.2. Performance measures In this article, we use two performance metrics proposed by Deb et al. [25] to assess the performance of our approach. For more details, the readers can refer to [25]. The first metric ϒ measures the extent of convergence to a known set of Pareto-optimal solutions. In all simulations, we address the average ϒ and variance σ ϒ of this metric obtained in ten independent runs. Deb et al. [25] has pointed out that even when all solutions converge to the Pareto-optimal front, the convergence metric does not have a value of zero. The metric will yield zero only when each obtained solution lies exactly on each of the chosen solutions. The second metric ∆ measures the extent of spread of the obtained nondominated solutions. It is desirable to get a set of solutions that spans the entire Pareto-optimal region. The second metric ∆ can be calculated as follows:

N −1 dd++ dd − fl∑i=1 i ∆= (4) ddfl++(1) N − d

where, d f and dl are the Euclidean distances between the extreme solutions and the boundary solutions of the obtained nondominated set, di is the Euclidean distance between consecutive solutions in the obtained nondominated set of solutions, and d is the average of all distances di (1,2,,iN=− 1), assuming that there are N solutions on the best nondominated front. Note

that a good distribution would make all distances di equal to d and would make ddfl== 0

(with existence of extreme solutions in the nondominated set). Consequently, for the most widely and uniformly spreadout set of nondominated solutions, ∆ would be zero. 6.3. Discussion of the results Table 2 shows the mean and variance of the convergence metric ϒ obtained using five algorithms, i.e., MOEO, NSGA-II (real-coded), PAES, SPEA and SPEA2. In this study, all the experimental results of NSGA-II (real-coded), SPEA and PAES shown in Table 2 and Table 3 come from [25].

Table 2. Mean (first rows) and variance (second rows) of the convergence metric ϒ

Algorithm ZDT1 ZDT2 ZDT3 ZDT4 ZDT6

0.001277 0.001355 0.004385 0.008145 0.000630 MOEO 0.000697 0.000897 0.00191 0.004011 3.26E-05

NSGA-II 0.033482 0.072391 0.114500 0.513053 0.296564 (real-coded) 0.004750 0.031689 0.007940 0.118460 0.013135

0.082085 0.126276 0.023872 0.854816 0.085469 PAES 0.008679 0.036877 0.00001 0.527238 0.006664

0.001799 0.001339 0.047517 7.340299 0.221138 SPEA 0.000001 0 0.000047 6.572516 0.000449

0.001448 0.000743 0.003716 0.028492 0.011643 SPEA2 0.000317 8.33E-05 0.000586 0.047482 0.002397

It can be observed from Table 2 that the MOEO is able to converge better than any other algorithm on three problems (ZDT1, ZDT4 and ZDT6). For problem ZDT2, the MOEO performs better than NSGA-II and PAES, but a little worse than SPEA and SPEA2 in terms of convergence. For problem ZDT3, the MOEO outperforms NSGA-II, PAES and SPEA with respect to the convergence to the Pareto-optimal front, where SPEA2 performs the best. In all cases with MOEO, the variance of the convergence metric in ten runs is very small. Table 3 shows the mean and variance of the diversity metric ∆ obtained using all the algorithms. As can be seen from Table 3, MOEO is capable of finding a better spread of solutions than any other algorithm on all the problems except ZDT3. This indicates that our approach has ability to find a well-distributed set of nondominated solutions than many other state-of-the-art MOEAs. In all cases with MOEO, the variance of the diversity metric in ten runs is also small. It is worth noting that for those problems with discontinuous front, the distribution of the nondominated solutions found by our approach is not very good. We will improve our approach to handle this problem in the future studies.

Table 3. Mean (first rows) and variance (second rows) of the diversity metric ∆

Algorithm ZDT1 ZDT2 ZDT3 ZDT4 ZDT6

0.32714 0.285062 0.965236 0.275664 0.225468 MOEO 0.065343 0.056978 0.046958 0.183704 0.033884

NSGA-II 0.390307 0.430776 0.738540 0.702612 0.668025 (real-coded) 0.001876 0.004721 0.019706 0.064648 0.009923

1.229794 1.165942 0.789920 0.870458 1.153052 PAES 0.004839 0.007682 0.001653 0.101399 0.003916

0.784525 0.755148 0.672938 0.798463 0.849389 SPEA 0.004440 0.004521 0.003587 0.014616 0.002713

0.472254 0.473808 0.606826 0.705629 0.670549 SPEA2 0.097072 0.0939 0.191406 0.266162 0.077009

It is interesting to note that MOEO can perform well with respect to convergence and diversity of solutions in problem ZDT4, where there exist 219 different local Pareto-optimal fronts in the search space, of which only one corresponds to the global Pareto-optimal front. This indicates that MOEO is capable of escaping from the local Pareto-optimal fronts and approaching the global nondominated front. Furthermore, MOEO is suitable to deal with those problems with nonuniformly-spaced Pareto front, e.g., problem ZDT6. For illustration, we also show one of ten runs of MOEO on three test problems (ZDT1, ZDT2 and ZDT3) in Figure 5~ Figure 7, respectively. The figures show all nondominated solutions obtained after 25 000 fitness function evaluations with MOEO. From Figure 5~Figure 7, we can see that MOEO is able to converge to the true Pareto-optimal front on all the three problems. Moreover, MOEO can find a well-distributed set of nondominated solutions on ZD1 and ZDT2. But the diversity of the nondominated solutions found by MOEO on ZDT3 is not very good. Figure 8, which is taken from the literature [25], shows one of ten runs with NSGA-II and PAES on ZDT4. From Figure 8, we can see that NSGA-II finds better convergence and spread of solutions than PAES on ZDT4, but both of them can not converge to the true Pareto-optimal front. Figure 9 shows one of ten runs with MOEO and SPEA2 on ZDT4. As can be observed from Figure 9, MOEO is able to find a better convergence and spread of solutions than SPEA2 on ZDT4. It is interesting to notice that MOEO can converge to the true Pareto-optimal front and find a good spread of solutions that cover the whole front. From Figure 8 and Figure 9, we can see that MOEO outperforms NSGA-II, PAES and SPEA2 in terms of convergence and diversity of solutions on ZDT4.

Figure 5. Nondominated solutions with MOEO on ZDT1

Figure 6. Nondominated solutions with MOEO on ZDT2

Figure 7. Nondominated solutions with MOEO on ZDT3

Figure 8.[9] NSGA-II finds better convergence and spread of solutions than PAES on ZDT4, but they can not converge to the true Pareto-optimal front.

Figure 9. MOEO converges to the true Pareto-optimal front and finds a better spread of solutions than SPEA2 on ZDT4.

Taken from the literature [25], Figure 10 shows one of ten runs with NSGA-II and SPEA on ZDT6. As can be observed from Figure 10, NSGA-II finds better spread of solutions than SPEA on ZDT6, but SPEA has a better convergence. Figure 11 shows one of ten runs with MOEO and SPEA2 on ZDT6. From Figure 11, we can see that MOEO is able to find a better convergence and spread of solutions than SPEA2 on ZDT6. It is worth noting that our approach is capable of converging to the true Pareto-optimal front and finding a well-distributed set of nondominated solutions that cover the whole front on ZDT6. It can be seen from Figure 10 and Figure 11 that MOEO performs better than NSGA-II, SPEA and SPEA2 with respect to convergence and diversity of solutions on ZDT6.

Figure 10.[9] NSGA-II finds better spread of solutions than SPEA on ZDT6, but SPEA has a better convergence.

Figure 11. MOEO can converge to the true Pareto-optimal front and finds a better spread of solutions than SPEA2 on ZDT6.

In order to compare the running times of MOEO with those of other three algorithms, i.e., NSGA-II, SPEA2 and PAES, we also show the mean and variance of running times of each algorithm in ten runs in Table 4. To avoid any bias or misinterpretation when implementing each of the three other approaches, we adopted the public-domain versions of NSGA-II, PAES and SPEA2 (the former two can be available at http:// www.lania.mx/ ~ccoello/EMOO/EMOOsoftware.html and the last one can be available at http:// www.tik.ee.ethz.ch/pisa/selectors/spea2/spea2_c_source.html). Note that all the algorithms were run on the same hardware (i.e., Intel Pentium M with 900 MHz CPU and 256M memory) and software (i.e., JAVA) platform. The fitness function evaluations for all the algorithms is 25 000. It is known that SPEA2 and PAES adopt the clustering method [27] and the adaptive grid method [24] as the archive truncation method, respectively. Although there is no archive used in NSGA-II, NSGA-II adopts the crowding-distance metric [25] to keep diversity of solutions. In this paper, MOEO adopts the crowding-distance metric to truncate the archive. Thus, to compare the running times of each algorithm fairly, we use the crowding-distance metric as the archive truncation method for MOEO, SPEA2 and PAES.

Table 4. Mean (first rows) and variance (second rows) of the running times (in milliseconds)

Algorithm ZDT1 ZDT2 ZDT3 ZDT4 ZDT6

427.93 436.20 330.81 1971.27 2741.30 MOEO 35.56 61.17 39.66 466.79 100.26

NSGA-II 2906.90 2889.83 3154.90 2475.57 2746.61 (real-coded) 159.51 287.70 489.04 166.31 401.19

2782.37 2802.36 2884.46 2303.01 2687.53 SPEA2 303.95 167.87 474.73 267.51 302.44

12051.63 12088.77 8890.13 2661.80 25716.33 PAES 1044.91 240.52 357.17 139.36 1685.56

From Table 4, we can see that MOEO is the fastest algorithm in comparison with other three algorithms on problems ZDT1, ZDT2, ZDT3 and ZDT4. It is important to notice that MOEO is about 6 times faster than NSGA-II and SPEA2, even approximately 30 times as fast as PAES on problems ZDT1, ZDT2 and ZDT3. For problem ZDT6, MOEO runs almost as fast as NSGA-II and SPEA2, nearly 10 times as fast as PAES. MOEO runs much faster than PAES when the same archive truncation method is adopted. The results in Table 4 are remarkable if the NSGA-II and the SPEA2 are normally considered “very fast” algorithms. Therefore, MOEA may be considered a very fast approach. 6.4. Advanced features of the proposed approach From the above analysis, it can be seen that our approach has the following advanced features: z Similar to multiobjective evolutionary algorithms, our approach is not susceptible to the shape of Pareto front. z Only one operator, i.e., mutation operator, exists in our approach, which makes our approach simple and convenient to be implemented. z The hybrid GC mutation operator suggested in our approach combines the advantages of coarse-grained search and fine-grained search. z The historical external archive provides the elitist mechanism for our approach. z Our approach provides good performance in both aspects of convergence and distribution of solutions. z Our approach is capable of handling those problems with multiple local Pareto- optimal fronts or nonuniformly-spaced Pareto-optimal front. z Compared with three competitive MOEAs in terms of running times on five test problems, our approach is testified very fast. 7. Conclusions and future work In this paper, we present a novel elitist Pareto-based multiobjective algorithm, called Multiobjective Extremal Optimization (MOEO). The proposed algorithm extends EO to handle multiobjective optimization problems with satisfactory results. The fitness assignment of our approach is based on the Pareto dominance strategy. We also present a new hybrid mutation operator that enhances the exploratory capabilities of our algorithm. The experimental simulation studies with five benchmark problems show that MOEO is highly competitive to the state-of-the-art MOEAs. Moreover, compared with three competitive MOEAs in terms of running times on five test problems, our approach is validated very fast. Thus, MOEO may be a good alternative to deal with multiobjective optimization problems. It is worth pointing out that the problem dimension of the five test problems studied in this work is not very large. We will explore the efficiency of our approach on those problems with large number of decision variables in the future. The future work also includes the studies on how to extend MOEO to solve constrained or discrete multiobjective optimization problems. It is desirable to further apply MOEO to solving those complex engineering optimization problems in the real world.

References

1. Miettinen K M. Nonlinear multiobjective optimization. Kluwer Academic Publishers, Boston, Massachusetts, 1999. 2. Ehrgott M. Multicriteria optimization. Springer, second edition, 2005. 3. Coello C A C. Evolutionary multiobjective optimization: a historical view of the field. IEEE Computational Intelligence Magazine 2006; 1 (1); 28-36 4. Sarker R, Liang K H, Newton C. A new multiobjective evolutionary algorithm. European Journal of Operational Research 2002; 140; 12–23 5. Beausoleil R P. ‘‘MOSS’’ multiobjective scatter search applied to non-linear multiple criteria optimization. European Journal of Operational Research 2006; 169; 426–449 6.Hanne T. A multiobjective evolutionary algorithm for approximating the efficient set. European Journal of Operational Research 2007; 176; 1723–1734 7.Elaoud S, Loukil T, Teghem J. The Pareto fitness genetic algorithm: Test function study. European Journal of Operational Research 2007; 177 ; 1703-1719 8. Boettcher S, Percus A G. Nature's way of optimizing. Artificial Intelligence 2000; 119; 275-286 9. Bak P, Sneppen K. Punctuated equilibrium and criticality in a simple model of evolution. Physical Review Letters 1993; 71 (24); 4083-4086 10. Bak P, Tang C, Wiesenfeld K. Self-organized criticality. Physical Review Letters 1987; 59; 381-384 11. Boettcher S, Percus A G. Extremal optimization at the phase transition of the 3-coloring problem. Physical Review E 2004; 69; 066703 12. Boettcher S. Extremal optimization for the Sherrington-Kirkpatrick spin glass. European Physics Journal B 2005; 46; 501-505

13. Menai M E, Batouche M. Efficient initial solution to extremal optimization algorithm for weighted MAXSAT problem. IEA/AIE 2003, pp. 592-603 (2003) 14. Chen Yu-Wang, Lu Yong-Zai, Yang Genke. Hybrid evolutionary algorithm with marriage of genetic algorithm and extremal optimization for production scheduling. International Journal of Advanced Manufacturing Technology. Accepted 15. Lu Yong-Zai, Chen Min-Rong, Chen Yu-Wang. Studies on extremal optimization and its applications in solving real world optimization problems. To be presented at 2007 IEEE Series Symposium on Computation Intelligence, Hawaii, USA, April 1-5, 2007. 16. Chen Min-Rong, Lu Yong-Zai, Yang Genke. Population-Based Extremal Optimization with Adaptive Lévy

Mutation for Constrained Optimization. Proceedings of 2006 International Conference on Computational Intelligence and Security (CIS’2006), pp. 258-261, 2006. 17. Moser I, Hendtlass T. Solving problems with hidden dynamics-comparison of extremal optimization and ant colony system. Proceedings of 2006 IEEE Congress on (CEC'2006), pp. 1248-1255, 2006. 18. Ahmed E, Elettreby M F. On multiobjective evolution model. International Journal of Modern Physics C 2004; 15 (9); 1189-1195 19. Elettreby M F. Multiobjective optimization of an extremal evolution model. Available at http:// www.ictp.trieste.it/~pub_off 20. Das I, Dennis J. A close look at drawbacks of minimizing weighted sum of objectives for Pareto set generation in multicriteria optimization problems. Structure Optimization, 14(1):63-69, 1997. 21. Galski R L, de Sousa F L, Ramos F M, Muraoka I. Spacecraft thermal design with the generalized extremal optimization algorithm. Proceedings of inverse problems, design and optimization symposium, Rio de Janeiro, Brazil, 2004. 22. De Sousa F L, Vlassov V, Ramos F M. Generalized extremal optimization: An application in heat pipe design. Applied Mathematical Modeling 2004; 28; 911-931 23. Galski R L, de Sousa F L, Ramos F M. Application of a new multiobjective evolutionary algorithm to the optimum design of a remote sensing satellite constelation. Proceedings of the 5th International Conference on Inverse Problems in Engineering: Theory and Practice, Cambridge, Vol II, G01, UK, 11-15th July, 2005 24. Knowles J, Corne D. The Pareto archived evolution strategy: A new baseline algorithm for multiobjective optimization. Proceedings of the 1999 Congress on Evolutionary Computation. Piscataway, NJ: IEEE Press, pp. 98- 105(1999) 25. Deb K, Pratab A, Agrawal S, Meyarivan T. A fast and elitist multiobjective genetic algorithm: NSGA–II. IEEE Transactions on Evolutionary Computation 2002; 6(2); 182-197 26. Zitzler E, Thiele L. Multiobjective optimization using evolutionary algorithms-A comparative case study. Proceedings the Seventh International Conference on Parallel Problem Solving From Nature, PPSN-V [M], Berlin Springer, 1998 27. Zitzler E, Laumanns M, Thiele L. SPEA2: improving the performance of the strength Pareto evolutionary algorithm. Technical Report 103, Computer Engineering and Communication Networks Lab (TIK), Swiss Federal Institute of Technology (ETH) Zurich, Gloriastrasse 35, CH-8092 Zurich, May 2001 28. Fonseca C M, Fleming P J. An overview of evolutionary algorithms in multiobjective optimization.

Evolutionary Computation 1995; 1; 1-16 29. Zitzler E, Deb K, Thiele L. Comparison of multiobjective evolutionary algorithms: Empirical results. Evolutionary Computation 2000; 8(2); 173-195 30. Srinivas N, Deb K. Multiobjective optimization using nondominated sorting in genetic algorithms. Evolutionary Computation 1994; 2(3); 221–248 31. Goldberg D E. Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA:

Addison-Wesley, 1989. 32. Coello C A C, Pulido G T, Leehuga M S. Handling multiple objectives with particle swarm optimization. IEEE Transactions on Evolutionary Computation 2004; 8(3); 256-279

33. Fonseca C M, Fleming P J. Genetic algorithms for multiobjective optimization: Formulation, discussion and generalization. Proceedings of the Fifth International Conference on Genetic Algorithms. S. Forrest, Ed. San Mateo, CA: Morgan Kauffman, pp. 416-423(1993) 34. Deb K, Agrawal R B. Simulated binary crossover for continuous search space. Complex Systems 1995; 9; 115–148 35.Yao X, Liu Y, Lin G. Evolutionary programming made faster. IEEE Transactions on Evolutionary Computation 1999; 3(2); 82-102