On the Time Complexity of Algorithm Selection Hyper-Heuristics for Multimodal Optimisation
Total Page:16
File Type:pdf, Size:1020Kb
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) On the Time Complexity of Algorithm Selection Hyper-Heuristics for Multimodal Optimisation Andrei Lissovoi, Pietro S. Oliveto, John Alasdair Warwicker Rigorous Research, Department of Computer Science The University of Sheffield, Sheffield S1 4DP, United Kingdom fa.lissovoi,p.oliveto,j.warwickerg@sheffield.ac.uk Abstract of hard combinatorial optimisation problems (see (Burke et al. 2010; 2013) for surveys of results). Selection hyper-heuristics are automated algorithm selec- Despite their numerous successes, very limited rigor- tion methodologies that choose between different heuristics ous theoretical understanding of their behaviour and per- during the optimisation process. Recently selection hyper- ¨ heuristics choosing between a collection of elitist randomised formance is available (Lehre and Ozcan 2013; Alanazi and local search heuristics with different neighbourhood sizes Lehre 2014). Recently, it has been proved that a simple se- have been shown to optimise a standard unimodal benchmark lection hyper-heuristic called Random Descent (Cowling, function from evolutionary computation in the optimal ex- Kendall, and Soubeiga 2001; 2002), choosing between eli- pected runtime achievable with the available low-level heuris- tist randomised local search (RLS) heuristics with differ- tics. In this paper we extend our understanding to the domain ent neighbourhood sizes, optimises a standard unimodal of multimodal optimisation by considering a hyper-heuristic benchmark function from evolutionary computation called from the literature that can switch between elitist and non- LEADINGONES in the best possible runtime (up to lower elitist heuristics during the run. We first identify the range order terms) achievable with the available low-level heuris- of parameters that allow the hyper-heuristic to hillclimb ef- ficiently and prove that it can optimise a standard hillclimb- tics (Lissovoi, Oliveto, and Warwicker 2018). For this to oc- ing benchmark function in the best expected asymptotic time cur, it is necessary to run the selected low-level heuristics achievable by unbiased mutation-based randomised search for a sufficient amount of time, called the learning period, to heuristics. Afterwards, we use standard multimodal bench- allow the hyper-heuristic to accurately determine how use- mark functions to highlight function characteristics where ful the chosen operator is at the current stage of the opti- the hyper-heuristic is efficient by swiftly escaping local op- misation process. If the learning period is not sufficiently tima and ones where it is not. For a function class called long, the hyper-heuristic fails to identify whether the chosen CLIFFd where a new gradient of increasing fitness can be low-level heuristics are actually useful, leading to random identified after escaping local optima, the hyper-heuristic is choices of which heuristics to apply and a degradation of extremely efficient while a wide range of established elitist the overall performance. Since the optimal duration of the and non-elitist algorithms are not, including the well-studied Metropolis algorithm. We complete the picture with an anal- learning period may change during the optimisation, (Doerr et al. 2018) have recently introduced a self-adjusting mecha- ysis of another standard benchmark function called JUMPd as an example to highlight problem characteristics where the nism and rigorously proved that it allows the hyper-heuristic hyper-heuristic is inefficient. Yet, it still outperforms the well- to track the optimal learning period throughout the optimi- established non-elitist Metropolis algorithm. sation process for LEADINGONES. In this paper we aim to extend the understanding of the be- haviour and performance of hyper-heuristics to multimodal 1 Introduction optimisation problems. In order to evaluate their capability Selection hyper-heuristics are automated algorithm selec- at escaping local optima, we consider elitist and non-elitist tion methodologies designed to choose which of a set of selection operators that have been used in hyper-heuristics low-level heuristics to apply in the next steps of the opti- in the literature. (Cowling, Kendall, and Soubeiga 2001; misation process (Cowling, Kendall, and Soubeiga 2001). 2002) introduced two variants of move acceptance oper- Rather than deciding in advance which heuristics to apply ators: the elitist ONLYIMPROVING (OI) operator, which for a problem, the aim is to automate the process at run- only accepts moves that improve the current solution, and time. Originally shown to effectively optimise scheduling the non-elitist ALLMOVES (AM) operator, which accepts problems, such as the scheduling of a sales summit and uni- any new solution independent of its quality. Another se- versity timetabling (Cowling, Kendall, and Soubeiga 2001; lection operator that has been considered in the literature 2002), they have since been successfully applied to a variety is the IMPROVINGANDEQUAL (IE) acceptance operator which in addition to accepting improving solutions also ac- Copyright c 2019, Association for the Advancement of Artificial cepts solutions of equal quality (Ayob and Kendall 2003; Intelligence (www.aaai.org). All rights reserved. Bilgin, Ozcan,¨ and Korkmaz 2007; Ozcan,¨ Bilgin, and Kork- 2322 maz 2006). In the mentioned works, the acceptance operator Algorithm 1 Move Acceptance Hyper-Heuristic remains fixed throughout the run, with the hyper-heuristic (MAHHOI) (Lehre and Ozcan¨ 2013) only switching between different mutation operators. How- 1: Choose x 2 f0; 1gn uniformly at random ever, it would be more desirable that hyperheuristics are al- 2: while termination criteria not satisfied do lowed to decide to use elitist selection for hillclimbing in 3: x0 FLIPRANDOMBIT(x) exploitation phases of the search and non-elitism for explo- ALLMOVES with probability p ration, for instance to escape from local optima. 4: ACC ONLYIMPROVING otherwise (Qian, Tang, and Zhou 2016) analysed selection hyper- 0 0 heuristics in the context of multi-objective optimisation. 5: if ACC(x; x ) then x x They considered a hyper-heuristic selecting between elitist (IE) and strict elitist (OI) acceptance operators and pre- sented a function where it is necessary to mix the two ac- based unbiased randomised search heuristic (Lehre and Witt ceptance operators. Lehre and Ozcan¨ presented a hyper- 2012) even though it does not rely on elitism. Afterwards we heuristic which chooses between two different low level highlight the power of the hyper-heuristic by analysing its heuristics that use the above described OI (elitist) and AM performance for standard benchmark function classes cho- (non-elitist) acceptance operators (Lehre and Ozcan¨ 2013). sen because they allow us to isolate important properties of The considered Simple Random hyper-heuristic uses 1-bit multimodal optimisation landscapes. flips as mutation operator and selects the AM acceptance Firstly, we consider the CLIFFd class of functions, consist- operator with probability p and the OI acceptance operator ing of a local optimum that, once escaped, allows the iden- with probability 1 − p. Essentially the algorithm is a Ran- tification of a new slope of increasing fitness. For this class dom Local Search algorithm that switches between strict of functions we rigorously prove that the hyper-heuristic ef- elitism, by only accepting improvements, and extreme non- ficiently escapes the local optima and even achieves the best elitism by accepting any new found solution. For the stan- known expected runtime of O(n log n) (for general purpose dard ROYALROADk benchmark function from evolutionary randomised search heuristics) on the hardest instances of the computation, which consists of several blocks of k bits each function class (Corus, Oliveto, and Yazdani 2017). Thus, that have to be set correctly to observe a fitness improve- we prove that it considerably outperforms established eli- ment, and k ≥ 2 they theoretically proved that it is necessary tist and non-elitist evolutionary algorithms and the popular to mix the acceptance operators for the hyper-heuristic to METROPOLIS algorithm. We complete the picture by con- be efficient, because by only accepting improvements (i.e., sidering the standard JUMPk multimodal instance class of p = 0) the runtime is infinite due to the hyper-heuristic not functions to provide an example of problem characteristics being able to cross the plateaus of equal fitness while by al- where the hyper-heuristic is not efficient. This class is cho- ways accepting any move (i.e., p = 1), the algorithm simply sen to isolate problems where escaping local optima is very performs a random search. By choosing the value of the pa- hard because it is difficult to identify a new slope of increas- rameter p appropriately they provide an upper bound on the ing gradient. Nevertheless, the hyper-heuristic is efficient for expected runtime of the hyper-heuristic of O(n3·k2k−3) ver- k instances of moderate jump size (i.e., constant) where it still sus the O(n log(n) · (2 =k)) expected time required by evo- considerably outperforms METROPOLIS. lutionary algorithms with standard bit mutation (i.e., each Due to space constraints in this extended abstract, we omit bit is flipped with probability 1=n) and with the standard some proofs. selection operator that accepts new solutions if their fitness is as good as that of their parent (Doerr, Sudholt, and Witt 2013). In particular, just using the IE acceptance operator 2 Preliminaries