Steps Towards a Science of Heuristic Search
Total Page:16
File Type:pdf, Size:1020Kb
STEPS TOWARDS A SCIENCE OF HEURISTIC SEARCH BY Christopher Makoto Wilt MS in Computer Science, University of New Hampshire 2012 AB in Mathematics and Anthropology, Dartmouth College, 2007 DISSERTATION Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science May, 2014 ALL RIGHTS RESERVED c 2014 Christopher Makoto Wilt This dissertation has been examined and approved. Dissertation director, Wheeler Ruml, Associate Professor of Computer Science University of New Hampshire Radim Bartˇos, Associate Professor, Chair of Computer Science University of New Hampshire R. Daniel Bergeron, Professor of Computer Science University of New Hampshire Philip J. Hatcher, Professor of Computer Science University of New Hampshire Robert Holte, Professor of Computing Science University of Alberta Date ACKNOWLEDGMENTS Since I began my doctoral studies six years ago, Professor Wheeler Ruml has provided the support, guidance, and patience needed to take me from “never used a compiler” to “on the verge of attaining the highest degree in the field of computer science”. I would also like to acknowledge the contributions my committee have made to my development over the years. The reality of graduate school is that it requires long hours and low pay, but one factor that makes it more bearable is good company, and I have the University of New Hampshire Artificial Intelligence Research Group to thank for making those long days more bearable. The work in this dissertation was partially supported by the National Science Founda- tion (NSF) grants IIS-0812141 and IIS-1150068. In addition to NSF, this work was also partially supported by the Defense Advanced Research Projects Agency (DARPA) grant N10AP20029 and the DARPA CSSG program (grant HR0011-09-1-0021). iv TABLE OF CONTENTS ACKNOWLEDGMENTS ............................... iv LISTOFTABLES................................... viii LISTOFFIGURES .................................. x ABSTRACT ...................................... xi Chapter 1 Introduction 1 1.1 FormalBackground................................ 1 1.2 SearchAlgorithmProliferation . ..... 3 1.3 BetterSearchingthroughScience . ..... 4 1.4 DissertationOutline ............................. 5 Chapter 2 Bidirectional Search 9 2.1 Heuristic Improvement through Bidirectional Search . ........... 11 2.2 IncrementalKKAdd ............................... 14 2.3 A*withAdaptiveIncrementalKKAdd. 20 2.4 IDA*withIncrementalKKAdd . 21 2.5 EmpiricalEvaluation. 22 2.6 Limitations .................................... 30 2.7 Conclusion .................................... 30 Chapter 3 Managing the Open List 32 3.1 PreviousWork .................................. 33 3.2 PriorityQueues.................................. 35 3.3 FringeSearch ................................... 41 3.4 SelectinganImplementation. 47 3.5 Conclusion .................................... 49 v Chapter 4 Building Effective Heuristics for Greedy Best-First Search 51 4.1 Introduction.................................... 51 4.2 When is increasing w bad? ........................... 53 4.3 HeuristicRequirements . 55 4.4 Greedy Best-First Search with A* Heuristics . ....... 58 4.5 Quantifying Effectiveness in Greedy Heuristics . ......... 68 4.6 Building a Heuristic by Hill Climbing on Greedy Distance Rank Correlation 78 4.7 RelatedWork................................... 81 4.8 Conclusion .................................... 83 Chapter 5 Why d is Better than h 85 5.1 OperatorCosts .................................. 85 5.2 The importance of d ............................... 96 5.3 RelatedWork................................... 119 5.4 Conclusion .................................... 121 Chapter 6 Conclusion 123 vi LIST OF TABLES 2-1 CPU seconds (T), expansions (E), and heuristic evaluations (H) (both in thousands) required to find optimal solutions. 27 3-1 Average performance of Fringe Search and A* with different open list implementations on five benchmark domains. A* (All Nodes) denotes the number of expansions done by A* expanding all nodes in the final f layer, instead of stopping at the first provably optimal solution. Bold denotesthebestvalueforeachmetric. 37 3-2 Average total solving time of A* using a binary heap and a Fast Float Heapondifferentproblems. 41 3-3 Sign test between FFH and binary heap A* solving the Towers of Hanoi 41 3-4 A* using 3 different kinds of open lists solving the inverse15puzzle . 48 4-1 Domain Attributes for benchmark domains considered . ....... 53 4-2 Average number of nodes expanded to solve 51 12 disk Towers of Hanoi problems. ................................... 58 4-3 Amount of work required by Greedy best-first search and A* to solve 3x4 tile instances with different pattern databases. DNF denotes at least one instance would require more than 8GB to solve. 66 4-4 Average % error and correlation between h(n) and h∗(n)......... 69 4-5 Correlation between h(n) and d∗(n) .................... 74 4-6 Average % error, correlation between h(n) and h∗(n), and correlation between h(n) and d∗(n) inCityNavigation55 . 76 4-7 Expansions to solve TopSpin problem with the stripe cost function using differentPDBs ................................ 79 vii 5-1 DifficultyofsolvingthepuzzlefromFigure5-1 . ..... 86 5-2 Failure rate of A* on tile puzzles with different cost functions, and confi- denceintervalforfailurerate . 87 5-3 A*expansionsondifferentproblems . 88 5-4 Sizes of local minima and average number of expansions required to find asolution.................................... 94 5-5 Expansions required by greedy best-first search to solve different pancake puzzlesusingaconstantsize(7cake)PDB. 95 5-6 Correlation of heuristic with search effort in various domains using Kendall’s τ, Spearman’s ρ, and Pearson’s r ...................... 98 5-7 Algorithmsonunitcostdomains . 115 5-8 Solving the 4x4 face3 slidingtilepuzzle. 115 5-9 Solving14-pancake(cost=sum)problems. 117 5-10 GridPathPlanningwithLifeCosts. 118 viii LIST OF FIGURES 1-1 ExamplenavigationproblemfromWarcraft2 . .... 2 1-2 Graphcreatedfromnavigationproblem . ... 2 2-1 TopSpinpuzzle ................................ 10 2-2 TowersofHanoiproblemwith4pegsand6disks . 10 2-3 Solving time of A* and four bidirectional searches on the Towers of Hanoi problem .................................... 12 2-4 Algorithm1: A*withIncrementalKKAdd . 15 2-5 Algorithm 2: Forwards Search for Incremental KKAdd . ...... 15 2-6 Algorithm 3: Backwards Search for Incremental KKAdd . ...... 16 2-7 A city navigation problem with np = nc = 3, with 15 cities and 15 locationsineachcity.............................. 24 3-1 FastFloatHeap................................ 39 4-1 Domains where increasing the weight speeds up search . ....... 55 4-2 Domains where increasing the weight slows down search . ....... 56 4-3 The minimum h value on open as the search progresses, using different patterndatabases............................... 59 4-4 Two Towers of Hanoi states, one near a goal (top) and one far from a goal (bottom). Note that the goal peg is the leftmost peg. 60 4-5 The minimum h value on open as the search progresses solving a Tow- ers of Hanoi problem using disjoint pattern databases with different cost functions, square on left, reverse square on right. ...... 63 4-6 TopSpinpuzzlewithdifferentheuristics . ..... 64 4-7 Different tile abstractions. DNF denotes at least one instance would re- quiremorethan8GBtosolve.. 66 ix 4-8 Plot of h(n) vs h∗(n), and h(n) vs d∗(n) for City Navigation 4 4 . 70 4-9 Average log of expansions with different heuristics, plotted with different GDRC..................................... 75 4-10 Average log of expansions with each of the possible 462 5/6 disjoint PDB heuristics for the 3x4 sliding tile puzzle, plotted against GDRC ..... 76 4-11 Expansions with differing number of neighbors for citiesandplaces . 77 4-12 HillClimbingPDBBuilder . 78 5-1 Left: 15 puzzle instance with a large heuristic minimum. Right: State in whichthe14tilecanbemoved. 86 5-2 Examples of heuristic error accumulation along a path . ........ 91 5-3 Picture of one or two of local minima, depending on how they are defined 99 5-4 Example of how to remove extra nodes from a supergraph . 102 5-5 Anexampleofashortcuttree. 103 5-6 Asearchtreewithalocalminimum. 107 5-7 The minimum h value on open as the search progresses, using a disjoint PDB....................................... 110 x ABSTRACT STEPS TOWARDS A SCIENCE OF HEURISTIC SEARCH by Christopher Makoto Wilt University of New Hampshire, May, 2014 Heuristic search is one of the most important graph search techniques, allowing people to find paths through graphs that can have more nodes than there are atoms in the universe. There has been much research in developing heuristic search algorithms, but much less work on when and why those new heuristic search algorithms work well. Every heuristic search algorithm makes assumptions about the nature of the graph to be searched and the heuristic, and these assumptions allow the algorithm to perform well in certain circumstances, but perform poorly when those assumptions turn out to be false. The thesis of this dissertation is that understanding the assumptions behind heuristic search algorithms can be used to better select heuristic search algorithms, and to improve heuristic search algorithms. We begin by discussing how assumptions made during optimal heuristic search impact performance, and how directly addressing these assumptions can lead to improved perfor- mance. We next turn our attention to implementation