Symbolic Search in Planning and General Game Playing
Total Page:16
File Type:pdf, Size:1020Kb
Symbolic Search in Planning and General Game Playing Peter Kissmann Dissertation zur Erlangung des Grades eines Doktors der Ingenieurwissenschaften – Dr.-Ing. – Vorgelegt im Fachbereich 3 (Mathematik und Informatik) der Universitat¨ Bremen im Mai 2012 ii Gutachter: Prof. Dr. Stefan Edelkamp Universitat¨ Bremen Prof. Dr. Malte Helmert Universitat¨ Basel Datum des Kolloquiums: 17. September 2012 Contents Acknowledgements vii Zusammenfassung ix 1 Introduction 1 1.1 Motivation . .1 1.2 State Space Search . .3 1.3 Binary Decision Diagrams and Symbolic Search . .4 1.4 Action Planning . .5 1.5 General Game Playing . .6 1.6 Running Example . .8 I Binary Decision Diagrams9 2 Introduction to Binary Decision Diagrams 11 2.1 The History of BDDs . 12 2.2 Efficient Algorithms for BDDs . 13 2.2.1 Reduce . 13 2.2.2 Apply . 14 2.2.3 Restrict, Compose, and Satisfy . 14 2.3 The Variable Ordering . 15 3 Symbolic Search 19 3.1 Forward Search . 20 3.2 Backward Search . 21 3.3 Bidirectional Search . 22 4 Limits and Possibilities of BDDs 23 4.1 Exponential Lower Bound for Permutation Games . 24 4.1.1 The Sliding Tiles Puzzle . 24 4.1.2 Blocksworld . 24 4.2 Polynomial Upper Bounds . 27 4.2.1 Gripper . 27 4.2.2 Sokoban . 30 4.3 Polynomial and Exponential Bounds for Connect Four . 31 4.3.1 Polynomial Upper Bound for Representing Reachable States . 32 4.3.2 Exponential Lower Bound for the Termination Criterion . 35 4.3.3 Experimental Evaluation . 37 4.4 Conclusions and Future Work . 37 iii iv CONTENTS II Action Planning 41 5 Introduction to Action Planning 43 5.1 Planning Tracks . 44 5.2 The Planning Domain Definition Language PDDL . 44 5.2.1 The History of PDDL . 45 5.2.2 Structure of a PDDL Description . 47 6 Classical Planning 53 6.1 Step-Optimal Classical Planning . 54 6.1.1 Symbolic Breadth-First Search . 54 6.1.2 Symbolic Bidirectional Breadth-First Search . 55 6.2 Cost-Optimal Classical Planning . 56 6.2.1 Running Example with General Action Costs . 57 6.2.2 Symbolic Dijkstra Search . 58 6.2.3 Symbolic A* Search . 61 6.3 Heuristics for Planning . 66 6.3.1 Pattern Databases . 67 6.3.2 Partial Pattern Databases . 73 6.4 Implementation and Improvements . 74 6.4.1 Replacing the Matrix . 75 6.4.2 Reordering the Variables . 76 6.4.3 Incremental Abstraction Calculation . 78 6.4.4 Amount of Bidirection . 78 6.5 International Planning Competitions (IPCs) and Experimental Evaluation . 79 6.5.1 IPC 2008: Participating Planners . 80 6.5.2 IPC 2008: Results . 82 6.5.3 Evaluation of the Improvements of Gamer . 82 6.5.4 IPC 2011: Participating Planners . 85 6.5.5 IPC 2011: Results . 87 7 Net-Benefit Planning 91 7.1 Running Example in Net-Benefit Planning . 92 7.2 Related Work in Over-Subscription and Net-Benefit Planning . 94 7.3 Finding an Optimal Net-Benefit Plan . 95 7.3.1 Breadth-First Branch-and-Bound . 96 7.3.2 Cost-First Branch-and-Bound . 97 7.3.3 Net-Benefit Planning Algorithm . 98 7.4 Experimental Evaluation . 100 7.4.1 IPC 2008: Participating Planners . 100 7.4.2 IPC 2008: Results . 100 7.4.3 Improvements of Gamer . 101 III General Game Playing 105 8 Introduction to General Game Playing 107 8.1 The Game Description Language GDL . 108 8.1.1 The Syntax of GDL . 109 8.1.2 Structure of a General Game . 110 8.2 Differences between GDL and PDDL . 113 8.3 Comparison to Action Planning . 114 CONTENTS v 9 Instantiating General Games 117 9.1 The Instantiation Process . 118 9.1.1 Normalization . 118 9.1.2 Supersets . 119 9.1.3 Instantiation . 123 9.1.4 Mutex Groups . 124 9.1.5 Removal of Axioms . 126 9.1.6 Output . 126 9.2 Experimental Evaluation . 128 10 Solving General Games 133 10.1 Solved Games . 134 10.2 Single-Player Games . 141 10.3 Non-Simultaneous Two-Player Games . 143 10.3.1 Two-Player Zero-Sum Games . 143 10.3.2 General Non-Simultaneous Two-Player Games . 144 10.3.3 Layered Approach to General Non-Simultaneous Two-Player Games . 147 10.4 Experimental Evaluation . 150 10.4.1 Single-Player Games . 150 10.4.2 Two-Player Games . 153 11 Playing General Games 157 11.1 The Algorithm UCT . 158 11.1.1 UCB1 . 158 11.1.2 Monte-Carlo Search . 159 11.1.3 UCT . 159 11.1.4 UCT Outside General Game Playing . 160 11.1.5 Parallelization of UCT . 163 11.2 Other General Game Players . 165 11.2.1 Evaluation Function Based Approaches . 165 11.2.2 Simulation Based Approaches . 168 11.3 The Design of the General Game Playing Agent Gamer . 171 11.3.1 The GGP Server . 171 11.3.2 The Communicators . 172 11.3.3 The Instantiators . ..