TECHNISCHE UNIVERSITÄT MÜNCHEN Representation And
Total Page:16
File Type:pdf, Size:1020Kb
TECHNISCHE UNIVERSITÄT MÜNCHEN Lehrstuhl für Bildverstehen und wissensbasierte Systeme Institut für Informatik Representation and parallelization techniques for classical planning Tim-Christian Schmidt VollständigerAbdruck der von der Fakultät für Informatik der Technischen Universität München zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften (Dr. rer. nat.) genehmigten Dissertation. Vorsitzender: Univ.-Prof. Dr. Felix Brandt Prüfer der Dissertation: 1. Univ.-Prof. Michael Beetz, Ph.D. 2. Univ.-Prof. Dr. Bernhard Nebel, Albert-Ludwigs-Universtiät Freiburg Die Dissertation wurde am 20.09.2011 bei der Technischen Universität München eingereicht und durch die Fakultät für Informatik am 14.02.2012 angenommen. Abstract Artificial Intelligence as a field concerns itself with the study and design of intelligent agents. Such agents or systems perceive their environment, reason over their accumulated perceptions and through this reasoning, derive a course of action which achieves their goals or maximizes their performance measure. In many cases this planning process boils down to a principled evaluation of potential sequences of actions by exploring the resulting anticipated situations of the agent and its environment, that is to some form of combinatorial search in an implicit graph representing the interdependencies of agent actions and potential situations. This thesis focuses on the search algorithms that implement this reasoning process for intel- ligent agents. The first part covers a novel subclass of such reasoning problems where a goal driven agent must find a sequence of actions to achieve its goal such that some value function of the sequence is closest to a reference value. Such problems occur for example in recom- mender systems and self-diagnosing agents. The contribution here is an algorithm and a family of heuristic functions based on memoization that improves on the current state-of-the-art by several orders of magnitude. The second part focuses on cost- and step-optimal planning for goal-driven agents in which the problem is to derive the least-cost action sequence which achieves the agent’s goals. Here, dynamic programming can be employed to avoid redundant evaluations. The contributions of this thesis address two challenges associated with the use of dynamic programming algo- rithms. The first challenge is the memory consumption of these techniques. At the heart of such search algorithms lies a memoization component that keeps track of all generated situations and the best known way to attain them which is used to derive potential new situations. In practice the rapid growth of this component is the limiting factor for the applicability of this class of algorithms, relegating their use to problems of lower complexity or instances where the necessary graph traversal can be limited by other means such as very strong but often computationally expensive heuristics. The contribution in this thesis is a data structure that significantly reduces the necessary amount of memory to represent such a memoization com- ponent and which can be integrated into a wide range of search algorithms. III Another challenge lies in the parallelization of such algorithms to take advantage of con- current hardware. Conceptually their soundness relies on maintaining a coherent memoization state across all participating threads or processes. As a state-of-the-art planner can explore on the order of millions of planning states per second, each of which have to be tested against and potentially update said state, standard synchronization primitives generally result in pro- hibitive overhead. The second contribution describes a technique that allows to adaptively compartmentalizing this state based on problem structure guaranteeing consistency across participating threads with negligible synchronization overhead. Both techniques are applica- ble to a wide range of search algorithms. In combination they allow to exploit the significant computational advantages of dynamic programming on problems of higher complexity while profiting from the inherent parallelism in current and future computing platforms. Kurzfassung Intelligente Software-Agenten zeichnen sich durch die Fähigkeit aus durch Schlussfolger- ungsprozesse und Umgebungswissen eigenständig Aktionen auszuwählen und durchzuführen, die geeignet sind ihre vorgegebenen Ziele zu erreichen. In vielen Fällen sind diese Schlussprozesse im Kern als heuristische Suche über einem Umgebungs- und Aktionsmodell implementiert, wobei die Skalierbarkeit dieser Suche sowohl die mögliche Umgebungskomplexität als auch die Qualität des Agentenverhaltens direkt beeinflusst. Diese Dissertation greift diese Heraus- forderung auf und beschreibt im ersten Teil ein neuartiges Suchverfahren für eine Klasse von komplexen Suchproblemen die im Kontext selbstdiagnostizierender Agenten auftreten und im zweiten Teil eine neuartige Datenstruktur und Parallelisierungstechnik für auf Dynamischer Programmierung basierender Suchverfahren, die ihren Einsatz in komplexeren Umgebungen ermöglicht. V Contents Abstract III KurzfassungV Contents VII List of FiguresXI List of TablesXV List of Algorithms XVII 1 Introduction1 1.1 On AI planning . .4 1.2 Genealogy . .6 1.3 Classical Planning . .7 1.4 State of the Art . 10 1.5 Motivation . 10 1.6 Contributions . 13 2 Preliminaries 17 2.1 Classical Planning . 17 2.1.1 Propositional STRIPS . 17 2.1.2 The Apartment Domain . 19 2.1.3 Classical Planning - Assumptions, Classification and Complexity . 21 2.2 Classical planning as a problem of combinatorial optimization . 26 2.3 Graph-traversal algorithms . 27 2.3.1 Depth-first search . 27 2.3.2 Breadth-first search . 30 2.4 Terminology and Conventions . 32 VII 2.5 Heuristics . 33 2.5.1 Admissibility . 35 2.5.2 Best-first search and A∗ ........................ 35 2.5.3 Consistency . 38 3 Target Value Search 41 3.1 Example Domains . 41 3.1.1 Pervasive Diagnosis for manufacturing systems . 41 3.1.2 Consumer Recommender Systems . 43 3.2 Problem definition . 44 3.2.1 Conventions . 45 3.2.2 Complexity . 46 3.3 Heuristics for Target Value Search . 47 3.3.1 A straightforward approach . 47 3.3.2 An Admissible Estimator for Target Value Search . 48 3.3.3 Multi-interval Heuristic for Target Value Search . 50 3.3.4 Computing the Interval Store . 51 3.4 Algorithms for Target Value Search . 53 3.4.1 Best-First Target Value Search . 54 3.4.2 Depth-First Target Value Search . 56 3.5 Empirical Evaluation . 59 3.5.1 The Test Domains . 60 3.5.2 Comparison of HS, BFTVS and DFTVS . 61 3.5.3 Scaling of DFTVS . 64 3.5.4 Interval Store Evaluation . 66 3.6 Summary . 68 4 State-set Representation 71 4.1 Background . 71 4.1.1 Pattern Databases . 71 4.1.2 State Representation . 72 4.1.3 State-Sets in Unit-Cost Best-First Search . 75 4.1.4 Set-representation techniques . 76 4.1.5 Explicit set representations . 77 4.1.6 Implicit Set Representations . 78 4.2 LOES - the Level-Ordered Edge Sequence . 82 4.2.1 Conventions . 84 4.2.2 Prefix Tree minimization . 85 4.2.3 Sampling representative states . 85 4.2.4 Analyzing the sample set . 87 4.2.5 On Prefix Tree Encodings . 87 4.2.6 The LOES Encoding . 93 4.2.7 Size Bounds of LOES Encodings . 94 4.2.8 Mapping Tree Navigation and Set Operations to LOES . 95 4.2.9 Building a Dynamic Data-Structure for Dynamic Programming based on LOES . 102 4.2.10 Construction of a LOES code from a Lexicographically-ordered Key- Sequence . 107 4.2.11 Virtual In-Place Merging . 109 4.2.12 Practical Optimizations . 110 4.2.13 Empirical Comparison of LOES and BDD in BFS-DD . 112 4.2.14 Pattern Database Representations . 120 4.2.15 Combined Layer Sets . 120 4.2.16 Inverse Relation . 121 4.2.17 Compressed LOES . 121 4.2.18 Empirical Comparison of LOES and BDD for Pattern Database Rep- resentations . 123 4.3 Summary . 137 5 Parallelization of Heuristic Search 139 5.1 Background . 139 5.1.1 Parallel Structured Duplicate Detection . 142 5.2 Parallel Edge Partitioning . 143 5.2.1 Parallel Breadth-First Search with Duplicate Detection and Edge Par- titioning - an Integration Example . 147 5.3 Empirical Comparison of PEP, PSDD, PBNF and HDA∗ ........... 152 5.3.1 Successor Generation . 152 5.3.2 Performance and Scaling . 153 5.4 Summary . 160 6 Conclusion 161 7 Outlook 165 Bibliography 167 List of Figures 1.1 Schematic view of intelligent agents . .3 1.2 Stanford Research Institute’s Shakey autonomous robot . .5 1.3 2007 DARPA Urban Challenge . .6 1.4 Screenshot of the video game “Empire” . .7 1.5 GOAP example plan . .9 2.1 The apartment domain . 19 2.2 Graphical notation for apartment domain states . 21 2.3 Abstract architectural view of a planning system. 21 2.4 Complexity of satisficing planning in pSTRIPS . 24 2.5 Complexity of optimal planning in propositional STRIPS . 25 2.6 Excerpt of the apartment domain-graph . 26 2.7 IDDFS apartment example . 29 2.8 BFS-DD apartment example . 32 2.9 A∗ apartment example . 37 2.10 Monotonicity of consistent heuristics . 39 3.1 A modular printing system . 42 3.2 Hiking map of the Yosemite Valley National Park. 44 3.3 Search- and connection graph . 45 3.4 Target value search example . 46 3.5 Non-admissibility of admissible SP heuristics in TVS. 48 3.6 Principle of interval heuristics . 49 3.7 Single-interval heuristic example . 50 3.8 Multiple-interval heuristic example . ..