Efficiency and Abstraction in Task and

by William Vega-Brown Submitted to the Department of Mechanical Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2020

©Massachusetts Institute of Technology 2020. All rights reserved.

Author ...... Signatureredacted Depar(ment of Mechanical Engineering January 8, 2020 Certified by Signature redacted ...... A' Nicholas Roy Professor of Aeronautics and Astronautics A I Thesis Supervisor Certified by...... Signature redacted V John Leonard Professnerin

Accepted by...... MASSCHUETSNSTTUT redcte~dcnsantno MASS SL NS ITUTE W\icolas Hadjiconstantinou TEC G IProfessor of Mechanical Engineering FEB 0 5 2020 Chairman, Committee on Graduate Students LIBRARIES 77 Massachusetts Avenue Cambridge, MA 02139 MITLibrades http:"-ibrariesmit.du/sk

DISCLAIMER NOTICE

Due to the condition of the original material, there are unavoidable flaws in this reproduction. We have made every effort possible to provide you with the best copy available.

Thank you.

The images contained in this document are of the best quality available. ii Efficiency and Abstraction in Task and Motion Planning by William Vega-Brown

Submitted to the Department of Mechanical Engineering on January 8, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mechanical Engineering

Abstract

Modern robots are capable of complex and highly dynamic behaviors, yet the decision- making algorithms that drive them struggle to solve problems involving complex behaviors like manipulation. The combination of continuous and discrete dynamics induced by contact creates severe computational challenges, and most known practical approaches rely on hand-designed discrete representations to mitigate computational issues. However, the relationship between the discrete representation and the physical robot is poorly understood and cannot easily be empirically verified, and so many planning systems are brittle and prone to failure when the robot encounters situations not anticipated by the model designer. This thesis addresses the limitations of conventional representations for task and motion planning by introducing a constraint-based representation that explicitly places continuous and discrete dynamics on equal footing. We argue that the chal- lenges in modelling problems with both discrete and continuous dynamics can be reduced to a trade-off between model complexity and empirical accuracy. We pro- pose the use of abstraction to combine models that balance those two constraints differently, and we claim that by using abstraction we can build systems that reliably generate high-quality plans, even in complex domains with many objects. Using our representation, we construct and analyze several new algorithms, pro- viding new insight into long-standing open problems about the decidability and com- plexity of motion planning. We describe algorithms for sampling-based planning in hybrid domains, and show that these algorithms are complete and asymptotically optimal for systems that can defined by analytic constraints. We also show that the reachability problem can be decided using polynomial space for systems described by polynomial constraints satisfying a certain technical conditions. This class of sys- tems includes many important robotic planning problems, and our results show that the decision problem for several benchmark task and motion planning languages is PSPACE-complete.

Thesis Supervisor: Nicholas Roy Title: Professor of Aeronautics and Astronautics Committee Chair: John Leonard Title: Professor of Mechanical Engineering

iii iv Acknowledgments

This work would not have been possible without the help of many people. I thank my advisor, Nick Roy. His insight has been invaluable, and his guidance helped shape both the ideas in this thesis and the way I approach research.

I thank my committee: John Leonard, for his willingness to guide me through the hidden paths of my department, as well as for his help and support in all my years at MIT; Leslie Kaelbling, for many conversations about planning in robotics, for helping me to frame my work, and for her guidance in better understanding the planning literature; Dan Koditschek, for his many technical and theoretical insights, for sharing the resources of his lab, and especially for for steering me towards the tools of semialgebraic and o-minimal geometry; Alberto Rodriguez, for his help and advice in our meetings; and Marc Toussaint, for helping to expand my understanding of what a planning problem is and what planning algorithms can do.

I thank my colleagues and fellow students: Rohan Paul, whose ideas and insights helped shape both the problems I tackled and the wayI approached them; Vasileios

Vasilopoulos and Turner Topping, for their tireless efforts in putting the ideas in this thesis to work on the Minitaur; Charlie Richter, who taught me the importance of doing one thing at a time; as well as all the members of the Robust Robotics Group. I thank Laura Hallock, whose patience as I walked this long and winding road was nothing less than saintly, and whose support-both emotional and practical-meant more to me than she knows.

Finally, I thank my parents, without whom I could be neither where I am nor who I am. I am eternally grateful for all you have done for me.

v vi Author's Note

The version of this document you are reading was submitted to the MIT libraries as a printed copy. The most recent version of this document, which includes color figures as well as digital tools like hyperlinks and cross-references, is available online at https://people.csail.mit. edu/wrvb/.

vii viii Contents

1 Introduction 1

1.1 The challenges of deliberative planning ...... 3

1.2 Planning in robotics ...... 4

1.2.1 Complete algorithms for path planning ...... 5

1.2.2 Algorithms with weaker guarantees ...... 7

1.2.3 Factored discretization ...... 8

1.2.4 Discrete planning ...... 9

1.2.5 Combined Task and Motion Planning ...... 11

1.3 Representation, Efficiency, and Verification ...... 16

1.4 Statement of contributions ...... 19

1.5 Structure of this document ...... 20

2 The Continuous Constraint Contract Representation 23

2.1 States and variables ...... 25

2.2 A ctions ...... 26

2.3 Axioms, knowledge, and constraints ...... 29

2.4 Planning problems ...... 30

2.5 Conclusions ...... 31

3 Randomized Algorithms for Planning 35

3.1 Asymptotic Optimality under Piecewise-Analytic Differential Constraints 36

3.2 Dem onstration ...... 44

3.3 Compact discretizations with factored graphs ...... 45

ix 3.3.1 Discretizing C3 ...... 45 3.4 A Proof that SRGG is Asymptotically Optimal ...... 51

3.4.1 Construction of a sequence of paths ...... 51

3.4.2 Construction of balls on the intersections between strata . 52 3.4.3 Construction of balls on an arbitrary leaf ...... 52 3.4.4 Bounding the cost of the path returned ...... 55 3.4.5 Proofs of propositions ...... 56

4 Planning with Abstraction 69 4.1 Admissible angelic semantics ...... 7 0 4.1.1 Problem Formulation ...... 7 1 4.1.2 Angelic Semantics ...... 7 3 4.1.3 Admissible Abstractions ...... 7 7 4.1.4 Abstractions in C 3 ...... 8 5 4.2 Abstract planning algorithms ...... 8 6 4.2.1 Angelic A* ...... 8 7 4.2.2 Acyclic Angelic A* ...... 9 5 4.2.3 Approximate Angelic A* ...... 9 8 4.3 Demonstration ...... 102 4.4 Analysis ...... 105 4.5 Notes and Extensions ...... 1 0 8

5 Exact Algorithms for Robot Planning 111 5.1 Transition systems ...... 112

5.2 Undecidability of Semialgebraic Transition Systems ...... 115 5.3 Uniform Accessibility, Decidability, and Complexity ...... 118

5.3.1 Stratification ...... 118 5.3.2 Uniform accessibility ...... 119

5.3.3 Uniform stratified accessibility ...... 121

5.3.4 Deciding stratifiable semialgebraic transition systems . .. .. 122

5.4 Relationships between representations ...... 127

x 5.4.1 Analytic functions and semianalytic sets . 128 5.4.2 Duality for analytic representations . .. . 130

5.4.3 Computability and approximation .... . 134 5.5 Im plications ...... 136

6 Conclusions 139

6.1 Summary of contributions ...... 139

6.2 Open problems ...... 140

6.3 A vision for verifiable performant systems ...... 141

Bibliography 143

xi xli List of Figures

3.1 Exampleplanning domain ...... 38

3.2 Comparison ...... 45

3.3 Coupled actionsubgraphs ...... 48

4.1 Angelic constraints ...... 74

4.2 Convexangelic bounds ...... 82

4.3 Door puzzle ...... 84

4.4 Successoroperation ...... 89

4.5 Plantreesolution ...... 90

4.6 Plantree problem ...... 91 4.7 Cyclic paths ...... 94

4.8 Search trees ...... 103 4.9 Quantitive evaluationof A**...... 104

xiii xiv List of Tables

4.1 Quantitative evaluation of A** ...... 103

xv xvi List of Algorithms

1 Stratified Random Geometric Graph construction (SRGG) ...... 40 2 Sampling from a C3 domain ...... 46 3 A ngelic A * ...... 88 4 Acyclic angelic A* ...... 96 5 Checking for cycles ...... 96 6 Approximate angelic A* ...... 100 7 Approximate angelic search priority ...... 101 8 Semialgebraic reachability ...... 124 9 An algorithm for determining plan existence...... 125

xvii xviii Chapter 1

Introduction

The proliferation of complex robotic platforms has had a profound impact on science, industry, transportation, and medicine, and this impact can only be expected to grow over time. However, building complex physical systems that behave intelligently when interacting with their environment, even at the level of a human child, remains a stubbornly difficult task. State-of-the-art systems still rely heavily on the guidance of expert operators to make high-level decisions. This reliance on operators constrains the domains where robots can be deployed, and, ultimately, limits the impact of robotics.

Imagine that two hundred and fifty million kilometers away, a robot toils away extracting iron from an asteroid. It is impractical and unsafe for a human to accom- pany the robot, and so its operator remains on earth. It takes fourteen minutes for a message sent from its operator on earth to reach the robot; it takes another fourteen minutes to get a response. Its fuel supply is starkly limited; so far from the sun, solar panels provide far less power than they do in earth orbit, and on-board fuel reserves only last so long. In the half-hour between communications, the robot must act with complete autonomy, making the best use of its dwindling fuel to take steps that bring the robot closer to achieving its goals, while avoiding catastrophic missteps.

Even when it is safe for an operator to directly command a robot, it may not be economically feasible. Imagine a commercial housekeeping robot 1781; an untrained user wants to command a robot to clean a cluttered room. A human would hardly

I need to think before beginning to work, and the odds of catastrophically bad perfor- mance would be minimal. The robot, on the other hand, may be stymied by the sheer number of decisions required to make progress: which object should it reach for first?

With which arm should it reach? How should it grasp the object? Where should it place it? Without human intuition for which questions are worth thinking about and which decisions are likely to lead to success, the robot might treat every decision as critical-leading to long delays as it thinks-or treat no decision as critical-leading to poor performance, as the robot chooses a poor plan.

Finally, even when an expert is available, some planning problems are too difficult for human operators to solve. Imagine attempting to construct an optimal plan for manufacturing or assembling a part. The number of possible plans is staggeringly large, and at large enough scale even small improvements could have an enormous economic impact. We have already seen automated decision-making systems surpass humans at classical discrete games, and it is common in industry for automated process optimization to yield results that surpass human plans. Physical reasoning has not yet achieved such superhuman performance.

Three threads unify these examples. First, for various reasons-lack of expertise, physical limitations, the limitations of human mental capacity-the robot must make decisions about how to interact with the world without substantial human interven- tion. Second, the robot must interact with the physical world, making and breaking contact with unfamiliar objects in unfamiliar scenarios, and making decisions about continuous and highly dynamic physical processes. Finally, the decisions the robot must make only indirectly affect its ultimate objectives: we cannot know whether the choice of which object to grasp first matters unless we first grasp the entire plan.

These scenarios evoke concrete technical challenges. How are we to design systems meant to interact with a wide variety of complex objects? How are we to verify that a system composed of many complex, interacting components will behave as we intend? How are we to make decisions in problems with a long planning horizon, where the first few steps of a plan and the ultimate objective are related only by a long sequence of subgoals and contingencies? Taken collectively, these challenges present serious

2 difficulties to the designers of autonomous systems. These are problems the robotics community has sought to address since its earliest days, and yet many foundational questions have gone unaddressed.

1.1 The challenges of deliberative planning

Fundamentally, any autonomous system capable of addressing these challenges must be able to distinguish between good plans and bad plans, and it must be able to do so within the bounds of its resource constraints. A model for planning thus serves two distinct purposes: first, it allows a decision-making system to evaluate a plan without first executing it in the real world, and second, it structures the space of possible plans, so that the search for a good plan can be done efficiently.

These two requirements are in tension. Modern simulators [38, 62, 136, 181, 190] can accurately predict the effects of plans for most robotic systems, and are limited primarily by the challenges associated with measuring the physical properties of systems. However, even if we knew the physical properties exactly, simulators do not expose sufficient structure to allow efficient planning. In fact, planning is undecidable for the full range of dynamic models handled by modern simulators 161: any algorithm must either be incorrect on some instance or must fail to return an answer on some instance.

Conversely, highly structured representations like PDDL 158] can be quite concise and do enable efficient planning, but it is difficult to verify that they are accurate representations of a given physical system. Without additional structure, there is no guarantee that a plan from a classical structured representation will be realizable by a physical system-nor can we guarantee that if a realizable plan exists, it will be considered feasible by the classical representation. Most modern approaches to plan- ning in complex domains combine a structured representation to guide search and a high-fidelity model to validate results; for example, see the recent work of Lagriffoul et al. 1131]. While the combination of approaches does dramatically accelerate planning in practice, it does not circumvent the underlying issues of efficiency and verifiability.

3 Intricate geometry remains a challenge even for modern systems, and the interplay

between an abstract discrete representation and the underlying high-fidelity model is poorly understood, making it difficult to verify the completeness or correctness of a combined planning system.

Disambiguating between what must be included in a model for it to be "correct" and what can be freely chosen as "advice" to the planner is hard, since "correct" is not well-defined. As a result, we have an important practical question-how should we represent planning problems?-and minimal no theoretical guidance with which to answer it. An empirical answer is precluded by the fact that we cannot disentangle properties of the particular solver from properties of the planning domain.

In this thesis, we attempt to resolve these difficult representational questions. We introduce a new formalism for planning, which models the world in terms of contracts defined by constraints. Our representation addresses the tension between verifiabil- ity and efficiency by supporting abstraction with well-defined semantics. Abstraction allows us to model a system at varying levels of resolution and use lower-resolution models to guide the search at higher levels of resolution. We show that with a better understanding of the theoretical issues at hand, all non-computational issues evaporate, giving a new way to understand existing research. Moreover, our represen- tational framework allows us to make precise the question of whether a given model is a "correct" representation of a physical system and reduce the answer to objective and empirical facts.

In the remainder of this chapter, we first formalize the planning problem. We then trace the historical development of major families of solutions. We articulate in more detail the limitations and open questions around conventional representations.

Finally, we review our contributions, and outline the remainder of this document.

1.2 Planning in robotics

When we speak of planning in this document, we specifically mean high-level delib- erative planning, the sort of planning that uses a model of the world and a specified

4 task to decide how the robot ought to behave. Informally, the planning problem is to find a plan, a sequence of decisions-control inputs, waypoints, operators-that will, according to a model of how the world evolves, take the robot and the world from its initial state to some state consistent with a specified objective. The planning problem has been studied in various forms since the earliest days of robotics [139, 1571, and it is a fundamental building block of intelligent embodied systems. In this document, we focus exclusively on deterministic, fully observable systems, those in which the state of the world is fully known and the effects of each decision are fully known. While determinism and observability are very limiting assumptions, they capture much of the complexity of the planning problem and serve as a useful lower bound on the more general problem of planning under uncertainty.

1.2.1 Complete algorithms for path planning

Perhaps the best-known example of a robotic planning problem is the path planning problem, in which the robot comprises one or more rigid bodies linked together by hinges and is tasked with traversing Euclidean space while avoiding obstacles. The path planning problem is sometimes called the "piano movers problem" due to its clear relation to the question of whether an unwieldy object like a piano can be made to fit through a given environment. Despite its apparent simplicity-there are no nontrivial dynamics at play, and the workspace may only be two-dimensional-the path planning problem is already computationally intractable. The complexity of the path planning problem is due to the fact that the configuration space1140] of the robot grows exponentially with the number of links, and as a result the length of the shortest path can be exponential in the number of links.

The exponential worst-case complexity of motion planning is unavoidable. Reif1162] showed that the path planning problem problem is PSPACE-hard (that is, as hard as any problem that can be solved using only polynomial space). The general path planning problem was solved by Schwartz and Sharir 1170], who took advantage of a geometric technique called the cylindrical algebraic decomposition 135] to decompose the configuration space into a finite collection of cells, reducing the path planning

5 problem to a search through a finite graph. Unfortunately, the cylindrical algebraic

decomposition for a problem with d links might contain 0 (22 d) cells, meaning that the algorithm of Schwartz and Sharir required doubly-exponential time. The doubly-

exponential time bound was later improved to singly-exponential time by Ben-Or et al. 114], and finally the problem was shown by Canny 127] to be PSPACE-complete.

The algorithms required to solve the path planning problem exactly are incredibly

complex, and generally held to be impractical for problems of realistic size11101. Still, the analysis of the piano movers problem illustrates several recurring themes in robotic

planning. First, the complexity of planning is often dominated by the size of the state space, not by the number of actions or the geometric complexity of the environment.

There are any number of problems 148, 89, 96] that are quite simple in every respect except the size of the state space that are asymptotically as difficult as the generalized piano movers problem. The exponential growth in complexity with the size of the state space is of particular concern if we want to solve planning problems in which a robot interacts with the world around it: in domains with many movable objects, each object carries its own degrees of freedom, suggesting there is little hope of a tractable, complete algorithm for domains with more than a few objects.

Second, the key step in deciding if a continuous planning problem has a solution is the construction of a suitable discretization. Many continuous planning problems become easy if we can reduce them to the search for a path through a small directed graph; much of the complexity of modern motion planning stems from the challenges inherent to constructing that discretization, or from the challenges in searching the discretization when it becomes large. Often, the key step in formulating a prac- tical algorithm with theoretical guarantees is devising a clever way to construct a discretization that is not exponentially large for most problems.

While it is a useful illustration of the complexity of motion planning, the early analysis of the piano movers problem does not address domains with nontrivial dy- namics, nor does it address the problem of finding the best path (according to some optimization criteria) to a goal set. For example, many real-world problems require a robot to interact with its environment through contact. Contact immediately in-

6 troduces complex non-holonomic dynamics 1114], even in systems that are otherwise slow and well-behaved. Though problems involving contact have received some at-

tention 132, 51, 198], key questions around decidability and complexity of planning with dynamics remain unsolved.

1.2.2 Algorithms with weaker guarantees

The limitations of complete algorithms for motion planning-most notably, the com-

plexity of implementation, the computational cost of planning, and the lack of support for nontrivial dynamics-led most practitioners to abandon complete planning algo-

rithms in favor of weaker guarantees. For example, Brooks and Lozano-Perez 122] employ a subdivision algorithm that guarantees resolution completeness: their al- gorithm will find a plan if one exists at a given resolution, and, as that resolution decreases to zero, will always find a plan if one at exists (subject to certain techni- cal conditions). Barraquand et al. [11] guarantee probabilistic completeness: as the amount of time available to search increases, the probability that the algorithm finds a plan if one exists approaches zero. Guarantees like probabilistic completeness are often good enough in practice and have driven the bulk of motion planning research over the past three decades.

By relaxing the requirement that a planning algorithm prove no solution exists for infeasible instances, we obviate the need for an explicit representation of the reachable workspace. Avoiding such an explicit representation can in turn enabling far more compact representations and greatly simplify the planning process. For example, the probabilistic roadmap (PRM) 1109, 111] and the rapidly-exploring random tree

(RRT) 1133, 135] both represent the planning domain as a graph of configurations, whose edges represent feasible paths. In both cases, the structure of the search space is revealed exclusively by checking if there is a path between two configurations (for example, by checking for collisions along the straight-line path). Collision checking is both faster and easier to implement than typical cell decomposition algorithms.

Sampling-based approaches have proven effective in high-dimensional search spaces, and many researchers have developed extensions and improvements. Later work has

7 extended sampling-based methods to optimal [106] and near-optimal [143] planning, as well as to path planning under differential constraints 1107, 134]. Furthermore, careful analysis [21, 97, 102] has shown that randomness is not essential to the per-

formance characteristics of sampling-based planning; deterministic sequences of sam-

ples can be as or more effective. The essential feature of sampling-based planning is the ability to generate compact discrete representations of the underlying con-

tinuous space. Uniform sampling turns out to yield a good discretization in many

problems, though numerous authors [7, 20, 98, 99] have shown that drawing sam- ples non-uniformly can have significant impact on performance without impacting

theoretical guarantees. For example, deliberately sampling near narrow gaps can dramatically decrease the number of samples needed for a probabilistic roadmap to contain a solution to hard problems.

Of course, the summary in this section is not an exhaustive survey of techniques for motion planning. In many cases, theoretical guarantees take a back seat to empirical performance. For example, local optimization techniques like CHOMP [202], GPMP [16, 54, 151], and TrajOpt [168] have been shown to generate high-quality plans, and to do very quickly, even if they sometimes fail to generate a plan when one exists.

Approaches using local optimization are particularly suitable for problems that are dynamically complex or that have many degrees of freedom.

1.2.3 Factored discretization

The core technical challenge in applying conventional motion planning algorithms to task and motion planning domains is that the system dynamics for problems like ma- nipulation are non-smooth and non-holonomic [114]. In problems with non-smooth dynamics, two configurations can be arbitrarily close together as measured by Eu- clidean distance while the shortest feasible path connecting the configurations remains arbitrarily long. Conventional kinodynamic planning techniques rely on the local con- tinuity of the planning problem, and are thus ineffective in more general settings. Alami et al. [3, 4] recognized that it is possible to build a discretization compatible with the dynamics of manipulation problems. Later work [178, 179, 180] showed

8 that this could be done using sampling-based techniques, paving the way for the

first probabilistically-complete [841 and asymptotically optimal [197] algorithms. By

combining these techniques with kinodynamic planners capable of sampling from low-dimensional contact constraints 1151, it has become possible to solve challenging

problems involving contact. Modern sampling-based manipulation planning has been

demonstrated on robotic systems with many degrees of freedom 1112, 173], as well as in multi-robot systems [174, 175].

The key to making these approaches tractable lies in building factored representa-

tions. Rather than building a graph of configurations in the joint configuration space of the robot and all movable objects, we can instead build lower-dimensional manipu- lation graphs and stitch them together into a unified graph of the configuration space.

This allows for a compact (that is, polynomially sized) representation of the high- dimensional joint configuration space. In problems with only a few movable objects, directly searching this discretized representation can be effective. However, for do- mains with many objects, a direct search is prohibitively slow. If there are n objects to move, then simply deciding in which order they should be moved requires consider- ing 0 (n!) plans. Without heuristic guidance or some other algorithmic optimization, direct search algorithms are paralyzed by the size of the discrete configuration space.

1.2.4 Discrete planning

The challenges of combinatorial search-the search for a solution in a space that is the product of many simpler spaces-were recognized early in the history of artificial intelligence. Many problems can be in principle be solved by searching a sufficiently large finite state space. For example, Fikes and Nilsson 1651 described the Stanford

Research Institute Problem Solver (STRIPS), which used Boolean predicates to de- scribe the state of the world and heavily influences planning research to this day. In their representation, the state is defined by a collection of N facts, each of which can be either true or false; the state space is therefore of size 2N

Complete search algorithms-those that are guaranteed to find a solution if one

9 exists-have been known since before the digital computer,land optimal algorithms, which find the lowest-cost sequence of decisions, have been known since Dijkstra's seminal work [52]. It turns out that we can construct planning instances for which the shortest plan visits every state exactly once; it follows that in the worst case, any complete algorithm must at least enumerate the full state space and thus the worst case complexity of discrete search is linear in the size of the state space. This lower bound is attained by Dijkstra's algorithm, and nearly all complete algorithms are asymptotically equivalent in the worst case.

In practice, however, exponentially-long plans are rare, and often of little use. In addition, we often have some idea of the likely cost of a solution to a planning prob- lem, and by encoding this knowledge in a heuristic estimate of the cost, we can greatly accelerate search. This intuition was formalized by Hart et al. 179], who developed the A* search algorithm and showed that A* and most derived algorithms can be used to find optimal plans if the chosen heuristic is admissible-that is, if the heuristic estimate of the cost to move between states is always a lower bound on the true cost.

Admissible domain-agnostic heuristics are known, and in at least some cases they are competitive with the computational performance of inadmissible heuristics1100, 1831.

Importantly, while inadmissible heuristics may yield solutions that are of arbitrarily high cost, there are approximate planning algorithms that use admissible heuristics to exchange bounded sub-optimality for dramatically improved computational per- formance 1119, 161]. The power and generality of A* has led to an enormous variety of related algorithms, supporting interleaved execution and planning [1211, fast re- planning 1115, 116, 117, 118, 186, 187], reduced memory consumption 1120, 1651, and hierarchies of heuristics 12, 63, 145]. All of these algorithms can be used to search a discretization of a continuous domain, allowing us to adapt sampling-based methods to these different settings.

While the heuristics typically used in motion planning leverage a good deal of domain-specific information, decades of research have yielded powerful domain-agnostic

'The earliest record of breadth-first search, for example, was the rejected PhD thesis of Konrad Zuse, submitted in 1945. This thesis also introduced the first high-level programming language.

10 heuristics for factored domains. While there is no way around exponential worst-case

complexity, these techniques perform quite well in practice. Most modern task plan- ners use the same core architecture; they accept problems defined using a standard

language like PDDL 158], use domain-independent heuristic estimates of the length

of the shortest plan [39, 90, 94], and exploit those heuristics in some kind of highly-

optimized search routine. While there are exceptions (the graph-planning framework

introduced by Blum 1181 is a noteworthy example) heuristic search remains the state of the art 1194, 199].

1.2.5 Combined Task and Motion Planning

The ideas of sections 1.2.3 and 1.2.4 can be combined, and this combination underlies most state-of-the-art approaches to solving complicated planning problems involving

factored search spaces. This is the core idea behind aSyMov 125, 26, 75], a general-

ization of the earlier work of Simon et al.(179] and a useful example of this class

of planner. ASyMov uses the result of a heuristic search 193] through a model de- rived from a carefully coded task representation to guide the search over metric paths

through a densely sampled discretization. This allows for solving problems with sig- nificant discrete structure, such as the continuous analogues of classical challenges like the Towers of Hanoi puzzle.

Dozens of authors have investigated related approaches to this combined task and motion planning (TMP) problem, including very recent work. The key unifying feature of this class of algorithm is the presence of both a recognizable task planning

domain-typically, though not always, represented using a language like PDDL-in

conjunction with a geometric representation. However, apart from this key feature, approaches are quite diverse. In the remainder of this section, we survey some of the key distinctions.

11 Linking the task and motion layers

Perhaps the most fundamental choice when designing a task and motion planning algorithm is how the relationship between the task layer and the motion layer is en-

coded. Many, if not most, approaches augment the task planning layer to include

support for arbitrary predicates like path existence. This allows the task planner to query a motion planner, which can in turn be treated as a black box. These arbitrary predicates are sometimes called semantic attachments1571 or extended action seman- tics [69]. The inclusion of these predicates enables a tight integration of geometric constraints into the task planning layer, perhaps leading to earlier backtracking and faster search 155, 56, 76, 91].

An alternative to arbitrary predicates is to encode bidirectional maps between the discrete state space of the task planning layer and the continuous state space of the motion planning layer. Dantam et al. [40, 42] call this approach domain seman- tics, while Srivastava et al. 11841 describe it as a planner-independent interface layer.

Plaku and Hager 1160] use a unidirectional map, assigning a discrete task state to each configuration; by sampling from the preimage of a given abstract state, they can effectively guide the construction of a discretization. These strategies have the advantage of allowing greater independence between the task planner and the motion planner.

Others have employed more structured links. Garrett et al. 170, 71, 72] have defined conditional generators, which facilitate efficient sampling-based planning in complex domains. Several authors have investigated reachability maps [33, 67, 159], which provide a useful and efficient way for a to extract information from the geo- metric search.

Discretization of the action space

TMP problems include a motion planning component, and thus a defining character- istic of TMP problems is the presence of a continuous space of actions. In order to make use of fast task planners, many approaches reduce the action space to a finite

12 set of actions over which the planner will search. This process of discretization is done in several ways.

Perhaps the most common approach to action discretization is to simply fix a set of samples before planning, either by sampling the parameter space or constructing a grid 117, 55, 56, 57, 59, 108, 129, 130]. This has the advantage of simplicity; once a discretization has been fixed, the problem reduces to deciding whether any sequence of

sampled actions achieves the goal. Unfortunately, if the discretization is fixed before

planning, there is no way to use information gleaned during planning to improve the discretization.

A closely related approach interleaves sampling and search. For example, Gar- rett et al.168, 69, 70, 72] describe efficient incremental algorithms that alternate

between search and sampling phases, using powerful heuristics to make search fast and avoiding indiscriminate discretization. Dearden and Burbridge 1471 also inter- leave a backtracking search over an incrementally sampled discretization. When the

search fails to find a suitable value for some action parameter, they first attempt to

sample a few more abstract states; only if that fails do they backtrack and find a new abstract plan.

Finally, some approaches [67, 141, 158, 184, 185] explicitly avoid constructing a

discretization of the action space. Rather than search over actions that correspond to motion plans, these planners search over sequences of parameterized actions (some-

times called plan skeletons), deferring instantiating the parameters of a plan until a reasonable sequence has been generated. Deferring the selection of parameters has at least two advantages. First, if the geometric parameters cannot be instantiated, it

is possible to determine why and use that information in the search for a new plan skeleton. Srivastava et al. 1184, 185] use this introspection to reduce the time spent backtracking on complicated instances. Second, it is often more efficient to determine that a plan skeleton is infeasible when the entire skeleton is available at once.

The most common way to instantiate the parameters of a plan skeleton is by sampling, but that is far from the only way. Several authors 187, 191, 192, 193] have explored using continuous optimization techniques to jointly optimize over the

13 parameters of multiple actions at once. While local optimization is not guaranteed to find feasible parameters if they exist, empirically it works quite well. Optimization

has the key advantage of quickly failing when no parameters exist; early failure can

reduce the amount of time the planner must spend exploring bad candidate plans, at the cost of occasionally missing good plans.

Others have explicitly formulated and attempted to solve a continuous constraint

problem. For example, Lozano-P6rez 1141] extracts a constraint satisfaction problem

from a plan skeleton, which can be solved using high-quality off-the-shelf solvers.

Erdogan and Stilman 160, 61] and Fernandez et al.164] both construct relaxed convex optimization problems, which can be solved exactly.

Searching for task plans

The main reason to include a task planning layer in a planning system is to facilitate

efficient high-level search. The most common way to execute the high-level search

involves a backtracking heuristic search. However, a growing body of work takes

advantage of fast solvers for various logical satisfiability formalisms. For example, many authors [23, 29, 43, 153, 200] have explored using satisfiability modulo theories

(SMT) solvers like Z3 145] to implement the search over task sequences, generally

relying on external motion planners like RRT [135] or RRT-Connect [125] to determine

if a given task is realizable. Others have used answer set programming [88, 128, 137], context-free grammars 141], or non-monotonic causal logic 159] for similar purposes.

Alternatively, Toussaint et al. [87, 191, 192, 193] sacrifice the guarantees of complete

solvers and take advantage of fast incomplete solvers like Monte Carlo tree search (MCTS).

Other distinctions

One fundamental challenge of deliberative planning is that the robot may suffer from decision paralysis, spending an inordinate amount of time deliberating rather than acting. The "planning in the now" of Kaelbling and Lozano-Perez 1105] avoids de- cision paralysis by interleaving planning and execution--committing to a particular

14 branch of the search tree early dramatically reduces the amount of search required.

Committing early comes at a cost: the robot might commit to a bad plan that only appeared promising. In a well-designed abstraction, however, promising plans rarely

turn out too badly, and in the kinds of manipulation systems in common use, bad plans are often reversible.

Most TMP planners do not maintain a single, dense geometric graph; if dense

sampling is needed to check if an action is feasible, that check is made and the results

discarded (or perhaps cached). However, some approaches 15, 25, 26, 67, 72, 75, 160 do maintain a dense representation internally.

Some systems take advantage of additional user-specified structure in the form of hierarchical task networks [73, 176, 177, 2011. Planning with an HTN is only complete if the supplied hierarchy satisfies strong assumptions; if the user provides an incorrect or ineffective hierarchy, the system can perform arbitrarily poorly. Unlike physical

models of the robot, however, there is no objective standard by which to verify a proposed hierarchy.

Specialized problems

Most of the results described so far address prehensile manipulation, in which the interactions between the robot and the objects are limited to the presence or absence

of rigid links. Non-prehensile manipulation planning presents its own challenges.

Although the problem in its full generality remains open, several authors 11, 12, 36, 53, 95, 113, 156] have studied the special case of push-planning, where the robot can push a (typically circular) object in the plane but cannot pull or rotate the object in place. Multiple authors have shown how to leverage contact with the environment to

perform dexterous maneuvers 130, 95, 113]. However, there has been little success at proving completeness, or even probabilistic completeness, for these tasks, and most

approaches have not been shown to scale to complex problems involving many objects.

Other special cases of manipulation planning have been studied. One common special case is navigation among movable objects (NAMO), which shares many of the same challenges as prehensile manipulation. However, NAMO problems are less gen-

15 eral, which has accelerated algorithmic development. Practical algorithms for NAMO

predate effective generic manipulation planning algorithms [31], and probabilistically

complete algorithms for NAMO 1155, 196] predate probabilistically complete algo- rithms for the general problem of prehensile manipulation. Moreover, unlike more

general TMP problems, a general complete decision algorithm for NAMO is avail- able [150].

Another special case that has attracted interest is rearrangement planning. Kro-

ntiris et al. 1122, 123, 124] explored more powerful motion primitives than the usual

pick and place operators for rearrangement planning, and found dramatic perfor- mance improvements. Hauser [82, 83] studied asymptotically optimal algorithms for specialized cost functions related to minimizing the number of objects affected. Re-

arrangement planning highlights the benefits of bespoke algorithms tailor-made for

specific problems; these special classes of planning problem can be solved far more efficiently than more general planning problems.

1.3 Representation, Efficiency, and Verification

In this section, we summarize the review from section 1.2 by identifying three concrete challenges not yet addressed by the literature.

First, there is a need for a representational framework that clarifies the role of dis-

crete structure in robot motion planning. Frameworks for task and motion planning

universally include some representation of the discrete elements of the problem do- main, in addition to its continuous, geometric structure. Not all discrete elements are

the same, however; they can serve at least three distinct purposes. First, problems in-

volving hybrid dynamics, such as manipulation or legged locomotion, involve discrete information as part of the essential structure of the dynamics of a problem domain;

in manipulation problems, we care about containment and stacking relations because

objects behave differently when they are stacked then when they are not stacked.

Second, semantic properties of the physical system often create discrete structure of their own; knowing that a region of space corresponds to a room or a hallway is use-

16 ful in understanding and achieving objectives. Finally, discrete abstractions can be useful for purely computational reasons, as a tool to structure the search for a plan.

Specifying these various kinds of discrete structure, along with their relationship

to the underlying continuous problem, is therefore a crucial step in modeling complex

robotic planning problems. Unfortunately, it is not clear in general how semantic

properties or abstractions should be encoded, nor how they relate to the underlying

continuous representation. Unlike physical properties of the robot or system, there

are no experiments we can perform to verify a discrete model is suitable for planning, which makes the combined properties of an intelligent system difficult to assess.

The lack of objective criteria for evaluating a model independently of an associated planning algorithm has led to the evolution of many different mutually incompatible

frameworks. Dozens of authors have penned hundreds of papers on the problem

of task and motion planning, and each paper seemingly presents its own strategy for linking a discrete representation to a geometric one. Some authors treat discrete

abstractions as measures from which we can sample; others assume a bijection between discrete states to metric states; others use yet more exotic functional relationships.

These representational issues have had an important practical impact on the robotic

planning community, by stymieing the establishment benchmark problems and the development of shared software frameworks for planning 1131].

There is unlikely to be an objective correct way to specify planning problems, as

the representational challenges are not sufficiently well-specified. Still, if we want to design general-purpose intelligent agents with verifiable correctness or completeness

properties, we need planning algorithms that operate on a representation flexible

enough to model most systems of interest, and we need conditions under which these algorithms can be guaranteed to find acceptable plans. The lack of a clear framework

for defining or evaluating discrete representations presents a barrier to the formal

analysis of task and motion planning, makes it difficult to compare algorithms from

different research groups, and makes it challenging in practice to use TMP tools as part of a larger system.

Second, there is a need for an improved understanding of when and how planning

17 with abstraction can be guaranteed to return correct results. If there is a single

conclusion that can be drawn from the literature on task and motion planning, it is this: planning is hard, both in theory and in practice. Because the hardness

of planning grows exponentially with model complexity, generating long-term plans

reasoning only with high-fidelity, physically accurate models is likely not possible, even with fast search implementations and algorithms and powerful domain-independent

heuristics. For example, imagine a legged robot tasked with fetching objects from

around a building. Deciding foot placements is easily done using a high-fidelity model

of the robot and neglecting all non-local geometric structure. Deciding where to

navigate is largely independent of foot placement. An algorithm that cannot reason over which model features are important for which subgoals is likely to struggle to generate high quality plans quickly enough to be useful.

Existing approaches express these relationships using hierarchy, often encoded in

hierarchical task networks. Unfortunately, most approaches to planning in a hier-

archy can perform arbitrarily badly if the hierarchy is imperfect. The sensitivity of planning algorithms to details of an abstraction places an inordinate amount of pres-

sure on the designer of the hierarchy, who lacks any objective criteria to verify that

an abstraction is suitable. We argue that in order to address the needs of complex

systems, planning algorithms must be able to incorporate models at different levels

of resolution. If we can define clear representational semantics, we can enable an

explicit trade-off between model fidelity and model complexity. Moreover, the rep- resentational semantics of such a hierarchy or abstraction must be unambiguous and empirically verifiable.

Finally, even if we assume a model of appropriate fidelity and complexity, many of the key questions around algorithmic correctness remain open. When we began this line of research, it was unknown whether there existed general-purpose algorithms for task and motion planning that guaranteed even probabilistic completeness, let alone optimality or completeness. The lack of sound algorithms makes the task of adopting TMP algorithms daunting: if the algorithm fails to return a solution, is it because the model was wrong, there was insufficient computation time allotted, or

18 the algorithm itself is flawed? These questions are central to the design of intelligent systems, especially if we are to trust those systems with critical tasks.

1.4 Statement of contributions

In this dissertation, we will resolve these three challenges. Our main contribution

is a novel class of sampling-based planning algorithms for the general robotic plan-

ning problem. These algorithms have strong asymptotic theoretical guarantees on correctness, completeness, complexity, and optimality, and are capable of exploiting

abstraction to dramatically accelerate the search process. In the process of prov- ing these properties, we resolve several open questions around the decidability and complexity of the robotic planning problem.

More formally, the contributions of this thesis are as follows. We introduce a new representation for planning, the compact constraint contract (C 3 ) representation.

The C3 representation is is simple enough to be amenable to formal analysis yet expressive enough to model complex systems; it also exposes enough structure to

enable efficient planning. We then introduce algorithms for efficient discretization

and search using abstraction. We show that it is possible to construct a small (that is, polynomially sized) discretization that is faithful to the underlying continuous

problem, in the sense that as the discretization grows finer, metric properties like the lengths of shortest paths and topological properties like path existence all converge

to the corresponding properties of the underlying continuous problem. We use our results to design algorithms that are probabilistically complete and asymptotically optimal.

We also introduce the concept of admissible abstractions, which allow us to take advantage of approximate structure in a planning problem without sacrificing opti- mality or completeness. We provide illustrative examples of admissible abstractions for realistic domains, and we give algorithms for optimal, near-optimal, and any- time planning, including optimal planning with a mix of admissible and inadmissible abstractions.

19 Finally, we use the C3 formalism to show that a broad class of planning problems in which all constraints are polynomials can be decided using polynomial space. This class includes most common frameworks for task and motion planning and for pre- hensile manipulation; a consequence of our result is that the decision formulation of these problems is also PSPACE-complete.

The broader aim of this thesis is to establish a theoretical framework for under-

standing the role of abstraction in planning. Empirical evidence has clearly demon-

strated the importance of discrete structure for difficult planning problems, but that structure historically has served two roles, simultaneously encoding the discrete ele-

ments of the problem dynamics and offering high-level structure. Our analysis clearly disambiguates between these two roles, paving the way for efficient planners with

verifiable performance. This separation represents an important avenue towards ad- dressing the increasing complexity of systems for automated decision-making: by

combining empirically-verified models with complex inadmissible abstraction, we can design systems that perform well while still preserving critical safety guarantees.

1.5 Structure of this document

This dissertation is organized as follows. In Chapter 2, we introduce the C3 rep- resentation and describe its semantics. We describe the limitations of conventional approaches, and present the C3 representation as an alternative.

Next, in Chapter 3, we give deterministic and randomized approximate algorithms that are both asymptotically optimal and efficient in practice. In section 3.1, we give a simple dense sampling algorithm, which incrementally constructs a graph of the joint configuration space. In section 3.2, we describe an empirical demonstration that our approach is sensible for sufficiently small problems. In section 3.3, we improve on this algorithm to take advantage of factoring algorithm, which constructs an exponentially smaller discretization yet maintains the same asymptotic properties as the dense algorithm. In section 3.4, we analyze these algorithms and show they are complete.

Then, in Chapter 4, we show how we can extend C3 to plan using abstraction

20 by using multiple models and endowing our discretization with angelic semantics. In section 4.1, we describe angelic semantics and give an admissibility condition under which the semantics are valid. In section 4.2, we give algorithms for exact and ap- proximate planning with abstraction. In section 4.3, we demonstrate the advantage of angelic planning in a toy problem. Finally, in section 4.5, we suggest useful extensions to the angelic planning framework. Finally, in Chapter 5, we analyze the decidability and complexity of C 3 planning and establish relationships between C3 and known classes of planning problems, in- cluding path planning and planning for hybrid systems. We first define transition

systems, a simplification of C 3 with identical representational power but simpler se- mantics. Then, in section 5.2 we show that a broad class of C 3 instance in which all constraints are polynomials is undecidable. In section 5.3 we introduce the no-

tion of uniform stratified accessibility, which captures an intuitive notion of local controllability for problems where only part of the configuration space is control-

lable from any configuration. We give a polynomial space algorithm to decide plan

existence in domains that have uniform stratified accessibility, implying that many

common categories of task and motion planning problems are PSPACE-complete.

In section 5.4, we conclude our analysis by establishing relationships between several

standard classes of motion planning problem and classes of C3 . Finally, we briefly discuss the implications of our theoretical results in section 5.5.

We conclude with a summary of our contributions, and observe that our results

have illuminated several new open problems (section 6.2). We close with a discussion

of our broader aims (section 6.3), and suggest ways in which this line of research could enable the design of complex decision-making systems that are both performant and

verifiably correct.

21 22 Chapter 2

The Continuous Constraint Contract Representation

In this chapter, we present a formalism for task and motion planning, the continuous constraint contract (C3 ) formalism. A useful formalism for task and motion plan- ning must address three key challenges. First, the formalism must address the fact that for an intelligent agent that interacts with the physical world, space and time are continuous. Consequently, decisions represent continuous-valued functions, and ultimately a planning algorithm must instantiate these functions. Second, the for- malism must be expressive, allowing a practitioner to describe complicated domains compactly. Third, the formalism must expose enough structure to enable the design of efficient planning algorithms.

There are many ways to deal with these challenges. For example, a common way to convert a continuous control system

J = f(x, u) (2.1) into a discrete-time control system

Xt+1 = f(xt, Ut) (2.2)

23 is through the use of numerical integration methods. A simple scheme might use a fixed step size 6t, and integrate the dynamics using Euler integration:

f(Xt, ut) = Xt + f(xt, ut)6t. (2.3)

The properties of numerical integration for simulation are well-studied and can be quite effective, especially when the dynamics are well-behaved. However, problems with discontinuous or non-smooth dynamics may be difficult to simulate or require a very fine temporal discretization to be accurate. Unfortunately, using a fine temporal discretization in a planning problem comes at a heavy computational cost. Each time step corresponds to a distinct decision to be made, and so the computational cost of planning in continuous systems grows exponentially with the temporal resolution used for simulation.

Many motion planning algorithms avoid a fine temporal discretization by reason- ing about the reachable set of states rather than about the underlying control space used to reach those states. For example, path planning algorithms often assume that visibility implies reachability: if the straight-line path between two points does not intersect an obstacle, then the robot can move from one point to the next. The usual motivation for the visibility assumption is in unconstrained systems, any path can be approximated arbitrarily well by a piecewise-linear path, and thus it follows that for path planning on manifolds we need only consider piecewise-linear planning.1

The C3 formalism extends this idea of reasoning about reachable sets by defining the dynamics of a system using constraints between the decisions made and the state of the world at different times. Representing planning problems with constraints addresses all three challenges simultaneously. By constraining the state of the world at different times, we implicitly discretize time without fixing the discretization ahead of time. We can further discretize the space of decisions using sampling, as described in chapter ??. By defining the constraints in semantically meaningful ways, we can describe complex systems compactly. Finally, by using sparse constraints, we can

'See Karaman and Frazzoli [106], for example, for an explicit instantiation of this argument.

24 dramatically accelerate the planning process by factoring the state space. In the remainder of this chapter, we give a precise description of the semantics of the C3 representation.

2.1 States and variables

We assume the domain of discourse has a time-varying state s drawn from a com- pact set S C R". The state vector includes all aspects of the problem domain, both continuous and discrete. We endow parts of the state vector with semantic meaning by defining variables. Each variable v takes on a value s[v] drawn from a particular domain Dv. Formally, the domain can be any arbitrary compact set, and the relation- ships between states and variable values are defined by an injective map f, : S -+ D. This map is primarily relevant for analysis, as in practice we work exclusively in the domain of the variable. We will use the shorthand s[v] to denote the value of the given variable in the state s. For example, in a three-dimensional planning problem, we often need to reason about the poses of arbitrary objects. A three-dimensional pose

E SE(3) can be smoothly embedded into R12, subject to the constraint that nine of those numbers constitute a valid rotation matrix. This embedding Ug : R1 -+ SE(3) defines the variable.

Observe that we have defined the state as a vector of real numbers. Of course, discrete information is often essential to modelling problems. Although in practice we will treat discrete and continuous data separately, we stress that this is an opti- mization. In principle we could model discrete information by defining the domain of some variables to be finite. For example, we could model the state space of a propositional STRIPS instance by associating each proposition p with an element of

S, and restricting the domain D,= {0,1}. This simplifies theoretical analysis and does not affect the semantics of our representation.

By defining suitable functions, we can represent complicated spaces, including arbitrary manifolds and complex geometry. In our implementation, we support the use of arbitrary functions to define variables, although we can go quite far with discrete

25 data, Euclidean vectors, rigid transforms, and collision detection. In later chapters, we will introduce additional constraints on the kinds of functions that are permissible in order to establish a theoretical foothold for analysis. The only consistent requirement we impose is that the variable domains be compact. Following Bickstr6m and Nebel [81, we augment the domain D, of each variable v with the special symbol u, to designate that value for that variable is undefined. A state that contains undefined values is called a partial state. We adopt set notation to relate partial states: if the value of every defined variable in s is the equal to its value in s', we write s C s'.

s C s' 4=- Vv E V, s[v] = u V s[v] = s'[v] (2.4)

Furthermore, we use the notation +- to define assigning a values to a state.

2.2 Actions

We define the capabilities of a robot using parameterized actions a E A. Much as in standard languages like PDDL, each action is defined in terms of requirements and effects. However, unlike in PDDL, each action has a set of continuous parameters, and the requirements and effects are expressed in terms of these parameters. Actions describe the capabilities of the modeled robotic system in terms of pa- rameterized conditional contracts between a planner and a controller. Actions are contracts in the sense that there is a guarantee that if certain requirements hold, then there is an executable policy that will terminate in finite time, incur only bounded cost, and ensure that certain effect conditions hold. Actions are parameterized in the sense that a single action represents a continuum of possible control policies, and the specific effect conditions depend on the choice of parameters. For example, an action placeobject_1 might represent placing a particular ob- ject at some specified coordinates. The choice of target pose would be represented by a parameter Oa E EOcRm for some m C N. Much as we represented arbitrary variables

26 using real numbers, here we represent arbitrary parameter spaces using constrained subsets of the reals.

Formally, an action is just a set

Valid(a) c S x 82 (2.5) together with a function

fa: S x (a (2.6) associating each valid pair of state and parameters with a unique successor state s' = fa(S,Ga). The semantics of the action can be interpreted as: for any state- parameter pair if the system is in state s and executes action a with parameters

Oa for any Oa such that s,Oa E Valid(a), then after some finite amount of time the system will reach state s' = fa(s, Oa). The set of states reachable by an action is called the image of the action, denoted POST(a) = {s'Is, Oa : fa(s, Oa) = s'}. The function ga(s, Oa) defines the requirements; an action is feasible from a state s if and only if there exist parameters Oa such that ga(s,Ga) = 0. The set of states for which an action is feasible is called the preimage of the action, denoted PRE(a) = (s|]6a : fa(S, Ga) = 0}- Finally, we define the set of pairs of states that are connected by an action as the connected set CON(a) = {(s, s')|13a : fa(S, Oa) = s' A ga(S, a)= 0}. For the purposes of analysis, it is sufficient to define actions in terms of valid set of state-parameter pairs and a transition function. However, in order to expose enough structure to enable efficient search, and in particular to take advantage of existing research on heuristic search, it is beneficial to define our actions in a more structured way. Accordingly, we define each action as a tuple (R, C, E), where

Requirements R is a collection of variable-value pairs (v, x) that must be contained

in the previous state s,

Checks C is a collection of arbitrary Boolean-valued functions go : S X O -4 {0, 1}, and

Effects E are pairs (v,fa,,) of variables and functions fa, :S X Ga -' D, that compute

27 a new value for that variable from the previous state and action parameters.

Requirements are a special kind of check that only supports checking equality of a single variable against a fixed value; implementations can leverage this restriction for efficiency. These are used to define action semantics in the natural way. In particular, the valid set Valid(a) of state-parameter pairs are precisely those in which each requirement is satisfied (that is, s[v] = xV(v, x) E R) and in which all checks hold (that is, ga,i(S,Oa) = OVga, jE C). The transition function constructs a state s' such that s'[v] = fa,,(S, Oa) for each (v, fa,,) E E, and s'[v] = s[v] for all variables v not in E. Importantly, and unlike many planning formalisms, we do permit actions to delete variables that were previously defined (that is, the function fa,v may return the unknown value u).

In our implementation, we allow arbitrary functions to be used as part of the specification of checks and effects. Allowing arbitrary functions gives similar expres- sive capacity as representations that use semantic attachments or similar concepts.

Designing useful action contracts is challenging, and there are a considerable variety of reasonable approaches. For example, the model for place-object_1 most closely aligned with the bulk of existing TMP literature would include a requirement that a motion plan reaching the target location exist (as judged by a particular motion planner). Conceptually, we could choose a discretization of the parameter space, and perform a backtracking search over plans, taking the usual steps to avoid the cost of excessive backtracking.

However, we could also define a higher-resolution model, in which the action place-object_1 corresponds directly to a primitive command that could be issued to the robot. The requirements for such an action might be as simple as a collision check along the straight-line trajectory to the target location. Modelling the problem using complex continuous contracts would have dramatically different performance characteristics; it would involve substantially longer plans and thus greater search depth, but the checks involved in each stage of the plan would be considerably faster.

28 2.3 Axioms, knowledge, and constraints

The final component of our domain representation are auxiliary state constraints analogous to the axioms commonly used in PDDL planning. Functionally, an axiom is no different than an action, with the exception of the fact that it is always applied.

Like an action, we define an axiom using requirements, checks, and effects, each of which may include references to arbitrary user-defined functions. If the requirements and checks for a state hold, then the effects are assumed to hold as well.

Axioms change the semantics of the variables they affect. We require that each variable be assigned either by the effects of one or more actions, or by the effects of one or more axioms, but not both. If a variable has its value assigned by actions, it is called a primary variable, and its value is presumed to remain constant when not otherwise mentioned by the contract describing an action. If a variable has its value assigned by an axiom, it is called a secondary variable, and its value is erased after each action, before axioms are applied.

Axioms simplify the representation of a wide variety of concepts. There is a large body of literature on planning with axioms in the discrete case; for a recent example, see Haslum et al. [811, which also includes numerical state values. We emphasize one important use case for axioms in task and motion planning: representing scene graphs.

A scene graph is a collection of relative poses between distinct frames. Typically, the scene graph will be a tree relative to some fixed origin. Because relative poses form a Lie group, we can use the scene graph to compute the relative pose between any two frames in the graph, even if they are not directly linked. Importantly, a scene graph makes it trivial to describe fixed relative poses. Consider an object held by a gripper; if we describe the pose of an object in absolute coordinates, then an action that moves the gripper would also need to update the pose of the object. If more than one object is being moved-imagine a robot carrying a tray-then each object must be mentioned in the action contract.

If we instead encode the object pose in relative coordinates, things are simpler.

29 Recall the semantic assumption that primary variable values do not change during an action unless they are specifically updated by the effects of the action. Thus, if there are many objects on a tray, and that tray is moved by the action, the baseline assumption is that the objects will move with the tray.

There are two problems with using relative coordinates, however. First, many important preconditions, such as collision checks against static environment geometry, are most easily expressed in world coordinates, that is, relative to a fixed coordinate frame. Second, as objects are rearranged and stacked, the sensible relative poses to track vary; if the robot places an object on a table, we would want to stop tracking the relative pose of the object relative to the gripper. We can solve both problems using an axiom that allows us to deduce the world frame coordinates from relative poses. In particular, the axiom encodes the fact that if the world frame coordinates of frame i and the relative pose of frame j from frame i are both defined, then we can deduce the world frame coordinates of frame j. To stop tracking a relative pose after an action ends, the action can simply include as an effect that the value of that relative pose should become u.

Of course, C 3 axioms are hardly the only way to track and update scene graphs, and there are far more interesting things that can be modeled with axioms. Still, scene graphs serve to illustrate the power of the idea.

2.4 Planning problems

We define a C 3 instance I as a 5-tuple (s, A, K, so, sg, c), where S is the set of possible world configurations, A is a collection of actions, K is a collection of axioms, so E s is an initial state, S9 C S is a set of possible goal states, and c : S x LJaE - R>o is a cost function mapping each pair of state and parameterized action to the cost of executing the action from that state. A C3 instance contains all the information needed to specify a well-defined planning problem.

A plan p is a sequence of actions (ai, ... , an) together with their parameters. The length of a plan |pl is the number of actions it contains. The trace of a plan Trace (p)

30 is the sequence of states the system reaches when executed from the initial state s. A feasible plan is a plan p = (a1,..., aN) whose trace Trace (p) = (SO,... SN) satisfies

" so is the initial state for the instance,

" Vi E [1, N), si = fai (si-1),

" Vi E [1, N], gai(si) = 0, and

" SN E S1g

An optimal plan is a feasible plan whose cost c(p) = c(s_1, a) is no greater than the cost of any other feasible plan. An c-near optimal solution is a feasible plan whose cost is no more than 1+ E times the cost of an optimal plan. We can then formally define two important decision problems related to planning.

First, the problem of plan existence is to decide, given a C3 instance 1, whether a feasible plan exists. Second, the decision form of the optimization problem asks whether, given a C 3 instance I, a plan of cost less than c exists. Note that if we can solve the optimization problem, then we can compute the cost of the optimal plan arbitrarily accurately by repeatedly solving the optimization problem for a sequence of cost thresholds chosen by (for example) binary search.

2.5 Conclusions

In this chapter, we described our framework for modelling problems. Our repre- sentation is built around three core ideas. First, the distinction between a task-level representation and a motion-level representation is artificial, and can be treated as an implementation detail. Second, many important physical properties can be modeled as state constraints, rather than encoded directly into the actions. Third, we define actions with parameterized contracts defined by arbitrary functions, which allows us to place actions in direct correspondence with actual control policies executable by the robot. Contracts enable a lower-level controller to subsume much of the com- plexity of the underlying continuous system, and provides an unambiguous interface between the deliberative planner and the controller.

31 Our framework should be understood not as an innovation in and of itself, but as a coherent framework through which to view task and motion planning. Every algorithm described in this thesis either operates on a C3 instance or on something that can be directly derived from aC3 instance, and as we will argue in later chapters, every task and motion planning problem can be modeled as a C 3 instance. Moreover, much, if not all, of the existing task and motion planning literature can be captured within our framework.

Many of the ideas in this chapter have been explored before; the idea of treating modelling controllers as contracts dates back at least to the funnel metaphor intro- duced by Mason 1146], and perhaps before. If there is one key insight we would highlight here, it is the important role that state constraints play in planning for continuous domains. We construct a single, unified representation that contains both discrete variables and continuous variables; using state constraints, we can define the relationships between the continuous and discrete variables precisely. Our represen- tation allows us to unify a large number of interrelated ideas, including semantic attachments [57], extended actions [69], and domain semantics [40].

State constraints allow us to do more than link a continuous and a discrete rep- resentation, however. Many common-sense physical phenomena are quite difficult to model using explicit differential equations. For example, imagine a robot using a tray. Any object sitting atop that tray will move whenever the tray moves. If the tray is held level, the relative pose of the objects to the tray will remain fixed (under reasonable assumptions about friction and geometry). Modelling this contract using explicit constraints on the pose of the object would require each operator to constrain the pose of each object that may be on the tray, greatly increasing the complexity of the resulting constraint graph. Using state constraints, we can derive the absolute pose of an object from a fixed relative pose defined when the object is placed on the tray.

Finally, state constraints can also be used to define abstraction. Suppose we have two different continuous models; for example, in a manipulation problem, it may sometimes be beneficial to plan in the space of all possible joint configurations, and

32 at other times it may be helpful to model the robot as a floating hand. Using state constraints, we can freely define complex relationships between these two models. As we will show in chapter 4, planning with abstraction can speed up search considerably.

33 34 Chapter 3

Randomized Algorithms for Planning

In this chapter, we consider a class of randomized algorithms for solving problems modeled using the C 3 formalism introduced in chapter 2. Later, in chapter 5, we will discuss exact combinatorial algorithms for determining feasibility in C 3 instances. However, conventional wisdom is that combinatorial motion planning is "extremely difficult to use in practice" and "mainly of theoretical interest" 1110. The difficulty described stems from the need to construct an explicit decomposition of the configura- tion space. When the configuration space is high-dimensional, any such discretization will be exponentially large. 1 Moreover, these exact decompositions are quite complex to construct.

The challenges of constructing exact decompositions is often addressed in the gen- eral motion planning literature using sampling-based methods. By considering only a finite collection of actions, the implementation and semantics of actions are con- siderably simplified. Importantly, if we are careful in how we select these actions, we can show that if there exists a plan that reaches a state s, then with high probability a randomly selected set of actions can be composed into a plan to reach s198, 135]. Further analysis showed this property could be attained even in the presence of dy- namic constraints 11341. Karaman and Frazzoli [1061 showed that a similar property holds for optimality: with high probability, not only does a plan exist but it is nearly

1 This is an instance of the so-called curse of dimensionality; some authors [164] have noted it is an open question whether this curse is related to other well-known curses [28].

35 optimal.

The underlying reason for these good properties is that in the limit of many samples, the set of sampled actions densely fill the space of possible actions. In most problems of interest, no particular action in a plan needs to be chosen exactly to ensure feasibility; if a plan exists at all, there is some neighborhood around the parameters of each action in that plan for which the plan is also valid. Given these observations, it is unsurprising that random sampling works so well.

In this chapter, we extend these results to the case of C 3 planning, and in partic- ular to the case where the constraints require the planner to explore low-dimensional manifolds, also known as "non-expansive spaces" 1851. We first define an asymptoti- cally optimal algorithm that operates directly on differentially-constrained systems, which highlights the challenges involved in motion planning without access to the structure of the planning problem. We then show how the C 3 representation both greatly simplifies the analysis and improves the efficiency of asymptotically optimal planning. Our analysis is largely based off of our previously published work [197], and does not explicitly reference the C3 formalism.

3.1 Asymptotic Optimality under Piecewise-Analytic

Differential Constraints

A differentially-constrained planning domain is given by a tuple (f, X), where X is an N-dimensional Riemannian manifold with tangent bundle TX defining the con- figuration space, and f : X x TX -+ R kis a set of k constraints on the allowable motions, with k < N. We assume this manifold is specified by an embedding into a higher-dimensional Euclidean space, such that it can be represented as the zero level set of some function g : X RM, N < M. Then a continuous function p : [0, T] -+ X is a feasible path if f(p(t),3P(t)) = 0Vt E [0, T]. We denote the set of feasible paths in a given domain by Hf. Note that this formulation can model kinematic constraints, such as collisions between objects or joint limits, and dynamic constraints.Although

36 the systems we consider are non-holonomic, this analysis does not address kinody- namic constraints or systems with drift in this work.

We define a planning problem as a tuple (xo, XG, c), where xo E X is an initial configuration, XG is a set of goal configurations, and c : p R+ is a piecewise

Lipschitz continuous cost function. We restrict our attention to problems where the cost of a path is independent of the time taken to traverse the path; accordingly, the cost of a plan is the line integral of the cost function over the path.

Definition 1. A solution to a planning problem (xo, XG,c) in a domain (f, X) is a continuously differentiable path p* : [0,T] -* X, where

fT p* = arg min c(p(t))|1 (t)|| dt (3.1) pEEx o0

s.t. f(p(r), A4)) > 0 VT E [0, T] (feasibility)

p(0) = x,, p(1) E cl(Xg)

We can solve problems of this form under a variety of conditions, by constructing and searching a random geometric graph. A random geometric graph GGG _(VE) is a graph whose vertices are a randomly chosen finite set of configurations V c X, and whose edges are paths between nearby configurations, with adjacency determined using simple geometric rules.

As a motivating example, consider a robot navigating among movable obstacles

(figure 3.1). A holonomic robot is tasked with moving K boxes. When in contact with a box, the robot can rigidly grasp the box, so that the robot-box pair behave as a single rigid body as long as the box is grasped. Any box not grasped by the robot does not move. The robot can release a grasped box at any time. This model avoids much of the complexity of real contact dynamics; there is no consideration of momentum, or even of form or force closure. Yet already, the dynamics are non-analytic and beyond the scope of traditional motion planning algorithms.

We can write the differential constraints for this problem explicitly. Define the indicator function fcontact : SE(2) x SE(2) - {0,1} such that for any object o,

37 I robo I*1 (a) (b)

Figure 3.1: a: In this block pushing problem, the triangular robot seeks to move both the red blocks into the shaded room. b: A locally-optimal decision early on in the plan can have dramatic consequences later. Because it placed the first block just over the threshold of the door to the target room, the robot cannot move the second block without grasping the first block again. fcontact(x, zo) = 0 if the robot can grasp the object when the robot pose is x, and the object pose is xO, and fcontact(Xrxo) = 1 otherwise. Then the constraints on permissible motions can be expressed as

fcontact(Xr, Xo)zo = 0 Vo. (3.2)

Because the support of the function fcontact(Xr, XO) is compact, the function is not analytic. Note that although fcontact is not smooth, a similar result holds for smooth constraints, provided they have compact support. Non-analytic constraints cause motion planning algorithms to fail for a simple reason: arbitrary pairs of configurations can be quite close in geodesic distance but require very long trajectories to connect. Consider two world configurations where the position of each object is perturbed slightly from one configuration to the other; to move the world from the first configuration to the second, the robot must travel to each object in turn. Typical PRM* implementations use either straight-line con- nections or two-point boundary value solvers as local planners; in order to apply an algorithm like PRM* to this domain, we would need a more powerful local planner capable of computing the very long path between the original and perturbed configu- rations. Similarly, the rapid exploration of algorithms like the RRT* is dependent on a steer function that finds a plan that moves toward an arbitrary configuration; in ma-

38 nipulation planning problems, simple implementations of the steer function may not bring the world configuration closer to a perturbed configuration, leading to slow ex- ploration. Note this does not constitute a proof of incompleteness of RRT* or PRM* in problems with non-analytic constraints; little is known about the completeness or optimality of sampling-based motion planning in settings where the geodesic distance does not accurately capture the complexity of moving between configurations.

Although the conditional constraints that are a fundamental characteristic of ma- nipulation planning domains cannot be represented as analytic differential constraints, they can be represented using the more general class of piecewise-analytic constraints.

Constraints are piecewise analytic if there is a semianalytic stratification E such that the restriction of f to each o- E E is an analytic function. Because the constraints are analytic on each stratum o-, there is a collection F, of disjoint analytic submanifolds of the same dimensionality 148] such that for each L E F,, any two states x, x' in L are connected by a path p c L.

In our block-pushing problem, a block moves only if grasped by the robot, and thus it is impossible to move between arbitrary configurations on a stratum without grasping and releasing blocks-that is, without traversing multiple strata. Consider the stratum defined by which block the robot has grasped; the reachable configura- tions within this stratum are defined by the locations of all the other blocks. This submanifold of reachable configurations is called a leaf, and the collection of leaves is a foliation. In the block-moving problem with k blocks, there are k + 1 strata, one for each graspable object and one for the case where no object is grasped. The leaves of the foliation correspond to the poses of every non-grasped object, which are held fixed.

This structure, with infinite leaves on each of a finite collection of manifolds, is the hallmark of piecewise-analytic systems. We can build a finite approximation of this structure using sampling. To extend the asymptotic optimality of PRM* to problems with piecewise-analytic constraints, we must ensure that as the graph size n tends to infinity, the number of leaves on the connected component containing x, tends to infin- ity and that the number of samples on each of those leaves tends to infinity. Building

39 on the Random-MMP algorithm described by Hauser and Ng-Thow-Hing 186], we pro- vide a graph construction and search algorithm that incrementally builds the graph such that as a sample count n increases, the graph contains an increasing number of samples from an increasing number of leaves yet remains connected with probability one as n -+ oo. Our algorithm for doing this, the stratified random geometric graph, is described in algorithm 1.

Algorithm 1 Stratified Random Geometric Graph construction (SRGG)

1: function SEARCH(initial state x0, goal set X, sample count n, parameters v G (0, 1),,q > 0) 2: V +-{xo} > vertices 3: Q+- {(Xo, 0)} > queue 4: C<- {(xo,0)} > cost to reach 5: while do card(Q) > 0 6: x, c - minx,cEQ c 7: Q +- Q \ (x, c) 8: if thenx E X 9: return c > Success case 10: N +-{} 11: for s E strata(x) do 12: L+- leaf (x, s) 13: k<- dimL 14: if thenL n V = 0 15: V+- V U sample(L, vn) 16: V+- V U sample boundary(L, (1 - v)n) 17: N +- N U {y E L n Vld(x, y) < (1 + 7)r(n)} 18: for doy E N 19: c'<- local _plan(x, y) 20: co*- inf{c|(y, c) E C} 21: if thenc+c' Termination condition

Note that as with most sampling-based motion planning algorithms, the SRGG 2 construction does not require an explicit geometric representation of the stratification or the foliations in order to plan; we require only the ability to sample from them. This may be non-trivial, as standard rejection-sampling approaches will fail if the strata or leaves have dimension less than that of the configuration space. However, for many

2 The authors recommend pronouncing this pile of consonants like the English word "surge"

40 problems of interest, it is straightforward to write subroutines that uniformly sample configurations from a given leaf. These subroutines are also a prerequisite for many other manipulation planning algorithms.

Concretely, we assume that the strata and leaves are exogenously given in a form that permits sampling. That is, we assume that given a configuration x, there exists a subroutine strata(x) that returns a list of identifiers of the strata that contain x, a subroutine sample(x) that generates a configuration chosen uniformly at random from leaves containing x, and a subroutine sample_boundary(x) that generates a configu- ration uniformly at random from the boundary of one of the leaves containing x, such that the probability of the generated sample belonging to any boundary manifold is greater than zero. In addition to the subroutines sample and sample_boundary, we assume we have a local planner available, which must satisfy several regularity conditions described in section 3.4.

SRGG follows the same basic procedure as PRM*: we construct an implicit ran- dom geometric graph by choosing a set of configurations (the vertex set) and spec- ifying a subroutine to generate the neighboring configurations for a given vertex.

However, in order to ensure the resulting graph is asymptotically connected, we in- terleave sampling and search such that samples are always generated from leaves containing at least one vertex in the set. At each iteration of A*, immediately before expanding a vertex v, we check if any leaf containing v is an element of a leaf that has no samples (lines 11-17). If not, we generate n samples from that leaf and its boundary.

We then collect the neighbors or successors for v by finding all nearby vertices on an accessible leaf (line 17). Two vertices x and x' are connected by an edge in our graph if the geodesic distance on some leaf including x and x' is less than a critical threshold value depending on n. For the box pushing problem, if v is a grasping configuration, this includes nearby poses where robot has moved the grasped box, as well as nearby poses where the robot has released the box and left it behind to move elsewhere. If v is a non-grasping configuration, the accessible leaves include both nearby grasps and movement without grasping an object. In practice this is

41 implemented using fast spatial indices like k-d trees.

Finally, we expand the vertex x (lines 18-21) using Dijkstra's algorithm. We use the local planner to determine the cost of an edge; if the local planner cannot find a feasible path, perhaps due to the presence of an obstacle, the edge is assigned a cost of oo. If the path to a neighboring vertex y through x is less than any other expended vertex, we add y to the search queue and update its cost to reach.

Together, the vertex set and the neighbor function specify an implicit random geometric graph. An eager PRM* implementation would explicitly evaluate the cost of every edge; instead, we lazily evaluate only those edges that may be part of an optimal path. Only when a vertex is expanded do we actually invoke the local planner to determine whether it can generate a collision-free feasible trajectory to any of the neighbors of the expanded vertex; if it can, the neighboring vertex is added to a priority queue with priority equal to the minimal cost to reach that vertex. Note that this idea of lazily searching a random geometric graph is not novel; the core idea was described by Bohlin and Kavraki [191, and recent work from Janson et al. 11031, among others, suggest this approach can be significantly more efficient than RRT* or searching an explicit graph constructed with the PRM* algorithm. The innovation in our approach is the way in which we construct the graph, which ensures that the graphs constructed on each leaf are constructed in a way that ensures asymptotic optimality.

In addition to a problem specification (X 0 , XG, c), we take as input an integer parameter n E N, a parameter q > 0, and a parameter v E (0,1). As the number n of samples on each leaf increases, the size of the graph increases, which increases the computational resources required to search for a trajectory but also improves the quality of the path returned. Increasing r increases the number of edges in the graph by inflating the radius defining whether two vertices are connected. v represents a trade-off between exploring the interior of each leaf and exploring the relations between different leaves.

The SRGG algorithm is asymptotically optimal for any v,riif the set of neighbors of a given vertex includes all configurations within a distance rL(n), for each leaf L

42 containing the configuration.

Theorem 1 (Optimality of SRGG). Given a planning problem (x,, X, c) in a do- main (X, f), letca be the shortest path between x, and X on a stratified random geometric graph Gn with n vertices, built using a locally complete local planner. Then

P({limsupc= c*}) = 1. nf-oo

Proof. See section 3.4 l

The function rL(n) is determined by v, y, the dimensionality dL, and the Lebesgue measure vol(L) of the leaf, where the measure is induced by the volume form of the manifold L. As with other sampling-based motion planners, the Lebesgue measure of the leaf can be approximated using rejection sampling.

rL(n)= (1+)yLlon1dL (33) n)

7L =4~YL=4( ((1+i- 1 )vol(L))N (3.4) dL (dL 7rdL dL = (volume of a dL-sphere) ( 1)o

Note this connectivity radius is nearly identical to that presented by Karaman and

Frazzoli; the only change is the factor of 4 in -y, which is needed to ensure optimality when the path obtained is on a different leaf from the optimal path.

SRGG is different from Random-MMP in two important ways. First, SRGG obtains provably asymptotically optimal paths across each leaf by constructing an optimal random geometric graph on each leaf. Random-MMP instead adds a single path across an leaf to a configuration on another stratum; finding this path is sufficient to guarantee completeness but not optimality. Second, by choosing a fixed fraction of samples from each leaf to be from the intersection of the leaf and the adjacent strata, SRGG ensures that enough leaves are considered to guarantee optimality.

In addition to the piecewise-analyticity of the constraints, we require an additional technical condition on the local planner 7r used to connect nearby states. First, there must exist a radius ro > 0 such that the planner will return a feasible path if invoked

43 to connect two configurations that lie inside an open geodesic ball of free space with radius less than ro. Second, for any E > 0 there must exist r, > 0 such that for all x, x' E M : dM(x, x') < r, LM(wr(x, x')) < (1 + c)dm (x, x'). If a local planner has these two properties, we say it is locally complete. Note that a local planner that connects trajectories with geodesic curves is locally complete.

3.2 Demonstration

We implemented SRGG for the simplified block-pushing problem (figure 3.2a). The goal is to move the block labeled 'box1' into the region shaded red, past a moveable obstacle. The planner must either decide to go around or must choose where to place the moveable object to get it out of the way. As an illustrative comparison, we considered a toy task and motion planning approach (labeled TAMP), which invokes a motion planner as a subroutine to determine if a high-level plan is feasible.

As expected, we find that while TAMP can often quickly find a solution, more computational time does not allow that solution to be improved. In contrast, SRGG continues to perform better as the available computational time increases. Note that

TAMP is suboptimal because it can only consider a finite set of goal locations; this set does not grow as n increases. By injecting domain knowledge in the form of a better task hierarchy, it is likely the TAMP planner could find plans as good as SRGG;

SRGG finds these plans without such domain knowledge.

The quantitative comparison in figure 3.2c highlights the main deficiency of SRGG it is computationally demanding. As the parameter n increased above 1000, the im- plementation exhausted available memory and the algorithm failed due to space con- straints. Improving the computational efficiency of SRGG is an important avenue for future work. Augmenting our algorithm with intelligent heuristics or nonuniform sampling strategies derived from domain knowledge could greatly increase computa- tional efficiency.

44 1 2 .1 -. ' .... t2

(a) (b) (c)

Figure 3.2: Plans obtained with SRGG and an independent task and motion planner for asimple block pushing problem. The goal is tomove the block labeled 'box1' to the region shaded in red, but the most direct path is blocked by another box. (a): The four-step plan obtained by SRGG. (b) Asimpler, but more expensive, plan returned by asequential task and motion planning algorithm. (c) Quantitative comparison of the cost of the plan returned and the computational time used for various graph sizes with both methods.

3.3 Compact discretizations with factored graphs

We now return to the C3 formalism, and show how the added structure can be used to improve the discretization. Let Dbe aC 3 domain, with action subgraphs gk = (Va, Fi). A discretization nof Dis afinite collection ofparameterized actionsA, where each aEcZis avalid set of parameters 6afor some action a. The size of the discretization is just the cardinality card bof the set of discrete actions.

3.3.1 Discretizing C3

There are two key challenges to overcome in constructing adiscretization. First, in the presence of dimensionality-reducing constraints, uniformly sampling from the space of feasible action parameters will yield many actions that cannot be reached from the starting configuration. Second, in C3 planning, the state before and after an action is executed can be arbitrarily far apart, making local planning difficult. We claim that algorithm 2constructs adiscretization that is asymptotically optimal and probabilistically complete. Before we prove this claim, we present and discuss the

45 algorithm.

Algorithm 2 Sampling from a C3 domain 1: function BUILDDISCRETIZATION(CCC domain D, count n E N) 2: .A= {} 3: D - {} for each variable 4: for i E [1, n] do 5: Pick an action a E A 6: spre, Oa, Spost <- SAMPLEACTION((){ Iv EV}, a) 7: Z - A U Oa 8: for dov E Vpost 9: T, <- D U {spost} return A,{DLJv E V} 10: function SAMPLE(Existing samples 5v for each variable, action a) 11: spre - 12: conflict <- true 13: while conflict is true do 14: conflict <- false 15: for dov E Vpre 16: if thenspre[v] = u 17: Choose a random sample s E cD 18: if thenspre L S 19: spre <- Spre 0 s 20: else 21: conflict 4- true 22: break 0 23: spost, a = SAMPLE(a, spre) 24: return spre,6 , spost

Recall that samples from the subgraph include both parameter values as well as the values of variables before and after the action. For example, we could parameterize a stacking action in terms of the final pose of the upper object in global coordinates; however, for the robot controller to ensure feasibility, we would need to constrain that final pose to be atop the bottom object given its pose before the action takes place.

These previous state values may be coupled to the action parameters.

Algorithm 2 repeatedly chooses an action a, and generates a sample from its action subgraph. It generates this sample conditioned on known reachable values for the prior state. This addresses the first key challenge, and ensures that every action generated will be reachable from some reachable state. We select compatible sets of

46 prior state values by rejection sampling, ensuring that asymptotically every sample will be chosen infinitely often.

Note that the coupling of the final robot pose to the initial object pose in the

'jump' example is in a sense an artifact of the parameterization we chose. Had we instead chosen to parameterize the action with the relative pose of the robot to the object, there would be no such coupling to the object pose. This situation is not uncommon; often, a clever representation of a task allows us to exploit symmetry in the controller, creating smaller and sparser graphs. However, this is not always the case. Consider a block-stacking task, where a robot aims to make a pyramidal arrangement like the one in figure 3.3c. This requires the block labeled 'C' be atop both the block labeled 'A' and the block labeled 'B'. The natural parameterization of the action links the pose of object 'C' to both blocks; by representing the pose of

'C' relative to one of the other blocks, we can eliminate one of these constraints, but not both. Algorithm 2 is necessary to discretize general problems representable in our formalism.

We observe first that the requirement of stratified accessibility immediately trivi- alizes the problem of asymptotic optimality.

Theorem 2. Construct a discretization by choosing n points uniformly at random from the image Post(Action) of each action, for a total of card(A)n samples. Let c* be the cost of the best plan solving a given instance, and let 8 be the cost of the best feasible plan that can be formed by sequencing sampled actions. Then

Pr( > (1+ c)c*) = 0 (2-nd) (3.5)

Corollary 3. The sample count n,,6 required to find a plan of cost at most (1+)c* with probability at least 1 - 6 is

nej = 0 (d log (3.6)

Proof. For any r, there is a cover Br of the space of states by d balls. If a

47 t is-atop(robot, ground)

Pose(of=robot) Is~atop(robot, cart) Pose(or-robot)

Pose(oftcart) Pose(of=cart)

(a) Action subgraph for jump.

target is_at-p(rbot ground)

mlavepoe( is_aop(robot. cart) or--robot of=robot, from=gund) from=cart)

Pose(of=cart) Pose(of=cart)

(b) Factored subgraph for jump.

(c) No factorization possible.

Figure 3.3: Algorithm 2 is necessary if the parameters are coupled to the previous state by constraints (3.3a). Sometimes, by re-parameterizing the action we can make the parameters independent from the previous state (3.3b); the fewer connections to the previous state we have, the fewer samples are needed to ensure good performance. There exist constraint graphs that cannot be rewritten to avoid coupling, however (3.3c). 6-accessible stratified domain is solvable, it contains a plan of length at most N* =

|EI|AL (j)d] If r < and for each action, every ball B E B whose intersection with Post(a) is nonempty contains a sample, then the discretization will contain a plan. If the cost is Lipschitz-continuous with constant C, then the gap between the cost of that plan and the cost of the best plan is at most Cr per action. Thus for r < 6, a sample in each ball means that the total cost is at most 2CN*r greater than the optimal cost. Put differently, there is a constant CO such that if

r < max(Coe, 6) (3.7) then the cost of the best plan is at most 1 +e the cost of the best discretized plan.

Let Va be the volume of Post(a) and da be the dimension of Post(a). Then if samples are generated uniformly at random from Post(a), then the probability that any ball contains a sample is (darda/Va, and the probability of the event E, that every ball contains a sample is

Pr(En) = - Q (1- () < cardVB.exp(-n ol(B) < C exp(-C2nr-d) BEBr Vol(Post(a)) - (3.8) where C1 and C2 are constants independent of r and n. Consequently, the probability that the discretization does not contain a plan of cost at most (1 + e) the cost of the best plan is at most C1 exp(-C-dC 2 ne-d). L

This same proof can immediately be used to construct a deterministic sampling algorithm; if collection of samples for Post(a) has dispersion r, then the discretiza- tion contains a plan of cost at most (1 + 0 (r))c*. There exists low-dispersion se- quences, like the Halton 177] or Sobol 1182] sequences, with dispersion asymptotically

0(n-/d), as opposed to the expected 0(On1d) )of uniformly random point se- quences.

Theorem 4. Construct a discretization by projecting n points from a low-dispersion sequence onto the image Post(Action) of each action, for a total of card(A)n samples. Let c* be the cost of the best plan solving a given instance, and let 6 be the cost of the

49 best feasible plan that can be formed by sequencing sampled actions.

< (1 + 0 (n-/d))c* (3.9)

1 Proof. A low-dispersion sequence has D, - 0 (n- /d) by construction. If a sequence has dispersion D, then every ball of radius D contains at least one point. The analysis of theorem 2 ensures the result holds.

Observe how simple this these proofs are, compared to the proof in section 3.4. The

trick that simplifies our analysis is that we have not taken any steps to ensure locality;

there is no restriction here that we consider only the 0 (log n) nearest neighbors, as in a PRM* or as in SRGG. This means that we consider 0 (n) successor actions in each

state, as opposed to the Q (1) that are necessary. We can overcome this limitation

by adding locality preconditions to each action, but we will need to ensure that the

resulting graph remains connected: that is, we must ensure that the union of the

preimages of each sampled parameterized action 6,i covers the preimage of a.

Pre(a)= U Pre(a2) (3.10) aiE$

The challenge here-first understood and addressed by Karaman and Frazzoli 11061-

is that in order to maintain o(n) connections, the volume of Pre(ai) must shrink as n increases. The key to effective sampling-based planning is balancing the rate at which the preimage shrinks against the number of actions with which to cover the space.

This motivates the second technical problem: because there is no constraint that C 3 actions connect nearby states, there is no guarantee that we can shrink the volume of Pre(ai) and retain connectivity. Imagine a robot jumping on to a platform; mod- elling the dynamics while the robot is in the air would be quite difficult, so we instead place requirements only on its position while still on the ground. If we simply inter- sected Pre(ai) with a ball around the final pose whose radius shrinks as n increases, then there will be some n* such that for all n > n*, every jump action is infeasible

50 from any point on the ground. We resolve this issue by distinguishing between local

actions and nonlocal actions based on whether the strata linked by the action are adjacent. Local actions satisfy a stronger property than stratified accessibility: there is a radius J such if two states s, s' lie inside of a ball B(z, 6) n o-, then s, s' c Con(a)). If this holds, then every path on a can be traced by a plan of any length n by selecting n points on the path and executing an action to move between each sequential pair. It is not hard to show that the analysis of section 3.4 applies to this case, and thus that if there are n action samples it is sufficient to consider only the sampled actions that lead to the 0 (log n) nearest posterior states.

3.4 A Proof that SRGG is Asymptotically Optimal

In this section, we prove the asymptotic optimality of SRGG.

3.4.1 Construction of a sequence of paths

Let &* be an optimal solution to the planning problem. Decompose &* into a sequence of M paths{o}mE[1,M], each lying on a single manifold.

M a* = a* (3.11) m=1

Define Sm E VM as the stratum on which the path a* lies. Let dm be the dimension-

ality of the leaf containing a* . Let be the maximum dimension of any intersection

leaf expressed in the path: dim(m, n m+1) < Vm.

Because the strata are analytic manifolds, there exists > 0 such that each path o* is homotopy-equivalent to a path that lies in the union of the 6-interior of the stratum ,m, an open ball of radius 6 whose closure contains o-,a(0), and an open ball of radius 6 whose closure contains o(1). Define the weakly monotonically decreasing sequence{ 6 n}nz- 6= min(6, n-L) (3.12)

51 Clearly, this sequence satisfies 0 < J ! and li o=& 0; let no = min{n E Z : 6 < 6}. Because the problem is 6-robust, there exists a sequence {&n}nEN such that &n has o-clearance. Decompose each path &n into a sequence of M paths n,m, just as with &*.

3.4.2 Construction of balls on the intersections between strata

Define rn,n,m = amn 2, where am is recursively defined to ensure that if a leaf intersects rn,n,m, it also intersects rn,n,m+1-

am = 6 (3.13)

am = sup{a > 0 : Vy c B(n,,a) sup inf d(on,m(t),y') < am+1} (3.14) tE(o,1)Y'EOm(Y)

Note that rn,n,m < onVm for large n. For each path On,m, define the region Bn,n,m as the geodesic ball centered at on,m(0) on the manifold

Let En,n,m be the event that the ball Bn,n,m contains a sample: that is, En,n,m occurs when the intersection of the vertex set Vn and the ball Bn,,m is nonempty. Let

An,n= (mEn,n,m be the event that each ball Bn,n,m contains a sample.

3.4.3 Construction of balls on an arbitrary leaf

Fix 6 E (0, 1) and r > 0; let a : [0, 1] -+ M be a feasible path on a stratum M such that there exist real numbers t_, t+, 0 t_ < t+<1 and the following conditions hold: for all t E (t-,t+),u (t) E Intr(M); for all t E (0,t_, u(t) E Br(u(t_)); and for all t E [t+,1), u(t) E Br(0(t+)). Then there exists a finite collection of configurations Y(a, yo, r) = {yk} drawn from the leaf containing yo with the property that if Zk E B(yk, '), Zk+1 E B(yk+1, r) are two vertices in an orbital random geometric graph, the SRGG algorithm will call the local planner for the pair (zk,Zk+1), and the local planner will succeed. We provide a construction of Y in two steps. First, we consider the part of the path that lies in the r-interior of the manifold. Define a

52 strictly monotonically increasing sequence (tk) as follows.

Tro = t_ (3.15)

Tk+1 = sup {dM(o(Tk), 0-(T)) < Or (3.16) rE(Tk,t+) 4+0

Let K be the smallest integer k such that Tk = t+. Define (xk)kE[K] so that Xk = o(TOk).

Define (yk)kE[K] so that d(yk, Xk) < '; by the assumptions on the leaf, such a sequence must exist. Define the set of balls BkkE[K], where Bk = B(yk, r).

Let zk be an arbitrary configuration in Bk.

r r d(zk,Xk) d(zk,yk)+d(yk,xk) < r + r

d(zk+l,Xk) d(zk+, Yk+1) - d(yk+1,Xk+1) - d(xk+l,Xk) Or r r <2+9 + + < r < r (3.18) 4+0 4+0 4+0 4+0 - 2 2 +0 d(zk, zk+1) d(zk, xk) - d(xk, zk+1) < r + r < r (3.19) 4+0 4+0 -

From equation (3.19), if the set of vertices includes a configuration in each of the pair of balls Bk and Bk+1, the local planner will be invoked for the pair; from equa- tion (3.17) and equation (3.18), both samples lie inside the ball B(xk, r), and therefore by the assumptions of the theorem the local planner will succeed if called. Note that with the exception of TK, sequential centers o-(Tk) and (Tk+1) are separated by or4; if L(o) is the length of the path, it follows that

K<4+0 K < r L() 1 + 1. (3.20) 1 r

Next, we consider the part of the path that lies near the boundary. We will prove the result for t E (0, t_), assuming t_ , 0; the proof for t E (t+, 1) is similar. Fix an arbitrary yo c MA n L such that d(o(0), yo) 5 r. Define a chart

S x V x W in collar coordinates, such that S C R>o, V CRk-1, W C Rn-k. Note that the coordinate s is equal to the minimum distance of a configuration to the boundary of the manifold. Note also that S x V is diffeomorphic to a subset of the leaf L; any

53 curve with for which the coordinates in W are constant will be a feasible path. For sufficiently small r, such a chart must exist 1148]. Without loss of generality assume

4 0 (yo) = (0, 0, 0). Assume r is small enough that B,(u(t_)) c U. Choose y_ E L such that d(y_,(t.)) < , and define #,0 (y_) = (s-,v_.,0). Let yA -1 ((1 - ak)3 + akS, akV,0), where ai = 0 and the sequence (a) is defined recursively as follows.

2r ak+1 = sup {d(yk, yk+1) < 2 (3.21) aE(ak,1) 4+0

Note that the total distance from yo to y_ is upper-bounded.

d(yo, y_) 5 d(yo, u(0)) + d((0), (t_)) + d(o(t_), y_) (3.22)

< + 2r + r 10+20 (3.23) 4+0 4+0 4+

Since with the exception of the first and last centers the distance between succes- sive centers (Yk) is at least g, this part of the construction adds at most 2 +

(00 r)/(2r) = 7+ < 8 configurations to the set. The following claim is proven as theorem 5 in our supplementary material; we omit the proof here. If the set of vertices includes configurations zk E B(yk, r ) and

Zk+1 c B(yk+1, r), these configurations will be connected by an edge. Similarly, if the set of vertices includes yo and a configuration Zk E B(y, r), those vertices will be connected by an edge.

If t+ / 1, we can apply a similar construction at the other end of the path. In total, we have constructed a set of at most

K, 4 L(o) + 17 (3.24) Or I balls, such that if each ball contains a sample, the resulting graph will contain a path with the desired properties. Then the cardinality of the set Y(a, yo, r) is upper- bounded by F 0 L(a)] + 17. We now apply the construction Y(a,yo,r) to each stratum. If the event En,m

54 occurs, there exists some Yn,m,O E Bo,n,m n Vn. Let rn,m = 7m(1f")A. Let Yn,m=

Y(Un,m, yn,m,o,rn,m). Let Kn,m = card(Yn,m). Let Bn,m,k be the geodesic ball centered atyn,m,k of4radius+rn,m. LetEn,m,k be the event that the intersection of the vertex set Vn and the ball Bn,m,k is nonempty. Let An be the event that all balls on each stratum contains a sample. Note that An occurs only if An,n occurs, as An is meaningful only if there exists an Yn,m,o E Bnn,m to define the leaf on which Yn,m is defined. By construction, if An occurs, then algorithm SRGG will return a solution with finite cost.

3.4.4 Bounding the cost of the path returned

Fix an arbitrary 3 E (0,1), and assume there exists xm E B,n,m Vm. For each yn,m,k, define a smaller ball 5n,m,k with the same center and radius nm. Let In,m,k be the indicator for the event that the intersection of the vertex set V and the ball hn,m,k is nonempty.

In,m,k 1 card(Bn,m,k n Vn) > 0 (3.25) 0 else

Let Sn,m,k = Erm=1 = In,m,k be the number of smaller balls {5mk} containing aconfiguration, and let Kn= E _, Kn,m be the total number of smaller balls. If the cost function c is Lipschitz continuous, then for any e > 0, > 0 there exists a > 0, # > 0, no > 0 such that if Sn > aKn, then for all n > no, c() (1+E)c* (theorem 6 in the supplement). Let An,,,3 be the event that Sn > aKn. We can then upper-bound the probability that the path returned by SRGG has cost more than 1 + e times the optimal cost in terms of the probabilities P(Aa,n), P(An|Aa,n),and 1P(An,,Aa,n) (proposition 7 in the supplement).

P(c ;> (1 + )c*) P(An,n) + P(AnAn,n) + P(A,O,|An,n) (3.26)

Since each of the terms on the right side of proposition 3.26 is summable (proposi- tions 8, 10, and 12) the term on the left side of equation 3.26 is summable. Conse-

55 quently, the term on the left side of equation (3.26) is summable.

0000 00

EZ]P(A,n) < 00, ZIP(A IA,n) < 00, IP(An,A,IA,n) < 00 (3.27) n=1 n=1 n=1 00 .. P(cn > (1 + E)c*) < 00 (3.28) n=1

Therefore, by the Borel-Cantelli lemma, the event that the algorithm returns a feasible path with cost less than (1+ E)c* occurs infinitely often as n -+ o. The sequence cn then converges almost surely to c*.

3.4.5 Proofs of propositions

Proposition 5. Let{Yk} be a set of configurations near the boundary constructed as in the proof of theorem 1. Then if the set of vertices includes configurationsZk E B(yk, ') andZk+1 E B(yk+1, ), these configurations will be connected by an edge.

Similarly, if the set of vertices includes yo and a configuration zk E B(yk, those vertices will be connected by an edge.

Proof. Let zk be an arbitrary configuration drawn from B(yk, r) n L. For all k > 1, a pair of configurations (zk, Zk+1) is separated by a distance less than r and lies in an open ball centered atyk+-

d(zk, Zk+1) d(zky k) + d(yk, yk+1) + d(yk+1, Zk+1)

r + +2- < r (3.29) 4+0 4+0 4+0- d(zk, yk+1) d(zk, yk) + d(yk, yk+1) r +2rr< 3r (.0 < + 2rr < 3r< d(yk+1, B.M) (3.30) 4+6 4+ -4+0~

From equation (3.29), if a random sample exists in the pair of balls Bk and Bk+1, the local planner will be invoked for the pair; from equation (3.30), both samples lie inside the open ball B(yk, '), and therefore by the assumptions of the theorem the local planner will succeed if called.

56 Similarly, any configuration zi E B(yi, ) is separated from yo by a distance less than r, and both zi and yo lie in the closure of an open ball centered at y1.

r 3r d(zi,yo) d(zi, y) + d(y1, yo) ) + r (3.31) 4+0 4+0

Therefore, if the set of sampled configurations include both yo and any configuration in the ball B(yi, '), the local planner will be invoked and will succeed for the pair.

Proposition 6. Fix a,# E (0,1). Let Yn,m = (yn,m,k) be an ordered sequence of

Kn,, configurations such the distance fromYm,n,k to the center of the ball Bn,,,k is less than 14rn,, for all k (each large ball contains a configuration) and is greater than than -$arn,,mfor at most aKn,, of the configurations (aKn,, of the small balls do not contain a configuration). Let ulm be the concatenation of the result of an ri-complete local planner called for each pair (yn,,,» Yn,m,k+1) in the sequence. Then for any objective function

c(o) = h(o-(r)) g~o (&(), &(r)) dT (3.32) with h(x) a Lipschitz continuous cost function, and any E > 0, there exists a constant nE > 0 such that for all n > n, c(o-,m) < (1+E)c(o-n,m)-

Proof. Consider a path a on a Riemannian manifold .M with metric g. We assume without loss of generality that the path is parameterized such that it has constant speed: if l(o-, T) denotes the length of the path which follows a and stops at o-(T), then we assume l(o-,T) = l(o-, 1).3

3Note that an objective function of the form

c(-) = h(o-(r)) fg((r), 6(r)) do- (3.33) is invariant to a re-parameterization of the path: if r' is a strictly monotonically increasing function of r, then

h(a-(T')) g( (6'), (r')) dr' h((r)) g((r), (-r)) dr (3.34) and we can freely re-parameterize the path to be constant speed without changing its cost.

57 Let o-' be a constant-speed path segment with the following properties.

dM (o,(0) , a-'(0)) < ro (3.35)

dm(a(1), o'(1)) < ri (3.36)

It follows that we can bound the distance between corresponding points on either path; let L = lM (-) and L' =lm (o).

dM(oj(r), o'(r)) < rL +L' + ro r E [0, i] (3.37) (1 - )L +(1 - r)L +r1 rE [1, 1]

Since the cost function h is Lipschitz-continuous, there exists a constant Ch such that h(x') < h(x) + ChdM(x, x'). We can then bound the cost of the path a' in terms of the cost of the path o-.

c(o-') = JOh(o-'(r))V/9M(&'(r),&'(,)) dr (3.38)

= h(o-'(r))L'dr (3.39)

< j[h(o(T)) + ChdM(o(T),o'(r))]L'dT (3.40)

< c(-) + ChL' r(L +L') + ro dr + L' (1 - )(L +L) + ri d L L JJ1 (3.41)

< )_ L+ ro+r1. Sc(-)-+ 4 2

Let 0on,m,kbe the segment of the path o-n,m between the centers of the balls Bn,m,k and Bn,m,k+1, andletmkbe the result of an y-complete local planner called on the samples inside Bn,m,kand Bn,m,k+1. Let Ln,m,k =cpm(Unk) and Lmk - lm(i,m,k)

Recall that for an q-complete planner ir, for any c > 0 there exists rE > 0 such that for all x, ' e .M : dM(x,x') < r,, Lm(7r(x, x')) < (1 + e)dM(x,x'). Because the optimal path is differentiable, for any E > 0 there exists nE > 0 such that for all

58 4 n > n, Ln,m,k (1+E)- -rnLe. t ,=ED ,mk;because c(o1 Eo-2 ) = c(ol)+c(-2), we can bound c(u,m) for n sufficiently large.

C(Onm) = C(o4,m,k) (3.43) k

c(n,m,k)_____L(n.k ~~ + ChL',mn,k('Ca~m Ln,m,k + Ln,m,k + rk + rk+1) (.4(3.44) k n~m,k n,4 2 ( + 2# + 2a(1 - #) ( c + r (3.45)

cC(On,m)(1 + E)(1 + -(a + # - #)) + f2 (e, 6) ) (3.46)

Here, fi and f2 are functions of c and 0 independent of n. Since log n/n is decreasing in n, and since by assumption Co-,am is finite and nonzero, this implies that for any C' > 0, there exists nE, > 0 such that for all n > n, we havec(om) (1+')c(-,m). l

Proposition 7. For anye, there exist a,0, such that for sufficiently large n, the following inequality holds.

P(cn > (1+E)c*) P(Aa,n)+P(An|An,n)+P(Ana,,/ 3An,n) (equation (3.26) revisited)

Proof. The proof consists of three arguments. First, if An,,,nAn occurs, for any c > 0 there exists n, > 0 such that for all n > n, there exists a feasible path (section 3.4.1) through the graph gn with cost less than (1+ E)c* (proposition 6).

Second, A , n An occurs only if Aan occurs. In order to have samples from any ball on an leaf, we must first include that leaf in our sampling.

59 Finally, we conclude the following.

P(ca ;> (1+C)c*)

" P((An, nAnA n,,,)U (An,n)) (De Morgan's laws)

P(An n An,,,|A+,n) +P(A(An,n) (Union bound)

((P(An|An,n) + P(AnajAn,n))P(An,n) + P(An,n) (Union bound)

< P(An,) + P(An An,n) + P(A,,,pAn,n) (P(An,n ! 1))

Proposition 8. There exists A, B G R+ such that P(Aa,n) is upper bounded by

exp(-AnB).

Proof. Let An,n,m =lE,_1 En,n,m, be the event that the first m intersection balls

contain a sample. We can bound the probability of An,n in terms of the sum of

probabilities of the partial events An,n,m-

P(An,n) =1 - P(An,n) (3.47) M =1 - ]JP(En,m|A m1) (3.48) m=1 M SEP(En,m|Am1) (3.49) m=1

Let An,n,o be a placeholder event with probability1 to simplify notation.

If An,n,m occurs, the set Bn,n,m n Vn is nonempty; let x be a configuration in

Bn,n,m n Vn. Let Mn = OQm(x) n m+1, and let let do be the dimensionality of

Mn. When the vertex x is expanded, SRGG will sample un configurations from O Om(x) n Wm+1- P(En,mIA _) is the probability that none of the vn uniformly generated samples from M nlie inside B,n,m(x). Because M nis a Riemannian manifold, for any E > 0,

60 there exists rE > 0 such that

(3.50) vol(Bn,m) > (1 - E)(dnT n,m Vr < rE where the volume is taken on Mn. If we sample uniformly from Mn, we can upper bound P(En,mA_).

(P(En,mA_)= - vol(Bn,m) )n (3.51) 1P(nnjn~- 1 = ( -Vol (Mn))

exp(-un "'"M ) (3.53) < exp (-un (d ~~

-p (VO(Mn) (3.54)

< exp (-Amn 2 d) (3.55)

= exp(-Amnsm) (3.56)

Here, Am and Bm are constants that do not depend on n. Since by definition do < j,

Bm = 1 d > 0.

Finally,sinceP(A )<_ m_ P(En,mjA n), we can conclude there exist finite, positive A and B such that P(A,)< exp(--AnB) l

Corollary 9. P(An,n) is summable.

P(A,) < 00 (3.57) n=1

61 Proof.

00 P(A ) <;Zexp(-an') (3.58) n=1 n=1

< f exp(- at') di (3.59)

= a-b s-exp(-s) ds (3.60) b J0 (3.61) b b <00 Li

Proposition 10. There exists A > 0, B >1 such that P(AAn,n) is upper bounded by An-B.

Corollary 11. P(An|Aa,n) is summable.

00 ZIP(AnjAn,n) < 00 (3.62) n=1

Proof. We can bound P(AIAn,n) with the union bound.

M Kn,m P(AnAn,n) < EzEP(En,m,kA,n) (3.63) m=1 k=1

Because (Pmis a Riemannian manifold, for any c> 0, there exists r, > 0 such that

vol(Bn,m,k) > (1 - e)dmrnm Vrn < r (3.64) where the volume is taken on Pm. SRGG selects n samples uniformly from each leaf

62 it considers; we can then upper bound P(En,m,k|A,n).

P(En,m,kAf,n) =(I- vol(Bn,m,k) )n (3.65) Vole ( .O(Bm))

-exp n (1 -)(k 7 ,logn /d-)dm (3.67) Vol (OWm(xm))4+6 n f

=n-Bm (3.69)

Here, Bm = (,)dmTro f VOi(Om(Xmt)(i-)dm is a constant that does not depend on n. Therefore, for sufficiently large n, we have an upper boundoilP(AjA,n).

M Kn,m P(l~A,n)

Here, A and B are constants that do not depend on n; the last line follows by taking

B = minm{Bm - -}. Define -m.

(Xm))) im = 4 ((1 + Vol(O (3.75)

Provided we choose 7m > 7m for each m, there exists e, such that B > 1. O

Proposition 12. For any a E (0, 1), # sufficiently small, there exists A > 0 such

63 that for all n sufficiently large, the following inequality holds.

P(An,,|An,n) exp(-Ani/a) (3.76)

Corollary 13. P(An,,|JAn,n) is summable.

00 SIP(An,a,,JAn,n) < 00 (3.77) n=1

Proof. The proof employs a Poissonization argument. Since An,n occurs, for any m E [M] there exists m E B,n,m. Let Vn = {vi}iEN be a setof cnfigurations drawn independently and uniformly from Om(xm). Fix A E (0,1), and let Poisson(An) be a Poisson random variable with parameter An; then n= {vi}iE[poisson(n)] is an homogeneous Poisson point process with intensity n/vol(O (m). We refer to this process asPun.

In,m,k Recall is the event that 5n,m,k(Xm), the small ball of radius 3m4+0 and center

Xn,m,k, does not contains a sample. Let Sn,m =Z -i'm In,m,k be the total number of small balls for the segment Bn,m(m, -) that contain no points. Let in,m,k be the indicator for the event that fn n n,m,k is empty, and 5n,m= _"'" "' n,m,k. Then the event {Sn,m > aKn,m} is a decreasing event: if it occurs for some n, it occurs for all n' > n. Consequently, we can bound the probability of the event{Sn,m >! aKnm} in terms of the probability of the Poissonized event {Sn,m> aKn,m}-

P({5n,m > aKn,m}) P({Sn,m ;> aKn,m}) + P({Poisson(An) > n}) (3.78)

The latter term decays exponentially with n; this can be derived from a Chernoff bound argument.

(1 -A) 2 P({Poisson(An) > n}) < exp(- n) (3.79) 1 + A

The first term can be bounded in several ways; the simplest is to note that for sufficiently small 3, the balls are disjoint and the number of samples in each ball are

64 independent. The probability of In,m,k can be computed directly.

P(In,m,k) = exp ( Vol(.Bn,m,k) An) (3.80)

5 Because in,m,k are disjoint, n,m is a binomial random variable with parameters Kn,m and exp (-VOo Xmk An). We can then upper bound the probability of {Snm > aKn,m} for any a > Pn,m-

IP({Sn,m aKn,m}) exp(-Kn,mD(|pn,m)) (3.81)

exp(-2(a - Pn,m) 2Kn,m) (3.82)

exp(-Amn1/dm) (3.83)

Note that Am c R+ is a constant independent of n. SincePn,m is decreasing in n, for any a there is a finite no such that for all n > no, Pn,m < a.

Combining results, we can bound the probability of the desired event.

(1 - A)2 P({Sn,m > aK,m}) exp(-Amn/dm) + exp(-1 n) (3.84) 1 + A

Since Sn= _'m Sn,m, we can bound P(An,,An,n) with an application of the union bound.

P(An,ap3An,n) P({Sn,m aKn,m}) exp(-Anl/d) l m

Proposition 14. Let {(M1 ,g1)}L_1 be a set of L Riemannian manifolds, and let

X M 1 x ... x ML be their Cartesian product endowed with the product metric dx :X x X R+, defined as dx(x,x') 2 = Zw di(xi,x')2. Let r E R+, and fix a set {xm E X}_ 1 . Defnexm,i M suchthatxm= Xm,1 X -- X Xm,L. Let

Bi = {x c X : dx(x,xi) < r} be the ball of radius r centered at xm. Let P, be a

Poisson point process with intensity vi, defined on manifold M 1. Let P be the product of those L Poisson point processes, such that if X is a point set drawn from P1,

65 X 1 x ... x XL is a point set drawn from P. Then the probability that that more than aM balls do not contain any points in a draw from P is bounded from above by a function of v, a, and r. In particular,for any f > 0, there exists K, > 0 such that

P(Sx ;> aM) <1+ exp(-v vol(Bi)) (3.85) if v vol(BI) > KVl E [1, L].

Proof. Define the ball Bj, := {x E Mi : d(xi,) < L}. Let Iij be the indicator for the event that Bj,1 contains no points; let Ii be the indicator for the event that Bi contains no points. Note that ][_L Bj, c Bi, and therefore if there is a point in each of Bj,Vl E [1, L], then there is a point in Bi. Define S, = EZiI zfiiandSx= it follows that Sx < E S1 .

Let i = exp(-vvol(B)). We can compute the expected value of Sx from the linearity of expectation, independently of the location or shape of the regions.

M

E[S]= ZE[Ii] = Mp1 (3.86)

The variance of the sums can be computed from the expectations of the indicators and the expectations of their pairwise products.

M M Var[S] = E[Ij]- E[I%]E[IJ] (3.87) i=1 j=1

For i= j, we have E[I1Ij]= E[I] = p. For i j, we cannot compute the indicator product independently of the location or shape of the regions; however, we can obtain upper and lower bounds. The product of indicators I ly3 = 1 if and only if there is no sample in Bi or Bj; this is the same as the event that there is no sample in the region Bi U Bj. The volume of the union of two regions is more than the volume of either region and less than the volume of both regions; thus, vol(B) < vol(Ri U Rj) <

66 2vol(B), and in turn we can then bound E[Ij].

exp(-2vvol(B)) E[Ij] exp(-vvol(B)) (3.88)

Thus, the variance Var[Sn] can be bounded from above and below.

M(PI - el < Var[SI] M2p - p) (3.89)

The sums S, are independent; we can obtain the mean and variance of their sum.

L L IEES,] = MY i (3.90) l=1 l=1 L L L Var[E Sj] Var[Si] M 2 Y -p2 (3.91) 1=1 1=1 1=1

We can then bound the probability that more than aM balls have no samples from above.

L >M IP(Sx>aM) 5P(E:S,> T~ (3.92) 1=1 L L L S] > (a- pi)M) (3.93) 1=1 1=1 i=1

Var[E L S] 1 (Cantelli's inequality) 2 2 Var[E'_ 1S]+(a- 1=1 ) M

< EL:E1±(a-z_ 1 (equation (3.91) revisited)

The third line holds only if a > p.Note that the expression in the last line is on the order of E1= 1fp as pi tends to zero. In particular, for any E > 0, there exists K, > 0 such that

P(Sx aM) 1 exp(-vvol(B)) (3.94) 1=1 if v vol(BI) > KE V1 c [1, L].

67 68 Chapter 4

Planning with Abstraction

The C3 representation introduced in chapter 2 provides a powerful way to model problems in robotics, yet as we argued in chapter 1, there is no single "correct" model.

The worst-case time to solve a planning problem grows exponentially with the size of the description of domain, and so while a complicated, high-fidelity model may yield better plans, search is likely to be intractable. Domain-agnostic heuristics can help mitigate the computational burden of search, but cannot always avoid it effectively.

Conversely, the best plan in a smaller, simpler model may be easy to find, yet could lead to the selection of an arbitrarily poor plan-or the erroneous conclusion that no plan exists-if the model is missing some crucial piece of relevant information. In a perfect world, we could carefully craft a model that contains only the information needed to solve a given planning problem-yet that would require constructing a new model of how the world works every time our goal or state changed. While the challenges of designing an appropriate model are not unique to robot motion planning, the computational complexity of deliberative search makes it especially important in domains with many objects.

The key insight of methods that use abstraction or hierarchy is that large, com- plex problems can often be broken up into smaller parts and solved separately. We claim that the set of problems where this assumption is approximately true is much larger than the set where it holds exactly. A natural thing to do, therefore, is to use a small, easily solved model to guide the search through a large, complicated model.

69 Unfortunately, it is unclear how to specify the relationship between models of differ- ent levels of fidelity. Methods that operate on constructs like hierarchical temporal networks [1661 are limited to guarantees of hierarchical completeness: if the hierarchy does not capture the true dynamics of the problem, there is no guarantee that algo- rithm will find a plan if one exists. The lack of formal guarantees puts practitioners in the unfortunate position of needing to formulate a model with no way to verify their model is not wrong.

In this chapter, we show how we can augment the C3 representation to incorporate abstraction. We define abstraction as a relationship between actions; an abstract ac- tion represents a (possibly uncountable) collection of more primitive refinements. We then give sufficient conditions under which planning with abstraction is guaranteed to return either the optimal plan or a near-optimal plan. Underlying our approach are the angelic semantics proposed by Marthi et al.1144, 145]. Briefly stated, these conditions are that the abstraction must define accurate bounds on the cost of any re- finement of an abstract action, and that if a primitive plan is contained in an abstract action, it is also contained in some refinement of that action. We say an abstraction that meets these conditions is admissible, in a direct analogy to heuristic search. Ad- missibility gives objective criteria by which to evaluate an abstraction, and enables verifiable planning to be efficient in domains which have approximate structure. We demonstrate several algorithms that take advantage of an abstraction to speed up search, then describe future work which would employ multiple abstractions, only one of which need be admissible, and allow efficient anytime planning.

4.1 Admissible angelic semantics

3 Suppose we have two C domains, Di and D2 . We can trivially combine them into a single domain by taking the union of their associated variables, actions, and axioms.

The resulting domain will be no easier than either domain separately, and possibly significantly harder as the joint domain is much larger. If one domain is easier to plan in-due to additional structure, fewer constraints, a simpler model, or any other

70 reason-then under some circumstances, we can define additional structure to make planning in the union of domains only slightly harder than as planning in the easier domain. That structure will be our abstraction. In the remainder of this section, we introduce the notation and semantics of angelic abstraction. We then show how to define angelic abstractions for a collection of C3 domains.

4.1.1 Problem Formulation

We are interested in planning problems involving some underlying continuous con- figuration space X, such as the position of a robot or the configuration of its joints. Our task is to find a path through free space that starts in a specified state so and ends in a goal set Sgoa. The goal set may be specified as the set of states satisfying a given collection of constraints.

A path is a continuous map p : [0, 1] -+ X. We define a concatenation operator o for paths.

(P1 o p 2 )(s) p(2t) if (4.1) p2(2t - 1) if 1 < t <1

Let Hx(S, S') be the set of all paths starting in S CX and ending in S' C X. Let c : X x TX -+ Ro be a cost function, where TX is the tangent space of X. We can define an associated cost functional C : P- R>o.

C[p] = jc(p(t), P(t)) dt (4.2)

Because C is additive, C[p 1 o p2] = C[p1 ] + C[p2 ]. We define the optimal cost function c* : 2x x 2x - R>o as

c*(S, S') = inf{C(p) : p E Icx(S, S')}. (4.3)

We define the 6-approximate planning problem as the search for a path P E

IIx({so}, S) with cost less than (1 + c) the optimal cost for any c E R>o U {oo}.

71 E{p EG x ({so}, Sg) : C($) <; (1 +e)c*(so, Sg)} (4.4)

The case where c = oo, when we wish to find any feasible path to the goal set, is the problem of satisficing planning. The case where E = 0 is optimal planning.

The set Hx(X, X) of all possible paths from all possible start and goal locations is continuous and topologically complex. To simplify planning, we assume we have available a finite set Ao of primitive operators, low-level actions that can be executed in the real world. The problem of constructing such a set of operators in continuous motion planning domains is well studied; in this chapter, we will assume the set of operators is given by a finite discretization like the ones described in chapter 3. That is, we randomly sample a finite set of configurations V C X, and for each such configuration v, we define an operator p,. The operator pv ensures that the robot will end at the state v if executed from any state in the open ball of radius rn around v, where rn oc (log n/n)l/d is a radius that increases slowly with the size of the discretization. Any feasible plan can be well-approximated by a sequence of these randomly sampled operators as the number of sampled configurations tends to infinity.

lim inf{C[p] : p E Afn lO ({so}, Sg)} = inf{C[p] : p E Ux({so}, Sg)}. (4.5) n-+oo

See chapter 3 for details.

Because the set of primitive operators can grow quite large, especially in problems with high-dimensional configuration spaces, a direct search for primitive plans is computationally intractable. Instead, we will use angelic semantics to encode bounds on the cost of large groups of plans. We can use these bounds to plan efficiently while preserving optimality.

72 4.1.2 Angelic Semantics

An abstract operator a E A represents a set a C lIx of primitive plans. Because the space of plans is infinite, we define operators implicitly, using constraints on the underlying primitive plans. For example, in a navigation problem, we might define an operator as any primitive plan that remains inside a given set of configuration space and ends in a different set of configuration space. The definition of abstract operators is depicted graphically in figure 4.1a: the operator a2 3 contains every path that is contained in region 2 and ends in region 3.

The concatenation of two operators ai o a is an abstract plan containing all possible concatenations of primitive plans in the operators.

ai o aj = {p o pj : pi E a,,pj E aj,pi(1) = pj(O)} (4.6)

The condition pi(l) = pj(O) is necessary to enforce that only feasible plans are con- tained in ai o a. In a problem with nontrivial dynamic constraints, the condition would need to be more complex. Note that the condition means that the set of plans contained in ai x aj is smaller than the set product of plans in ai and aj: not all pairs of plans can be concatenated. Moreover, ai o a may be empty, even if neither ai nor aj is empty. In figure 4.1b, we show samples from the plan a1 2 o a2 3 o a3 4 o a4g, which contains paths that move from region 1 to 2 to 3 to 4 to the goal. The concatenation operation allows us to express complicated sets of plans in a compact way.

Because our operators are defined implicitly, it can be difficult to find the best plan in the abstract plan, or even to decide if there exists a plan consistent with the constraints of an abstract plan. Note that it is easy to write down abstract plans that are empty; in the toy navigation example in figure 4.1, the plan a 12 o a3 4 contains no primitive plans, as the intersection of regions 1, 2, and 3 is empty. For planning with abstract operators to be feasible, we need a way to reason about the primitive plans contained by an abstract plan without first enumerating those primitive plans.

Specifically, we will define a valuation of an operator or plan, and use valuations to compare abstract plans. Then, we will show that this comparison is sufficient

73 2 Start Start 4 1 Goal Goal 3 3

(a) Plans in operator a23 (b) Plans in a 12 o a23 o a3 4 o a4g

2 2 Start Start I4 Goal G oal 3

(c) Bounds for V[as4] (d) Bounds for V[a12 o a23 o a34 o a4g]

Start Start 1 3 3

(e) Plans in a 12 o ACT (f) Plans in a12 o a23 o ACT

Figure 4.1: A schematic description of angelic semantics. Abstract operators (a) are sets of primitive plans, and can be defined implicitly in terms of constraints. For example, the operator a23 contains all plans that end in region 3 and do not leave region 2. We can sequence abstract operators into abstract plans (b). The red lines link the centroids of successive regions, while the black lines are randomly sampled primitive plans representative of the abstract plan that move from regions 1 to 2 to 3 to 4 to the goal. We can use domain-specific information to compute bounds on the cost of any plan in an operator stating from a specific set of states (c). Here, lower bounds are drawn using dashed lines, while upper bounds are drawn in dotted lines. Note the dependence on the initial state: the cost of a plan starting in 1f 3 is strictly higher than the cost of a plan stating in 2n 3. We can sequence these bounds (d) to compute bounds on the cost of an abstract plan. Finally, a refinement of an abstract plan p (e) is a less abstract plan (f) p'C p. Primitive plans in p' are shown with red lines, while plans in p \ p' are shown with green lines.

74 for planning. A valuation V[a] for an operator or plan a is the unique map V[a] : X x X -+ R>o that takes a pair of states and gives the cost of the lowest cost path between the pair.

V[a](si, s2) = inf{C(-) : a E a,o(O) = si, -(1) = s21 (4.7)

Note that if there are no paths in a linking si and S2, then V[a](si, s 2) = inf 0 = oo.

Valuations allow us to compare abstract plans without reference to the primitive plans they contain. Given two abstract plans p and p', if we can prove that for any pair of states x,x, either VIp](x, x') < V[p'](x, x') or V[p'](x, x') = oo, then either there is a solution to our planning problem in p, or there is no solution in p or p'.

Either way, we do not need to consider any plan in p'; we can prune p' from our search space. Under such a condition, we say that p dominates p' and we write

V[p] -< V[p']. Similarly, if either V[p](x,x') ; V[p'](x,x') or V[p'](x,x') = oo, then we say that p weakly dominates p' and we write V[p] - V[p'].

Unfortunately, determining the valuation of an operator is itself an optimization problem, and one that is not necessarily any easier than the planning problem we are trying to solve. In many cases, however, we can derive a computational advantage from reasoning about bounds on the valuation of an abstract operator. By computing these bounds from a description of the abstract state and operator, we will be able to reason without reference to the underlying states or plans.

We first define bounds on the valuation of an operator between two sets of states.

VL[a](s, s')= inf{inf{V[p](s, s') : s' c s'} s E s} (4.8)

Vu[a](s, s') =sup{inf{V[p](s, s') : s' E s'} s E s} (4.9)

An abstract valuation bound #[a] is a set of tuples {(s, s', 1, u)}, where s, s' are ab-

75 stract states and I < u E R>o U{oo}. An abstract bound V[a] is admissible if

3(s, s', l, u) E [a] 1 < VL [a] (s, s') (4.10)

V(s, s', l, u) E [a] u > Vu [a](s, s'). (4.11)

In words, a bound (s, s', 1, u) is admissible if for any state x in s there exists a plan

p ending in some state x' in s' with cost c= C[p] bounded above by u and below by 1. We can also interpret a valuation bound as a bound over sets of states. We can

use an abstract valuation bound to approximate the true upper and lower bounds on the valuation of an operator.

VL[a](s, s') = inf{l : (so, si, l, u) E V[a), s n so #o, s'fn si # 0} (4.12)

Vu[a](s, s') =inf{u :(so, si, l, u) E V[a], s C so, s' C si}. (4.13)

Note that ifV[a] is admissible, then VL[a](s,s') < V[a](s,s') and Vu[a](s, s') <

Vu[a](s, s') for all abstract state pairs s, s' (see section 4.4, proposition 22).

The relationships between abstract valuation bounds have important consequences

in a few interesting limiting cases. A bound /[a] contains at least one element

(ss',l, u) where u is finite only if then there must be some plan in the operator a. A bound f[a] does not contain an element (s, s', 1, u) where 1 is finite only if a is empty. Similarly, V[a](s, s') < oc implies a contains feasible plans connecting each state in

s to some state in s', while L[a] (s,s') = oo implies a contains no plan cnnecting a state in s to a state in s'. In addition, if 9[p] and 9[p'] are admissible, then V[p U p'] = V[p] U V[p'] is admissible (see section 4.4, proposition 24 for a proof).

It is also important to recognize the state-dependence of valuation bounds. Con- sider the operator as4 in figure 4.1c; the operator is defined as containing any plan contained in region 3 that ends in region 4. Because regions 3 and 4 intersect, the global lower bound on the cost of a plan in this operator is zero. However, we can compute nontrivial bounds for specific states, or for specific sets of states. For ex-

76 ample, paths achieving lower and upper bounds are drawn from the abstract states

R2 nR 3 and R1 nR3 to the termination set of the operator.

As we will see in sections 4.1.3 and 4.1.3, for many domains we will not need to

write down a valuation explicitly. Instead, we can use domain information to make metric computations and generate the necessary elements of a valuation procedurally.

Moreover, by working with bounds we can efficiently compute bounds on the cost

of plans consisting of sequences of abstract operators, without reference to a dense

discretization of the underlying space of plans. For example, if we have bounds on a plan V[a] and an operator #[a'], we can compute a bound V[a o a'].

[a o a'] ={(s, s"', 1 + 1', u + u') : (s, s', 1, u) E V[a],

(s",7s'" /', n' 7U) E $[a'], s' Cs"} U

{(s, s"', 1 + 1', u) (s, s', 1, u') E $[a],

(" s" oo)', E #[a'], s'n s" o} (4.14)

If V[a] and V[a'] are admissible, then V[a o a'] is admissible as well (see section 4.4, proposition 23 for a proof). We call this process propagation. The process of propa- gation is depicted graphically in figure 4.1d.

4.1.3 Admissible Abstractions

We will use angelic semantics to specify abstractions that enable efficient planning.

Suppose that p, p' are abstract plans, with p c p'. Then p' -d p, since any plan in p is also in p'-but because p is a smaller set than p', our bounds may tighter. If

Vu[p'] -< L[p], then we can also conclude that p'\ p -< p'. We can incrementally construct an increasingly accurate estimate of V[p] by iteratively considering smaller

and smaller subsets of an operator p and pruning those subsets that cannot contain an optimal plan. The pruning process is depicted graphically in figures 4.le and 4.1f.

We can make precise the construction of these increasingly fine subsets by in- troducing a refinement relation 7 c A* x A*, where * denotes the Kleene closure.

77 The elements of R are ordered pairs (p, p') such that p' C p. We can construct a relation R by defining a procedure to generate plans p' given a plan p. First, define an operation HEAD : A* - A, which takes a plan p and selects a single operator a from it to replace with a more constrained refinement. We then define operations

BASE : A* -+ A* and EXT : A* -+ A* that return the part of p before and af- ter HEAD(p), respectively. Together, the three operators split a plan p into three segments so that p = BASE(p) o HEAD(p) o EXT(p). Finally, we define a domain- specific relation T CA x A*; this relation can be thought of as a function mapping an abstract operator to a set of abstract plans. Then (p, p') E R if and only if p' = BASE(p) o p" o EXT(p) and (HEAD(p), p") E N. If (a, p) E N, we call p a refinement of a; similarly, if (p, p') E R, we call p' a refinement of p.

We can combine these elements into an abstraction over a problem domain (X, c, so,S). Formally, an abstraction is a tuple (S, A, ,Z), where

" S is a collection of propositions,

" A is a collection of operators, including a distinguished top-level operator Act,

" 1 c A x A* is a refinement relation, and

o V is a valuation bound.

The valuation bound encodes both the cost and the dynamics of our problem domain.

The refinement relation structures the space of abstract plans.

Angelic planning algorithms accept an abstraction as an argument in much the same way that the A* search algorithm 179] accepts a heuristic. The analogy to heuris- tic search raises an important question: under what circumstances will an abstraction

(S, A, R, V) allow us to find the optimal primitive plan for a domain (X, c, so, Sg), and to prove we have done so? We will generalize the idea of an admissible heuristic to define an admissible abstraction. As we will show in section 4.2, two properties suffice.

Definition 2. An abstraction (S, A, , ), defined over a planning domain (X,c), is admissible if

78 1. For each abstract operator a E A, for each primitive plan p in a, there is a

refinement p of a such that p E p, i.e.,

Va E A,Vp E a, 3(a,p) E R: p E p. (4.15)

2. V is admissible, i.e., L[p] 3 V[p] V U[p] for each abstract operator p E A.

The first property ensures that we do not "lose track" of any primitive plans while refining a plan. Plans are only removed from consideration when they are deliberately

pruned. The second property ensures that if abstract plans p, p' E P, where P is a collection of abstract plans, and 1[p] -< 9[p], then no optimal plan is in p' and thus the best plan in p is also in the set P' = P \ {p'}. Taken together, these properties ensure that if P' is the result of refining and pruning a collection of plans

P, then for every plan in P there is a plan that is no worse in P'. If we start with the set Po = {Act}, no sequence of refinement and pruning operations will discard

an optimal solution. To construct complete planning algorithms, we simply need to

choose an order in which to refine and prune, and keep track of bounds to know when we can terminate the search.

The Flat Abstraction for Graph Search

We illustrate the construction of an admissible abstraction with graph search. Let

g = (V, S) be a graph, where each edge e E S has an associated cost ce. Suppose our objective is to find the shortest path to a goal node v9 E V and we have an

admissible heuristic h : V R>o. Then the abstraction Ag = (V, E U {ACT}, 9, z) is admissible, where

SV[ae] {({eo}, {e1}, ce, ce)},

" V[ACT] = {({v}, {vg}, h(v, v'), oc) : v c V)}

" 7 is the union of{(ACT, e o ACT)Ve c £} and {(ACT, e)Ve E S : ei = vg}.

Admissibility of r follows immediately from the admissibility of h, and the admissi- bility of k is easily proven. By definition, any primitive plan p is contained in ACT.

79 Every primitive plan in the abstract plan p o ACT is of the form p o p'. Suppose the first primitive operator in p' is e. For each such p and p', (ACT,e o ACT) E

Therefore 7 is admissible, and so Ag is admissible.

This abstraction demonstrates that the machinery of angelic abstractions is at

least as general as heuristics in graph search: every graph search problem can be reformulated as an abstract search, using the edges to define a refinement operation

and an admissible heuristic to define lower bounds. Often, however, we can use domain-specific information to devise even more informative abstractions. In the

remainder of this section, we will provide concrete examples of admissible abstractions

for a pair of simple continuous planning problems.

An Abstraction for Navigation

A common problem in robotics is navigating to some specified goal location in a

structured environment. Simple heuristics like the Euclidean distance to the goal

work well in environments that are cluttered but largely unstructured, where the

distance is a good proxy for the true cost. In highly structured environments, however, the Euclidean distance can be quite a bad proxy for cost. Consider the example in

figure 4.8, in which the robot starts just on the other side of a wall from the goal.

Using A* with a Euclidean heuristic requires searching almost the entire space.

We can plan more efficiently by taking advantage of structure in the environment.

Suppose we have a decomposition of the environment into a finite set of overlapping regions, as in figure 4.1, and we know which regions overlap. These regions can

be derived from a semantic understanding of the environment, such as rooms and

doorways, or they can be automatically extracted using (for example) the constrained

Delaunay triangulation. Then any plan can be described by the sequence of regions it moves through. We can use this region decomposition to define an abstraction.

Let S = {Ri}, where UiRi = X, and let A = Ao U {aij : Ri n R # 0}U{ACT}, where p E aij if p(s) E RiVs E [0, 1) A p() E R. The refinement can be defined as

80 follows. T=U{(ACT, aij o ACT), (ACT, aij)} U

{(aij, a o aij) : a(t) E RVt} U (4.16)

{(aij, a) : a(t) E RjVt, a(1) E cl(Rj)} We can use spatial indices like k-D trees and R-trees to quickly find the operators that are valid from a particular state. It is straightforward to show this refinement relation is admissible (see section 4.4, proposition 25).

If the cost function is path length, then we can compute bounds using geometric operations. Executing the action aij from a state in RknlRi would incur a cost at least as great as the set distance inf{IIs - s'I| : s E Rfn Rk, s' e Rf n R}. This function is always a lower bound on the true valuation of the action; if the intersections between sets are small and well-separated, it will also be an accurate estimate, in the sense that the cost of the best plan in the operator will be well-approximated by the lower bound. Planning with this lower bound has the effect of heuristically guiding the search towards the next region, allowing us to perform a search in the

(small) space of abstract plans rather than the (large) space of primitive plans. The Euclidean heuristic can deal with things like clutter and unstructured obstacles, while the abstraction can take advantage of structure in the environment.

Note that we have made no reference to the shape of the regions, nor even to their connectedness. If regions can be disconnected, for instance by an obstacle, abstract operators can have no upper bound, which can lead the search to be inefficient. On the other hand, if we require the regions to be convex, then executing the action aij from a state in Rk n Ri would incur a cost no greater than the Hausdorff distance dH (Ri n Rk, R, n Rj), where

dH(X, Y) = max(sup inf ||x - yI, sup inf ||x - yl|). (4.17) xX yEY yY xEX

Convexity is quite a strong requirement. In a cluttered environment, a convex representation may need to contain many regions. We can relax the requirement of

81 (a) (b)

(c) (d)

Figure 4.2: Useful regions for defining abstract operators are nearly convex. In all four examples here, the lower bound is given by the Euclidean distance in work space. In a convex region (a), the gap between the lower bound on the cost of a plan and the true optimal cost is zero. In an E-convex region (b), the gap between the lower bound 1 on the cost of a plan and the true optimal cost c is small: 1 < c < (1+ E)l. Some regions are not c-convex for any finite ; for example, the region might not be connected (c). A region can fail to be e-convex even if the region is connected in the work space (d) if it is not connected in configuration space. Here, the object cannot fit through the narrow gap, and so the region is not E-convex. In the presence of dynamic constraints, regions can fail to be path-connected even if they are connected in the workspace.

82 convexity, and generalize to costs besides path length, by defining E-convexity. A region R is c-convex if

inf C[p] < (1 + c)IIx - x'I|. (4.18) PEflR(X,X')

E-convexity is shown graphically in figure 4.2. Intuitively, a region is E-convex if the

shortest path between any two points is only slightly longer than the distance between the points. For example, if X c R", a convex region R cluttered with convex objects

of diameter less than d is E-convex, with c = 7r this fact is an elementary

consequence of Jung's theorem [104].

An Abstraction for the Door Puzzle

The door puzzle introduced in the introduction combines the motion-planning aspects

of navigation with a high-level task planning problem: the choice of which doors to open and in which order. Unlike in the navigation problem, the configuration space for the door problem involves discrete components: X c R2 x {0,}N, where where N is the number of doors. The discrete elements of the door puzzle create an element

of combinatorial search that is not present in the navigation problem.

We use the same region-based abstraction to guide the search for motion plans, and construct a relaxed representation of the effects of toggling switches in PDDL by

omitting geometric constraints like collision. Using the region-based representation, we can quickly compute a partial ordering on the sequence of switches that need to be pressed in order to reach the goal. For example, in figure 4.3, the path to the

goal is blocked by six doors. Before we can move towards the goal, we must move to and press each of the six switches. Knowing that each switch must be pressed leaves

us with the task of computing a lower bound on the cost to reach and toggle each

switch. We can find such a bound in two steps. First, we construct a directed graph

whose vertices are the possible effects of executing each operator, and whose edges have weights that lower bound the cost of executing each operator. By constructing this graph, we reduce the problem of finding a lower bound to solving a travelling

83 V9.101 03 05

4tart oal

(a) Problem

00 Om al

(b) Lower bound

01 03 05

tart oal

0204 06

(c) Solution

Figure 4.3: In the problem shown in (a), it is easy to conclude that all N= 6 doors must be opened before the robot can reach the goal. However, there are N! = 720 possible orders in which we might press the switches. We can bound the cost of any sequence by solving a travelling salesperson problem (b, dotted lines), where the edge costs are the minimal distance the robot must travel to move between switches. Although the TSP is an NP-hard problem, we can compute a lower bound on the cost of a solution in polynomial time by computing a minimum spanning tree (b, solid red line). Using the MST to guide search allows the planner to quickly find a near-optimal solution (c).

814 salesperson problem (TSP). While solving a TSP requires exponential time, we can

compute a lower bound on the cost of the optimal solution by computing a minimum spanning tree of the directed graph, which can be done using polynomial time with

standard methods. The directed graph, and the associated minimum spanning tree, are drawn in figure 4.3b. Although the minimum spanning tree bound neglects pos-

sible interactions between the operators, it is admissible; in fact, it is an admissible

special case of the more general (and inadmissible) hadd heuristic 180]. We can use

minimum spanning tree bound to guide the search for a more detailed motion plan (figure 4.3c).

4.1.4 Abstractions in C3

We can use the structured nature of C3 to simplify constructing abstractions. Sup-

pose we have a collection of domains {Di}. Semantically, nothing stops us from

planning jointly in these domains by allowing actions from any domain to be concate-

nated into a plan. Because the actions and axioms in each domain in the ensemble

will only reference variables from one domain, a direct combination will confer no computational advantage. The resulting plans will simply be interleaved plans from

the constituent domains. However, if we add auxiliary axioms defining relationships

between domains, then a plan in one domain can affect the validity of an action in

another domain. For example, suppose we have two domains, one in which the states

are discrete regions and another in which they are continuous poses. Then we can

define an axiom that assigns a region in the former domain to each pose in the latter domain.

Auxiliary axioms are very nearly enough to allow us to specify an abstraction. We can define a refinement relation R as a directed acyclic graph whose vertices are the individual domains, and an edge from domain i to j implies that Di is an abstraction of domain j. Any vertex of out-degree zero represents a domain whose actions can actually be executed on a physical robot; any other vertex represents an

abstract domain. Then we can define a function REFINELEFT(a, s) that takes a from domain i and a state s containing variables from all domains, and returns the list of all

85 actions a' taken from child domains of domain i in the tree R such that s E Pre(a') and

Apply(a', s) E Pre(a'). That is, it returns precisely the list of actions such that a' o a is a valid plan from s. Similarly, we can define REFINEDOWN(a, s) to generate those

actions a' from child domains such that s E Pre(a) and Apply(a', s) E Apply(a', s). That is, it returns precisely the list of actions such that a' satisfies the contract of a from s.

We claim that it is possible to construct an admissible refinement relation from

these functions. Specifically, the union of REFINELEFT and REFINEDOWN will lead to a forward search; if we always refine the leftmost action of a plan first, the prefix

of a plan will be concrete while the suffix will be abstract. Conversely, the union of REFINERIGHT and REFINEDOWN will lead to a backward search; if we always

refine the rightmost action of a plan first, the suffix of a plan will be concrete while the prefix will be abstract. These two relations are not the only possible refinement relations; one could easily imagine refining a single high-level action into an entire plan, in order to take advantage of efficient planners leveraging special structure in one domain. However, in our experiments we have found that using REFINELEFT to implement forward search yields an effective refinement relation.

4.2 Abstract planning algorithms

We now describe several algorithms that leverage an admissible angelic abstraction to search efficiently, even in high-dimensional continuous spaces. Our algorithms are all derived from the angelic hierarchical A* algorithm developed by [144]. We begin by reviewing this algorithm (section 4.2.1), then discuss a subtle variation that dramatically improves efficiency in some common degenerate or nearly degenerate cases (section 4.2.2). Finally, we discuss an extension that solves the approximately optimal planning problem, embracing the key insights of weighted A* (section 4.2.3).

1 We could also define REFINERGHT, to generate those actions where a o a' is valid, in order to implement backwards planning.

86 4.2.1 Angelic A*

Angelic A* (algorithm 3) is a reformulation of the angelic hierarchical A* algorithm

[1441. This algorithm solves the optimal planning problem using a best-first forward search over abstract plans.

The primary data structure maintained by our algorithm is a tree. Each node in the tree is a tuple (a, p_, BASE(p), [p]) representing a plan p = p_ o a, where

* ais an abstract operator,

" p_ is a pointer to the predecessor of the node,

" BASE(p) is a pointer to the base plan, which is used in choosing refinements, and

* V[p] is an admissible bound on the valuation of p.

The root of the tree is the node (0, 0, 0, {({x}, {x,}, 0, 0)}, representing the start of any plan.

The main entry point for the algorithm is the SEARCH routine, which first con- structs the root plan node (line 2) then computes an initial abstract plan that includes all possible primitive plans (line 5). This abstract plan is then added to the plan queue (line 6). Then, as long as a plan remains on the queue, AA* repeatedly finds the abstract plan in Q with the lowest lower bound (line 8). If this plan is dominated by a previously discovered plan, then the algorithm returns successfully, as any remaining plan on the queue is also dominated. Otherwise, AA* expands the active plan by computing its successors and adding them to the queue if they cannot be pruned

(lines 12-17). If the queue becomes empty without discovering a primitive plan that reaches the goal, then no plan exists and the algorithm returns failure.

AA* generates successors to a plan using the refinement relation. It then con- structs a set of child plans by selecting one operator from the plan and replacing it with its refinements. Any successor plan that cannot possibly contain an acceptable solution is pruned, while any plan that could contain an acceptable solution is added

87 Algorithm 3 Angelic A* 1: function SEARCH (abstraction (S, A,7NV)) 2: root = (0, 0,0, {(xxS,0,0)}) 3: p* = 0 4: BOUND(0) = V[ACT] 5: po P ROPAGATE(root, [ACT]) 6: Q {po} 7: while |QI> 0 do 8: p = argmin{V[p]({x,}, X,) : p EQ 9: if PRIMITIVE(p*) and fu[p*] -< L[p] then 10: return p* 11: else 12: Q <- Q\{p} 13: S +- SUCCESSORS(p) 14: for p'E S do 15: if ¶u[p']

88 [1.6, 1.61 13.3, 4.0] [4.3, 5.0] [7.3, 9.1]1 [11.2, 15.01

-(-6.3,5.9)-G0()- T 5OGGLE(S1)-Go(R1,R2)-Go(R2,X ) BASE(p) HEAD(p) ExT(p)

[1.6, 1.6] [0, 0]

•(6.3, 5.9) (-7.2. 7.1)+Go(S 1 ) TOGGLE(S1)+-Go(R1, R2 )-Go(R 2, Xg) BASE(p) p' E REF(HEAD(p)) EXT(p)

[1.6,1.6] [3.1,3.1] [3.5,4.2] [4.5, 5.2] [7.5, 9.3] [11.4,15.2 [0, 01 -(-6.3, 5.9)-(-7.2, 7.1)+-GO(S1)-TOGGLE(S1)-Go(R1, R2)4--Go(R2, Xg) BASE(Pref) HEAD(Pref) EXT(pref)

Figure 4.4: A schematic illustration of the process by which we construct successor plans. A plan is represented by a collection of nodes representing operators. Each node has a pointer to its predecessor, and represents the concatenation of the prede- cessor with its operator. Each node also has a pointer to a base node. To form the successors of a plan, we first break plan into three pieces: the base, the node after the base (called the head), and the rest of the plan (called the extension). We then replace the head with a valid refinement, chosen to be optimistically feasible. Finally, we propagate, creating new nodes corresponding to the operators in the refinement and the extension.

89 Go(5.83, 0.36)

Go(-7.29, -2.49) Go(RO, R1 )

Go(R2, X,9)

Go(5.83,0.36) Go(R2, X,)

G (Go( R2, X.. )

Go(-2.63, -4.49) Go(1.77, 4.09)

Go(R, R2)

Go(3.13, -4.47)

Go(7.00, -3.00)..

Go(-0.75,0.00) Go(RD, RI)..

Go(5.83,0.36)..

Go(8.13, 3.39)..

Go(5.83,0.36)..

Go(8.13, 3.39)..

Go(1.77, 4.09)..

Go(0.75, 0.00) Go(7.00, -3.00)..

Go(RI, R2)..

-Go(-4.14, 4.31) G(R2, X,)

Go(-1.59,0.71) Go(Ro, RI)

Go(Ro, RI) Go(Ro, RI)..

Go(-5.02,0.94) Go(RI, R2)..

Go(Ro,S5) ... ~ Go(-9.00, 2.00) Go(RD, R1)..

AcT .. Co(R&,RI)

Figure 4.5: Part of the plan tree constructed by AA* for the problem shown in figure 4.6. Each node represents a plan; edges link a node to its predecessor. Nodes that are part of the optimal plan are highlighted in red. Branches of the tree not drawn are indicated with an ellipsis. The act of opening the door is referred to as TOGGLE(S). Primitive motion operators are referred to as Go(x, y), where x and y are coordinates. An abstract motion to a region Rj through a region Ri is referred to as Go(Ri, Rj). The top level operator is labeled ACT.

90 Figure 4.6: The problem solved by the tree in figure 4.5. The plan first toggles the switch in the lower left hand corner, then moves through the opened door to reach the goal. Primitive operators (edges) are displayed as dotted lines, while the optimal plan is highlighted in red.

91 to the priority queue. The algorithm terminates when we remove a plan from the queue that is dominated by a previously expanded primitive plan. We compute the valuation of each new plan incrementally (line 28). If that new plan does not optimistically reach some state with lower cost than a previously ex- plored plan ending with the same extension, we discard it (line 37). Otherwise, we update the bounds on any plan with the current extension to include the new plan (line 39). Next, if the upper bound on the cost of reaching the goal under the new plan is better than any previous plan, we record this new plan as the best yet found (line 16). Finally, if the lower bound on the cost of reaching the goal under the new plan is better than the upper bound under any previous plan, we add it to the set Q of active plans.

Marthi et al. showed this algorithm will return the optimal refinement of the top- level operator ACT after a finite number of iterations, provided the lower bound on the cost of every operator is greater than zero.

Theorem 15. Algorithm 3 willreturn the optimal primitive refinement of the abstract plan ACT, provided the lower bound on the cost of every operator is strictly positive [144].

However, if the abstraction is admissible, we can prove the following stronger claim.

Theorem 16. If the abstraction A is admissible and a feasible plan exists, then algorithm 3 returns an optimal sequence of primitive operators in finite time, provided the lower bound on the cost of every operator is greater than zero.

Corollary 17. If the set of primitive operators A 0 ,n is asymptotically optimal (equa- tion 4.5), then

lim Pr(C[SEARCH(An)] < (1 + e)c*) = 1. (4.19) n-*oo

Proof. Define the condition 1i(p) as

Ii(p) - 3p Q:p C p V u[p] -< V[p]. (4.20)

92 We will first show that if Ii(p) holds before line 12, it holds after line 17. If Ii(p) holds initially, there is a plan p E Q such that p E p. If p f arg min{ZL[p] : p E Q}, then p E Q after line 17. Otherwise, because the abstraction is admissible, there is a

plan p' such that p E p' and (p, p') E R. If P- p', there is a plan p" E Q such that

Vu[p") - V[p). Otherwise, p' will be added to the queue before line 17. Therefore if Ii(p) holds before line 12, it holds after line 17.

Initially, Q = {ACT}. By construction, the operator ACT contains every valid

sequence of primitive operators. Therefore the proposition Vp Ii(p) holds after

line 6. If i(p) holds at line 8, it holds before line 12; therefore Vp :Ii(p) holds every time line 9 is reached.

If Vp: Ii(p) and lu[p*]-- L[p]Vp E Q, then Vpip* E p* : V[p*]

If there is a feasible plan, it has finite cost, and therefore the cost c* of an optimal

plan is finite. Let N E N be the number of plans with lower bound less than c*;

because the abstraction is finite and every operator has a positive lower bound, N

is finite. Each plan is added to the queue at most once, and is therefore removed at most once. Then after N iterations, the lower bound on any plan in Q is greater than c*, and algorithm 3 will return a plan.

The distinction between these claims is subtle, but important. Theorem 15 implies hierarchical optimality: if a plan is returned, no better plan can be expressed as a refinement of the top-level operator. Theorem 16 implies primitive optimality: if a plan is returned, no better plan exists. If we can ensure our abstraction is admissible, then using our abstraction provides the same guarantees as a direct search over the space of primitive plans, but may be much faster.

93 02 2

0 0

2

Goal x

(c) (d) (e)

Figure 4.7: An illustration of the problems with cyclic paths. Many natural operators in continuous domains have a cost with a lower bound of zero and no upper bound. For example, deciding whether the irregularly-shaped object (a) can reach the blue region requires detailed geometric analysis. Since regions 0 and 2 touch, the greatest lower bound on the cost of a plan in a oi o a1 2 is the same as the bound on aoi o a1 2 0aio oa 1 2 -

Any number of repetitions of the cycle aio o a12 will have the same cost, and so if aoi o a1 2 is ever selected for expansion, the algorithm will only ever refine this infinite sequence of cyclic plans. Separating the regions (b) eliminates the infinite recursion, but remains inefficient; each cycle of aio o a1 2 adds only a small cost E to the lower bound, meaning that if the next plan on the queue has a cost J greater, the planner will consider [6/E cycles before considering the next acyclic plan. We cannot simply ignore these 'cyclic' plans; in some scenarios, the best plan (c) or any feasible plan (d) is cyclic. Cyclic plans can be important even if the regions defining our operators are connected in configuration space: in the diagram in (e), although there is a feasible plan in a 12 o a29 , the optimal plan is in a1 2 o a21 o a29 .

94 4.2.2 Acyclic Angelic A*

Algorithm 3 requires strictly positive lower bounds on the cost of any operator. In discrete problems, this requirement is reasonable restriction, but it presents challenges in continuous problems. For example, suppose we have a plan consisting of two oper- ators a o ag from our navigation abstraction. If the destination regions intersect-if

R nR 0-then the largest possible lower bound for the valuation of aj,, is zero. This phenomenon can lead to a zero-cost cycle: a sequence of operators that can optimistically returns to a given state with zero cost (figure 4.7a). Even positive cost-cycles are problematic, if the lower bound 1 on the cost of a cycle is much smaller than the upper bound u: the algorithm can only prune a plan if it executes the cycle [u/li times (figure 4.7b). Unfortunately, we cannot simply discard any abstract plan with a cycle: the optimal plan may leave and return to an abstract state if the state is non-convex, even if the state is connected (figure 4.7c-4.7e). Often, cyclic plans indicate a poor choice of abstraction, but they can arise even with natural choices of abstraction, especially in domains with topologically complex configuration spaces.

We can deal with such edge cases while still avoiding cycles with a minor modification to the algorithm.

We define an acyclic plan as any plan p that cannot be partitioned into two plans po o pi such thatVL[po] -< L[p] (algorithm 5). When we compute the successors of a plan p, if we find the extension pext would create a cyclic (i.e. not acyclic) plan when propagated on top of BASE(p), we do not add p o pext to the set of successors.

Instead, we add (BASE(p), Pext) to the set of deferred plans (algorithm 4, line 12).

When any descendent of p is expanded, we consider activating any deferred ex- tension of p by propagating it on top of the descendent plan. If the resulting plan is no longer cyclic, we add it to the set of successors (line 19). Doing so ensures that only acyclic plans will ever be added to the queue of plans, while also ensuring all plans that are not pruned will eventually be considered.

Theorem 18. If the abstraction A is admissible and a feasible plan exists, then the acyclic angelic A* algorithm returns a sequence of primitive operators with cost no

95 Algorithm 4 Acyclic angelic A* 1: function SUCCESSORS(plan node p) 2: > D is a global variable, initially set to 0. 3: POST(BASE(p))= s' : (s, s', 1, u) E V[BASE(p)]) 4: S= 0 5: a= OPERATOR(HEAD(p)) 6: for p' : (a, p') E , s E POST(B ASE(p)) : HEAD(p') n s - o do 7: Pref +-PROPAGATE(BASE(p), p' o EXT(p)) 8: if VL [Pref](Xs, X) < oo then 9: if ACYCLIC(Pref, 0) then 10: S+ S U{Pref} 11: else 12: D+- D U {(BASE(p), EXT(Pref))} 13: Pa +-P 14: while BASE(PARENT(Pa)) - 0 d.0 15: pa +- BASE(PARENT(pa)) 16: for Pext (Pa, Pext) E D do 17: Pref PROPAGATE(BASE( p), Pext) 18: if ACYCLIC(Pref, 0) and Vb[Pref] < oo then 19: S <- SU {Pref} 20: return S

Algorithm 5 Checking for cycles 1: function ACYCLIC(plan nodes p, p') 2: if p = 0 then 3: return true 4: else if p'= 0 then

5: p_ *- PREDECESSOR(p) 6: return ACYCLIC(p_, 0) A ACYCLIC(p_ , p) 7: else 8: p_ <- PREDECESSOR(p) 9: return -,(VL[p] -VL[p']) A ACYCLIC(p_, 0)

96 greater than c*({xs}, X,) in finite time.

Corollary 19. If Ao,n is asymptotically optimal, then

Ve > 0, lim Pr(C[SEARCH(An)] < (1 + E)c*) = 1. (4.21) n--*oo

Proof. Define the parent of a plan as

PARENT(p') = p <-> (p, p') E R). (4.22)

Define the ancestors of a plan recursively.

ANCESTOR(p, 0) = p (4.23)

ANCESTOR(p, n) = BASE(PARENT(ANCESTOR(p, n - 1))) (4.24)

ANCESTORS(p) = (ANCESTOR(p, n) : n E Z} (4.25)

Let DEF(p, D) be the set of abstract plans p' such that (BASE(p), p') E D. Define the condition 12(Q, D,p) as true if one of

•*pEQ:pEp,or

• 3p E Q: 3p' E DEF(p, D) : p E BASE(p) o p', or

* ]p E Q : u[p] < V[p] is true. First, we show that if I2 holds before removing a plan from the queue, it holds after the successors to that plan have been added to the queue. If12(Q, D,p) holds before removing a plan p, then either

1. ]p' f pEQ:pEp'

2. ]p' p E Q : p'-< p

3. 3p' E Q : 3p" E ANC(p', D) : ]p"' E DEF(p", D) : p E BASE(p') p', or

97 4. p E p, or

5. 3p c ANC(p, D) : np" E DEF(p', D) : p E p' = BASE(p) 0 p

In cases 1, 2, or 3, p' E Q after expansion and the invariant holds. In case 4, because the abstraction is admissible, there is a plan p' such that p E p' and (p, p') E R;

this plan will be considered for expansion. In case 5, the plan p' = BASE(p) op" will

be considered for expansion. If p' is acyclic, then either p' E Q after expansion and

the invariant holds, or p' is dominated by some plan already in Q and the invariant

holds. If p' is cyclic, its extension will be added to DEF(PAR(p), D). Since there is

at least one refinement of p that will be added to the queue, there will be a plan on the queue with p as an ancestor, and thus the invariant will hold.

Since 12 holds initially, and holds after each expansion, it holds until the algorithm

terminates. If Vp : 1 2 (Q, D,p) and u[p*]-- [p]Vp E Q, then Vp Ep* * : V[p*] V[p'] and thus there exists an optimal primitive plan p* E p*. Then any time line 10 is reached, p* is an optimal primitive plan. In other words, if algorithm 4 returns a

plan, that plan is an optimal plan.

If there is a feasible plan, it has finite cost, and therefore the cost c* of an optimal

plan is finite. Let N E N be the number of acyclic plans with lower bound less than

c*; because the abstraction is finite and every acyclic plan has a cost strictly greater than its predecessor, N is finite. Only acyclic plans are added to the queue; each

plan is added at most once, and is therefore removed at most once. Then after N iterations, the lower bound on any plan in Q is greater than c*, and algorithm 3 will return a plan.

4.2.3 Approximate Angelic A*

Even with a good abstraction, finding an optimal solution may be intractable for many problems. By modifying the order in which plans are expanded and the conditions under which the algorithm terminates, we can accelerates the search process while still ensuring approximate optimality. This modification is described in algorithm 6.

98 Often, admissible valuations are unduly optimistic: the lower bound L[p] on the

cost of a plan is much less than the true optimal cost c*({x8 }, X,). This problem is well-understood in the context of graph search, where it is often mitigated by using

an admissible heuristic that has been inflated, as in weighted A* (1161]). WA* keeps

a queue of states, and expands the state minimizing

KEYWA*(; W) = g(x) + wh(x), (4.26)

where g(x) is the estimated cost to reach a state x and h(x) is an admissible estimate of the cost to reach the goal from a state x. This biases the search towards plans that

pessimistically reach states close to the goal. If h() - c*({x}, Xg) has only shallow local minima, weighted A* will explore far fewer states before finding a path to the

goal than A* would. Moreover, when a path to the goal is found, it will have cost less than wc*({x}, Xg).

Unfortunately, we cannot directly apply the inflated priority of WA* in the context of angelic search. When we use angelic semantics, we may not have a distinct cost

g(x) to reach a state; we only have bounds on the cost of plans. In order to apply the

idea of WA* to angelic search, we need to compute a priority that satisfies the same properties as KEYwA*(; w). A naive approach, such as inflating the lower bound on

each operator, does not have the desired effect: it would inflate the priority of all

plans equally and would not affect the order in which plans are expanded.

A more reasonable approach might be to inflate the lower bounds on each non- primitive operator, which would exactly equal the priority computed by WA* for a flat

abstraction-but this approach does not properly take upper bounds into account. Consider an operator a for which Vu[a](s, s') (1+ e)L[a)(s, s') for some abstract state pair (s, s'). If e is zero, the operator would be treated as primitive and its lower bound would not be inflated. If it is small but positive, it would be treated as non-primitive. This would artificially bias the search away from operators that are almost, but not quite, primitive.

To avoid this undesirable bias, we recursively compute a priority KEY(p, W). If

99 p_ = PREDECESSOR(p),

KEY(p; w) = min(KEY(p_) +w(rL [p] - L[p-]),Xu[p]), (4.27)

with KEY(0; w) = 0. KEY(p; w) has several useful properties. For each plan p, KEY(p; w) is no greater than the upper bound U[p] or the inflated lower bound

wL[p]. If w = 1, then KEY(p; w) = L[p] For a primitive plan, KEY(p; w) equals the cost of the plan. In a flat abstraction, KEY(p; w) is precisely equal to the cost estimate used by WA*.

We refer to this approach as approximate angelic A*, and present pseudocode in algorithm 6. The pseudocode is substantially similar to algorithm 3, and identical subroutines have been omitted. Changes are highlighted in red.

Algorithm 6 Approximate angelic A*

1: function SEARCH(abstraction (S, A,'Z, f7), weight w) 2: root = (0, 0, 0, {(xS, x, 0, )}) 3: p* = 0 4: BOUND(0) = V[ACT] 5: po PROPAGATE(root, [ACT]) 6: Q {po} 7: while |Qi> 0 do 8: p = arg min{KE Y(p,7): p E Q} 9: if PRIMITIVE(p*) and Vu[p*] - VL[p] then 10: return p* 11: else 12: Q - Q \ {p} 13: S<- SUCCESSORS(p) 14: for p'E S do 15: if #u[p'] < #u[p*] then 16: p* <- p 17: Q +- Q U S 18: return 0

Algorithm 6 is approximately optimal, in the sense that any plan it returns has cost less than w times the cost of the optimal plan. If we combine the acyclic generation of successor plans with the approximate search, the resulting algorithm is approximately optimal even if there are zero-cost operators.

100 Algorithm 7 Approximate angelic search priority 1: function KEY(node p, weight w E R;_) 2: if p = 0 then 3: return 0 4: else 5: p_ +- PREDECESSOR(p) 6: return min(KEY(p_) + w('L [p] - L [p-]),Vu[p])

Theorem 20. If the abstraction A is admissible and a feasible plan exists, then

algorithm 6 returns a sequence of primitive operators with cost no greater than w

C*({xs}, X) in finite time.

Corollary 21. If A0,n is asymptotically optimal, then

V ;> 0, lim Pr(C[SEARCH (A,)] < (1 + c)w . c*) 1. (4.28) n-*oo

Proof. Two invariants hold while the algorithm is running. First, invariant I2 still holds; second, we know the minimal lower bound of any plan on the queue is less

than w times the lowest priority on the queue.

Vp : I2(Q, D, p) (4.29)

KEY(p, w) < wVL[p]Vp E Q (4.30)

When the algorithm terminates,

Vu[p*](x, Xg) -< KEY(p, w)Vp E Q. (4.31)

Consequently,

VU[p*](X, Xg) -< wVL [p](X, X). (4.32)

Since the algorithm must terminate by the same argument as for theorem 18, algo- rithm 6 will return a w-optimal solution in finite time. 0

If we use the approximation idea in the angelic A* algorithm, rather than the

101 acyclic angelic A* algorithm, the same theorem holds with the additional constraint that all operators must have a strictly positive lower-bound.

4.3 Demonstration

We implemented algorithms 3-6 and the abstractions described in sections 4.1.3 and

4.1.3 in the Python programming language. We then compared the performance of the planner to the original angelic A* search algorithm (1144]) and to a search without abstraction using A*.

In the navigation domain, we constructed a random discretization with 104 states.

Examples of the search trees constructed by A* and by algorithm 4 are given in figure 4.8. By using the abstraction, the algorithm can avoid exploring large parts of the configuration space. Our quantitative results bear out this hypothesis: using abstraction allows us to reduce the number of states explored by a factor of three and the number of plans considered by several orders of magnitude.

Using abstraction in the door puzzle domain resulted in even larger speedups.

Even in easy problem instances with only a few doors, search without abstraction quickly became infeasible (figure 4.9). Using abstraction reduced the number of states explored by orders of magnitude. However, the unmodified angelic search spent a great deal of time exploring plans with cycles. By deferring these plans, our algorithms were able to reduce the number of plans expanded by an order of magni- tude. In fact, only our algorithm was able to solve problem instances with more than ten doors. We were able to find 2-optimal plans for instances with up to 32 doors and 104 sampled configurations (corresponding to a discretized state space with ap- proximately 40 trillion states). Unfortunately, software limitations prevented us from experimenting on states with more than 32 doors.

102 (a) A* (b) Acyclic Angelic A*

Figure 4.8: The search trees constructed by A* (a) and by algorithm 4 (b). Note that the A* search needs to explore almost the entire space, due to limitations of the Euclidean distance as a heuristic. In contrast, when provided with a decomposi- tion of the world into nearly-convex regions, angelic A* can find a path to the goal while exploring far fewer states. By avoiding plans with cycles, our modified angelic planning algorithm can explore these states while expanding far fewer plans.

cost time plans states A* 33.430 42.119 11807 7948 Angelic A* 33.430 160.256 25770 4758 AAA* (w=i.) 33.430 4.159 706 3068 AAA* (w=2.5) 35.586 0.697 48 1443

Table 4.1: Quantitative performance on a problem instance in the navigation domain. The discretized state space includes 104 sampled configurations. We see that ab- straction and approximation result expanding fewer plans and exploring fewer states, yielding a faster search and optimal or nearly optimal results.

103 10000 A.AA* (w=1.) - AAA* (w=2.5) I 8000- AngelicA* A* I WA*( &=2.5) Cd 6000-

r__ IO 4000- .0 CO 2000- .00 000.0I

U PO 0 163. "l0 102 103 10n Sample count

-*- AAA Z(w=1.) 6000- ..- AAA*(w=2.5) Angelic A* A* 0 4000- - WA* (w=2.5)

Ci2 2000-

1 1 " 1 102 103 104 Sample count

Figure 4.9: Quantitative evaluation on an easy instance of the door puzzle domain with only two doors. More difficult instances could not be solved by any algorithm considered except algorithm 4. The abscissa measures the number of randomly sam- pled states in the discretization of the configuration space. The ordinate axes measure the number plans expanded by each algorithm and the number of distinct configura- tions explored during search.

1.04 4.4 Analysis

Proposition 22. Define operator bounds

VL [a(s, s') = inf{inf{V[p](s, s') s' E s'} s E s} (4.33)

Vu[a](s, s') = sup{inf{V[p](s, s') s' E s'} s E s} (4.34)

VL [a] (s, s') =inf{l : (so, s1, , u) E V[a],

s n so # 0, s'n si 0} (4.35)

Vu[a](s, s') inf{u : (so, si, l, u) E V[a),

s C sO, s' C s1}. (4.36)

If V[a] is admissible, then VL a (s, s VL [a](s, s') and Vu [a (s, s') Vu a (s, s') for all abstract state pairs s, s'.

Proof.

Vx E s,x' E s',

3(so, si, l, u) E C[a] : 1< V[a](x, x'), x E s, x' E s' (4.37)

xEsAxEso => sfn so 0 0 (4.38)

x' E s'Ax' E si -== s'n si #0 (4.39)

VL [a]({x}, { x}) < V[a] (x, x') (4.40)

..Z[a)(s, s') VL [a] (s, s') (4.41)

Vx E s, X' C s',

V(so, si, l, u) C V[a] : x E s, x' E s' u > V[a](x, x') (4.42)

sCsoAxEs x so (4.43)

s' C si Ax' E s' s' E so (4.44)

Vu[a]({x}, {'}) > V[a](x, x') (4.45)

. u [a](s, s') V[a](s, s') (4.46)

Then L a](s s') VL [a](s, s') and Vu [a](s, s') u[a](s, s') for all abstract state

105 pairs s, s'.

Proposition 23. Suppose [a] and /[a'] are admissible. Then the valuation bound

Z [a o a'] ={(s, s"', 1+ 1', u + u') : (s, s', 1, u) E V[a],

(s", s"', l',u') E #[a'], s' ; s"} U

{(s, s"', l1 + 1l', u) : (s, s', 1, u') E # [a],

(s", S'",)i', oo) E #7[a'], s' n s" = o} (4.47) is admissible.

Proof. Let p o p' be a plan in a o a', where p E a and p' E a'. Then becauseV[a] and V[a'] are admissible,

](s, s', 1, u) E f[a] : p() E s, p(1) E s', 1 C[p] (4.48)

3(s",s"', l',u ') E f[a'] : p'(0) E s", p'(1) G s', l'i C[p'] (4.49)

Because p o p' is continuous, s' n s" intersect.

p (1) = p' (0) -- > s' n s"1 f o (4.50)

Because the cost function is additive, 1+1' is a lower bound on the cost of p o p'.

Cpo p'] =C[p]+C[p'] ;> 1+' (4.51)

By construction, it follows that (s, s',1 + 1', u + ') E [a o a'], and therefore

... Vp o p' E a o a', (s, s',1, u) E [a o a'] :

p() E s, p(1) E s', l C[p op']. (4.52)

Thus V[a o a'] is an admissible lower bound.

106 Let x, x, x" be primitive states.

V[aoa'](x, x")

V[a o a'](x, x') + V[a o a'](x', x") (4.53)

V[a](x, x') + V[a'](x', x") (4.54)

Vu[a](s, s') + Vu[a'](s", s"')Vs' ; s" (4.55) ZVu[a](s, s') + Vu[a')(s", s"') (4.56)

Vs' c s",x E s,x' E s', x" E s'

"U+U '(4.57)

V(s, s', l, u) EV[a] : x E s, x' E s',

V(s",s', ', ') E l[a'] : ' E s", x" E s'

inf{u : (s, s"', l, u) c [a o a']} (4.58)

= Z[a o a'](x, x') (4.59)

Thus V[a o a'] is an admissible upper bound. l

Proposition 24. IfV[p] andV[p'] are admissible, then Ip U p'] V[p] U V[p' is admissible.

Proof. Observe that

V[p U p'](s, s') = min(V[p](s, s'), V[p'](s, s')). (4.60)

Then

V [p U p']({s}, {s}') V#U[p]({s}, {s}') (4.61)

> V[p](s, s') (4.62) > V[p U p'](s, s') (4.63) so V[pUp'] is a valid upper bound. Note that this argument applies toVu[p]({s}, {s}')

as well.

107 Then

VL[pUp']({s},{s}') = min(L [p] ({s }, {s ),L p'1]({s}, {s}')) (4.64)

< min(V[p](s, s'), V[p'](s,s')) (4.65)

= Vfp Up'](s,s') (4.66)

so #[p U p'] is a valid lower bound. L

Proposition 25. Let S = { Rj} be a decomposition of the state space X into N

regions, such thatUiE[N]Ri = X. The refinement relation

N= U{(ACT, aij o ACT), (ACT, aij)} U

U (4.67) {(a 2 , a o ai) : a(t) E RiVt} {(aij, a) : a(t) E RjVt, a(1) E cl(Rj)}

is admissible.

Proof. Choose any p E ACT. Because UiE[N]Ri = X, there exists some Ri such that p() E RI. If p(t) E RjVt, then p e aii. Otherwise, there is some t' E [0,1) such that p(t') E BRi and p(t) E RjVt < t'. Because UjE[N]Rj = X, there exists some Rj such

that p(t') E cl(Rj). Then we can partition p into pi 0 P2, such that pi(t) E RjVt and

P2(0) E cl(Rj). Then pi E ai and p2 E ACT, so p E aj o ACT.

Choose any p E at. Because p E A;, either we can partition p into a op', or p = a.

If p = a, then (aj, p) E Z. Otherwise, p' E ag, so (aij, a o p') = (aij, p) E . El

4.5 Notes and Extensions

We have defined conditions on an abstraction that allow us to accelerate planning while preserving the ability to find an optimal or near-optimal solution to complex motion planning problems. We motivate these conditions by deriving two admissible

108 abstractions and showing they improve the efficiency of search without adversely affecting the quality of the resulting solutions. We view the work in this chapter as a proof of concept, demonstrating that a good abstraction can render optimal

planning feasible even on large problems. The classical planning community has

developed several powerful families of admissible heuristics [801; by reformulating

these heuristics to employ angelic abstractions, we may be able to obtain optimal or

near-optimal solutions to practical manipulation planning problems.

We briefly describe two powerful extensions to the algorithms proposed here. First, we have described a strategy for near-optimal planning with an arbitrarily complex

admissible abstraction. In many cases, however, admissibility is too strong of a con- dition; it is often far easier to estimate cost than it is to compute a lower bound. Following Aine et al. 12], we observe that it is possible to combine admissible and inadmissible abstractions while still retaining near-optimality. The basic idea is to keep multiple queues, one per abstraction, and alternate search between the admissi-

ble and inadmissible abstractions. By sharing a common cost-to-reach, we can prune

branches of the search tree where the inadmissible abstractions actually found good

plans, resulting in bounded suboptimality in any case and fast search in the case

where the inadmissible heuristic is a good estimate of the true cost.

Second, we note that in cases where optimizing search is too expensive or with hard time constraints, we can use anytime algorithms like ANA* 11951 to quickly find a satisficing path, then use any remaining time to optimize. Anytime search is

a powerful way to avoid the need to enumerate many reasonable high-level when no good upper bounds are available. Other ideas from graph search, such as efficient replanning, can also be incorporated. Taken together, an anytime multi-abstraction search algorithm could be a useful way to implement bounded rationality, where a system achieves the best plan possible given its available resources.

109 110 Chapter 5

Exact Algorithms for Robot Planning

In this chapter, we study two important theoretical questions in robotic planning.

First, we study the decidability and complexity of continuous constraint-based plan- ning. In line with previous research 16, 50], we find that constraint-based planning is undecidable in general, even if all constraints can be expressed as low-order poly-

nomials. We then obtain sufficient conditions under which constraint-based planning

is decidable, and we give a polynomial-space algorithm for deciding plan existence

in any domain that satisfies those constraints. Our complexity result implies that

most standard formulations of task and motion planning can be solved using only

polynomial space, and thus are PSPACE-complete problems.

We then study relationships among various representations for planning problems.

We show that there is a duality between the constraint-based representations used

in this thesis and the differential representations commonly used in control theory

and in modeling hybrid systems. Broadly speaking, we show that systems with an- alytic dynamics can be equivalently represented as transition system with analytic

constraints. Our result is not constructive; it ensures equivalent representations exist, but does not say anything about how to find them. The main utility of our duality result is to motivate the use of constraint-based representations as, in some sense, no less natural than differential representations.

Together, these results show that if we can control a system locally, and if we can model the effects of local controllers using well-behaved polynomials, then we can

111 always decide if it is possible to steer a system between any two states. Moreover, because analytic functions can be approximated arbitrarily well by low-order poly- nomials, our complexity results imply that for many domains, a relaxed form of the planning problem is PSPACE-complete. While the antecedents of this argument ex- clude many systems, they include many systems of practical interest, including many classes of system for which no algorithms for deciding global controllability are known at present.

This chapter is structured as follows. First, in section 5.1 we review the notation and terminology of planning with transition systems, and show that that transition systems have equivalent expressive power to C3 . We introduce several important families of transition system, including discrete, semialgebraic, and semianalytic sys- tems. Next, in section 5.2, we argue that while the reachability problem for discrete transition systems is PSPACE-complete, the reachability problem for continuous transition systems is undecidable-even if the system An section 5.3, we introduce a special class of semialgebraic transition systems for which the reachability prob- lem is decidable using only polynomial space. We show that this class of problems includes prehensile manipulation planning as a special case, implying that many stan- dard formulations of task and motion planning are PSPACE-complete. Finally, in section 5.4, we describe the relationships between different formal representations for planning, and prove our duality result. We conclude in section 5.5 with a brief discussion of the implications of these results.

5.1 Transition systems

The results in this chapter are formulated for transition systems, defined by a tuple T= (U, V, <», B) where

•U c RN is a space of possible states,

V c RN" is aspace of possible decisions,

4

112 B is a quantified Boolean formula whose atomic formulae are of the form #(u, v, u') m 0, where m represents one of the symbols {<,=, >} and # is a function drawn from <».

We use the notation B[u, v, u'] to indicate that the formula B is satisfied by the tuple

(u, v, u'). The transition system formalism generalizes both C3 and more conventional for-

malisms like discrete-time dynamical systems [1011. In particular note that any C3 instance is trivially equivalent to a transition system, where each v corresponds to a particular parameterized action, and the u represent the states after each action

is applied. Recall that a C3 action a can be defined by the valid set of parameters

Valid(a) C U x Ea together with a successor function fa : U x ea (equations 2.5 and 2.6). If the decision space V = LIa and the state space U = S, then we can define the set of feasible transitions as precisely those tuples (u, v, u') = (s, a, s') such that

(s, a) E Valid(a) A fa(s, a) - s' = 0.

Axioms and knowledge can be modeled similarly.

We use the transition system formalism here because we can analyze the compu- tational properties of C3 planning without the need to consider the full semantics of the framework. In particular, by restricting the types of functions that can be in- cluded in <1-or, in the language of C3 , by restricting the types of functions that can appear in action contracts-we derive sufficient conditions under which C3 planning is decidable.

The reachabilityproblem for a transition system is a tuple I = (U, V, <», B, uo, B.), where uO is an initial state and B. is a Boolean formula defining a goal set. The reachability problem has a solution if there exists a natural number N E N and a sequence ((ui, vi),..., (UN,VN)) such that

" B[ui_ 1 , vi, uj] for all i E [1, N], and

* B[UNI -

113 In general, there is no upper bound on N; plans may be arbitrarily long.

If the set

most common representations for task and motion planning problems can be described

as a semialgebraic transition system, as common operations like forward or inverse

kinematics and collision-checking against a mesh are semialgebraic. Consequently, we spend the bulk of this chapter studying semialgebraic transition systems.

Our analysis makes use of several tools from semialgebraic geometry. A set S is semialgebraic if it can be described as the set of values of the free variables in a Tarski sentence, a quantified Boolean combination of polynomial equalities and inequalities. For example, the sentence Dx : ax2 + bx + c = 0 A x > 0 defines the set of possible coefficients of a quadratic equation that has a positive real solution. A sentence with

no free variables is either true or false. Vx: X2 > 0 and VyEx : x2 < y are both Tarski sentences; the former happens to be true, while the latter is false.

It is a remarkable and important observation that any set defined by a Tarski

sentence that includes the quantifiers V or 3 can also be defined without quantifiers:

real algebraic geometry supports quantifier elimination. The existence of quantifier

elimination for semialgebraic sets is known as the Tarski-Seidenberg theorem 1171, 189], and can be used to show that the closure, interior, and complement of a semi- algebraic set are all semialgebraic sets, as is the distance between semialgebraic sets.

In addition, there are effective algorithms for deciding a variety of questions about

semialgebraic sets, such as whether a set is empty (that is, whether a system of poly- nomial equalities and inequalities has a solution) or whether two points lie on the same connected component. 1 These facts underly modern results on the decidability and complexity of motion planning 127, 162, 169].

Semialgebraic sets are quite flexible, and standard task and motion planning prob- lems can be modeled by semialgebraic transition systems. For example, we can use semialgebraic constraints to describe both continuous and discrete phenomena. Any

'For a more thorough introduction to semialgebraic geometry, we recommend the short book by Coste [37].

114 finite propositional STRIPS instance can be expressed as a semialgebraic transition

system. For example, a domain with N propositions could have U = {0,1}NC N where U can be defined as a semialgebraic set using the N constraints ui(1 - U) = 0. Because the transition relation can include arbitrary Boolean functions, transition

systems can clearly model STRIPS or PDDL.

Most important manifolds for robotics, such as the special orthogonal or special

Euclidean groups, are semialgebraic sets, and object geometry (including meshes)

can be modeled as semialgebraic sets. A binary variable can be represented by a

real number subject to the constraint xz(xi - 1) = 0, ensuring the permissible values are 0 and 1. Using these constrained variables, we can represent complicated ideas. Because semialgebraic sets can be defined using the Boolean operators {A, V,-,} as well as the quantifiers {, V}, any sentence from propositional logic can be represented as a semialgebraic set. If x, y, z are binary variables, the propositional sentence x V

2 2 2 2 2 -y - z is equivalent to the polynomial equality (x + (1- y) ) (1 - z) + x (1 _

X) 2 + y 2 (1 _ y) 2 + Z2(1 _ z)2 = 0. The statement "the robot does not collide with an obstacle when it follows a given spline" can be captured by the Tarski sentence

Vt E [0, 1], Vx E R3 , (X 0 Xrobot V t)X 0 Xs) where ((t) is the pose of the robot at

time t, and Xrbot and XobS are semialgebraic sets representing the robot and obstacle

geometry.

5.2 Undecidability of Semialgebraic Transition Sys-

tems

In this section, we show that there is no algorithm to decide the general problem of plan existence for semialgebraic transition system instances. In fact, we can show that for any Turing machine T and string w, there is a semialgebraic transition sys- tem instance I that has a solution if and only if T accepts w. The undecidability of transition systems is unsurprising, given known undecidability results in manipula- tion planning 1501 and hybrid systems 16]. However, what may be surprising is that

115 undecidability can manifest even in bounded problems where the transition functions

are free from obvious pathologies.

Our proof of undecidability is derived from a connection between semialgebraic

transition system and Diophantine equations, i.e., equations of the form p(x) = 0

with x E Rk and p a polynomial with integer coefficients.

Theorem 26. Let p(x) 0 be a Diophantine equation in k variables of degree d,

i.e., a polynomial in x 1 ,...,x with integer coefficients. Then there is a semialgebraic transitionsystem instance I with k variables and degree at most kd that has a solution

if and only if there exist natural numbers x 1,... ,Xk such that p(x 1 ,... ,xn) = 0.

Corollary 27. There is no algorithm to determine whether or not a semialgebraic transition system instance has a solution.

Proof. The proof is constructive. We will create a transition system instance for

which the reachable states are precisely the inverse natural numbers, i.e., of the form

( I,..., -). We then construct a polynomial of degree at most kd that has a solution

in the inverse natural numbers if and only if p(x) = 0 has a solution in the natural numbers.

The instance has k variables = {u 1 ,... ,uk}, each with a domain ui E (0,1]. The decision space is the finite set V {1, .. ., k}, and the transition relation enforces that

u i = 's i v Buv u'] <->

UV-uv+1.

That is, the decision chooses a single variable to update, which is updated such that the polynomial equality 'u+u'-u= 0 holds, and leaves the other variables constant.

Note that this transition system instance is semialgebraic and has only integer coefficients. In addition, if we apply v = k from a state where uk = for some natural number n, thenu'= --. Consequently, if our instance has initial state u= (1,.., 1), then a state is reachable if and only if it is of the form( ,..., y) for some natural numbers n..... ,nk E Nk.

116 Next, construct a polynomial q(u) that has a solution if and only if p(x) has a solution. Let di be the maximal exponent of xi in p, and let q(ui,...,Uk) =

p(1/ui,.. .,1/Uk) Fg u '. Then since ui > OVi, B, Udi > 0. Consequently, we have 0 = p(,. q(ui,...,uk) = 0. Furthermore, if -u : q(u) = 0 with E NVi, then p(x1,..., Xk) = 0 with xi ENVi. Let Bg[u] <->= q(u) = 0, and letu= (1,..., 1). Then the transition system instance I, = (U, V, <, B, uo, B) has a solution if and only if Bg contains a point in the inverse natural numbers, which is true if and only if p(x) has a solution in the natural numbers. O Theorem 26 implies that an algorithm to decide if a transition system instance is solvable would imply an algorithm to decide if a Diophantine equation is solvable. Hilbert's tenth problem [92] asked for such an algorithm, and Matiyasevic, Robinson, Davis, and Putnam 144, 147] proved no such algorithm can exist. Thus, there is no algorithm to decide if an arbitrary semialgebraic transition system instance has a solution. 0

The essential feature of the Diophantine construction is that it permits infinite trajectories in a bounded space. We can take advantage of the same idea to construct even simpler undecidable examples. For example, we can define a class of systems for which reachability is undecidable for which the state space has only five dimensions and for which all constraints are linear. We do so by showing that linear transition system can model problems from arithmetical billiards. A closely related example was described by Alur et al.161; we extend that analysis. A billiards trajectory on a triangular table is closed if and only if the angles of the triangle are rational multiples of 7r. Otherwise, the trajectory is unbounded, and the ball will eventually reach every point. Consequently, by choosing a trian- gle with rational coordinates whose angles are not a rational multiple of 7r-such as

{(0, 0), (1, 0), (2, 1)}, which has interior angles of approximately (1.1072, 1.1072, 0.9272)) radians-we can show that the ball will pass arbitrarily close to each point in the tri- angle, yet might have to bounce arbitrarily many times to do so. A two-dimensional table can therefore support unbounded trajectories, and by using Boolean gates we

117 can define actions that allow us to increment the counter, decrement it, or check if its value is zero. Two such counters, together with a finite-domain variable to represent an instruction set, would allow us to implement Minsky's universal 2-tag system 149], through which we can simulate any Turing machine. Consequently, there is no al- gorithm to decide reachability for semialgebraic transition systems, even in just five dimensions and with all linear constraints.

5.3 Uniform Accessibility, Decidability, and Com- plexity

The undecidability results of the previous section seem to limit the potential for ex- act algorithms for transition systems. However, it turns out that if we can preclude unbounded paths, we can not only prove decidability, but in fact can give an explicit algorithm that uses only polynomial space to determine reachability. In particular, we show that if the instance admits a stratification for which the actions are uniformly accessible-technical terms that we define in this section-then our algorithm uses only polynomial space. We begin by defining stratification and uniform accessibility, then introduce the notion of uniform stratified accessibility, a particular kind of com- patibility between a stratification and a transition system instance. We show that if a transition system instance has the property of uniform stratified accessibility, then we can bound the length of any feasible plan. Finally, we use the bound on plan length in an algorithm that decides if a stratifiable semialgebraic transition system instance has a feasible plan, and show that our algorithm requires only polynomial space.

5.3.1 Stratification

Unlike manifolds, semialgebraic sets need not have constant dimension: consider the set defined by x 2 + y 2 < 1 V x = y, which is locally one-dimensional at some points and two dimensional at others. However, it is provable that semialgebraic

118 sets can be decomposed into a finite collection of pieces of constant dimension. Such a decomposition is called a stratification, and the pieces of constant dimension are

called strata. Consider a square; the interior of the square looks like a two dimensional

space, while each of its four edges look like a one dimensional space. The boundaries

of the edges are the corners of the square, which look like zero-dimensional spaces.

Formally, a stratification of the configuration space U of a transition system in- stance I is a finite collection E of subsets of u called strata satisfying three technical conditions:

1. each - E E has constant dimension;

2. the strata are pairwise disjoint and cover the space, so that every u E U is an element of exactly one stratum o- E E;

3. and the boundary Bo- of each stratum - E E is the union of some members of

E with dimensions strictly smaller than dim u.

A semialgebraic stratification is a stratification in which each stratum is defined by a semialgebraic formula. Every semialgebraic set is compatible with a semialgebraic stratification.

5.3.2 Uniform accessibility

As transition systems generalize discrete-time dynamical systems, we can study tran- sition systems using control-theoretic tools. In particular, Jakubczyk and Sontag[1011 showed that accessibility is a sufficient condition to avoid most kinds of exotic be- havior. Loosely speaking, a system is accessible if the transition relation behaves like a control signal: if some decision v takes the system to a state u, we can reach any nearby u with small changes to v.

Definition 3 (Accessible transition system). Let POST(u) = {u'Ilu, v: B[u, v, u']}, and let PRE(u') = (uI3v, u' : B[u, v, u']}. A transition system is accessible if

1. for any u EU, POST(u) is either empty or has non-empty interior, and

119 2. for any u' E U, PRE() is either empty or has non-empty interior.

In an accessible transition system, set of states reachable from a given state can

be determined without finding an explicit decision sequence. For example, if u E POST() for all u such that POST(u) is nonempty, then the set of reachable states

from a state u is precisely the connected component of the set

U POST(U) uEU

containing u. However, the shortest plan required to reach a given state may be arbitrarily long. Accessibility does not bound plan length, because the amount of free space that is reachable from any point can become arbitrarily small. Consider a transition system with B[u, v, u'] 4 u' = u(v + 1), where U = [0, 1] and

V = [0, 1]. This transition system is accessible, but as the state u approaches zero, the distance it can cover in a single transition decreases. Consequently, it takes n steps to reach the state , and there is no bound on the number of steps needed to reach any reachable state.

Most physical systems do not permit these Cauchy-like infinite convergent se- quences. We will specifically preclude infinite convergent sequences on bounded do- mains through the notion of uniform accessibility.2

Definition 4 (Uniformly Accessible Transition System). A system is uniformly ac- cessible if there exists a radius 6 > 0-independent of the state u-such that for any point u every point in some ball B(z',6) is reachable from u in one step, and u is reachable from every point in some ball B(z,) in one step.

For an example of a uniformly accessible system, consider a system path modeling path planning for a point robot. The space U models the space of collision-free configurations; the set of feasible transitions includes any two configurations u and u' for which the line su + (1 - s)u' U for all s E [0, 1]. Then path is uniformly

2 The terminology is a direct analogy to standard notions like uniform convergence and uniform continuity.

120 accessible for any 6 small enough that every point in U lies inside of a ball of radius 6 that does not leave U.

Uniform accessibility is important because (as we will show), if a system is uni- formly accessible and its state space is bounded, then every reachable state is reach-

able by a plan of length 0(2 dimU). We can take advantage of this bound on plan length to draw conclusions about the decidability and complexity of planning problems with

nontrivial dynamics by extending known results about problems in semialgebraic ge- ometry.

5.3.3 Uniform stratified accessibility

For robotic planning, accessibility is often too stringent of a condition to require.

In any problem involving contact-based manipulation, the accessible space will be restricted to the subspace where non-grasped objects do not move. We accommodate

these limitations by introducing a notion of uniform stratified accessibility, which generalizes accessibility by requiring that transitions between strata have uniform

accessibility. Concretely, we require that there is some positive radius 6 such that for

any feasible transition (u, u') from a strata a to a strata a', we can reach any point on a' within the ball B(u', 6) from some point on a within the ball B(u, 6).

Definition 5 (Uniform stratified accessibility). Let T = (U, V,<», B) be a transition system, and let E be a stratification. Let CON[T] = {(u, u')|1v : B[u, v, u']} c U x U be the connected set for a transition system T. T has uniform stratified accessibility with constant 6 if

Va E, ', E E, V(y, y') E CON[T) n (a x a-')

Vx E B(y, 6) n o-

3x' c B(y', 6) a' : (x,x') E CON[T] n (o x a-'). (5.1)

We say that a transition system T is uniformly stratifiable if it has uniform stratified

121 accessibility with some constant 6 > 0.

The condition in equation 5.1 is closely related to the notion of stratified control-

lability developed by Goodwine and Burdick 174]. Observe that if the stratification is

semialgebraic, then equation 5.1 is a Tarski sentence-which means there is an effec-

tive algorithm using only polynomial space to both decide if an instance is stratifiable, and to choose the largest 6 for which it is stratifiable.

Though complex, the condition in equation 5.1 meshes nicely with our intuitive understanding of controllability. If we can break the configuration space into a finite

number of pieces such that on each piece we can move around more-or-less freely, than the instance is stratifiable. Most standard formulations of TMP are stratifiable, including all instances with semialgebraic geometry where each action either leaves

an object fixed or allows free motion through free space. The constant 6 is often related to the size of the geometry; a problem with very fine geometry might require a very small 6, in order to ensure that an action can actually reach every point in a ball of radius 6.

Note that the instance we constructed to prove Theorem 27 was not stratifiable: the preimage of each action was the whole space, yet there is no 6 such that a ball of radius 6 includes only reachable states. Furthermore, we cannot restrict that preimage to just the set of inverse natural numbers; that set cannot be semialgebraic, as a semialgebraic set always has a finite number of connected components. While the Diophantine construction used in the theorem is not stratifiable and clearly does not correspond to a real robotics problem, it it not pathological in any obvious way. The coefficients involved in every polynomial are integers; it did not involve any quantifiers, or indeed any inequalities.

5.3.4 Deciding stratifiable semialgebraic transition systems

Using the uniform stratified accessibility condition, we can formulate the main result of this section: if a semialgebraic transition system instance is stratifiable, then we can decide if a plan exists using polynomial space. Note the algorithm we present

122 does not need to know the stratification itself, just its size; the decision procedure we

give accepts as input an instance I, a natural number IE|, and a constant 6 > 0.

First, we show that if an instance is stratifiable and has a feasible plan, then there

is an upper bound on the length of the shortest plan.

Theorem 28. Let E be a stratificationcompatible with a transition system instance

I that has uniform stratified accessibility with constant 6. Then either the instance I

has no solution, or it has a solution of length at most N* = [iEi ()d] , where d is

the dimension of U andCd is the volume of the unit ball in Rd.

Proof. Note first that for any 6 > 0 there is a set of N = [(1) dballs that cover the unit cube. Thus if there there is a plan p (v,..., vN) of length N > N* with trace

(uO, -.. , UN), there must exist i < j such that ||ui - uII < 26, ui E o-', uj E o,'. Since u, uj E B(z',6) n ', it follows from the definition of uniform stratified accessibility that there is a state u' and a decision v such that B[ui_1, v, u'], and u' c B(uj,6)no'. Because u' E B(uj,6) n a', we can repeat the argument: there must be a state u"

and a decision v' such that B[u', v', u"] and u' c B(uj+1,6) noj+1 . This argument can be repeated until we reach the end of the plan. The result is a feasible plan

(v 1 , .vi, vj+V 1,vj+2,... VN) of length N - j+ i, which is strictly less than N. The process can be repeated whenever N > N*, meaning that we can iteratively shorten any plan until we arrive at a plan of length N*. Therefore, provided a plan exists, we can construct a plan of length at most N*. L

The length bound from theorem 28 immediately ensures the decidability of strat- ifiable semialgebraic transition systems.

Theorem 29. The reachability problem for uniformly stratifiable semialgebraic tran- sition systems is decidable.

Proof. Because we have a finite upper bound on plan length, we can define the semi- algebraic set of all plans less than that length. Because it is decidable whether a semialgebraic set is empty, it is decidable whether a plan exists. E

123 The "algorithm" used in the proof of theorem 29 requires exponential space and

doubly-exponential time in the size of the transition system instance, but it is guar- anteed to terminate.

We now define a decision algorithm that closely resembles Savitch's canonical

algorithm 1167]. Recall that for discrete planning, we can check for the existence of

a plan of length less than N linking two states by recursively checking if there exists

a plan of length N/2 to some state, and a plan of length N/2 from that state to the

goal. The process involves a maximum recursion depth of log 2 (N) (using constant space at each level). Since the longest possible plan in a PDDL instance has length exponential in the number of predicates, a logarithmic recursion depth means we can decide plan existence using only polynomial space.

In the context of semialgebraic reachability, there are uncountably many possible

intermediate states u', so we cannot simply enumerate them. However, given a semi-

algebraic set So, we can decompose the set of states reachable from some state in S into a finite number of cells, each of which is a semialgebraic set. Furthermore, we

can enumerate these cells using only polynomial space 113]. Recursive enumeration

of reachable sets forms the core of our algorithm; starting with the bound N* on the

length of the shortest plan that we calculated in Theorem 28, we recursively enu-

merate the semialgebraic cells reachable using a plan of length at most N*/2. Our algorithm is defined in detail in Algorithm 8.

Algorithm 8 An algorithm to enumerate reachable subsets in semialgebraic transi- tion systems. 1: function REACH(n E N, semialgebraic sets So, Si) 2: if n = 1 then 3: S <- POST(So) 4: for set C in DECOMPOSE(S) do 5: yield INTERSECTION(C, Si) 6: else 7: for set S in REACH([n/2), So,U) do 8: S' - POST(S) 9: for set C in REACH(~n/2], S', Si) do 10: yield S'

124 Note that the yield keyword inherits its meaning from the Python programming language: it returns the indicated value, then suspends execution until the next time

the caller requests a value. Algorithm 8 invokes several subroutines not precisely

defined in this thesis. First, the subroutine POST(S) computes the semialgebraic

set {u'u E S,v E V : B[u,v,u']} of states reachable from any state in S in one transition. Next, the subroutine INTERSECTION(S, S2) returns the intersection of two semialgebraic sets as a semialgebraic set. Finally, the subroutine DECOMPOSE(X)

generates a decomposition of the semialgebraic set x into basic semialgebraic cells. It can be implemented using the algorithms in chapter 14 of Basu et al. 1131.

We claim that Algorithm 8 will generate all semialgebraic sets reachable in a plan of length n. The first branch of the function (lines 2-5) covers the base case, generating a decomposition of the space reachable from So in a single step. The second branch

(lines 6-10) is the recursive case; it generates a decomposition of the subset of Si reachable from So in at most n +1 steps. The function recurses to enumerate the

reachable subsets of the set from which each action can be executed, then recurses

again to enumerate the subsets of Si that can be reached after executing that action.

Algorithm 8 has a recursion depth of O(log n), and uses polynomial space at any

point in the computation. That is, if the transition system instance can be encoded in a string of length m on some Turing machine, Algorithm 8 requires space polynomial in m and time exponential in m.

The correctness of Algorithm 8 and the upper bound given in Theorem 28 sug- gest a straightforward decision procedure: invoke Algorithm 8 to check for plans of length [iEI (j)d linking uo to some state in B.. We describe this procedure in Algorithm 9.

Algorithm 9 An algorithm for determining plan existence. 1: function PLANExISTS(instance (U, V,<, B, uo, B), J E Ro) 2: n <- 3|pl 1 (j)d] 3: if REACH(n, (uo}, Sg) yields anything then 4: return True 5: else 6: return False

125 Algorithm 9 returns True if and only if there is a state in B reachable from uO via a

plan of of length less than n. Since a plan exists if and only if a plan of length less than n exists, Algorithm 9 is correct. Since our bound on plan length is exponential in the

size of the input and Algorithm 9 requires space logarithmic in the length bound and

polynomial in the size of the input, it follows that Algorithm 9 uses only polynomial

space. Specifically, if the description involves polynomials of degree at most r in

1 a space of dimension d, Algorithm 9 requiresO((|

Theorem 30. Stratified semialgebraic C 3 is PSPACE-complete.

Proof. Because algorithm 9 decides plan existence using only polynomial space, plan existence is in PSPACE. As argued in section 5.1, any finite PDDL instance with binary variables can be expressed as a uniformly stratifiable semialgebraic transition system by replacing binary variables with real variables subject to the constraint xi(1 - xi) = 0. Since the resulting transition system has a description only polyno- mially larger than the original propositional PDDL instance and since planning using

PDDL is PSPACE-complete [24], plan existence is PSPACE-hard. Because plan existence for stratified semialgebraic C 3is PSPACE-hard and in PSPACE, it is PSPACE-complete.

Finally, observe that it is straightforward to extend Algorithm 9 into an algorithm for the optimization problem. Given a transition system instance I and any cost threshold E, we can construct an augmented transition system instance l from I by adding an additional dimension to U, and defining the feasible transitions to only permit that value to increase by the cost associated with each decision v. If the goal of l includes the semialgebraic constraint c < -, we can use Algorithm 9 to decide if a plan exists that incurs cost at most E. This construction requires only one additional variable and thus only polynomially more space than deciding if a plan exists at all; accordingly, near-optimal planning for stratified semialgebraic transition systems is also PSPACE-complete.

126 5.4 Relationships between representations

We close this chapter by studying the relationships between constraint-based represen-

tations for planning problems and the more conventional differential representations.

For example, a standard way to model the dynamics of a physical system is to define

a system state x E X, a continuous control input u E U, and a differential equation

k = f(x,u) determining the evolution of the state in response to the input. If the dynamics f are Lipschitz continuous, then state trajectories are guaranteed to exist and to be unique, and we can thus define motion planning problems as the search for

a control tape u(t) that will steer the system from an initial state x0 to a goal set X.

There are many advantages to differential representations, including a clear phys- ical motivation: Newton's equations take the form of ordinary differential equations.

However, differential planning problems can be quite complicated to analyze, and

require more sophisticated mathematical tools than the real algebraic geometry used in this chapter. Tools from differential geometry and geometric control theory allow

formal proofs of global controllability for certain kinds of systems. Unfortunately, the same features that make these tools appealing for formal analysis make them

unfriendly to algorithmic analysis. The solutions for dynamical systems often involve

analytic functions, and many simple queries involving analytic functions are undecid-

able; for example, there is no algorithm to determine whether two analytic functions

are equivalent. Consequently, general-purpose algorithms for studying dynamical sys- tems typically rely on numerical approximations.

Observe that if we could transform some representation of a dynamical system

into a semialgebraic transition system, we could use the results of this chapter to construct algorithms for deciding plan existence in most physical systems. As we will show, such a transformation is not possible in general. However, in this section we take several steps towards determining when such a construction might exist, and thus to providing an algorithm for determining plan existence in a more general setting than the transition systems considered here. First, we show that any piecewise-analytic locally-controllable hybrid system has an equivalent transition system with transi-

127 tions defined by Boolean combinations of analytic functions. This result establishes a duality between transition systems and certain classes of differential representations.

Unfortunately, our proof is not constructive: there is no general method for finding a transition system equivalent to a given differential system. Second, we show that

any semianalytic transition system can be approximated arbitrarily well by a semial- gebraic transition system. Together with the results of section 5.3, approximability

implies that we could solve planning problems for semianalytic transition systems

using only polynomial space if we were willing to accept some arbitrarily small fixed precision in our result.

5.4.1 Analytic functions and semianalytic sets

Suppose we have a differential representation of a dynamical system n = f(u, v). We might naiively assume that if the function f is semialgebraic, then the set of states reachable from some initial state uO is also semialgebraic. Unfortunately, that

assumption would be false. Consider the most elementary example of a differential equation:

n = Au. (5.2)

Here, A is a matrix. Equation 5.2 is a linear equality constraint, and is solved by functions of the form

u(t) = exp(At)uo. (5.3)

The exponential function is not semialgebraic; instead, it belongs to a broader class of functions, the analytic functions.

Definition 6 (Analytic functions). A function f : R + R is analytic, written f E C', if there is a neighborhood of every point x 0 such that the Taylor series at each point x 0 converges to f(x) for all x in some neighborhood of x0 -

Note that analyticity implies the existence of infinitely many derivatives, but is an even stronger condition than smoothness; for example, it can be shown that if an analytic function is constant on some open subset of its domain, it is constant

128 everywhere.

Just as we can define the semialgebraic sets using polynomials, we can define the

class of semianalytic sets using analytic functions.

Definition 7 (Semianalytic sets). A set A C Rd is called semianalytic if for any

x c Rn, there are a neighborhood U D x and analytic functions fi, gij : U -+ R such that

AnU=Unfx E Ulf<(x>=0,gij(x>>01 (5.4) i j

Semianalytic sets inherit many of the nice properties of semialgebraic sets; for example, the union or intersection of semianalytic sets is semianalytic. A foundational result in real analytic geometry is that every semianalytic set admits a semianalytic stratification.

Theorem 31 (Lojasiewicz 11381). Any semianalytic set is compatible with a semian-

alytic stratification, that is, a stratification whose strata are semianalytic sets.

The simplicity of this statement belies its importance. Similar statements made

without recourse to real algebraic geometry must introduce numerous technical con-

ditions related to the ways in which strata are permitted to intersect. Analyticity along is sufficient to capture these complex regularity conditions.

Unlike semialgebraic sets, there is no quantifier elimination or Tarski-Seidenberg

theorem for semianalytic sets. The projection of a semianalytic set need not be semianalytic. For example [49], consider the semianalytic set

A {(u, v, uv, v exp(-u))I(u, v) E [0, 11 x [0, 1]}, and let

B {(v, w, x)I(u, v, w, x) E A}, that is, the image of A under projection onto the last three coordinates. Observe that w = uv and x = v exp(-u), and therefore B can also be characterized as B =

{(v, w, v exp(-w/v))I0 w< v 1}, which is not semianalytic near 0. The function

129 exp(-1/w) is a well-known example of a function that is infinitely differentiable yet

not analytic, as every derivative vanishes at 0 yet the function is not identically zero.

A broader class of sets, the subanalytic sets, are sometimes useful tools when pro- jection is necessary. Subanalytic sets have previously found use in control theory1188]

and hybrid systems [127], and are intimately related to a powerful formalism for con- structive geometry and analysis, the theory of o-minimal sets. We refer the reader

to the survey by Denkowska and Denkowski [491 for an introduction to the topic; we

give only the essential definitions here. A subanalytic set is a set that is locally the projection of a semianalytic set.

Definition 8 (Subanalytic sets). A set X C Rd is called a subanalytic set if for any x E X there exists a neighborhood U - x such that X n U =ir(A), where A c R+k

is a semianalytic set and 7r : Rd+k _ Rd is the projection onto the first d elements.

Semianalytic sets are closed under the Boolean operations of union and intersec-

tion; subanalytic sets are additionally closed under the operations of closure, bound- ary, and complement.

5.4.2 Duality for analytic representations

In this section, we give sufficient conditions on a differential representation for a dynamical system to ensure the existence of an equivalent transition system. In par- ticular, we prove that any piecewise-analytic dynamical system defined in a compact space can be described by an analytic transition system (

Consider a dynamical system represented as a first-order differential equation

n = f(u, v) (5.5) where u E U c Rd is the configuration of the system, v E V c Rk is the control signal, and U is compact. Recall from chapter 3 that if there is a semianalytic

130 stratification of U such that the restriction of f to each stratum is analytic, then we say f is piecewise analytic. If we envision f as defining the dynamics of a simplified

manipulation problem, where the pose of each of N objects changes only when the object is rigidly grasped by a robot and is otherwise fixed, then there are N+1 strata:

one stratum for each possible grasp, and one stratum for the part of the state space where no object is grasped. On each stratum, the dynamics are trivial: either the robot moves, or the robot and the grasped object move as one rigid body.

We first show that because the dynamics are analytic on each stratum S, we can

construct a collection of analytic functions 4s on each piece such that for any feasible

path p c S through a configuration u E S, 0(u) = q(u')Vu, u' E p,q E 4s. In the language of differential geometry, 4s defines a foliation: the equivalence relation

u u' <- #(u) = #(u') partitions S into an infinite collection of submanifolds

called leaves such that any feasible path on a stratum lies entirely in the closure of one element of the collection. In the manipulation example, every possible combination

of poses of the non-grasped objects defines a distinct leaf.

Next, we use results from geometric control theory 11881 to show that on each leaf,

if there is a control law that takes the system from a state u E L to a state u' E L, then there is a piecewise-analytic control law that takes u to u'. That is, we only ever

need to consider piecewise-analytic control laws to move around each leaf, and thus around the full configuration space. We can therefore define a finite collection vector

fields and show that for any reachable point in the entire configuration space, there is a trajectory defined by the flow along a suitable piecewise-analytic vector field.

Finally, we can show that strata, leaves, and switching conditions for the vector

fields can all be defined in terms of finite Boolean formulae whose atomic formulae are given by inequalities of analytic functions. Thus, we can show that if a dynamical sys- tem u = f(u, v) is piecewise-analytic, then there exists an analytic transition system

(U, V,

The primary utility of our existence theorem is that it shows a kind of univer-

131 sality to constraint-based representations. The class of systems we can model with piecewise-analytic differential equations is identical to the class of systems we can model using analytic transition systems. Put less precisely, we can use transition

systems to model essentially any dynamical system that is sufficiently well-behaved.

We now introduce the technical ideas required to make and prove our claim. Recall the following definitions from chapter 3.

Definition 9 (Piecewise analytic control system). If there is a semianalytic strati- fication of U such that for each v E V, the restriction of f(.,v) to each stratum is analytic, then we say f is piecewise analytic.

Definition 10 (Path space). The path space If associated with a dynamical system

fis the set of all continuous paths p : [0,1] -+ U satisfying 3v E V : (s)= f(p(s), v) Vs.

Definition 11 (Reachable set). The reachable set Rf(uo; t) is the set of all configu- rations u reachable from the configuration uo within time t.

Rf(uo; t) = {u|2p E Ilf : p(O) = uo, p(t) = u} (5.6)

In order to prove our result, we will first construct a simpler system, where the feasible directions {f(u, v)Iv E V} form a subspace of TuU. The technical term for such a structure is a distribution.

Definition 12 (Distribution). A distribution on a manifold X is a smooth map assigning to each x E X a linear subspace of TX.

Definition 13 (Involutive closure). The involutive closure A of a distribution A is the smallest distribution containing A that is closed under the Lie bracket, that is

VX1 ,X 2 E A, [Xi, X2] E A. Explicitly, define a sequence Ai recursively with Ao =A and Ai+ 1 = A U UxxX2 EA[X1, X2] ; then A A .

We now have the tools requires to prove the main result.

132 Theorem 32 (Duality for analytic transition systems). For any piecewise-analytic control system

= f(u, v) (5.7)

there exists a transition system (U,V,D, B) such that for any initial state uO and definable goal set U, either

1. The transition system instance (4, B,u0 ,U,) has no solution and u. n R(uo) 0, or

2. There exists a solution ((UI,v),... , (UVT)) to the transitionsystem and there

exists a path p E 1I5 and a sequence 0 < t 1 < ... < tT such that p(O) = u(0),

p(tT) E Ul, andsuch that p(t1) = u 1, .. . ,p(tT) = UT.

That is, a solution to the transitionsystem exists if and only if the goal set is reachable.

Proof. Because the control system f is piecewise analytic, it admits a definable semi- analytic stratification E of U such that the restriction f, of f to each stratum o of E is analytic. The stratification E is finite, and thus is definable using only a finite set ID of analytic functions.

The function f, induces a family Xf of (locally-defined) analytic vector fields: for

any set S c o- open in o-, let 3f,s = {(u, f(u, V(u))) : u E S, V E C'(S, V)}, and let

Xf = Uf,s. This family Xf in turn induces a distribution Af = UEMSpan{X(u) : X EXf}= {(u, v) : 3v E S, A E R : v = Af(u, v)}. By construction, the involutive closure Af of Af is closed under involution, and we can construct a basis for Af of analytic vector fields by considering the restriction Xf C Xf where the local control fields are analytic; because fis analytic, 3C contains only real analytic vector fields. Because An is closed under involution and spanned by analytic vector fields, it is integrable [152].3

Because An is integrable, for any point u E - there is a unique maximal integral manifold Ou c a containing u. O is immersed analytic submanifold of a, and is thus

3 The cited theorem is a generalization of Frobenius' theorem [34, 46, 66] that requires analyticity rather than regularity. For a more recent exposition on the topic, see Michor [148], theorem 3.28.

133 a bounded semianalytic set. We can then define an analytic foliation whose leaves are

the manifolds O. There must then be an analytic foliated atlas of o-, implying the existence of a finite set of functions b, such that two points u, u' E o are connected by a path p E HAf,p c o- if and only if #(u) = #(u'). Because Ulfg c Af, any feasible path p E U1 including u is a subset of Ou. Consequently, the collection D = 1 UU ysz ©, can be used to define a transition

system that contains a solution if (but not only if) the goal set is reachable under the

dynamics f: if there is no path between two points on the same connected component of the same leaf of the same strata, then there is no feasible path under the dynamics. To prove that a path in the transition system exists only if there is a corresponding

path in the dynamical system, observe that for small time t, the reachable set Rf(u) is semianalytic. We can therefore define the semianalytic set St= {(u, u')ju' E

Rf(u; t)}. For any feasible path p E f,p C o- there is thus a sequence (ui,..., un) such that (ui, u+i) E St. Because St is semianalytic, we can define it in terms of a

finite collection of analytic functions ,.Combining the finite collections used thus

far, we have shown that the collection 4 = IDE U WEEUP, U4,jcan be used to define a transition system that contains a solution if and only if the goal set is reachable

under the dynamics f. E

5.4.3 Computability and approximation

While theorem 32 ensures the existence of a transition system, the proof is not con-

structive: it does not provide a procedure to find the analytic transition system

corresponding to a given dynamical system. In practice, it may be difficult or im-

possible to construct a transition system from a given set of dynamical equations.

Unfortunately, there is unlikely to be a general algorithm for performing this task, as a consequence of Richardson's theorem 1163], which shows that there is no algorithm to determine whether two analytic functions are equivalent.

There are many variations to and statements of Richardson's theorem. Among the most modern and compact is due to Laczkovich 1126], who showed that if F is the set of functions formed by substituting x, sin x", and sin(x sin x") into an arbitrary

134 polynomial with integer coefficients, then there is no algorithm to decide if there exists x such that f(x) > 0, nor is there an algorithm to decide if there exists x such

that f(x) = 0. An immediate consequence is that there is no algorithm to decide if

two functions f, f' E F are equal over all x

Richardson's theorem has two important ramifications for transition system plan- ning. First, there exist pathological instances of transition systems with semianalytic

constraints for which we cannot even decide if a plan of length one exists. Sec- ond, because we cannot determine whether two analytic functions are equivalent, we also cannot determine whether two analytic transition systems are equivalent, nor

whether an analytic transition system is equivalent to a given analytic dynamical system. These undecidability results pose a significant barrier to formulating general algorithms to solve robotic motion planning problems.

One path around the barriers due to undecidability is to relax the definition of

feasibility. Analytic functions on a compact domain can be well-approximated by algebraic functions. In particular, Baouendi and Goulaouic [9, 10] showed that if f is an analytic function defined on a compact set K C R", then there exist constants a,3 > 0 such that for any n E N, there is a polynomial pn of degree n with

sup If(x) - p(x)|I < 2- xEK

While constructing such a polynomial may be difficult, this theorem is sufficient to show that we can solve a relaxed version of the plan existence problem for compact semianalytic domains.

Theorem 33. For any semianalytic transition instance I and positive 6, there is a corresponding semialgebraic instance T with maximal degree O(log6) such that any plan f feasible for T is nearly feasible for I. Let d(p, p') = sup. inft ITrace (p) (s) - Trace (f) (t)|Ibe the distance between two plans. If Ufeas[1] is the set of feasible plans for I, then

C- E feas[I'] p E feas[I] : d(p,) < 6.

If we are ultimately going to implement our algorithms using finite-precision arith-

135 metic, then in practice semialgebraic transition systems have as much representational

power as semianalytic systems: every semianalytic system is an arbitrarily small per- turbation of a semialgebraic instance. The size of the semialgebraic approximation

grows with the logarithm of the tolerance, meaning that even very small tolerances can

be achieved by reasonably small descriptions. We could therefore define a polynomial-

space partial algorithm to determine the feasibility of analytic transition systems, that

always returns the correct result when it returns a definitive answer, and that fails to return a definitive answer only if feasibility hinges on features smaller than machine

precision. Probing the bounds of what can be decided by approximate algorithms of this type is a promising avenue for future research.

5.5 Implications

In this chapter, we have made a sequence of theoretical claims.

1. Undecidability: reachability for semialgebraic transition systems is undecidable.

2. PSPACE-completeness: reachability for stratifiable semialgebraic transition

systems, a class of problems that includes most task and motion planning for- malisms, is PSPACE-complete.

3. Duality: any analytic, locally controllable hybrid dynamical system can equiv- alently be expressed as a semianalytic transition system.

4. Approximability: any semianalytic transition system can be approximated ar- bitrarily well by a semialgebraic system.

These results advance our understanding of long-standing open problems around the intersection of computation and robotics. In particular, they expand the class of motion planning problems that are formally solved. These results also demonstrate the computational equivalence of large classes of planning problems; the feasibility problem for every planning formalism considered was shown to be either undecidable or else PSPACE-complete.

136 We note that the approach taken in algorithm 9 is not intended to be a practi-

cal strategy for solving planning problems; it discards most of the structure in the

planning problem, and makes no use of heuristics or other standard techniques. Still, its existence has several important ramifications. First, the duality between semian-

alytic transition systems and piecewise-analytic dynamical systems implies a kind of

sufficient conditions for a specification language. Any language that can specify an arbitrary semianalytic transition system is, in a sense, "as good as" any other language

for modeling problems. These conditions are highly relevant to the ongoing effort to standardize a specification language for TMP, and highlight the theoretical relevance

of C3 .

Second, our algorithm 9 solves motion planning problems as a special case. The

existence of a PSPACE algorithm for motion planning is unsurprising; however, our algorithm is considerably simpler than the known algorithms for semialgebraic motion

planning, and solves a broader class of problems. There is no publicly known imple-

mentation of Canny's roadmap algorithm 127] for motion planning, and it may be

that the algorithm in its general form has never been implemented 1132]. While there

is no implementation of algorithm 9 either, it is considerably simpler than Canny's al-

gorithm; at each iteration, it requires only a single round of quantifier elimination and

a decomposition of the resulting set into simple pieces. The comparative simplicity of our approach suggests that a practical implementation may be feasible, which has implications for the rare situations where exact results are important. For example, in problems like manufacturing optimization, a motion plan will be executed many times and even small improvements in cost can be extremely valuable.

Third, our results show that task and motion planning, if formalized using semial- gebraic transition systems for systems with uniform stratified accessibility, is asymp- totically no harder than task planning or motion planning alone. This hardness bound is the best result we could have hoped for; provided we restrict our attention to con- trollable systems, there is no subtle interaction that makes hybrid planning even harder than the already-hard constituent planning problems. Moreover, although problems in PSPACE are generally believed to be intractable, PSPACE has the

137 distinction of being among the hardest class of problems for which we cannot prove

there is no polynomial-time algorithm. In addition, it is known that every problem in PSPACE is solvable by an interactive proof system in polynomial time [1721. It is

easy to imagine the descendants of contemporary results in deep reinforcement learn-

ing being used to construct an enormously complex decision-making engine to solve

task and motion planning problems. Such an engine might seem to work empirically,

but be too complex to verify. Problems in PSPACE are precisely those problems for

which the results can be verified in a polynomial number of interactions between the untrusted prover and a resource-constrained verifier with only a bounded probability of error.

Finally, Corollary 27 highlights the importance of discrete structure; without a

stratification, the problem quickly stops even resembling motion planning. Our un-

decidability result points to the fundamental importance of symbolic reasoning- interpreted as meaningful discrete structure-as a tool for understanding continuous

decision-making. The importance of discrete structure has been appreciated since the early days of robotics. For example, Newell and Simon [154] famously conjectured

that "A physical symbol system has the necessary and sufficient means for general intelligent action." Malcolm and Smithers [142] are even more explicit:

Any mechanically embodied intelligent process will be be comprised of

structural ingredients that 1) we as external observers naturally take to represent a propositional account of the knowledge that the overall process

exhibits, and 2) independent of such external semantical attribution, play a formal but causal and essential role in engendering the behaviour that manifests that knowledge.

Corollary 27 can be interpreted as a partial proof of these statements. At least in the limited setting of deterministic planning under piecewise-analytic differential constraints, the set of problems for which planning is possible in general is precisely the set of problems that possess a suitable symbolic structure.

138 Chapter 6

Conclusions

6.1 Summary of contributions

In this thesis, we have sought to address deficiencies in representations for task and

motion planning. Our main contribution is a new representation, the C3 formalism, that generalizes several existing representations. We argue that this representation

can be a unifying lens through which to view existing task and motion planning re-

search. Furthermore, we showed that our representation supports abstraction,which

we argue is an effective way to move beyond the limitations of traditional heuris- tics. We defined admissibility conditions for abstractions, under which planning with abstraction is guaranteed to yield nearly-optimal results.

We also used this representation to resolve several long-standing open questions about the decidability and complexity of robot motion planning under general dynam- ics. We presented asymptotically optimal algorithms for task and motion planning, and used them as the backbone of a planning system capable of solving real-world planning problems. One of our algorithms takes advantage of factored structure in the problem, and allows exponentially smaller discretizations than anaYve approach.

Furthermore, we used tools from real algebraic geometry to show that a broad class planning problems are PSPACE-complete. Our result applies to most formalizations of task and motion planning in common use today. In addition, we showed that every piecewise-analytic differential system can be well-approximated by a C3 instance.

139 6.2 Open problems

We close by noting that as have resolved a long-standing open question around the

complexity of motion planning, several new questions can now be asked.

Decidability of continuous planning under uncertainty

All of our theoretical results have centered around deterministic planning. We observe

that if we have a transition or observation model that is semialgebraic, then basic clo-

sure properties of semialgebraic sets under the operations of calculus ensure the value

function is also semialgebraic and thus that the optimal policy is semialgebraic. It is therefore likely that our analysis could be extended to study the complexity of gener- alized planning under uncertainty. We conjecture that planning under uncertainty in

semialgebraic domains is EXPTIME-complete, while partially observable planning

is EXPSPACE-complete. The intuition here is straightforward. If our discretiza-

tion results hold even in the stochastic or partially observable case, then we ought to be able to reduce the problem to discrete planning with an exponentially large state

space. Planning under uncertainty requires time polynomial in the size of the state

space, while partially observable planning requires space polynomial in the size of the state space; if the state space is exponentially large, this implies EXPTIME and

EXPSPACE complexity, respectively. The main technical barrier to proving these conjectures is establishing a bound on the size of smallest policy that satisfies the constraints of the planning problem.

Necessary conditions for decidability

In chapter 5 we showed that uniform stratified accessibility is sufficient to ensure the decidability of C3 . However, it remains unknown if uniform stratified accessibility is the weakest possible condition, or if it is just a special case of a yet more general class. Observe that we could trivially construct a larger class of decidable problems by picking any analytic planning problem I* for which the solution is known and defining the set that contains the semialgebraic stratifiable transition systems and

140 1*. We could therefore ask: is there a "natural" class of decidable planning problems

that is a strict superset of the semialgebraic stratifiable transition systems we have studied? The proper definition of "natural" is likely the key technical barrier to answering this question.

Constructive proofs of duality

In chapter 5, we proved that any piecewise-analytic dynamical system is can be rep-

resented arbitrarily accurately by a semialgebraic transition system. However, our

proof was not constructive; we have no algorithm which would take some specifica-

tion of a piecewise-analytic dynamical system and construct a corresponding analytic transition system. In general, no such algorithm is likely to exist, but there may be

classes of dynamical system for which we can construct an explicit transition system, such as differentially flat systems of Pfaffian systems.

6.3 A vision for verifiable performant systems

We close by noting that the work in this thesis represents a very small step down an

exciting road. Deliberative planning is an essential component of intelligent systems, yet the algorithmic hardness of planning for hybrid systems ensures that planning will

always be intractable in the worst case. Abstraction represents a powerful framework for addressing this challenge.

Incremental improvement to the algorithms described here could already lead to interesting places. Efficient anytime planning, taking advantage of powerful admis- sible heuristics and the rich litany of upper bounds available in the control theory literature, could enable even very complicated robots to behave with apparent intelli- gence. Focussed sampling could solve challenging manipulation problems in the same way that it solves challenging motion planning problems. Minor extensions could al- low fast replanning, or the synthesis of policies robust to uncertain actions. Symbolic reasoning could improve perception, and facilitate communication and interaction with humans.

141 Beyond these incremental advances, though, lies a monumental and important challenge: abstraction learning. We have shown that while there is fundamental and important symbolic structure inherent to any hybrid domain, there is also a purely cognitive role for symbolic reasoning. In all of our work, and in all of the immediately visible next steps, the symbols used for reasoning were specified by hand, typically as a result of careful analysis. Yet the evidence is mounting that learned policies can outperform humans even in domains we think we understand deeply.

The limitations of learning policies have traditionally been the cost of data and the lack of provable safety; the former makes autonomous data collection necessary, and the latter makes it impossible. Angelic abstractions provide a substrate on which we can build intelligent systems, providing a way to ensure good performance while the learned model improves and that prevents the worst of the errors that learned policies can exhibit. We hypothesize, and hope to one day prove, that learned cognitive symbols are the essential missing ingredient for general embodied intelligence.

142 Bibliography

[1] Pankaj Agarwal et al. "Nonholonomic path planning for pushing a disk among obstacles". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 1997.

12] Sandip Aine et al. "Multi-heuristic A*". In: The International Journal of Robotics Research 35.1-3 (2016), pp. 224-243. 131 Rachid Alami, Jean-Paul Laumond, and Thierry Simeon. "Two manipulation planning algorithms". In: Proceedings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 1994. [4] Rachid Alami, Thierry Simeon, and Jean-Paul Laumond. "A geometrical ap- proach to planning manipulation tasks. The case of discrete placements and grasps". In: Proceedings of the InternationalSymposium on Robotics Research (ISRR). 1990. [5] Samir Alili et al. "Interleaving symbolic and geometric reasoning for a robotic assistant". In: Workshops at the International Conference on Automated Plan- ning and Scheduling. 2010. 16] Rajeev Alur et al. "The algorithmic analysis of hybrid systems". In: Theoretical 138.1 (1995), pp. 3-34. 17] Nancy M Amato et al. "OBPRM: An obstacle-based PRM for 3D workspaces". In: Proceedings of the International Workshop on Algorithmic Foundations of Robotics (WAFR). 1998, pp. 155-168.

18] Christer Backstram and Bernhard Nebel. "Complexity results for SAS+ plan- ning". In: Computational Intelligence 11.4 (1995), pp. 625-655. [9] Salah Baouendi and Charles Goulaouic. "Approximation of analytic functions on compact sets and Bernstein's inequality". In: Transactions of the American Mathematical Society 189 (1974), pp. 251-261. DOI: 10.2307/1996858. [10] Salah Baouendi and Charles Goulaouic. "Approximation polynomiale de fonc- tions C° et analytiques". French. In: Annales de l'institut Fourier. Vol. 21. 4. 1971, pp.149-173.DOI:10.5802/aif.396. 111] J6r6me Barraquand et al. "A random sampling scheme for path planning". In: The InternationalJournal of Robotics Research 16.6 (1997), pp. 759-774.

143 1121 Jennifer Barry, Leslie Pack Kaelbling, and Tomis Lozano-P6rez. "A hierar- chical approach to manipulation with diverse actions". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2013. 113] Saugata Basu, Richard Pollack, and Marie-Frangoise Roy. Algorithms in Real Algebraic Geometry. Vol. 10. Algorithms and Computation in Mathematics. Springer, 2016.

[14] Michael Ben-Or, Dexter Kozen, and John Reif. "The complexity of elementary algebra and geometry". In: Journal of Computer and System Sciences 32.2 (1986), pp. 251-264. 115] Dmitry Berenson, Siddhartha Srinivasa, and James Kuffner. "Task space re- gions: A framework for pose-constrained manipulation planning". In: The In- ternational Journal of Robotics Research 30.12 (2011), pp. 1435-1460. DOI: 10.1177/0278364910396389. 116] Mohak Bhardwaj, Byron Boots, and Mustafa Mukadam. Differentiable Gaus- sian process motion planning. 2019. arXiv: 1907.09591 [cs. RO]. 117] Julien Bidot et al. "Geometric backtracking for combined task and motion planning in robotic systems". In: Artificial Intelligence 247 (2017), pp. 229- 265. [18] Avrim Blum and Merrick Furst. "Fast planning through planning graph anal- ysis". In: Artificial intelligence 90.1-2 (1997), pp. 281-300. 119] Robert Bohlin and . "Path planning using lazy PRM". In: Pro- ceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2000. DOI: 10.1109/ROBOT.2000.844107. 120] Valerie Boor, Mark H Overmars, and A Frank Van Der Stappen. "The Gaussian sampling strategy for probabilistic roadmap planners". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 1999. 121] Michael Branicky et al. "Quasi-randomized path planning". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2001.

1221 Rodney Brooks and TomA Lozano-P6rez. "A subdivision algorithm in config- uration space for findpath with rotation". In: IEEE Transactions on Systems, Man, and Cybernetics SMC-15.2 (Mar. 1985), pp. 224-233. DOI: 10. 1109/ TSMC.1985.6313352.

123] Daniel Bryce et al. "SMT-based nonlinear PDDL+ planning". In: Workshops at the AAAI Conference on Artificial Intelligence. 2015. 124] Tom Bylander. "The computational complexity of propositional STRIPS plan- ning". In: Artificial Intelligence 69.1-2 (1994), pp. 165-204. 125] Stephane Cambon, Rachid Alami, and Fabien Gravot. "A hybrid approach to intricate motion, manipulation and task planning". In: The International Journal of Robotics Research 28.1 (2009), pp. 104-126.

144 126] St6phane Cambon, Fabien Gravot, and Rachid Alami. "A robot task planner that merges symbolic and geometric reasoning". In: Proceedings of the Euro- pean Conference on Artificial Intelligence (ECAI). 2004. 127] John Canny. The complexity of robot motion planning. MIT press, 1988. {28] Michael Carreras. The Curse of the Mummy's Tomb. 1964. [29] Michael Cashmore et al. "A compilation of the full PDDL+ language into SMT". In: Workshops at the AAAI Conference on Artificial Intelligence. 2016.

130] Nikhil Chavan-Dafle and Alberto Rodriguez. Sampling-based Planning of In- Hand Manipulation with External Pushes. 2017. arXiv: 1707.00318 Ecs. RO]. 131] Pang Chen and Yong Hwang. "Practical path planning among movable obsta- cles". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 1991.

132] Peng Cheng, George Pappas, and Vijay Kumar. "Decidability of motion plan- ning with differential constraints". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2007. [331 Jaesik Choi and Eyal Amir. "Combining planning and motion planning". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 2009. [341 Alfred Clebsch. "Ober die simultane Integration linearer partieller Differential- gleichungen". In: Journalfir die reine und angewandte Mathematik 65 (1866), pp. 257-268. 135] George Collins. "Quantifier elimination for real closed fields by cylindrical al- gebraic decompostion". In: Proceedings of the Conference on Automata Theory and Formal Languages. 1975. [36] Akansel Cosgun et al. "Push planning for object placement on cluttered table surfaces". In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2011. 137] Michel Coste. An introduction to semialgebraic geometry. Istituti Editoriali e Poligrafici Internazionali, 2000. [38] Erwin Coumans et al. Bullet physics library. 2013. URL: https: //bulletphysics. org. [39] Joseph Culberson and Jonathan Schaeffer. "Pattern databases". In: Computa- tional Intelligence 14.3 (1998), pp. 318-334. 1401 Neil Dantam, Swarat Chaudhuri, and Lydia Kavraki. "The task motion kit". In: Robotics and Automation Magazine (2018). [41] Neil Dantam and Mike Stilman. "The motion grammar: Analysis of a linguistic method for robot control". In: Transactions on Robotics 29.3 (2013), pp. 704- 718.

145 142] Neil Dantam et al. "An Incremental Constraint-Based Framework for Task and Motion Planning". In: The InternationalJournal of Robotics Research 37 (10 2018), pp. 1134-1151. DOI: 10.1177/0278364918761570. 1431 Neil Dantam et al. "Incremental Task and Motion Planning: A Constraint- Based Approach". In: Robotics: Science and Systems. 2016. 144] Martin Davis. "Hilbert's Tenth Problem is Unsolvable". In: The American Mathematical Monthly 80.3 (1973), pp. 233-269. URL: http: //www. jstor. org/stable/2318447. 1451 Leonardo De Moura and Nikolaj Bjorner. "Z3: An efficient SMT solver". In: Proceedings of the International Conference on Tools and Algorithms for the Construction and Analysis of Systems. 2008. 146] Feodor Deahna. "Ober die Bedingungen der Integrabilitat lineirer Differential- gleichungen erster Ordnung zwischen einer beliebigen Anzahl verinderlicher Gr6Een". In: Journal fir die reine und angewandte Mathematik 20 (1840), pp. 340-349.

147] Richard Dearden and Chris Burbridge. "An approach for efficient planning of robotic manipulation tasks". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2013. 148] Erik Demaine et al. "Pushing blocks is hard". In: Computational Geometry 26.1 (2003), pp. 21-36. DOI: 10.1016/SO925-7721(02)00170-0. [49] Zofia Denkowska and Maciej Denkowski. "A long and winding road to definable sets". In: Journal of Singularities 13 (2015), pp. 57-86. DOI: 10.5427/jsing. 2015.13d. [50] Ashwin Deshpande. "Exact Geometry Algorithms for Robotic Motion Plan- ning". PhD thesis. Massachusetts Institute of Technology, 2019. 1511 Ashwin Deshpande, Leslie Pack Kaelbling, and Tomas Lozano-Perez. "Decid- ability of semi-holonomic prehensile task and motion planning". In: Proceed- ings of the InternationalWorkshop on the Algorithmic Foundations of Robotics (WAFR). 2016. [52] Edsger W Dijkstra. "A note on two problems in connexion with graphs". In: Numerische mathematik 1.1 (1959), pp. 269-271.

153] Mehmet Dogar and Siddhartha Srinivasa. "A framework for push-grasping in clutter". In: Robotics: Science and Systems. Vol. 1. 2011. 1541 Jing Dong et al. "Motion Planning as Probabilistic Inference using Gaussian Processes and Factor Graphs". In: Robotics: Science and Systems. 2016. 1551 Christian Dornhege, Andreas Hertle, and Bernhard Nebel. "Lazy Evaluation and Subsumption Caching for Search-Based Integrated Task and Motion Plan- ning". In: Workshops at the InternationalConference on Intelligent Robots and Systemsl (IROS). Tokyo, Japan, 2013.

146 [56] Christian Dornhege et al. "Integrating task and motion planning using seman- tic attachments". In: Proceedings of the AAAI Conference on Bridging the Gap Between Task and Motion Planning. 2010.

1571 Christian Dornhege et al. "Semantic attachments for domain-independent plan- ning systems". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2009. 1581 Stefan Edelkamp and J6rg Hoffmann. "PDDL2: The language for the classical part of the 4th International Planning Competition". In: InternationalPlan- ning Competition at the International Conference on Automated Planningand Scheduling. 2004.

[591 Esra Erdem et al. "Combining high-level causal reasoning with low-level ge- ometric reasoning and motion planning for robotic manipulation". In: Pro- ceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2011. 1601 Can Erdogan and Mike Stilman. "Autonomous design of functional structures". In: Advanced Robotics 29.9 (2015), pp. 625-638. 1611 Can Erdogan and Mike Stilman. "Planning in constraint space: Automated de- sign of functional structures". In: Proceedings of the IEEE InternationalCon- ference on Robotics and Automation (ICRA). 2013. [62] Tom Erez, Yuval Tassa, and Emanuel Todorov. "Simulation tools for model- based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 2015. 163] Pedro Felzenszwalb and David McAllester. "The generalized A* architecture". In: Journal of Artificial Intelligence Research 29 (2007), pp. 153-190. DOI: 10.1613/jair.2187. 164] Enrique Fernandez-Gonzilez, Brian Williams, and Erez Karpas. "ScottyActiv- ity: mixed discrete-continuous planning with convex optimization". In: Journal of Artificial Intelligence Research 62 (2018), pp. 579-664. [65] Richard Fikes and Nils Nilsson. "STRIPS: A new approach to the application of theorem proving to problem solving". In: Artificial intelligence 2.3-4 (1971), pp. 189-208. [66] Ferdinand Georg Frobenius. "Ober das Pfaffsche Problem". In: Journalfirdie Reine und Angewandte Mathematik 82 (1877), pp. 230-315. [67] Caelan Reed Garrett, Tomis Lozano-P6rez, and Leslie Pack Kaelbling. "Backward- forward search for manipulation planning". In: Proceedings of the IEEE Inter- national Conference on Intelligent Robots and Systems (IROS). 2015. 1681 Caelan Reed Garrett, Tomis Lozano-P6rez, and Leslie Pack Kaelbling. "FFROB: An efficient heuristic for task and motion planning". In: Proceedings of the In- ternational Workshop on the Algorithmic Foundations of Robotics (WAFR). 2014.

147 169] Caelan Reed Garrett, Tomis Lozano-Prez, and Leslie Pack Kaelbling."FFRob: Leveraging symbolic planning for efficient task and motion planning". In: The International Journal of Robotics Research 37.1 (Nov. 2017), pp. 104-136. ISSN:1741-3176. DOI:10.1177/0278364917739114. [70] Caelan Reed Garrett, Tomds Lozano-P6rez, and Leslie Pack Kaelbling. PDDL- Stream: Integrating Symbolic Planners and Blackbox Samplers. 2018. arXiv: 1802.08705 [cs.AI].

171] Caelan Reed Garrett, Tomas Lozano-Prez, and Leslie Pack Kaelbling. "Sample- Based Methods for Factored Task and Motion Planning". In: Robotics: Science and Systems (RSS). 2017. 172] Caelan Reed Garrett, Tomis Lozano-Perez, and Leslie Pack Kaelbling. "Sampling- based methods for factored task and motion planning". In: The International Journal of Robotics Research 37.13-14 (2018), pp. 1796-1825. DOI: 10.1177/ 0278364918802962. 173] Mamoun Gharbi, Raphael Lallement, and Rachid Alami. "Combining symbolic and geometric planning to synthesize human-aware plans: toward more efficient combined search." In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2015. [74] Bill Goodwine and Joel Burdick. "Controllability of kinematic control systems on stratified configuration spaces". In: IEEE Transactions on Automatic Con- trol 46.3 (2001), pp. 358-368. [75] Fabien Gravot, Stephane Cambon, and Rachid Alami. "aSyMov: a planner that deals with intricate symbolic and geometric problems". In: Proceedings of the InternationalSymposium on Robotics Research (ISRR). 2005. [761 Peter Gregory et al. "Planning modulo theories: Extending the planning paradigm". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2012. [77] John Halton. "On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals". In: Numerische Mathematik 2.1 (1960), pp. 84-90. DOI: 10.1007/BF01386213. [781 William Hanna and Joseph Barbera. The Jetsons. 1962. [79] Peter Hart, Nils Nilsson, and Bertram Raphael. "A formal basis for the heuris- tic determination of minimum cost paths". In: IEEE Transactions on Systems Science and Cybernetics 4.2 (1968), pp. 100-107.

180] Patrik Haslum and Hector Geffner. "Admissible heuristics for optimal plan- ning". In: Proceedings of the InternationalConference on Artificial Intelligence Planning (AIPS). 2000. 181] Patrik Haslum et al. "Extending classical planning with state constraints: Heuristics and search for optimal planning". In: Journal of Artificial Intel- ligence Research 62 (2018), pp. 373-431. DOI: 10.1613/jair.1.11213.

148 182] Kris Hauser. "Minimum Constraint Displacement Motion Planning". In: Robotics: Science and Systems (RSS). 2013. 183] Kris Hauser. "The minimum constraint removal problem with three robotics applications". In: The International Journal of Robotics Research 33.1 (2014), pp. 5-17. 184] Kris Hauser and Jean-Claude Latombe. "Integrating task and PRM motion planning: Dealing with many infeasible motion planning queries". In: Work- shops at the International Conference on Automated Planning and Scheduling (ICAPS). 2009. [85] Kris Hauser and Jean-Claude Latombe. "Multi-modal motion planning in non- expansive spaces". In: The International Journal of Robotics Research 29 (7 2009), pp. 897-915. DOI: 10.1177/0278364909352098. [86] Kris Hauser and Victor Ng-Thow-Hing. "Randomized multi-modal motion planning for a humanoid robot manipulation task". In: The InternationalJour- nal of Robotics Research 30.6 (2011), pp. 678-698. 187] Joshua Haustein et al. Object Placement Planning and Optimizationfor Robot Manipulators. 2019. arXiv: 1907.02555 [cs. RO]. 188] Giray Havur et al. "Hybrid reasoning for geometric rearrangement of multiple movable objects on cluttered surfaces". In: Proceedings of the IEEE Interna- tional Conference on Robotics and Automation (ICRA). 2014. 189] Robert Hearn. "Games, puzzles, and computation". PhD thesis. Massachusetts Institute of Technology, June 2006.

[90] Malte Helmert. "The fast downward planning system". In: Journal of Artificial Intelligence Research 26 (2006), pp. 191-246. DOI: 10.1613/jair.1705. [91] Andreas Hertle et al. "Planning with Semantic Attachments: An Object-Oriented View." In: Proceedings of the European Conference on Artificial Intelligence (ECAI). 2012. 192] David Hilbert. "Mathematical problems". In: Bulletin of the American Math- ematical Society 8.10 (1902), pp. 437-479. DOI: 10.1090/S0002-9904-1902- 00923-3. 193] J6rg Hoffmann. "The Metric-FF Planning System: Translating "Ignoring Delete Lists" to Numeric State Variables". In: Journal of Artificial Intelligence Re- search 20 (2003), pp. 291-341.

194] J6rg Hoffmann and Bernhard Nebel. "The FF planning system: Fast plan gen- eration through heuristic search". In: Journal of Artificial Intelligence Research 14 (2001), pp. 253-302. 195] Frangois Robert Hogan and Alberto Rodriguez. Feedback control of the pusher- slider system: A story of hybrid and underactuated contact dynamics. 2016. arXiv: 1611.08268 [cs.R0].

149 1961 John Hopcroft, Jacob Schwartz, and Micha Sharir. "On the Complexity of Motion Planning for Multiple Independent Objects; PSPACE-Hardness of the "Warehouseman's Problem"". In: The International Journal of Robotics Re- search 3.4 (1984), pp. 76-88. DOI: 10.1177/027836498400300405. 197] David Hsu, Jean-Claude Latombe, and Hanna Kurniawati. "On the proba- bilistic foundations of probabilistic roadmap planning". In: The International Journal of Robotics Research 25.7 (2006), pp. 627-643.

198] David Hsu et al. "On finding narrow passages with probabilistic roadmap plan- ners". In: Proceedings of the International Workshop on the Algorithmic Foun- dations of Robotics (WAFR). 1998. 199] David Hsu et al. "The bridge test for sampling narrow passages with probabilis- tic roadmap planners". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2003. 1100] Franc Ivankovic and Patrik Haslum. "Optimal planning with axioms". In: Pro- ceedings of the InternationalJoint Conference on Artificial Intelligence. 2015. 1101] Bronislaw Jakubczyk and Eduardo Sontag. "Controllability of nonlinear discrete- time systems: A Lie-algebraic approach". In: SIAM Journal on Control and Optimization 28.1 (1990), pp. 1-33. DOI: 10.1137/0328001. 1102] Lucas Janson, Brian Ichter, and Marco Pavone. "Deterministic sampling-based motion planning: Optimality, complexity, and performance". In: The Interna- tional Journal of Robotics Research 37.1 (2018), pp. 46-61. DOI: 10 . 1177/ 0278364917714338. [103] Lucas Janson et al. "Fast marching tree: A fast marching sampling-based method for optimal motion planning in many dimensions". In: The Interna- tional Journal of Robotics Research 34 (7 2015), pp. 883-921. DOI: 10.1177/ 0278364915577958. 1104] Heinrich Jung. "Uber die kleinste kugel die eine rsumliche figur einschliesst". PhD thesis. Universitst Marburg, 1899. 1105] Leslie Pack Kaelbling and Tomis Lozano-Perez. "Hierarchical task and motion planning in the now". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 2011. 1106] Sertac Karaman and Emilio Frazzoli. "Sampling-based algorithms for optimal motion planning". In: The International Journal of Robotics Research 30.7 (2011), pp. 846-894.

1107] Sertac Karaman and Emilio Frazzoli. "Sampling-based optimal motion plan- ning for non-holonomic dynamical systems". In: Proceedings of the IEEE In- ternational Conference on Robotics and Automation (ICRA). 2013. 1108] Lars Karlsson et al. "Combining task and path planning for a humanoid two- arm robotic system". In: Proceedings of the ICAPS Workshop on Combining Task and Motion Planningfor Real-World Applications. 2012.

150 1109] Lydia Kavraki, Mihail Kolountzakis, and Jean-Claude Latombe. "Analysis of probabilistic roadmaps for path planning". In: IEEE Transactions on Robotics and Automation 14.1 (1998), pp. 166-171. [110] Lydia Kavraki and Steven LaValle. "Motion planning". In: Springer Handbook of Robotics (2008), pp. 109-131. 11111 Lydia Kavraki et al. "Probabilistic roadmaps for path planning in high-dimensional configuration spaces". In: IEEE Transactions on Robotics and Automation 12.4 (1996), pp. 566-580. 1112] Andrew Kimmel et al. "Fast, anytime motion planning for prehensile manip- ulation in clutter". In: Proceedigs of the IEEE International Conference on Humanoid Robots (ICHR). 2018. [113] Jennifer King and Siddhartha Srinivasa. RearrangementPlanning via Heuristic Search. 2016. arXiv: 1603.08642 [cs.RO]. 1114] Daniel Koditschek. "Robot assembly: Another source of nonholonornic control problems". In: Proceedings of the American Control Conference (ACC). 1991. [115] Sven Koenig and Maxim Likhachev. "D* lite". In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2002. [1161 Sven Koenig and Maxim Likhachev. "Incremental A*". In: Advances in Neural Information Processing Systems (NeurIPS). 2002. 1117] Sven Koenig, Maxim Likhachev, and David Furcy. "Lifelong planning A*". In: Artificial Intelligence 155.1-2 (2004), pp. 93-146. [118] Sven Koenig et al. "Incremental heuristic search in Al". In: Al Magazine 25.2 (2004), pp. 99-99. [119] Richard Korf. "Depth-first iterative-deepening: An optimal admissible tree search". In: Artificial Intelligence 27.1 (1985), pp. 97-109. 1120] Richard Korf. "Linear-space best-first search". In: Artificial Intelligence 62.1 (1993), pp. 41-78. [121] Richard Korf. "Real-time heuristic search". In: Artificial intelligence 42.2-3 (1990), pp. 189-211. [122] Athanasios Krontiris and Kostas Bekris. "Dealing with difficult instances of object rearrangement". In: Robotics: Science and Systems. 2015. 1123] Athanasios Krontiris and Kostas Bekris. "Efficiently solving general rearrange- ment tasks: A fast extension primitive for an incremental sampling-based plan- ner". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2016. DOI: 10. 1109/ICRA. 2016.7487581. [124] Athanasios Krontiris et al. "Rearranging similar objects with a manipulator using pebble graphs". In: 2014 IEEE International Conference on Humanoid Robots (ICHR). 2014.

151 1125] James Kuffner and Steven LaValle. "RRT-connect: An efficient approach to single-query path planning". In: Proceedings of the IEEE International Con- ference on Robotics and Automation (ICRA). 2000.

1126] Mikl6s Laczkovich. "The removal of 7r from some undecidable problems in- volving elementary functions". In: Proceedings of the American Mathematical Society 131.7 (2003), pp. 2235-2240. DOI: 10.1090/S0002-9939-02-06753-9. (127] Gerardo Laflerriere, George Pappas, and Shankar Sastry. "Subanalytic strati- fications and bisimulations". In: Proceedings of the International Workshop on Hybrid Systems: Computation and Control. 1998. (128] Fabien Lagriffoul and Benjamin Andres. "Combining task and motion plan- ning: A culprit detection problem". In: The InternationalJournal of Robotics Research 35.8 (2016), pp. 890-927. DOI: 10.1177/0278364915619022. [1291 Fabien Lagriffoul et al. "Constraint propagation on interval bounds for deal- ing with geometric backtracking". In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2012. [130] Fabien Lagriffoul et al. "Efficiently combining task and motion planning us- ing geometric constraints". In: The InternationalJournal of Robotics Research 33.14 (2014), pp. 1726-1747. DOI: 10.1177/0278364914545811. [131] Fabien Lagriffoul et al. "Platform-Independent Benchmarks for Task and Mo- tion Planning". In: IEEE Robotics and Automation Letters 3.4 (2018), pp. 3765- 3772. 1132] Steven LaValle. Planning algorithms. Cambridge university press, 2006. 1133] Steven LaValle. Rapidly-exploring random trees: A new tool for path planning. Tech. rep. TR 98-11. Computer Science Department, Iowa State University, 1998.

[134] Steven LaValle and James Kuffner Jr. "Randomized kinodynamic planning". In: The InternationalJournal of Robotics Research 20.5 (2001), pp. 378-400. [135] Steven LaValle and James Kuffner Jr. Rapidly-exploring random trees: Progress and prospects. 2000. [136] Jeongseok Lee et al. "DART: Dynamic Animation and Robotics Toolkit". In: The Journal of Open Source Software 3.22 (Feb. 2018), p. 500. DOI: 10. 21105/ joss.00500. [137] Vladimir Lifschitz. "Answer set planning". In: Proceedings of the International Conference on Logic Programming and Nonmonotonic Reasoning. 1999. (138] Stanislaw Lojasiewicz. Ensembles semi-analytiques. Tech. rep. Institut des Hautes Etudes Scientifiques, 1965. URL: https ://perso . univ - rennes1 . fr/michel.coste/Lojasiewicz.pdf. [139] Tomis Lozano-Perez. "Automatic planning of manipulator transfer movements". In: IEEE Transactions on Systems, Man, and Cybernetics 11 (1981), pp. 681- 698. URL: http: //lis. csail. mit. edu/pubs /tlp/ automatic - planning- transfer-motions.pdf.

152 [140] Tomis Lozano-P6rez. "The design of a mechanical assembly system". MA the- sis. Massachusetts Institute of Technology, 1976. 11411 Tomis Lozano-P6rez and Leslie Pack Kaelbling. "A constraint-based method for solving sequential manipulation planning problems". In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2014. [142] Chris Malcolm and Tim Smithers. "Symbol grounding via a hybrid architecture in an autonomous assembly system". In: Robotics and Autonomous Systems 6.1-2 (1990), pp. 123-144.

1143] James Marble and Kostas Bekris. "Asymptotically Near-Optimal is Good Enough for Motion Planning". In: Proceedingsof the InternationalSymposium on Robotics Research. 2011. 1144] Bhaskara Marthi, Stuart Russell, and Jason Wolfe. "Angelic Hierarchical Plan- ning: Optimal and Online Algorithms". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2008. 1145] Bhaskara Marthi, Stuart Russell, and Jason Wolfe. Angelic hierarchicalplan- ning: Optimal and online algorithms (revised). Tech. rep. UCB/EECS-2009- 122. EECS Department, University of California, Berkeley, 2009. [146] Matthew Mason. "The mechanics of manipulation". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 1985. [147] Yuri Vladimirovie Matiyasevie. "The Diophantineness of enumerable sets". Russian. In: Doklady Akademii Nauk. Vol. 191. 2. Russian Academy of Sci- ences. 1970, pp. 279-282. 1148] Peter W Michor. Topics in differential geometry. Vol. 93. American Mathe- matical Soc., 2008. 11491 Marvin Minsky. "Recursive Unsolvability of Post's Problem of "Tag" and Other Topics in Theory of Turing Machines". In: The Annals of Mathematics 74.3 (1961), pp. 437-455. 1150] Shokraneh Moghaddam and Ellips Masehian. "Planning Robot Navigation among Movable Obstacles (NAMO) through a Recursive Approach". In: Jour- nal of Intelligent and Robotic Systems 83 (2016), pp. 603-634. DOI: 10.1007/ s10846-016-0344-1. 11511 Mustafa Mukadam, Xinyan Yan, and Byron Boots. "Gaussian process motion planning". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2016. [152] Tadashi Nagano. "Linear differential systems with singularities and an appli- cation to transitive Lie algebras". In: Journal of the Mathematical Society of Japan 18.4 (1966), pp. 398-404. 1153] Srinivas Nedunuri et al. "SMT-based synthesis of integrated task and motion plans from plan outlines". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 2014.

153 1154] Allen Newell and Herbert Simon. "Computer Science As Empirical Inquiry: Symbols and Search". In: Communications of the ACM 19.3 (1976), pp. 113- 126. DOI: 10.1145/360018.360022. 1155] Dennis Nieuwenhuisen, A Frank van der Stappen, and Mark Overmars. "An effective framework for path planning amidst movable obstacles". In: Proceed- ings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 2008. 1156] Dennis Nieuwenhuisen, A Frank van der Stappen, and Mark Overmars. "Path planning for pushing a disk using compliance". In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2005. 1157] Nils Nilsson. Shakey the robot. Tech. rep. 323. Al Center, SRI International, 1984. 11581 Amit Kumar Pandey et al. "Towards planning human-robot interactive ma- nipulation tasks: Task dependent and human oriented autonomous selection of grasp and placement". In: Proceedings of the IEEE International Conference on Biomedical Robotics and Biomechatronics (BioRob). 2012. 1159] Erion Plaku. "Path planning with probabilistic roadmaps and co-safe linear temporal logic". In: Proceedings of the IEEE International Conference on In- telligent Robots and Systems (IROS). 2012. 11601 Erion Plaku and Gregory Hager. "Sampling-based motion and symbolic action planning with geometric and differential constraints". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2010. [161] Ira Pohl. "Heuristic search viewed as path finding in a graph". In: Artificial intelligence 1.3-4 (1970), pp. 193-204. [162] John Reif. "Complexity of the mover's problem and generalizations". In: Pro- ceedings of the IEEE Annual Symposium on Foundations of Computer Science (SFCS). 1979. 1163] Daniel Richardson. "Some undecidable problems involving elementary func- tions of a real variable". In: The Journal of Symbolic Logic 33.4 (1969), pp. 514- 520. DOI: 10. 2307/2271358. 11641 Nicholas Roy. "Finding approximate POMDP solutions through belief com- pression". PhD thesis. Carnegie Mellon University, Aug. 2003.

1165] Stuart Russell. "Efficient memory-bounded search methods". In: Proceedings of the European Conference on Artificial Intelligence (ECAI). 1992.

1166] Earl Sacerdoti. "Planning in a hierarchy of abstraction spaces". In: Artificial Intelligence 5.2 (1974), pp. 115-135. DOI: 10. 1016/0004-3702(74) 90026-5. 1167] Walter Savitch. "Relationships between nondeterministic and deterministic tape complexities". In: Journal of computer and system sciences 4.2 (1970), pp. 177-192.

154 [168] John Schulman et al. "Motion planning with sequential convex optimization and convex collision checking". In: The InternationalJournal of Robotics Re- search 33.9 (2014), pp. 1251-1270. [169] Jacob Schwartz and Micha Sharir. "Motion planning and related geometric algorithms in robotics". In: Proceedings of the International Congress of Math- ematicians. 1986. [170] Jacob Schwartz and Micha Sharir. "On the "piano movers" problem. II. General techniques for computing topological properties of real algebraic manifolds". In: Advances in Applied Mathematics 4.3 (1983), pp. 298-351. 1171] Abraham Seidenberg. "A new decision method for elementary algebra". In: Annals of Mathematics (1954), pp. 365-374. [172] Adi Shamir. "IP=PSPACE". In: Proceedings of the Annual Symposium on Foundations of Computer Science. 1990. [173] Rahul Shome and Kostas E. Bekris. Anytime Multi-arm Task and Motion Planning for Pick-and-Placeof Individual Objects via Handoffs. 2019. arXiv: 1905.03179 [cs.R0]. [174] Rahul Shome et al. "dRRT*: Scalable and informed asymptotically-optimal multi-robot motion planning". In: Autonomous Robots (2019), pp. 1-25. DOI: 10.1007/s10514-019-09832-9.

[175] Rahul Shome et al. Fast, high-quality dual-arm rearrangementin synchronous, monotone tabletop setups. 2018. arXiv: 1810.12202 [cs.RO]. [176] Lavindra de Silva, Amit Kumar Pandey, and Rachid Alami. "An interface for interleaved symbolic-geometric planning and backtracking". In: Proceedings of the IEEE InternationalConference on Intelligent Robots and Systems (IROS). 2013.

[177] Lavindra de Silva et al. Towards combining HTN planning and geometric task planning. 2013. arXiv: 1307.1482 [cs.AI]. [178] Thierry Sim6on et al. "A General Manipulation Task Planner". In: Proceedings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 2002. [179] Thierry Sim6on et al. "A general manipulation task planner". In: Proceedings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 2004. [180] Thierry Sim6on et al. "Manipulation planning with probabilistic roadmaps". In: The InternationalJournal of Robotics Research 23.7-8 (2004), pp. 729-746. [181] Russell Smith et al. Open Dynamics Engine. 2001. URL: https: //ode.org. [182] Ilya Sobol. "On the distribution of points in a cube and the approximate eval- uation of integrals". In: Zhurnal Vychislitel'noi Matematiki i Matematicheskoi Fiziki 7.4 (1967), pp. 784-802.

155 1183] David Speck et al. "Symbolic Planning with Axioms". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2019.

1184] Siddharth Srivastava et al. "Combined task and motion planning through an extensible planner-independent interface layer". In: Proceedings of the IEEE InternationalConference on Robotics and Automation (ICRA). 2014. 1185] Siddharth Srivastava et al. "Using classical planners for tasks with continu- ous operators in robotics". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2013. 11861 Anthony Stentz et al. "The focussed D* algorithm for real-time replanning". In: Proceedings of the InterntationalJoint Conference on Artificial Intelligence (IJCAI). 1995. 1187] Xiaoxun Sun, Sven Koenig, and William Yeoh. "Generalized Adaptive A*". In: Proceedings of the InternationalJoint Conference on Autonomous Agents and Multiagent Systems. 2008.

[188] H6ctor Sussmann. "Subanalytic sets and feedback control". In: Journal of Dif- ferential Equations 31.1 (1979), pp. 31-52. [189] Alfred Tarski. A decision method for elementary algebra and geometry. Tech. rep. R-109. RAND Corp, 1948. 11901 Emanuel Todorov, Tom Erez, and Yuval Tassa. "MuJoCo: A physics engine for model-based control". In: Proceedings of the IEEE International Conference on Intelligent Robots and Systems (IROS). 2012.

1191] Marc Toussaint. "Logic-geometric programming: An optimization-based ap- proach to combined task and motion planning". In: Proceedings of the Intern- tational Joint Conference on Artificial Intelligence (IJCAI). 2015. 1192] Marc Toussaint and Manuel Lopes. "Multi-Bound Tree Search for Logic-Geometric Programming in Cooperative Manipulation Domains". In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). 2017. [193] Marc Toussaint et al. "Differentiable Physics and Stable Modes for Tool-Use and Manipulation Planning". In: Robotics: Science and Systems. 2018. 1194] Mauro Vallati et al. "The 2014 International Planning Competition: Progress and Trends". In: AI Magazine 36.3 (2015), pp. 90-98. 1195] Jur Van Den Berg et al. "ANA*: Anytime Nonparametric A*". In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 2011. 11961 Jur Van Den Berg et al. "Path planning among movable obstacles: a proba- bilistically complete approach". In: Proceedings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 2008. [197] William Vega-Brown and Nicholas Roy. "Asymptotically optimal planning un- der piecewise-analytic constraints". In: Proceedings of the International Work- shop on the Algorithmic Foundations of Robotics (WAFR). 2016.

156 1198] Marilena Vendittelli, Jean-Paul Laumond, and Bud Mishra. "Decidability of robot manipulation planning: Three disks in the plane". In: Proceedings of the International Workshop on the Algorithmic Foundations of Robotics (WAFR). 2014. DOI: 10 .1007/978-3-319-16595-0_37. 11991 Vincent Vidal. "YAHSP3 and YAHSP3-MT in the 8th International Plan- ning Competition". In: Proceedings of the InternationalPlanning Competition (IPC). 2014. 1200] Yue Wang et al. "Task and Motion Policy Synthesis as Liveness Games". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2016. 1201] Jason Wolfe, Bhaskara Marthi, and Stuart Russell. "Combined Task and Mo- tion Planning for Mobile Manipulation". In: Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS). 2010. 1202] Matt Zucker et al. "CHOMP: Covariant hamiltonian optimization for motion planning". In: The InternationalJournal of Robotics Research 32.9-10 (2013), pp. 1164-1193.

157