SAMPLING-BASED MOTION PLANNING ALGORITHMS: ANALYSIS AND DEVELOPMENT

by NATHAN ALEXANDER WEDGE

Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy

Dissertation Advisor: Dr. Michael S. Branicky

Department of Electrical Engineering and Computer Science CASE WESTERN RESERVE UNIVERSITY

May, 2011 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the dissertation of

Nathan Alexander Wedge candidate for the Doctor of Philosophy degree∗.

Michael S. Branicky (committee chair)

M. Cenk C¸avu¸so˘glu (committee member)

Wyatt S. Newman (committee member)

Soumya Ray (committee member)

March 25, 2011

∗We also certify that written approval has been obtained for any proprietary material contained herein. Table of Contents

List of Tables v

List of Figures vi

List of Algorithms ix

Acknowledgements x

Abstract xii

1 Introduction 14 1.1 Motion Planning ...... 14 1.1.1 Problem Characterizations ...... 15 1.2 Experimental Problems ...... 18 1.3 Contributions and Outline ...... 19

2 Background 23 2.1 Combinatorial Planning ...... 23 2.2 Sampling-based Planning Methodology ...... 25 2.2.1 Sampling and Randomization ...... 26 2.2.2 Distance and Metrics ...... 27 2.2.3 Local Planning ...... 28 2.2.4 Collision Detection ...... 29

i 2.3 Sampling-based Planning Algorithms ...... 32 2.3.1 Randomized Potential Fields ...... 33 2.3.2 Ariadne’s Clew ...... 35 2.3.3 Single-query Bidirectional Lazy (SBL) ...... 35 2.3.4 Probabilistic Roadmap (PRM) ...... 38 2.3.5 Rapidly-exploring Random Tree (RRT) ...... 42

3 Rapidly-exploring Random Tree Analysis 47 3.1 Simplified Models ...... 48 3.1.1 One-dimensional Model ...... 49 3.1.2 Markov Chain Model ...... 53 3.2 Exponential and Power Decay Regimes ...... 59 3.3 Parameterization and Heuristics ...... 65 3.3.1 Step Size ...... 66

3.3.2 Extend versus Connect ...... 69 3.4 Implications for Problem Difficulty ...... 72 3.5 Summary ...... 76

4 Distributions and Restarts 78 4.1 Distributions to Runtimes and Restarts ...... 79 4.2 Continuous Problems ...... 82 4.2.1 Experiments ...... 83 4.3 Discrete Problems ...... 86 4.3.1 Planner Variations ...... 87 4.3.2 Experiments ...... 87 4.4 Generalizing Restarts ...... 90 4.4.1 General Queries ...... 91 4.4.2 Task-based Decomposition ...... 94

ii 4.4.3 Algorithmic Measures ...... 98 4.5 Summary ...... 100

5 Neighborhood-based Expansion 101 5.1 Locally-isolated Expansion ...... 101 5.2 Path-length Annexed Random Tree (PART) ...... 104 5.3 Performance ...... 109 5.3.1 Demonstrative Problems ...... 110 5.3.2 Realistic Benchmarks ...... 113 5.4 Roadmaps and Paths ...... 117 5.5 Summary ...... 119

6 Local Obstacle Adaptation 121 6.1 Cost-to-come Thresholds ...... 121 6.2 Potentially-reachable Regions ...... 123 6.3 Adaptive PART (APART) ...... 125 6.4 Performance ...... 129 6.4.1 Demonstrative Problems ...... 130 6.4.2 Realistic Benchmarks ...... 132 6.5 Roadmaps and Paths ...... 136 6.6 Summary ...... 138

7 Conclusion 140 7.1 Future Work ...... 143 7.1.1 Extended Algorithm Models ...... 143 7.1.2 Informed Restart Strategies ...... 143 7.1.3 APART with Disconnected Components ...... 144 7.1.4 Balancing APART Exploration ...... 144 7.1.5 Collision Checking in APART Path Processing ...... 145

iii A Derivations 146 A.1 One-dimensional RRT Model ...... 146 A.1.1 Recurrence ...... 146 A.1.2 Distribution ...... 147 A.1.3 Approximations ...... 149 Geometric ...... 149 Negative Binomial ...... 150 A.2 Power Decay ...... 151 A.3 Constant Restart Intervals ...... 152 A.3.1 Mean ...... 152 A.3.2 Variance ...... 153 A.3.3 Usefulness ...... 154 A.3.4 Optimality ...... 154 A.4 Na¨ıve Neighbor Scaling ...... 155

B Implementation 156 B.1 Algorithm Terminology and Notation ...... 156 B.2 Software ...... 158

Bibliography 161

iv List of Tables

5.1 Maze (bidirectional) simulation results evaluating the PART planner. 111 5.2 Kinked tunnel (bidirectional) simulation results evaluating the PART planner...... 112 5.3 Bug trap (unidirectional) simulation results evaluating the PART plan- ner...... 113 5.4 Flange 0.95 (unidirectional) simulation results evaluating the PART planner...... 116 5.5 Alpha puzzle 1.1 (bidirectional) simulation results evaluating the PART planner...... 117

6.1 Maze (bidirectional) simulation results evaluating the APART planner. 130 6.2 Kinked tunnel (bidirectional) simulation results evaluating the APART planner...... 131 6.3 Bug trap (unidirectional) simulation results evaluating the APART planner...... 132 6.4 Flange 0.95 (unidirectional) simulation results evaluating the APART planner...... 133 6.5 Alpha puzzle 1.1 (bidirectional) simulation results evaluating the APART planner...... 134 6.6 Mean local tree counts for the PART and APART planners...... 137

v List of Figures

1.1 An example mapping from work space to configuration space. . . . . 16 1.2 Two-dimensional experimental problems...... 20 1.3 Six-dimensional experimental problems...... 20

2.1 An example of combinatorial planning for a two-dimensional setting. . 25 2.2 The top four layers of an oriented bounding box hierarchy...... 30 2.3 A potential field function for two dimensions...... 34 2.4 Path segment trees in the SBL planner...... 38 2.5 Roadmaps generated by the PRM planner...... 40 2.6 Non-uniform sampling distributions for the PRM planner...... 41 2.7 Trees created by the RRT planner...... 43 2.8 Voronoi decompositions for in-progress instances of the RRT planner. 45 2.9 Effective sampling regions from the DD-RRT planner...... 46

3.1 Setup of the one-dimensional model of the RRT planner...... 50 3.2 Edge effects in the (bidirectional) RRT planner...... 53 3.3 Computed performance of the RRT planner in one dimension via Markov chain fundamental matrix...... 56 3.4 Example discretized Markov chain models for the RRT planner. . . . 57 3.5 Discretized Markov chain model for the RRT planner...... 59

vi 3.6 Dynamics of the RRT planner in the nine-state environment as mod- eled via Markov chain...... 60 3.7 Diversely-sized minimum (Euclidean) Voronoi regions in different nar- row passage situations...... 61 3.8 Generalized Voronoi region volume changes with sampling...... 62 3.9 Sequence of Voronoi visibility during the RRT planner’s expansion on atube...... 63 3.10 Transition between power and exponential decay in the RRT planner. 65 3.11 Comparison of small and large step size in the RRT planner...... 67 3.12 Densities of nodes created by the RRT planner with Manhattan metric at specific iterations...... 68 3.13 Densities of nodes created by the RRT planner with Euclidean metric at specific iterations...... 68 3.14 Densities of nodes created by the RRT planner with Chebyshev metric at specific iterations...... 69 3.15 Average-case RRT instances over metric and step size...... 70 3.16 Effects of step size and heuristic on overall RRT planner performance. 72 3.17 Collision checking performance of the RRT planner on a bug trap ver- sus query states...... 73 3.18 Node density in the RRT planner by performance...... 74 3.19 “Trick” states for the RRT planner on realistic disassembly/assembly problems...... 75 3.20 RRT planner growth bias due to faraway “trick” states...... 76

4.1 Balanced universal restart strategy...... 80 4.2 Restart statistical diagrams...... 81 4.3 The usefulness of restarts on various versions of the kinked tunnel. . . 84 4.4 Restart benefits for two easier versions of the alpha puzzle...... 85

vii 4.5 Impact of nearest neighbor computation method on runtimes and restarts in the RRT planner...... 86 4.6 Discrete example problems...... 88 4.7 Performance and restart impact for RRT variants on exchange. . . . . 89 4.8 Performance and restart impact for RRT variants on 15-puzzle. . . . 90 4.9 Constant and universal restart strategy performance of the RRT plan- ner on the kinked tunnel with a random sequence of queries...... 92 4.10 RRT planner instances solving tasks...... 95 4.11 Runtime survivor functions for RRT planner solving tasks...... 95 4.12 Voronoi visibility issues on the task-based portions of problems. . . . 96 4.13 Runtime survivor functions for the RRT planner on the tube...... 97 4.14 Cumulative measures of RRT planner coverage...... 99

5.1 Metric and cost-to-go relationships...... 102 5.2 Locally-isolated expansion...... 104 5.3 Multiple selection in the PART planner...... 106 5.4 PART planner branching...... 107 5.5 Solution paths for the flange and the alpha puzzle...... 114 5.6 Roadmaps generated by the PART planner...... 118

6.1 PART planner collision checking performance by cost-to-come threshold.123 6.2 Behavior of the RRT planner with an optimal metric...... 125 6.3 APART planner comparative performance on the SE(3) benchmarks. 133 6.4 APART planner local trees on the SE(3) benchmarks...... 134 6.5 Compared roadmaps from the PART and APART planners...... 137 6.6 Densities of local trees created by the APART planner...... 138 6.7 Solution path lengths from the PART and APART planners...... 139

B.1 Design diagram for the simulation software...... 160

viii List of Algorithms

2.1 The Single-query Bidirectional Lazy (SBL) planner...... 36 2.2 The Single-query Bidirectional Lazy (SBL) planner’s progressive colli- sion detection...... 37 2.3 The Probabilistic Roadmap Method (PRM) planner...... 39 2.4 The Rapidly-exploring Random Tree (RRT) planner...... 43

5.1 The Path-length Annexed Random Tree (PART) planner...... 105 5.2 The Path-length Annexed Random Tree (PART) planner’s branching and connection strategy...... 108

6.1 The Adaptive Path-length Annexed Random Tree (APART) planner. . 126 6.2 The Adaptive Path-length Annexed Random Tree (APART) planner’s branching and connection strategy...... 128 6.3 The Adaptive Path-length Annexed Random Tree (APART) planner’s shrinking function...... 129

ix Acknowledgements

In just over a decade at Case Western Reserve University, I have benefitted from the influence of numerous individuals, both personally and professionally. Chief among them is Dr. Michael S. Branicky, who has served as my advisor throughout all three of my degrees. The breadth and depth of his knowledge is immediately obvious, and as such, his guidance has been of unparalleled quality. Further, it has always been forthcoming; I have always felt prioritized regardless of his workload or situation. In my appreciation, it is exceptionally gratifying to depart Case with him helming the department in which I have studied. The broader environment of Case has treated me well during my tenure as a student. Most recently, my dissertation committee (Drs. M. Cenk C¸avu¸so˘glu,Wyatt S. Newman, and Soumya Ray) has been accommodating to my schedule and has provided beneficial input on my work. Before that, I had the singular opportunity to play a significant role in Case’s Urban Challenge team, an experience which was both particularly enjoyable and indescribably valuable to my professional development. In general and over the course of that entire time, I have received superb instruction, generous financial support, and curteous assistance from the University. In particular, Beth Fuller Murray has been my first point of contact for any and all issues in my academic career; her glowingly positive attitude has made it a pleasure to have a problem or to file a form. Beyond the University, I owe approximately four years of my study to the consid-

x erable assistance of the National Defense Science and Engineering Graduate Fellow- ship. The freedom it has provided has been an important determinant of the course of my study, from allowing me to participate in the Urban Challenge to shaping the beginnings of the research herein. Finally, I would be remiss if I did not also acknowledge the contributions of my parents. They have provided invaluable support, from simple encouragements to practical advice and financial security. I cannot remember a time when they have not been fully available in every respect. For every positive moral, sensible thought, and academic curiosity they have instilled in me, they have my undying gratitude.

xi Sampling-based Motion Planning Algorithms: Analysis and Development

Abstract by NATHAN ALEXANDER WEDGE

Robotic motion planning, which concerns the computation of paths and controls that drive an autonomous agent from one configuration to another, is quickly be- coming a vitally important field of research as its applications diversify and become increasingly public. Many algorithms have been proposed to deal with this central problem; sampling-based approaches like the Rapidly-exploring Random Tree (RRT) and Probabilistic Roadmap Method (PRM) planners are among the most success- ful. Still, these algorithms are not fully understood and suffer from pathologically poorly-performing instances resulting from the contributions of random sampling and qualitative obstacle features like narrow passages. The large means and variances that result from these issues continue to motivate the development of new algorithms and adaptations to increase consistency and to allow more difficult problems to be solved. This research examines these performance issues with a focus on the Rapidly- exploring Random Tree (RRT) planner. Fundamental analysis establishes that the interaction of its Voronoi bias with particular obstacle features can compromise its efficacy and illustrates the types of distributions on its performance that result. It further provides guidance on the types of problems amenable to solutions by the al- gorithm and on the use of its alternative extend and connect heuristics and step size parameter. Observations from this analysis prompt an investigation of the use

xii of restart strategies to manage issues of both scaling in computation and exploratory missteps. In turn, their impact provides a foundation for the introduction of a novel algorithm, the Path-length Annexed Random Tree (PART) planner, that directs its exploration on a local basis. This algorithm and its environment-adaptive successor, the Adaptive PART (APART) planner, demonstrate competitive performance on in- structive examples and dramatic improvements on difficult benchmarks, while also supplementing their utility with the output of a connected roadmap.

xiii Chapter 1

Introduction

The study of robotics has evolved into an extremely broad field involving a vari- ety of disciplines across engineering and science and encompassing research issues in topics such as design, sensing, learning, and control. Simultaneously, the increasing introduction of practical consumer robotic products in the hobbyist/entertainment and household maintenance segments along with high-profile efforts like the DARPA Grand Challenges [63, 51, 55] are advancing the acceptance of and demand for more complex robots. One of the chief goals in this endeavor calls for robots to graduate from simply automated to fully autonomous. Among other components, the course of that goal requires robust and adaptive methods for various robotic systems to discover feasible trajectories through their environments, a process called motion planning.

1.1 Motion Planning

Generally, the motion planning problem is the problem of locating a path through an obstacle-laced environment that can be realized by an agent. For example, a child’s maze is one instance of the motion planning problem. However, applications for motion planning reach far beyond this simple, two-dimensional setting to include agents with various motion capabilities, varying environments with intricately com-

14 plex geometric constraints and higher dimension, and both discrete and continuous state. In fact, motion planning is occasionally referred to as the “piano mover’s prob- lem,” which evokes the notion of a grand piano with legged supports being moved over small objects and through doorways using translational and rotational motions. Predictably, this is not the limit of the complexity of motion planning, as agents and their environments take on many interesting forms including the redundant rotational joints of the human arm and the directional constraints of the typical automobile. However, the primary goal of this research lies in building understanding of and cre- ating novel strategies for the difficulties induced by obstacle constraints.

1.1.1 Problem Characterizations

A preliminary consideration in undertaking the problem of motion planning lies in identifying and managing the motion capabilities of the agent. In the most straight- forward cases, the agent has an unrestrained ability to drive its state incrementally in any direction in its state space like the “agent” in a child’s maze (i.e., the point drawn by the pencil) or an industrial arm. These and other types of agents that have differen- tial constraints that are integrable are referred to as holonomic systems. Other agents with differential constraints that are not integrable have limitations on their motion in state space and are referred to as nonholonomic systems. Common examples in- clude (the non-skidding motion models of) automobiles and other wheeled vehicles, for which movement is constrained by their turning radius and corresponding lack of direct, side-to-side motion. More complex motion models for automobiles and heli- copters that include phenomena like speed-induced skidding and velocity/acceleration constraints are kinodynamic systems. Other cases in motion planning involve scenar- ios with moving obstacles that add time to the state space and coordinated planning with multiple robots that compound the state space by adding state variables for each agent, potentially leading to high-dimensional planning problems.

15 (a) (b) (c)

Figure 1.1: An example mapping from work space to configuration space. The exam- ple agent in (a) is a two-link robotic arm operating in a work space with a six-rung ladder. Obstacles in the configuration space in (b) arise from the intersections be- tween the robot and the boundary (red), and the base and arm links and the rungs (green and blue, respectively). The result in (c) is a configuration space with complex obstacle shapes.

In addition to the issues arising from the motion and dynamics of an agent, the ge- ometry of both agent and environment are of central importance to locating collision- free paths. The notion of a configuration space [41] abstracts away these geometric issues, allowing any instance to be viewed as a problem involving a point agent in a transformed environment. In this configuration space, the agent is represented by a point at its current state (e.g., its spatial coordinates, rotational position, joint angles, etc.), and the obstacles are “swelled” (or otherwise reshaped, depending on the topology of the space) to block any state for which a part of the agent intersects an obstacle. Since agents may have a range of state variables that transform their location and geometry, the mapping from work space to configuration space, as in the example in Figure 1.1, is frequently not straightforward. However, the configuration space abstraction allows the construction of general algorithms that can plan motions for a variety of agents and environments, independent of their geometry. The general motion planning problem is defined in terms of an agent, its work

16 space, and the corresponding configuration space. Thus, given an agent/environment pair (i.e., a configuration space C), a motion planner must locate a path γ, which

provides a continuous mapping (γ : [0, 1] → Cfree) into the obstacle-free portion of the configuration space that connects the initial and final query states xi and xf (i.e.,

γ(0) = xi and γ(1) = xf ). Alternatively, the motion planner must (properly) assert that such a path does not exist. This basic definition of the motion planning problem, formalized in Problem 1.1, allows the specification of general algorithms designed to solve arbitrary (but compliant) problems.

Problem 1.1 (Motion Planning) Find a realizable path from an initial state, through an obstacle environment, and to a final state.

Given: State space S that defines the position and pose of the agent. This may include, but is not limited to, joint angles (for robotic arms), translational and rotational positions (for freely-moving parts), and velocity components (for vehicles). Transition model x˙ = f(x, u) that captures how the agent moves incrementally in its state space from a state x with an input u. Geometric representations A and O that model the agent and any obstacles in its work space (e.g., 3D triangular polygonal meshes). Configuration space C that describes the set of states that may be occupied by the agent, such that the models A and O do not intersect.

Query states xi and xf that specify the initial and final states (in the free portion of the configuration space) between which the agent should transition.

Provide: Path γ that lists a continuous sequence of states and/or inputs that steer the agent from initial to final state; a planner may alternately assert that the query is unsolvable.

17 Several aspects of the motion planning problem present a particular challenge. First, the dimension of the state space S and therefore, the configuration space C, is essentially unconstrained. While agents operate in the physical world, and therefore, in two, three, or four dimensions (e.g., if restricted to the Earth’s surface or dependent on time), dimensions can always be added to the state spaces of those agents. For example, a revision to a robotic arm could add an additional joint and a corresponding dimension, or the addition of a second mobile robot requiring coordination with a first could double the effective dimension. Unfortunately, volume increases exponentially with dimension, rapidly increasing the amount of potentially-explorable space (and leading to the so-called “curse of dimensionality”). Second, the dynamics of the agent itself may represent a complex control problem. If this is the case, obstacle-free paths in the configuration space may not be achievable by the agent, requiring further search. Third, obstacle constraints can be arbitrarily tight. For example, planning the motion of a bolt being inserted into a corresponding pre-tapped hole is likely to have negligible clearance compared to its own diameter. The correspondingly small width and volume of the configuration space passage can make it difficult to locate. This research is primarily motivated by the final difficulty of locating and negotiating such narrow passages in the configuration space.

1.2 Experimental Problems

Several problems are used repeatedly to illustrate and to benchmark the motion planning methods discussed in this work. The first three, pictured in Figure 1.2, are straightforward, two-dimensional problems that can be solved quickly with modern algorithms and hardware. Each one is defined for a holonomic point robot with a state space of [0, 1]2 (the unit square). The first, the maze, requires the robot to navigate a weakly-constrained but circuitous route between query states at the start and at the

18 end (e.g., xi = (0.125, 0.875) and xf = (0.875, 0.875)). The second, the kinked tunnel, involves crossing an extended narrow passage of width 2−6 in order to connect query

states on opposite sides (e.g., xi = (0.125, 0.125) and xf = (0.875, 0.875)). The third, the bug trap, has only a short narrow passage at its mouth (of opening width 2−6), but

an initial state inside the trap (e.g., xi = (0.5, 0.363)) also allows the robot to explore regions in the lobes that surround that mouth before escaping toward a final state

on the outside (e.g., xf = (0.125, 0.5)). The remaining two, shown in Figure 1.3, are more complex and involve freely-moving objects that can translate and rotate in three- dimensional space (leading to state spaces of six dimensions). The first, the flange, is an assembly problem that calls for the insertion of a curved pipe into a socket; its translational state space can be arbitrarily large, but the experiments herein define it as a cylinder of radius 1.5 and height 3.0 centered at (0.0, 0.0, 0.5) and measured relative to the center of the lipped top of the pipe. The second, the alpha puzzle, is a disassembly problem that requires an articulated part to be extricated from another, fixed copy. Like the flange, its translational state space can be arbitrarily large, but the experiments herein use a sphere of radius 72.0 centered at (0.0, 0.0, 0.0) and measured relative to the center of the loop of the agent. Both problems use quaternions [37] to parameterize the rotational component. These problems provide a range of examples on which to quantify the behavior and performance of motion planning methods.

1.3 Contributions and Outline

This research explores the issues of robustness and performance with respect to sampling-based motion planning algorithms in qualitatively varied environments. De- tailed and practical analysis is presented to illustrate the expansion behavior and resulting performance issues observed in the Rapidly-exploring Random Tree (RRT)

19 (a) (b) (c)

Figure 1.2: Two-dimensional experimental problems. Configuration spaces for the maze in (a), the kinked tunnel in (b), and the bug trap in (c), admit solution paths like the ones pictured to solve the example queries shown.

(a) (b)

Figure 1.3: Six-dimensional experimental problems. On the flange in (a), the agent (red) must be inserted into the fixed socket (black). On the alpha puzzle in (b), the agent (red) must be removed from the fixed part (black). Both agents are pictured in their initial states.

20 planner. This, in turn, provides guidance on the various features of the algorithm and motivates an examination of probabilistic performance-improving techniques and the development of a new algorithm intended to address problems of particular difficulty. Finally, an environment-adaptive mechanism supplements this algorithm to conclude the presentation.

Chapter 1 introduces the motion planning problem and the fundamental, underlying idea of navigating the multi-dimensional configuration space. The primary challenges of the problem are specified, including the need to locate and navigate narrow pas- sages, which is established as the motivation for this research. Experimental problems used throughout are detailed, and the contributions and layout of the work are out- lined. Chapter 2 overviews established tools and procedures used in solving the motion planning problem. The two main approaches, combinatorial and sampling-based, are defined and examples of both are provided. For the latter, a detailed description of the components used to implement its algorithms provides the framework for an explicit specification of several of its most popular algorithms. An examination of their weaknesses supplement the motivation for this research. Chapter 3 presents a model-based analysis of the expansion behavior of the RRT

planner and examines the consequences of the step size parameter and the extend and connect heuristics. Simple models and experiments also outline the qualitative characteristics of the distributions of computation required to solve individual prob- lems. These evaluations supply information about the classes of problems that are amenable to the RRT planner. Chapter 4 investigates the applicability of probabilistic restart techniques to the performance distributions that arise from the RRT planner in various types of prob- lems. This includes an examination of the relevant computational issues (e.g., nearest

21 neighbor, collision detection, etc.) in both continuous and discrete contexts. Exper- iments assert a general link between problem difficulty and the potential of these restart-based methods. Related material has been previously published [64]. Chapter 5 introduces the idea of localized instances of the RRT planner and leverages it in the design of a new algorithm called the Path-length Annexed Random Tree (PART) planner. This algorithm excels on difficult problem instances while also providing the utility of a roadmap output. Experiments demonstrate its performance advantage over other planning algorithms. Related material is to be published [65]. Chapter 6 expands the PART planner with an adaptive method of determining ap- propriate thresholds for its localized exploration and introduces the resulting Adaptive PART (APART) planner. Further experiments study the performance impact of a range of constant thresholds and reveal that the adaptive method performs competi- tively with the best constant threshold across various problems. Chapter 7 concludes the presentation with a summary of the work and an outline of possible future research directions.

Appendix A provides complete derivations for the introduced equations and models. Appendix B summarizes the software structure and hardware used for experiments.

The general impetus for this research is the need for flexible and well-performing algorithms for continuous motion planning applications, and as such, its focus is a proven sampling-based motion planning algorithm: the Rapidly-exploring Random Tree planner. This makes a presentation in the context of single-query algorithms natural; however, the introduced algorithms also share a multiple-query element. These hybrid algorithms combine strengths of both approaches and broaden the po- tential applicability of this work. Nevertheless, comparisons to and conclusions on other sampling-based algorithms are included throughout.

22 Chapter 2

Background

Myriad solutions to the problem of motion planning exist, falling into one of two general categories: combinatorial and sampling-based. The former explicitly uses delimiting features of the configuration space (e.g., corners) in order to capture its connectivity, while the latter implicitly explores it via the influence of samples from the state space. Modern research (including this work) chiefly pursues sampling-based approaches, many of which share a small set of fundamental building blocks that draw on other fields of research (e.g., sampling from statistical methods and collision detection from computational geometry). Their combined straightforwardness and power has led to wide adoption of sampling-based planning algorithms such as the Probabilistic Roadmap Method (PRM) and Rapidly-exploring Random Tree (RRT) planners. However, these algorithms also have recognized weaknesses that preclude them from generating solutions to some problems in reasonable time.

2.1 Combinatorial Planning

Early efforts at addressing the motion planning problem took the direct approach of quantifying the full connectivity of the configuration space. Such algorithms gen- erally capture this connectivity by building a roadmap, which is a graph to which

23 arbitrary (reachable) locations in the configuration space can be linked easily. As a result, answering a particular query is a matter of locating appropriate nodes in the roadmap to which the query states can be linked, followed by a discrete search over the roadmap to establish a path between those nodes. Generally, combinatorial planning algorithms produce a roadmap with comprehensive coverage, which implies that they are complete (i.e., they either return a feasible path or correctly declare that one does not exist). However, they frequently also suffer from practical issues like high complexity of implementation and poor theoretical performance. The weaknesses of combinatorial planning can arise from the use of various cell decomposition strategies to break down the problem into simple segments for which connectivity can be easily ascertained. One straightforward such decomposition is triangulation, which breaks a two-dimensional environment containing polygonal ob- stacles into simply-connected triangular regions. An example is shown in Figure 2.1. However, a decomposition scheme that can properly characterize the nonlinear effects of rotational components present in many problems is more complex. That process, known as cylindrical algebraic decomposition [16], has complexity that is doubly ex- ponential in dimension (i.e., O(eed )) but is capable of representing the general motion planning problem [44]. Fortunately, algorithms in this class are not necessarily lim- ited to such a performance bound; Canny’s algorithm [13] circumvents an explicit decomposition to improve its theoretical performance to singly exponential in dimen- sion (i.e., O(ed)). While combinatorial planning algorithms are of theoretical use in analyzing the difficulty of the motion planning problem and of practical use in spe- cific scenarios, their weaknesses have led more recent research toward a qualitatively different approach.

24 (a) (b)

Figure 2.1: An example of combinatorial planning for a two-dimensional setting. The decomposition (triangulation) in (a) partitions the obstacle-free configuration space into simply-connected cells. These cells, in turn, can form a roadmap, like the one in (b) that fully covers the space by connecting the midpoints of shared cell edges.

2.2 Sampling-based Planning Methodology

Unlike combinatorial motion planning, which focuses on explicit representations of the configuration space, sampling-based methods focus on individual states within the space. In the pursuit of widely-applicable algorithms, sampling-based planning draws from a set of tools discussed in the following sections that can be broadly applied to varied problems. Their use has resulted in algorithms that can solve dissimilar problems by leveraging the appropriate choice of established tools for the agent or environment. For example, different agents may call for different methods of measuring distance; Euclidean distance is appropriate for planning a path through a single-story building floor plan, but the lack of sideways motion in a nonholonomic car calls for a different measure. These tools, which encompass sampling schemes, metrics, local planning, and collision detection, are generally addressed as simple black boxes by sampling-based planning algorithms.

25 2.2.1 Sampling and Randomization

Sampling techniques, which generate a sequence of states drawn from a distribu- tion on the state space, are central to determining and guiding the exploration of sampling-based planners. These techniques are divided into deterministic sequences and random distributions, with the latter seeing more common use in motion plan- ning. Deterministic sequences supply an ordered set of samples across a space that have predictable values. Established sequences, such as the van der Corput sequence and two of its generalizations to higher dimensions, Halton and Hammersley points, have been used in sampling-based planning algorithms [11]. In such cases, a pri- mary advantage of deterministic sequences lies in their low discrepancy; they are con- structed to avoid leaving large, unoccupied gaps, which can translate to guarantees on the resolution of coverage of the sampling scheme. Random distributions, while not able to provide such guarantees, have still seen significant success in sampling-based planning. Randomization provides a convenient method of generating samples with known probabilistic characteristics over a configuration space. In spite of its unpredictabil- ity compared to deterministic sequences, randomization has a number of attractive qualities. The theory and study of probability is deep, so many distributions useful in motion planning predate its study and have been analyzed in detail. For example, a method of generating uniform random quaternions (points on a four-dimensional sphere), which parameterize the rotations of three-dimensional rigid bodies, has been available since 1956 [54], while attention to the problem of robotic motion planning originated in 1979 [59]. Further, common distributions like the Gaussian are avail- able to naturally and probabilistically focus sampling in particular regions. Finally, randomization has been used to great advantage for other problems, including sort- ing [30] and volume computation [18]. The usefulness and flexibility of the concept have influenced many well-known sampling-based planning algorithms to include ran-

26 dom sampling.

2.2.2 Distance and Metrics

Because sampling-based planning algorithms function by introducing a discretization on the configuration space that links sampled states, a measure of distance is im- portant to providing some prudence to the choices of these links. The concept of a metric generalizes the traditional concept of distance to the varied spaces that arise (for one) in motion planning. As such, a metric ρ provides a mapping from two

elements of the configuration space into the real numbers (i.e., ρ : C × C → R). They retain a practical resemblance to ordinary distance by obeying four important properties: nonnegativity, reflexivity, symmetry, and triangularity. Nonnegativity

guarantees that the resulting values are zero or positive (i.e., 0 ≤ ρ(x1, x2)), while reflexivity defines that the metric can only return zero if the arguments coincide (i.e.,

ρ(x1, x2) = 0 ⇐⇒ x1 = x2). Symmetry makes a metric reversible with respect to its arguments (i.e., ρ(x1, x2) = ρ(x2, x1)). Finally, triangularity reflects the notion that the distance between two points is no larger than the same distance through

an intermediate point (i.e., ρ(x1, x3) ≤ ρ(x1, x2) + ρ(x2, x3)). These properties allow for the construction of metrics that measure distance in an intuitive way in motion planning problems. A variety of metrics are in common use in motion planning. Many, including the

traditional Euclidean distance, are members of the Lp family of metrics, which provide general distance measures for n-dimensional spaces. These metrics, which are defined

pp Pn p as ρ(x1, x2) = =1 |x1[d] − x2[d]| (where x[d] indexes the dth component of x), include the Manhattan (for p = 1), the Euclidean (for p = 2), and the Chebyshev or max-norm (in the limit as p → ∞). Another simple expression provides a useable

metric for three-dimensional rotational states (quaternions) with ρ(q1, q2) = 1 − |q1 ·

q2| [37]. Further, existing metrics can be combined via (positively) weighted sum

27 to provide metrics over spaces composed of the concatenation of simpler ones. For example, the composition of the Euclidean and the aforementioned quaternion metrics given by Equation (2.1) is a valid metric for three-dimensional rigid bodies. Other works have explored the construction of metrics for automobiles [42] and other human- operated vehicles [21]. In these and other cases, metrics measure (or approximate) the cost of steering an agent between two states, informing the decision of which possible motions are explored in the process of a given algorithm.

ρ ({x1, q1} , {x2, q2}) = ctρeuclidean (x1, x2) + crρquaternion (q1, q2) (2.1)

2.2.3 Local Planning

Another central component of sampling-based planning is the ability to steer the agent from one state to another. This local planning is often as simple as inter- polation for holonomic agents. Realizable paths between states for robotic arms can be produced via linear interpolation (i.e., the intermediate states are speci-

fied as x(τ) = (1 − τ)xstart + τxend for τ ∈ [0, 1]). Rigid bodies in three dimen- sions can be translated via the same linear interpolation while being rotated via

−1 spherical linear interpolation, or SLERP (i.e., q(τ) = (sin((1 − τ) cos |λ|)qstart +

−1 −1 sgn(λ) sin(τ cos |λ|)qend)/ sin(cos |λ|) for λ = qstart · qend and τ ∈ [0, 1]) [37]. It is also possible to perform the rotational part of the motion before, during, or after the translation; this has been suggested to minimize the volume swept out by the agent in the process [3]. Motions of this kind along only some (or a single one) of the dimensions can be used as a simplifying mechanism. Local planning for holonomic systems tends to be straightforward; however, nonholonomic systems will generally require more complex methods. Control methods for nonholonomic systems analogous to those required for local planning have been studied extensively outside the field of motion planning. In some

28 cases, including specific mathematical models of automobiles [42, 58], there are ex- plicit methods that yield the appropriate control inputs needed to steer between two configurations. Other nonholonomic but simple robots, such as ones equipped with differential drives allowing them to rotate in place, have obvious steering strategies (e.g., rotate toward the target, drive, rotate to the target’s heading). While likely sub- optimal with respect to time cost, they may be easily constructed by human users. For more general classes of human-operated (or similar) vehicles, control strategies can be computed offline using learning methods (namely, value iteration) [21]. Fortunately, not all sampling-based planning algorithms necessarily require absolute precision in steering strategies. These methods are generally incremental and focus on making partial progress toward a target rather than reaching it. As such, approaches that discretize the control variables and apply them over a short interval of time [17] have been successful as local planners. Still, the steering complexity of nonholonomic sys- tems makes local planning an important consideration in the design of maximally general algorithms in sampling-based motion planning.

2.2.4 Collision Detection

The final typical component of sampling-based planning algorithms, collision detec- tion, is fundamental to exploring in compliance with the obstacle constraints of the environment. Depending on the algorithm, this may reasonably require the validation of a specific state (i.e., checking the agent’s position against obstacles in a particu- lar state) or of a path segment (i.e., checking the agent’s positions against obstacles over a continuous range of states). Real-world problems (or more specifically, envi- ronments and agents) in planning are normally represented using three-dimensional models composed of triangular polygon meshes, so collision detection methods must be designed around this parameterization. However, this allows for both simple, brute force methods and more complex solutions to both types of collision detection

29 (a) (b) (c) (d)

Figure 2.2: The top four layers of an oriented bounding box hierarchy. Each oriented bounding box (OBB) encloses a set of triangles from the object model, beginning with a root box that encloses the entire model and continuing down hierarchically to boxes that enclose individual triangles. problems. Collision detection for a specific state has a straightforward brute force solution, but more complicated methods are normally employed. A brute force approach would simply perform an intersection calculation between all pairs of triangles in the agent and environment models (and, if necessary, between all pairs of triangles in different agents). Unfortunately, the models are frequently complex (in order to reasonably represent real-world objects) and many collision detection calculations must generally be performed in order to solve a problem, so this brute force method is often compu- tationally intractable. Instead, various hierarchical bounding methods are available. These methods enclose a model with a simple shape (e.g., an object-oriented bounding box [29]), split the model in half, and continue the process recursively. An example of the coarsest levels of this process is shown in Figure 2.2 on the alpha puzzle model. Because the bounding shapes have simple intersection calculations and the depths of the hierarchies are of the order of the logarithm of the model’s triangle count, the performance of these methods is significantly better than a brute force method. This is also advantageous because a simple and approximate method for path segment collision detection can be built out of a sequence of this state collision detection. Like state collision detection, path segment collision detection has two important

30 varieties: na¨ıve, approximate methods and more complex, accurate methods. The former assumes that it is sufficient to check a suitably densely-spaced set of indi- vidual states along the path segment in order to validate it. The actual sequence of these states can be chosen in a variety of ways, including both depth-first and breadth-first search in subdivisions of the path segment to a specified resolution limit. Each of these traditional search methods has attractive properties. Depth- first search checks the states in order along the path segment, so the first one found to be in collision yields the relevant collision-free portion of the path segment and the closest obstacle edge with no wasted work. This is well-suited to incremental planning methods. Breadth-first search, on the other hand, is well-matched with planning methods that focus on connecting states end-to-end because it replicates the van der Corput sequence and checks states in order with increasing resolution (i.e., {0, 1, 1/2, 1/4, 3/4, 1/8, 3/8, 5/8, 7/8,... }) [44]. This maximizes the likelihood of locating obstacles placed at unknown positions within the path segment. These techniques require only one of the aforementioned Boolean-returning state-collision- checking modules; comprehensive methods require information beyond that level. The computational consequences of obtaining that information (e.g., computing the closest separation distance between complex models) has generally motivated the use of na¨ıve methods in sampling-based planning. Exact methods of collision checking a path segment must compare the separation between models over their travel along a path segment or ensure that the volume swept out by moving agents does not intersect obstacles. Bounding hierarchies (e.g., ones using rectangle swept spheres [41], which surround an object with a box with rounded edges) can compute the closest separation between models in a computa- tionally tractable way. Pairing this information with bounds on the displacement of points on the model is sufficient to compute whether a path is collision-free. General methods that adaptively compute this information across a path segment to validate

31 it against collisions (e.g., [61]) are available. Other methods based on incrementally tracking proximate features [15] and on composing swept volumes and convex enclo- sures thereof [20] have also been presented. Still, these methods are computationally expensive and dependent on the particular motion of the agent compared to the sim- plicity of the na¨ıve methods, so the latter remain viable in sampling-based planning.

2.3 Sampling-based Planning Algorithms

Leveraging the power of just a few basic ingredients (sampling, metrics, local plan- ning, and collision detection), many algorithms in the sampling-based planning family have been developed. While these algorithms have largely supplanted earlier work in combinatorial motion planning, they differ with respect to a critical property: completeness. While combinatorial algorithms pursue exact solutions to the motion planning problem, sampling-based algorithms settle for lesser notions of complete- ness. These weaker guarantees on the solution take two common forms: resolution completeness and probabilistic completeness. The former, resolution completeness, specifies that an algorithm will return a viable solution if one exists at a given res- olution. By virtue of the deterministic sampling sequences that drive them, the resolution of coverage of algorithms with this property increases over time, implying that they will also return a viable solution at some point in the future (if one exists). The latter, probabilistic completeness, is used when the sampling sequence is random. In lieu of guarantees on finding solutions at increasing resolution over time, proba- bilistic completeness guarantees that the probability of locating a solution increases over time with a limit of one (again assuming one exists). Both notions rely on their sampling being dense (i.e., drawing a sample within an arbitrarily small radius of a given point with sufficient time) in order to locate and sample within arbitrarily narrow passages in the configuration space. In addition to these new standards of

32 completeness, sampling-based planning algorithms vary in their approach to solving the motion planning problem. Two major approaches reconcile the significant motivation for sampling-based planning (quickly solving difficult motion planning queries) with limited computa- tional resources. Each approach (multiple-query and single-query) takes a different tack in answering motion planning queries (pairs of initial and final states to be con- nected by solution paths). Multiple-query algorithms have much in common with combinatorial methods; they assume resources are available offline and in advance of the actual requests to solve the problem. As such, they perform a preprocessing phase intended to capture the connectivity of the configuration space. Presuming that this phase is reasonably comprehensive in ascertaining this connectivity, individ- ual motion planning queries are straightforward and need only be connected to states located during that preprocessing. Answering the query is then a matter of a discrete search over the already-established connectivity. In contrast, single-query algorithms attempt only to directly answer a particular query. Their approach can be better suited to environments that change frequently or significantly, which might otherwise necessitate a multiple-query algorithm to recreate or update its preprocessing. Many single-query algorithms are also compatible with local planners that do not precisely connect states, so their applications with respect to agent types are broader. Un- der these two approaches, many distinct archetypes have emerged in sampling-based planning.

2.3.1 Randomized Potential Fields

One technique that can solve continuous motion planning problems takes advantage of a discretization of the configuration space onto a grid, which can then be searched via traditional discrete search methods. The randomized potential field planner [5] builds upon this idea by introducing a potential function over the configuration space that

33 (a) (b)

Figure 2.3: A potential field function for two dimensions. The potential field in (a) is a weighted sum of the reciprocal of the distance to the closest obstacle and the distance to the goal (starred in the upper right). The set of points in (b) shows the local (as opposed to global) minima of this function.

serves the combined purpose of driving the planner away from obstacles and toward the goal. Since typical potential fields have local minima in addition to the global minimum to be reached (as in the example shown in Figure 2.3), the algorithm pairs motions down the gradient of the field with random motions intended to escape these minima. The resulting graph over the sources and destinations of these motions is searched via depth-first discrete search (with uniformly-random backtracking when a newer and better minima cannot be located) to yield a solution to the original motion planning query. This method represents an early success in solving planning queries for robots with more than a few degrees of freedom, but it also carries significant drawbacks. While the randomized potential field planner demonstrated the ability to solve motion planning problems on real, practical robots (e.g., manipulators with 10 and 31 degrees of freedom), its implementation is complicated by the need to determine a complex set of parameters. The most fundamental of these parameters is the def- inition and computation of the potential field function itself, which should drive the exploration toward the goal while preferring a minimum number of local minima.

34 However, other parameters that govern the number and length of random motion attempts per source and the dynamics of the gradient motions are important as well. Additional complications include the frequently excessive lengths of path segments generated by the random motions and a reliance on explicit discretization of the configuration space. Nonetheless, the randomized potential field planner is an early illustration of the considerable potential of sampling-based approaches to motion planning.

2.3.2 Ariadne’s Clew

Another early attempt at sampling-based motion planning is the Ariadne’s Clew al- gorithm [7], which uses simple motions along individual axes in combination with exploration driven by genetic algorithms. Its operation is divided into two alternat- ing modes (exploration and searching), both of which are expressed as optimization problems for a genetic algorithm. The exploration mode adds a landmark with the intent that it is as far as possible from existing landmarks; this is executed by evaluat- ing the novelty of locations reached by fixed-length sequences of incremental motions from each landmark. The search mode evaluates similar motion sequences to deter- mine if the landmark can be used to reach the final state. This balance provides an explicit recognition of the need to balance exploration versus exploitation in motion planning; however, the presence of optimization problems and the use of genetic al- gorithms weakens its appeal by requiring an extensive, problem-dependent selection of parameters.

2.3.3 Single-query Bidirectional Lazy (SBL)

One more recent algorithm, the Single-query Bidirectional Lazy (SBL) planner [60], which is listed as Algorithms 2.1 and 2.2, employs a similar mechanism to the explo- ration mode of the Ariadne’s Clew algorithm, albeit somewhat simpler. Rather than

35 optimizing with respect to the locations that can be reached via simple paths from known locations, the SBL planner selects a source milestone in the configuration space with probability (π) inversely proportional to the density of local milestones. A new milestone is then generated by picking a random state in a fixed radius (ρlimit) around the source milestone; this process is repeated while linearly shrinking the radius until a collision-free state is picked for the new milestone. In this way, new milestones are added in novel locations without the use of an optimization problem as in the Ariadne’s Clew planner. This method of generating sampled states is supplemented by a lazy collision detection routine that progressively checks path segments in order to provide the complete algorithm.

Algorithm 2.1: The Single-query Bidirectional Lazy (SBL) planner. γ ← plan(xi, xf )

1 Ti::initialize(node(xi)), Tf ::initialize(node(xf )); 2 while true do 3 Tα ← random element({Ti, Tf }), Tβ ← {Ti, Tf }\Tα; 4 for i ← 1 to ∞ do // sample states at decreasing radius until valid 5 ns ← random element(Tα; Tα.n.π); 6 xr ← random state(ns.x ρlimit/i); 7 if !detect collision(xr) then 8 nα ← Tα::insert(ns, node(xr)), nα.κ = 0, nα.λ = metric(ns.x, nα.x); 9 break;

10 nn ← nearest neighbor(Tβ, xr); 11 if metric(nn.x, xr) < ρlimit then // trees overlap; do lazy collision checking 12 nβ ← Tβ::insert(nn, node(xr)), nβ.κ = 0, nβ.λ = metric(nn.x, nβ.x); 13 γ ← compose path(Tα, nα, nβ, Tβ; xi, xf ); 14 [c, nc] ← bolster collision(γ); 15 if !c then // path located 16 return γ;

17 else // remove colliding segment by moving nodes to other tree 18 if nc ∈ Tα then // consistently label colliding tree 19 [Tβ, Tα] ← swap(Tα, Tβ), [nβ, nα] ← swap(nα, nβ);

20 for ni ← nβ::parent() to nc ∈ γ do // adopt along path up to collision 21 nα ← Tα::adopt(nα, ni);

22 Tβ.erase(nβ);

The SBL planner’s lazy collision detection distributes the computational work across entire solution paths in order to maximize its usefulness. When one tree adds

36 Algorithm 2.2: The Single-query Bidirectional Lazy (SBL) planner’s progressive collision detection. [c, nc] ← bolster collision(γ)

1 Q::initialize(γ, γ.n.λ/power(2, γ.n.κ)); // queue sorted by checked resolution 2 while !Q::empty() do 3 nq ← Q::front(), Q::pop(); 4 for i ← 1 to power(2, nq.κ) by 2 do // check states that increase resolution 5 [xq, uq] ← local plan(nq::parent().x, nq.x, nq.λ×i/power(2, nq.κ)); 6 if detect collision(xq) then 7 return [true, nq];

8 nq.κ ← nq.κ + 1, w ← nq.λ/power(2, nq.κ); 9 if δ < w then // continue checking if not sufficiently dense 10 Q::push(nq, w);

11 return [false, ∅]; a milestone sufficiently close to the other tree, a bridging path segment joins the trees and collision detection begins on the resulting candidate solution path. The actual efforts of this collision detection are spread along the candidate solution path by fo- cusing on the path segment that has been previously checked at the lowest resolution. This path segment is then checked at interstitial states to double its resolution (e.g., a path segment already checked at {0, 1/2, 1} is checked at {1/4, 3/4}). This colli- sion detection process is repeated until the solution path is completely verified to a sufficient resolution (δ) or a collision is encountered. In the latter case, the offending path segment is destroyed and any milestones descending from it (in its respective tree) are adopted into the opposing tree. Examples of the resulting exploration pro- cess are shown in Figure 2.4; collision detection can be quite effectively isolated to constructive exploration in the SBL planner. As a result, the SBL planner can be highly efficient in a variety of problems without the large set of complex parameters seen in some earlier algorithms.

37 (a) (b)

Figure 2.4: Path segment trees in the SBL planner. The thickness of line segments is proportional to the resolution at which they have been checked for collisions (so completely unchecked segments are not visible). Solution paths tend to be circuitous but can often be easily differentiated from other segments due to the higher degree to which they have been (lazily) collision checked.

2.3.4 Probabilistic Roadmap (PRM)

One of the most widely-recognized modern sampling-based planning algorithms is the multiple-query Probabilistic Roadmap Method (PRM) planner [35]. The algo- rithm, which is listed as Algorithm 2.3, preprocesses the configuration space in order to create a roadmap of its connectivity. While it can be executed from scratch to answer a particular motion planning query (examples of this exploration are shown in Figure 2.5), the process of answering queries is considerably easier when a densely- connected roadmap is already available. In that case, a minimal amount of exploration is required to link the query states to the algorithm’s existing knowledge of the en- vironment. The roadmap generation method is quite natural: it repeatedly draws new (collision-free) sample states and connects them in order of increasing distance to existing samples (nodes), up to a fixed distance limit (ρlimit) and only when a connection would link two separate components of the roadmap. Alternatively, some implementations replace this notion of improving connectedness with a fixed maxi- mum number of connections per node, which can translate into shorter solution paths

38 (due to the larger number of options in the roadmap). Despite this relative simplicity, the PRM planner does have its own set of recognized weaknesses.

Algorithm 2.3: The Probabilistic Roadmap Method (PRM) planner. γ ← plan(xi, xf )

1 G::initialize(), ni ← G::add(node(xi)), nf ← G::add(node(xf )); // undirected graph 2 while true do 3 xr ← random state(); 4 if !detect collision(xr) then 5 nr ← G::add(node(xr)); 6 Qn::initialize(); // queue sorted by distance for nearby nodes 7 for ni ← G\nr do // locate nearby nodes 8 w ← metric(nr.x, ni.x); 9 if w < ρlimit then 10 Qn::push(ni, w);

11 while !Qn::empty() do 12 if !path exists(Qn::front(), G, nr) then // link unconnected components 13 [xt, ut] ← local plan(Qn::front().x, nr.x, metric(Qn::front().x, nr.x)); 14 if !detect collision(Qn::front().x; ut, xt) then 15 Qn::front()::link(nr), nr::link(Qn::front());

16 Qn::pop();

17 if path exists(ni, nf ) then // query states are connected 18 return compose path(ni, G, nf ; xi, xf );

In contrast to previous algorithms, the weaknesses of the PRM planner are gener- ally defined in terms of the motion planning problems it faces. With only one genuine parameter (the limit on connection distance), more investigative focus can be given to the performance consequences of particular problem characteristics rather than the design of specific implementations and parameters to address them. One weakness arising from the algorithm’s underpinnings involves the types of agents it can handle. Because it requires a local planner that can connect two states end-to-end, it is most frequently applied to holonomic agents, but it can also be used for nonholonomic agents that have local planners that are otherwise compliant with this condition. However, the most significant issue for the PRM planner lies in the interplay between its uniform random sampling and a specific characteristic of the configuration space obstacles: narrow passages.

39 (a) (b)

Figure 2.5: Roadmaps generated by the PRM planner. The roadmap in (a) is con- nected only when separate components can be linked (thus, many nodes are only connected to a single other node to link with the overall roadmap). The roadmap in (b) is connected over all nodes that satisfy the distance limit parameter.

The original sampling scheme of the PRM planner is uniform and random, which creates a predicament when capturing the connectivity of the configuration space re- quires coverage of specific regions of small volume. Such a situation, which occurs when the configuration space contains narrow passages that link regions in which queries may appear, presents difficulty because the PRM planner must draw sam- ples in these low-probability passages in order to succeed [39]. Since narrow pas- sages appear in common types of motion planning problems (e.g., peg-in-hole inser- tion/assembly problems), this is an important issue to address. In order to rectify it, non-uniform sampling strategies that condition their distributions on information obtained from collision detection have been introduced. Two examples are the Gaus- sian sampler [10], which tries to sample along obstacle boundaries, and the bridge sampler [31], which tries to sample in narrow passages. Figure 2.6 shows examples of the distributions generated by these two samplers. Generating these distributions carries its own computational cost (due to the interaction of the collision detection in the sampling process), but they can markedly improve the waiting time to locate narrow passage samples. With these enhancements, the PRM planner is useful in

40 (a) (b)

Figure 2.6: Non-uniform sampling distributions for the PRM planner. Sampling along the obstacle boundaries by the Gaussian sampler is shown in (a), and sampling in narrow passages by the bridge sampler is shown in (b). The bridge sampler should be supplemented by the uniform sampler to provide full coverage. solving many realistic and difficult planning problems. Outside of the development of non-uniform sampling methods, the PRM plan- ner has been the subject of considerable other research efforts toward improving its performance. One approach modifies the sampling distribution after the fact by tem- porarily allowing samples with only shallow penetration into obstacles. Any colliding samples and connections are later used as focal locations to resample new milestones and connection paths [32]. The Lazy PRM planner [9] uses a similar technique orga- nized around a lazy collision detection routine similar to the SBL planner. In that case, an initial roadmap is created without collision detection and its hypothetical connections are searched by the A∗ algorithm for the shortest possible path. When collisions are discovered, the shortest possible path search repeats; when no path exists, the algorithm returns to adding new nodes to the roadmap with the differ- ence that they are normally-distributed around existing nodes with collision-broken connections. Another algorithm, the Obstacle Based PRM (OBPRM) planner [2], takes advantage of the three-dimensional models of the agent and obstacles to gen- erate samples in which the two are in contact. As with the non-uniform sampling

41 strategies, the pervading theme of many PRM-inspired algorithms is improving the placement of nodes and the ease with which they (and their separate components) are connected.

2.3.5 Rapidly-exploring Random Tree (RRT)

Like the PRM planner, the Rapidly-exploring Random Tree (RRT) planner [43] is a widely-recognized and successful algorithm in sampling-based motion planning. How- ever, it takes a quite different tactic in solving the motion planning problem. Unlike the PRM planner, which applies only to agents with local planners that can reason- ably connect arbitrary state pairs to form a path, the single-query RRT planner relies on incremental exploration, which provides more freedom in the range of applica- ble agents. Like the PRM planner, the RRT planner functions on its own natural method of exploration: using each sampled state as a target, it incrementally ex- plores from the nearest existing node. This process builds trees with a satisfying fractal-like pattern (like the ones shown in Figure 2.7). To create bidirectional im- plementations [46], like the one listed as Algorithm 2.4, the two trees (rooted at the initial and final states) are alternately expanded toward a sampled target state and toward the resulting node, respectively. When the first expansion toward a new node in the opposing tree reaches its target, a solution path has been located. The actual expansions are performed in two different ways: by incrementally extending toward a target (labeled extend) or by iterating the extension until an obstacle or the target is reached (labeled connect) [38]. The nearest neighbor basis for these expansions rests on the greedy assumption that the shortest path is the best way to reach a given target; this has been proven reasonable by the many successful applications of the RRT planner, but it is also the root of a noted limitation. The use of nearest neighbor selection in the RRT planner creates a two-fold weak- ness based on the increasing computation time of the process and its interaction with

42 (a) (b)

Figure 2.7: Trees created by the RRT planner. The example in (a) explores incremen- tally toward each sampled target state (i.e., the extend operation), while the one in (b) iterates until it encounters a collision or reaches the target (i.e., the connect operation).

Algorithm 2.4: The Rapidly-exploring Random Tree (RRT) planner. γ ← plan(xi, xf )

1 Ti::initialize(node(xi)), Tf ::initialize(node(xf )); 2 Tα ← Ti, Tβ ← Tf ; // aliases for swapping trees 3 while true do 4 xr ← random state(); 5 [ρ, ns] ← nearest neighbor(Tα, xr); 6 [xα, uα] ← local plan(ns.x, xr, ε); // grow toward random state 7 if !detect collision(ns.x; uα, xα) then 8 nα ← Tα::insert(ns, node(xα)); 9 [ρ, ns] ← nearest neighbor(Tβ, nα.x); 10 [xβ, uβ] ← local plan(nx.x, nα.x, ε); // grow toward previous node 11 if !detect collision(ns.x; uβ, xβ) then 12 nβ ← Tβ::insert(ns, node(xβ)); 13 if nα.x = nβ.x then // trees are connected 14 return compose path(Tα, nα, Tβ, nβ; xi, xf );

15 [Tβ, Tα] ← swap(Tα, Tβ);

43 the metric and obstacles. The issue of increasing computation time can be addressed

in two chief ways. First, the connect operation markedly decreases the number of nearest neighbor computations per expansion [38] on the presumption that each step yields a node that replaces its parent as the nearest neighbor to the target state. Second, improved nearest neighbor algorithms outperform the brute force method (of computing and comparing the distance to every node) and restrict the growth rate of its computation time. While the brute force method grows linearly (i.e., O(n)) in the number of nodes, more refined methods such as the k-d tree [22] or the cover tree [8] grow only logarithmically (i.e., O(log n)) in the number of nodes. The prac- tical need for and consequences of these solutions are problem-dependent; problems involving very complex three-dimensional models will spend more time performing collision detection and likely reduce the relative impact of nearest neighbor compu- tations. In those situations, the direct effects of nearest neighbor selection on the choices of explored path segments and the coverage performance of the algorithm are more important. In addition to the computational issues associated with the nearest neighbor process, the RRT planner also has difficulties with exploration on certain types of problems that arise from its nearest neighbor node selection. Several examinations have linked these difficulties with the interaction between the metric and the ob- stacles [46] and more specifically with the appearance of thin obstacles and narrow passages [1, 62]. The underlying explanation lies in the RRT planner’s fundamen- tal operational principle: the so-called Voronoi bias. Since expansions originate at the nearest neighbor of each target state, the Voronoi decomposition of the set of existing nodes supplies a representation of the possible sources and destinations for exploration. As shown in the examples in Figure 2.8, this results in a probabilistic bias toward unoccupied regions based on their volume. Furthermore, nodes along the frontier of the algorithm’s exploration are more likely to be selected as the source for

44 (a) (b)

Figure 2.8: Voronoi decompositions for in-progress instances of the RRT planner. Nodes adjacent to unexplored areas of the configuration space have large regions and are more likely to be selected to expand into these open regions. future expansion. Unfortunately, metrics commonly measure a distance idealized to obstacle-free situations, so they may have little correspondence to true distance when obstacles are considered. Hence, the Voronoi regions of individual nodes may over- or under-value their usefulness in exploring the configuration space. The resulting performance of the RRT planner worsens as this phenomenon becomes more severe. In order to address the RRT planner’s problem of over-valuing nodes resting against obstacles, the Dynamic Domain RRT (DD-RRT) planner [67] revises the original algorithm to include an additional parameter restricting the range at which a sample can be accepted for expansion if its nearest neighbor node has previously experienced a collision. As a result, the DD-RRT planner effectively samples within a subset of the state space that is determined by the presence of obstacles. Fig- ure 2.9 illustrates these effective sampling regions. If obstacles are not present or the range parameter is chosen to be arbitrarily large, the DD-RRT planner behaves identically to the original RRT planner. In order to increase its robustness, a more re- cent revision of the algorithm [34] adds an adaptation parameter (a small percentage factor) that further scales the range at which samples are accepted for a given node (i.e., decreasing as it experiences further collisions and increasing when it executes

45 (a) (b)

Figure 2.9: Effective sampling regions from the DD-RRT planner. The dynamic domain sampling region for each node is its Voronoi region if it has not experienced a collision. Otherwise, it is the intersection of that Voronoi region and the set of states within a metric-measured (here, the Euclidean) range given by a constant parameter. Later versions adjust the range with successful exploration or further collisions. collision-free expansions). These revisions embody the larger effort in sampling-based motion planning research to provide algorithms that apply to a wide variety of prob- lems with minimal user intervention.

46 Chapter 3

Rapidly-exploring Random Tree Analysis

A primary goal of sampling-based motion planning is to provide algorithms that are si- multaneously functional, implementable, and understandable; the Rapidly-exploring Random Tree (RRT) planner exemplifies all of these, but it is difficult to differentiate its behavior from the influence of random sampling. The widespread use of random- ization in sampling-based motion planning has made this a general truth in the field and has motivated the introduction and study of quasi-random and derandomized versions of its most popular algorithms (namely, the PRM [11, 45] and RRT [47, 48] planners). In the case of the PRM planner, the sampling mechanism can be sim- ply replaced by a quasi-random technique, and comparisons have demonstrated that randomization, while potentially helpful, is not an essential ingredient in the algo- rithm [23]. While the PRM planner draws nodes directly from its sampler, the RRT planner generates nodes conditioned on its complete state and thus merits somewhat more complex derandomization techniques. The focus in derandomizing the RRT planner is on preserving and emphasizing the Voronoi bias in a deterministic setting. For specific spaces and metrics (e.g., two-

47 dimensional and Euclidean), the Voronoi decomposition can be explicitly computed; in general, it can be approximated by computing the nearest neighbors of a finite set of samples. Derandomized RRT planners have been evaluated using both methods, but it is notable that the exact RRT planner they approach, which expands only from the node with the largest Voronoi region to the centroid of that region, is an incomplete algorithm [47]. To remedy this, the derandomized versions add an impor- tant feature to the algorithm: obstacle nodes that block expansion when selected as a nearest neighbor. These obstacle nodes help restore the expansion variety created by randomization in the original algorithm. However, this also raises the issue that the dynamics of the RRT planner’s probabilistic expansion are both critical to the algorithm’s success and poorly-understood. The complexity of the RRT planner’s behavior in practice has limited the depth of analysis available to a small set of results: the algorithm is known to be proba- bilistically complete, to have an exponential bound on convergence rate to solutions, and to approach the distribution of its input samples [46]. However, these notions provide no explicit method of appraising the RRT planner’s performance on a given problem and little guarantee that it will even be capable of completing in reasonable time. Instead, the application and parameterization of the RRT planner is commonly conditioned on experimental testing on the relevant problems. To provide practical recommendations in making these choices, this chapter explores the RRT planner and its parameters and heuristics via several simple modeling approaches and experimen- tal demonstration.

3.1 Simplified Models

In seeking a more thorough understanding of the operation of any algorithm, straight- forward models can provide useful insight. Since the expansion of the RRT planner is

48 dependent on its current coverage of the configuration space (and hence the obstacles in that space), the metric, and the local planner, in-depth analysis is impractical for its behavior on real problems. Instead, this section introduces and evaluates two modeling techniques that provide valuable observations about the RRT planner.

3.1.1 One-dimensional Model

One aspect of the RRT planner that is important to its overall function is the dynamics of its expansion. This expansion is customarily governed by a fixed maximum step size (denoted by the symbol ε), which encourages the addition of shorter and more likely collision-free path segments. As a result, expansions fall into one of three categories: collisions (in which no new node is added), partial steps (in which the target state is within the step size of the expanding node), and full steps (in which the target is at or beyond the step size and the expansion stops at the step size). In the general use of the RRT planner, small steps add more nodes than large ones (because shorter path segments cover less distance, cause denser coverage, and are more likely to be collision-free); however, they also require additional samples to cover the distance that could be covered by longer expansions. Further, this growth mechanism implies that its rate of progress toward a given point generally decreases over its approach. This phenomenon can be modeled by an instance of the RRT planner in a single dimension (i.e., the number line); this approach necessarily ignores obstacles in order

to represent the algorithm’s basic operation when using the extend heuristic. The one-dimensional environment offers a crucial advantage in modeling the RRT planner: the process can be simplified to a function of a single random variable measuring progress of the search along the number line. Figure 3.1 depicts the orga- nization of this model, in which the search starts at the initial state and proceeds in the direction of the final state. Expansions targeting states that have already been covered (i.e., that are to the left of the search progress) make no useful growth to-

49 Figure 3.1: Setup of the one-dimensional model of the RRT planner. The search progress at iteration n is denoted xn as the search moves rightward from xi (the initial state) to xf (the final state) in the “pursuit” of xr (a random target state). ward the final state and hence are irrelevant. In contrast, useful expansions target states in the direction of the final state from the search progress; such states become less frequent as the algorithm advances. The step size parameter then governs the proportion of partial versus full steps taken during the growth of the search. Two features are of chief interest in the model: the probabilistic behavior of the search progress and the distribution of waiting lengths for arrival at a given final state. The one-dimensional model can be treated directly using probability theory; Ap- pendix A.1 includes complete details. The first step in the derivation presents the recurrence shown in Equation (3.1) that relates the density (or the distribution) function of the search progress at adjacent iterations. In the density interpretation, the terms correspond to the events no progress, partial step progress, and full step progress, respectively. The solution to this recurrence is shown in Equation (3.2)

 x−xi  (with the definition that T = ε −1), though its form is quite complex. However, the most useful outcome of this model is the somewhat simpler density of iterations required to complete the problem (i.e., for the search progress to reach the final state), shown in Equation (3.3), which is analogous to the runtime of the RRT planner on general problems. This form can provide some intuition into the basic dynamics of the RRT planner.

50 Z x

fxn+1 (x) = xfxn (x) + fxn (u) du + (1 − x) fxn (x − ε) x−ε

Fxn+1 (x) = xFxn (x) + (1 − x) Fxn (x − ε) (3.1)

T " t #" t   # X 1 Y X k t n F (x) = 1 − (x − kε) (−1) (x − kε) xn t!εt (1 − (x − tε)) k t=0 k=0 k=0 (3.2)

" T #" T # 1 Y X T  p [n] = 1 − (x − kε) (−1)k (x − kε)n−1 (3.3) T !εT f k f k=0 k=0 The density of iterations to completion in the one-dimensional model of the RRT planner has two useful relationships to well-known probabilistic quantities: geometric and negative binomial random variables. Both forms arise as a result of extreme values of the step size parameter: the density of iterations to completion approaches the geometric when the step size is large (Equation (3.4)), while it approaches the negative binomial when it and the distance to the final state are small (Equation (3.5)). In the former case, the one-dimensional model conceptually simplifies to the problem of drawing a single sample in a fixed-size region (i.e., beyond the final state). This approximation also models the action of the connect heuristic, since its growth in the one-dimensional setting is only limited by the position of the random target states. Further, this notion is directly related to the RRT planner’s probabilistic completeness, since that property relies on the fact that there is always a (minimum) fixed-size region in which samples induce useful exploratory growth. The latter case is somewhat more specific owing to the restriction that the final state must be nearby, but it is also related to the RRT planner’s growth dynamics. When exploration is incremental, growth through short and narrow passages involves drawing random

51 samples in a roughly fixed-size region (that causes collision-free expansion), which is a negative binomial random quantity. Thus, both types of random variables play a central part in the general RRT planner.

n−1 p [n] = (1 − xf ) xf (3.4)

n − 1 p [n] = (1 − x )T +1 xn−1−T (3.5) T f f

With these observations, the one-dimensional model of the RRT planner exposes aspects that must be captured in any more general model. First, such a model must describe the consequences of the decreasing probability of progress with that progress. Unfortunately, in multiple dimensions, this phenomenon is influenced in a complex way by the topology of the state space and its edges (e.g., two-dimensional state spaces on the unit square versus the unit circle) and their interaction with the obstacles. Second, a more general model would need to include both geometric and negative binomial random variables as (components of) its required iterations for completion. However, the associated probabilities and other parameters for these random variables are also dependent on the obstacles. From these characteristics, it is reasonable to expect that any comprehensive model of the RRT planner would be tractable only in very specific state space and obstacle scenarios. Absent a superior model, the one-dimensional version can still impart some use- ful information about the general performance of the RRT planner. Much as the probability of growth is a decreasing function of that growth in one dimension, the RRT planner can have difficulties when the obstacles require it to explore toward or along the edges of a configuration space. Figure 3.2 demonstrates this concept using a two-dimensional tunnel and two types of query states, one in which the initial and final states are aligned with the tunnel and one in which they are centered. For both

52 (a) (b)

Figure 3.2: Edge effects in the (bidirectional) RRT planner. The test environment in (a) involves query states that are either centered vertically (fixed) or aligned with the tunnel. The mean number of iterations required for the RRT planner to complete the problem, plotted in (b), displays a marked increase when the tunnel is positioned near the edge of the space. The version that searches with the connect heuristic does not display this tendency.

types of query and variants of the bidirectional RRT planner that search with the

extend heuristic, there is a sharp increase in mean iterations to completion as the tunnel nears the edge of the state space. In contrast, the variant that searches using the connect heuristic does not suffer from this weakness, as it is consistent with the behavior of the geometric approximation of the one-dimensional model. Clearly, these edge effects can have significant sway over the performance of some variants of the RRT planner. However, as suggested by the non-monotonic nature of several of the performance curves in Figure 3.2, there are other important features that are not captured in the one-dimensional model.

3.1.2 Markov Chain Model

An alternate method of representing the RRT planner involves the use of Markov chains, which can model its interaction with an environment by approximating its coverage with a set of discrete states. In order to properly capture the RRT planner with a Markov chain, it is first necessary to define the notion of state for the latter.

53 Since a Markov chain is composed of a directed graph of discrete states interconnected by transition probabilities, an individual state in the Markov chain must fully describe the immediate state of the RRT planner. The algorithm’s exploratory behavior is determined by the set of nodes it has previously searched, so this coverage state appropriately represents its current situation and serves as a state in the Markov chain model. Thus, if the RRT planner can occupy n (discrete approximated) states in the state space of its environment, the corresponding Markov chain model may have up to 2n coverage states (since the RRT planner it models may have reached or not reached each state). Practically, the count of coverage states is lower than this upper limit because exploration is restricted to adding states that are in the intersections of the visibility and Voronoi regions of existing nodes. This model also presents the difficulty of computing the transition probabilities, which include the probabilistic dynamics of the exploration due to the Voronoi decomposition and the action of the local planning and collision detection. However, like the one-dimensional model, the Markov chain approach does reveal several informative lessons about the action of the RRT planner. Modeling the RRT planner with a Markov chain offers a more general approach than the one-dimensional model, but it is still capable of reproducing the latter. In this instance of the model, the coverage state is defined in a manner analogous to that of the search progress in the one-dimensional model. Constructing the corresponding

matrix of transition probabilities (Pn(ε)) yields the formulation in Equation (3.6) (where the element in row i and column j is the probability of moving from coverage state i to j in an environment of n states and the step size ε is measured in terms of

the number of states covered by one step) for the extend heuristic. Analogously to the one-dimensional model, setting the step size parameter to a large value (e.g., at

least the number of states n) models the connect heuristic. The theory of Markov chains [50] allows the portion of this matrix corresponding to transient states (here,

54 coverage states not including the final state) to be transformed into the fundamental matrix (M = (I − Q)−1, where Q is the submatrix of transient states). The elements of this fundamental matrix (mi,j in row i and column j) give the mean number of iterations spent in one coverage state (j) when the model is initiated in a given coverage state (i). Figure 3.3 shows examples of the occupancy and performance measures that can be computed from this information. Consistent with the one- dimensional model, states corresponding to progress by multiples of the step size are occupied for the longest length, and final states nearing the edge of the state space require sharply increasing iterations to reach. Beyond replicating the one-dimensional model, several important qualitative properties of the RRT planner are represented by the Markov chain model.

  i , j = i  n  1 Pn (ε): pi,j = (3.6) n , i < j < i + ε   i+ε−1  1 − n , j = i + ε Significant properties of the RRT planner are observable from the general construc- tion of the Markov chain model, including its tendency to fully cover the reachable space and its probabilistic completeness. These properties arise from the notion of the coverage states used in the definition of the model. Coverage of the state space can only increase up to the point at which the entire reachable space is occupied; an ordering representing this results in an upper triangular transition matrix (in which the ordering is chosen to have increasing coverage). Diagonal elements of this matrix (the self transitions) correspond to the incidence of collisions, for which no exploration has occurred. Because the matrix is upper triangular based on the coverage ordering, these self transition probabilities are also its eigenvalues; in Markov chain theory, the corresponding eigenvectors represent stable distributions. However, forward substi- tution shows that eigenvectors other than the one belonging to the complete coverage

55 (a) (b)

Figure 3.3: Computed performance of the RRT planner in one dimension via Markov chain fundamental matrix. The Markov chain approximates the one-dimensional environment with a set of 256 states, and the algorithm uses a step size of 8 states. The curves in (a) show the mean iterations spent in each coverage state for three initial states; states at multiples of the step size from the initial state see the most occupancy. The curves in (b) show the mean iterations required to complete the problem for three final states; performance worsens significantly as the final state nears the edge. state necessarily have a sum of zero and are not realizable distributions. The final eigenvector (which has an eigenvalue of one) has a single nonzero element correspond- ing to certainty of being in the complete coverage state, which represents the RRT planner’s probabilistic completeness independent of final state. Using these general properties and the coverage ordering, illustrative Markov chain models for specific problems can be expressed. Formulating Markov chain representations of the RRT planner on simple envi- ronments allows the inclusion of the effects of obstacles on its expansion. Figure 3.4 presents two such examples with state spaces of nine and six discrete states, respec- tively. On both problems, the exploration of the RRT planner is constrained by surrounding obstacles to be one-dimensional, leading to a minimal number of cov- erage states (three and five, respectively). In order to reach the final states (which are last in the order of exploration), random states must be selected that draw the growth forward along the corridor. The associated transition probabilities do not

56 Figure 3.4: Example discretized Markov chain models for the RRT planner. These models constrain the planner’s growth to one dimension, making them straightforward for analysis. strictly increase or decrease, but they do decrease as the exploration moves toward a given edge of its state space, as in the one-dimensional model. Compiling these probabilities (and augmenting them with the effects of a goal biasing probability pf ) results in the transition and fundamental matrices in Equation (3.7). The straightfor- wardness of these examples allows for convenient and explicit analysis of their Markov chain’s properties.

  3+pf 1−pf 4 4 0 0 0     2−2pf 1+2pf  2+pf 1−pf  0  0 0 0   3 3   3 3   7−7pf 2+7pf   3−pf 1+pf  P9 (pf ) =  0  P6 (pf ) =  0 0 0   9 9   4 4      0 0 1  0 0 0 5−5pf 1+5pf   6 6    0 0 0 0 1   4 3 4 6    1−pf 1−pf 1+pf 1+5pf    3 9  0 3 4 6   1+2pf 2+7pf   1−pf 1+pf 1+5pf  M9 (pf ) =   M6 (pf ) =   (3.7) 0 9  0 0 4 6  2+7pf  1+pf 1+5pf    0 0 0 6 1+5pf

57 On simple problems that permit the explicit construction and computation of the Markov chain transition and fundamental matrices, it is possible to determine the expected length of the RRT planner’s runs and the optimal choice of goal biasing factor. On the nine- and six-state examples, post-multiplying the fundamental ma- trix by a column vector of ones and taking the first element results in the former

2 15+39pf 17+58pf +9pf ( 2 and 2 3 , respectively). In turn, these expressions can be differ- 2+11pf +14pf 1+5pf −pf −5pf entiated and set equal to zero to locate the minima of the RRT planner’s performance as a function of the goal biasing factor. Clearly, on the nine-state example, the op- timal goal biasing factor is one (leading to an expected duration of 2.0 iterations compared to the base expected duration of 7.5 iterations); the resulting polynomial

2 (182pf + 140pf + 29) accordingly has no real roots. On the six-state example, that

4 3 2 polynomial (45pf +580pf +358pf +52pf −27) has a real root at pf = 0.191 that leads to an expected duration that drops slightly from the base value of 17.0 iterations to 15.1 iterations. These examples capture the behavior and performance of the RRT planner on basic problems, though the Markov chain model is not limited to such one-dimensional cases. Examples containing more diverse exploration options for the RRT planner lead to more complex Markov chain models that provide other insights into its operation. Figure 3.5 shows such an example based on a state space with nine discrete states and using a step size of one state. The initial and final states (left center and center, respectively) are separated by only a single step, but the planner is also free to explore the surrounding area. In order to minimize the size of the resulting Markov chain, coverage states that are symmetric with respect to the horizontal are reduced to sin- gle states with the appropriate adjustments to transition probabilities. Furthermore, the exploration has a specific initial state, rather than including all possible initial states and corresponding coverage states and transition probabilities. Compiling the transition probabilities and computing the fundamental matrix for this instance of

58 Figure 3.5: Discretized Markov chain model for the RRT planner. Coverage increases to the right as the algorithm attempts to expand from the initial state on the left edge to the final state in the center of the environment. Transition probabilities (dark) are listed along the arrows connecting coverage states, while completion probabilities (light) are listed adjacent to their respective coverage states.

the model yields an expectation of 5.376 iterations to complete this problem; how- ever, examining the components of this figure provides more insight. The success probabilities of the RRT planner in this example display a marked decrease with its increasing coverage of the environment, which leads to the notable outcome (shown in Figure 3.6) that the algorithm spends the greatest average length in the full coverage state. In fact, it is responsible for more than a third of the total expected iterations required for completion. Based on its performance consequences, the decreasing prob- ability of useful exploration exposed by this Markov chain model are also of general interest in understanding the RRT planner.

3.2 Exponential and Power Decay Regimes

Early explorations of the RRT planner established that its distribution of iterations required for completion of any problem is subject to an exponential bound [46]. Corre- spondingly, an exponential density is associated with sampling in a fixed-size region in the one-dimensional model. In fact, the probabilistic completeness of the algo-

59 (a) (b)

Figure 3.6: Dynamics of the RRT planner in the nine-state environment as modeled via Markov chain. The transition probabilities to the completion coverage state, plot- ted in (a), generally decrease with coverage of the state space. This results in greater expected length spent in some coverage states with higher magnitude of coverage, as plotted in (b).

rithm relies on the related notion that given any sampling of nodes there is a set of possible states which lie in the nearest node’s visibility region and have smaller optimal cost-to-go (denoted ρ∗) to the final state than any node. More formally,

∗ SN ∗ inf{ρ ( k=0 Rxk ∩ Sxk , xf )} < mink{ρ (xk, xf )} (for N nodes at states xk and with

Voronoi regions Rxk and visibility regions Sxk ), which simply stipulates that the al- gorithm cannot reach a point at which further progress is impossible. In terms of actual performance, these regions of useful sampling are a complex function of the obstacles and the algorithm’s growth, but they can have a critical role in determining performance. Subject to the geometry of the surrounding obstacles, regions of useful sampling (and thus the probability of worthwhile exploration) may be particularly small or large relative to the state space as a whole. As the algorithm’s coverage becomes increasingly dense (as it frequently does when needing to explore a narrow passage) without making forward progress (in terms of true cost-to-go to the final state), these Voronoi regions of useful sampling shrink to a minimum size that is determined by

60 (a) (b)

Figure 3.7: Diversely-sized minimum (Euclidean) Voronoi regions in different narrow passage situations. In the bug trap in (a), nodes can easily reach areas surrounding the passage, leading to a small useful Voronoi region. In the kinked tunnel in (b), the obstacles prevent such coverage and yield large minimum size Voronoi regions. the environment. Figure 3.7 shows the Voronoi decomposition and examples of two such minimum size Voronoi regions in qualitatively dissimilar examples, a bug trap and a kinked tunnel, which both contain a narrow passage. When the exploration can easily surround important areas (like the mouth of the bug trap), these minimum size Voronoi regions can be quite small, resulting in low probability of picking individual random target states that improve the situation by making progress. This results in poor overall performance (13, 546 iterations required to solve the complete bug trap, on average). In contrast, when obstacles prevent such growth, the minimum size Voronoi regions are large and do not significantly hamper performance (3, 612 iterations required to solve the complete kinked tunnel, on average). While the ob- stacles are an additional restriction on the subset of these regions that can actually realize useful exploration, these critical Voronoi regions represent an influence over performance in the algorithm itself. While the critical Voronoi regions appear as a result of dense coverage of the environment precipitated by the inability of the RRT planner to escape a given region, the dynamics of growth that approach that situation are also important to overall

61 (a) (b) (c)

Figure 3.8: Generalized Voronoi region volume changes with sampling. The mean volume of the generalized Voronoi regions of uniform, random states in the open unit square due to three Lp metrics, plotted in (a), decays as the inverse of the sample count. When the RRT planner starts at the mouths of the narrow passages in (b), the mean Voronoi volume of the critically-located initial node (e.g., at the mouth of the bug trap), plotted in (c), also (roughly) decays like the inverse of the number of iterations. The corresponding power-like distribution of waiting duration holds until useful progress is made. performance. Unfortunately, the complexity of quantifying the volume of Voronoi regions (for which few theoretical results are available [4]) and the unique distribution of nodes induced by the RRT planner’s visibility-influenced growth makes modeling this phenomenon difficult. Still, analytical and experimental studies related to crystal growth have shown that the mean volume of (Euclidean) Voronoi regions in three or fewer dimensions decays with the inverse of the number of nodes [24, 36]. Figure 3.8 demonstrates experimentally that this is also consistent with generalized Voronoi regions due to other two- and three-dimensional metrics and with particular scenarios arising from sampling by the RRT planner. As a result, it is reasonable to expect some influence from inverse power terms in the RRT planner’s density of iterations required for completion in certain scenarios with narrow passages. In order to observe a probability of useful exploration that decays with the inverse of the number of nodes in the RRT planner, specific qualities are required in the environment. First, the notion rests on a single Voronoi region (i.e., a single node)

62 Figure 3.9: Sequence of Voronoi visibility during the RRT planner’s expansion on a tube. As the tree grows (top row) from initialization of the tree through its early growth (through 8, 16, and 24 iterations across the columns), the addition of cer- tain nodes can crowd out useful Voronoi visibility (bottom row) and chances for the planner to explore the useful area inside the narrow passage. being the focus of useful exploration, which can be provided by embedding it in an arbitrarily narrow passage. Second, the sampling of the RRT planner should be (roughly) consistent with uniform sampling. This requires an open and adjacent area around the passage that can be covered without directly progressing through the passage. Further, a large open area provides minimal deviation between node and iteration counts because there are few collisions. Figure 3.9 illustrates such a situation, in which the initial state is at the threshold of a narrow passage. While the planner has complete visibility of the passage from this point, it is comparatively easier for it to explore toward the adjoining open areas, an action which quickly cannibalizes the Voronoi region of the original node and curtails the planner’s ability to immediately enter the passage. When the RRT planner faces such a situation, it can experience a probability of useful exploration that decays as the inverse of its iteration count and is likely to perform poorly as a result. Based on the observations of inverse power decay of the volume of a single Voronoi region and its approach to a critical value, a simple expression can portray the inter- mediate success probabilities to which the algorithm may be subjected. Equation (3.8)

63 provides such an approximate model of a useful Voronoi region that shrinks with the inverse of the iteration count (from an initial position n0 ≥ 0) but approaches a con- stant (p∞) representing an applicable critical Voronoi volume. Appendix A.2 includes complete derivations for Equation (3.9), which expresses the resulting density of wait- ing length to obtain a success with such a decaying probability, and Equation (3.10), which shows its limit as the constant approaches zero. In the latter formulation, the density approaches an inverse square function of iteration count. However, as sug- gested by the former, the general performance of the RRT planner involves a balance of these qualitatively different forces.

1 − p∞ ps [n] , + p∞ (3.8) n + n0 + 1

∞     X 1 − p∞ k+1 1 + n0 ns [n] = + p∞ (1 − p∞) δ [n − k] (3.9) k + n0 + 1 k + n0 k=1

∞ X  1   1  lim ns [n] = (1 + n0) δ [n − k] (3.10) p∞→0 k + n0 + 1 k + n0 k=1 The real performance of the RRT planner is a complex sequence of elements driven by various types of probabilistic forces. The mean expressions for the common dis- tributions identified by the presented models of the algorithm demonstrate that their relative impact is central to that performance. In particular, the means of the expo-

1−p 1−p nential decay geometric and negative binomial distributions ( p and r p for success probability p and success count r) approximately double when an already small suc- cess probability is halved, but a true power decay distribution (e.g., the Zeta or Zipf distributions) can have infinite mean depending on its exponent. Unfortunately, as Figure 3.10 shows, the appearance of and transition between these elements can be a simple matter of a minor change in initial state. In this case, that change is a movement toward a state in which the alpha puzzle parts are more connected (i.e.,

64 (a) (b) (c)

Figure 3.10: Transition between power and exponential decay in the RRT planner. With a single instance of the RRT planner initiated at the colored states in (a), the (complement of the) distribution of iterations required to separate the two parts moves from primarily exponential (linear on the semi-log axes used in (b)) to primarily power (linear on the log-log axes used in (c)) as the initial state is drawn inward to a more connected condition.

are farther from being separated, as defined by the true cost-to-go). Since the precise interaction of these elements is not predictable with the current theory, it is appropri- ate to examine the practical performance effects of the parameters and heuristics of the RRT planner and the qualitative characteristics of its poorly- and well-performing instances.

3.3 Parameterization and Heuristics

While one of the original design features of the RRT planner is to have “as few heuris- tics and arbitrary parameters as possible” [43], the chief exceptions to that idea, the

step size parameter and the extend and connect heuristics, are quite powerful. Both choices determine the distribution of nodes during the planning process and hence, determine the overall performance of the algorithm. While the ultimate distri- bution of nodes follows the sampling distribution, this can only occur following full, dense coverage of the reachable space (to a resolution finer than the step size), so it

65 does not factor into the effective performance of a single-query algorithm. Rather, a performance-driven interest should be concerned with the initial distribution of nodes that arises up to the point at which the final state becomes reachable. With the RRT planner, the metric, local planning, step size, and heuristic all impact this distribu- tion; since the former two are chiefly associated with the agent’s dynamics and the environment, it is most valuable to study the latter two, which are inherent parts of the algorithm.

3.3.1 Step Size

Consistent with its character as an incrementally-exploring algorithm, common im- plementations of the RRT planner utilize a step size (in distance, time, etc.) that restricts the length of path segments and hence, increases the probability that they will be obstacle-free. This property is, in fact, a fundamental principle in the de- sign of the SBL planner [60] and shows up in the PRM planner as the maximum connection radius and the distance-based ordering of attempted connections. In the RRT planner, smaller step sizes have two chief advantages: they sample the explored paths more densely (resulting in shorter average distance to a given sample) and their individual path segments are less likely to stretch over obstacles (making them easier to collision check because less time can be wasted checking collision-free states that are past an obstacle). Figure 3.11 verifies these declarations experimentally for RRT instances with similar total distance of exploration. Despite the apparent benefits of a small step size, the parameter’s effect on the distribution of nodes (and path segments) must also be considered in determining an opportune choice. In the RRT planner, the influence of the step size goes well beyond simply assuring that collision-free segments can be generated cheaply and frequently. Its Voronoi bias, which quickly grows toward unoccupied areas, is regulated by the chosen step size. Smaller step sizes change the overall arrangement of nodes very little per iteration,

66 (a) (b)

Figure 3.11: Comparison of small and large step size in the RRT planner. The two instances have expanded an approximately equal total distance. Still, the small step size used in (a) tends to have nearest neighbors that are closer to random samples (0.205 versus 0.219 units length per sample) due to its denser coverage and requires less collision checking to validate path segments (9.380 versus 24.353 checks per unit length) than its counterpart large step size used in (b), based on averages from 10, 000 random samples. resulting in a bulk behavior that grows toward the centroids (volumetric averages) of unoccupied areas. In contrast, larger step sizes produce nodes that more closely follow the sampling distribution, since the reach of the longer path segments is more likely to connect to any sample. These characteristics lead to biases in the initial distribution of nodes that become increasingly significant as the step size decreases. Figures 3.12, 3.13, and 3.14 show the instantaneous distributions of nodes at particular iterations in an open environment as induced by small, medium, and large step sizes (and the Manhattan, Euclidean, and Chebyshev metrics, respectively). In each case, the distributions associated with the small step size contain concentrated peaks in metric-dependent locations. These localizations in initial exploration can be harmful if they do not coincide with a useful growth through the obstacles. Since the RRT planner’s growth is not directly obstacle-aware, similar initial dis- tributions of nodes appear in any problem, which can be disastrous for its performance if the occurrence of the peaks predisposes the algorithm to produce nodes that canni-

67 Figure 3.12: Densities of nodes created by the RRT planner with Manhattan metric at specific iterations. The rows present step sizes of 2−9, 2−6, and 2−3 (from the top down); the columns render the density at iterations 1/ε, 2/ε, 3/ε, and 4/ε (from left to right). Localized concentrations in initial growth become more severe at smaller step sizes.

Figure 3.13: Densities of nodes created by the RRT planner with Euclidean metric at specific iterations. The rows present step sizes of 2−9, 2−6, and 2−3 (from the top down); the columns render the density at iterations 1/ε, 2/ε, 3/ε, and 4/ε (from left to right). Localized concentrations in initial growth become more severe at smaller step sizes.

68 Figure 3.14: Densities of nodes created by the RRT planner with Chebyshev metric at specific iterations. The rows present step sizes of 2−9, 2−6, and 2−3 (from the top down); the columns render the density at iterations 1/ε, 2/ε, 3/ε, and 4/ε (from left to right). Localized concentrations in initial growth become more severe at smaller step sizes. balize Voronoi regions of ones with useful visibility. For example, Figure 3.15 presents an example bug trap that is solved much more easily with a moderate over a small step size if the Euclidean metric is used and a small versus a moderate step size if the Chebyshev metric is used. Unfortunately, it is likely to be difficult to choose a particular step size that optimizes over the presence of these initial distributions (or alternatively, to design a metric that prevents them from having negative effect).

Therefore, it is also helpful to consider the alternative growth heuristic, connect, which represents a compromise between the lack of distributional peaks at large step sizes and the collision checking advantages of the small step sizes.

3.3.2 Extend versus Connect

Two alternative approaches to seeking samples are defined for the RRT planner: extend, which takes a single step toward the sample with a fixed maximum size, and

69 (a) (b) (c)

(d) (e) (f)

Figure 3.15: Average-case RRT instances over metric and step size. The instances with large and medium step size (in (a) and (b) for Euclidean metric and in (d) and (e) for Chebyshev metric) have similar coverage density when escaping the bug trap. However, distributional peaks due to the small step size cause the Euclidean metric version in (c) to reach the lobes first and delay its progress, while the Chebyshev metric version in (f) escapes during its initial growth.

70 connect, which iterates extend until the sample or an obstacle is reached. Several weaknesses observed in extend are remedied by connect because the latter places no artificial limits on the total exploration toward a given target. This allows the use of a small step size without introducing significant distributional peaks, because the overall arrangement of nodes can change noticeably due to a single target. It further

alleviates the issues that extend has with approaching and following the edges of its space, because a single (visible) target provides the necessary growth force to reach or

explore along that edge. connect also has the computational advantage that fewer nearest neighbor calculations must be executed per step. Clearly, these observations only apply for small step sizes, since the difference between the heuristics diminishes when the step size is large. Despite these benefits, care must be taken to determine

if connect’s less restricted growth behavior leads it to more quickly and densely fill areas (like the bug trap’s lobes) that mislead later exploration.

In order to fully qualify the strength of connect as compared to extend, their performance on difficult problems must be compared. To the extent that their explo- ration covers areas that are not useful in locating the solution path and limit helpful exploration, each strategy is weakened. Figure 3.16 compares the two heuristics across step sizes on three problems with a varying degree of this potential; the tunnel has a low propensity for this issue, while the bug trap and tube increase it to a medium

and high level. In all cases, the previously-identified issues with extend cause it to perform badly at small step sizes, while connect’s performance is nearly monotonic with its step size. These experiments suggest that any tendency presented by the connect heuristic to prematurely cover areas that are deceptive to its exploration is outweighed by its other benefits. As a result, both conceptual analysis and exper- imental evidence support that connect is the superior heuristic across a variety of different situations.

71 (a) (b)

Figure 3.16: Effects of step size and heuristic on overall RRT planner performance. The connect heuristic places no limits on the attempted exploration per iteration, so it is very robust to shrinking step size. In contrast, the extend heuristic tends to perform badly at very small step sizes because its initial coverage becomes more deterministic (and less generally-applicable).

3.4 Implications for Problem Difficulty

Due to the RRT planner’s distinctive method of exploration, it requires unique def- initions of problem difficulty in order to account for the likelihood of its specific drawbacks. Unfortunately, these standards must also be somewhat less intuitive than those traditionally applied to other algorithms. For example, the PRM planner is modeled with a probabilistic concept known as the “coupon collector’s problem” [39], which relates its need to obtain samples in a set of important regions (e.g., near corners) to the problem of collecting a complete set of coupons via random chance. Clearly, a minimal set of coupons that links the entire (reachable) configuration space can be completely determined by the obstacles. With the RRT planner, this concept can only apply under the recognition that its coupons must be collected in order (due to its incremental growth) and with varying probability (due to the reliance of its growth on past exploration). Due to this additional complexity, the difficulty of a given problem for the RRT planner can be markedly influenced by aspects like the position of the query states that might not hold great sway over the performance of

72 (a) (b)

Figure 3.17: Collision checking performance of the RRT planner on a bug trap versus query states. When the final state is fixed outside the bug trap, as in (a), an initial state inside it requires significantly more exploration effort. When the initial state is fixed outside the bug trap, as in (b), increased exploration effort is still required to enter it, but the algorithm performs much better when solving the problem in the entering direction than the exiting one. other algorithms. One important aspect that must be considered in characterizing problem difficulty for the RRT planner is its directionality of exploration, which is determined by the query states. Environments including obstacles like a bug trap are only difficult for the RRT planner if it is initiated inside and must escape; Figure 3.17 illustrates this by varying the query states. Clearly, any query requiring the bug trap to be escaped could be well-handled by a bidirectional RRT planner; however, it is straightforward to invent a revised problem (e.g., involving multiple bug traps) that is not (much) easier for a bidirectional planner. Thus, its notion of difficulty must be fully aware of whether the query states predispose exploration to face obstacles like the bug trap interior. Still, it is especially complex to view a problem in the context of all possible query states; it is more straightforward to recognize the ability of various types of obstacles to create these situations. The potential difficulty presented to the RRT planner by a given obstacle can be extrapolated reasonably through the context of its performance. Since that per-

73 (a) (b) (c) (d)

Figure 3.18: Node density in the RRT planner by performance. The fastest and second fastest 6.25% of instances represented in (a) and (b), respectively, place very few nodes in the lobes of the bug trap before escaping it. Moving through the third and fourth fastest 6.25% of instances in (c) and (d), respectively, this tendency begins to disappear. formance is strongly biased by the actual distribution of nodes that arises from the combination of the RRT planner’s growth dynamics and the obstacles’ constraints, the relationship between these distributions and performance provides a useful gauge of where problems occur. Figure 3.18 contrasts the distributions of nodes at com- pletion on the bug trap for increasing increments of runtime. The fastest-running instances largely avoid placing nodes in the lobes of the bug trap and near the inner surfaces of the mouth, but as runtime increases, this quality fades into largely uniform coverage of the interior. As a result, these areas can be classified as “trick” states; if they are reached before nearby (and useful) exploration occurs, their presence ham- pers subsequent exploration. Regrettably, as Figure 3.19 demonstrates, these types of “trick” states also arise plausibly on realistic problems. Consequently, their effects on the RRT planner’s performance are widespread. While the label “trick” states intuitively applies to those in absolute proximity to the solution path, it is important to recognize that the general effect is a factor in virtually all problems. In fact, any obstacle environment that requires exploration in the rough direction of previous coverage is subject to a degree of this phenomenon. Figure 3.20 provides an example of a maze with a winding solution path; coverage of the early portions of the environment cause the RRT planner’s later exploration

74 (a) (b)

Figure 3.19: “Trick” states for the RRT planner on realistic disassembly/assembly problems. These common benchmarks, the alpha puzzle in (a) and the flange in (b), have easy-to-access “trick” states (red) that are close to essential states along the solution path (green) but do not have direct access to them.

to follow walls as it (implicitly) avoids that coverage in favor of the uncovered parts of the environment. This gives rise to an unusual consequence: while increasing the thickness of the walls in this environment narrows the passages and the planner’s movement freedom, it decreases the actual exploration effort required (as measured by collision checking) by lessening the bias due to earlier exploration. As a result, it is crucial to be mindful of the influence of the progression of coverage when applying the RRT planner to virtually any problem. Fundamentally, the primary source of difficult problems for the RRT planner arises from a lack of correspondence between metric and true cost-to-go. Because that metric is employed globally, obstructive biases can arise in many problems to at least some extent. Furthermore, it has counterintuitive effects: easy-to-reach areas away from the solution path should ideally be avoided because their coverage can bias exploration against reaching difficult areas, while adding artificial restrictions on the environment can actually improve the performance of the algorithm. Though the RRT planner has many amenable cases on which it performs well, its use should be carefully conditioned on any observable characteristics of the problem that might

75 (a) (b) (c)

Figure 3.20: RRT planner growth bias due to faraway “trick” states. The exploration (and solution) of the algorithm, in (a), is biased away from earlier nodes, resulting in a predilection toward following walls and corners. While thicker walls narrow the passages and hence, worsen the number of iterations required to complete the problem as shown in (b), they ameliorate bias from earlier nodes by forcing more separation and reduce the overall exploration effort in (c). indicate whether these negative bias effects could impair its performance.

3.5 Summary

This chapter has presented an analysis of the RRT planner from both probabilis- tic and practical perspectives. The former view has included several straightforward models centering on a pivotal point: the exploratory progress of the RRT planner can decrease over time in situations involving either state space edges or particular obsta- cle types. The latter has examined the impact of the RRT planner’s parameterization (step size and expansion heuristics) on both the initial distribution of exploration and its resulting performance. The particular difficulties arising from the RRT planner’s nearest neighbor growth mechanism have been attributed to the presence of obstacles that have the capacity to mislead the metric’s measurements between existing nodes and other space. In such situations, the RRT planner creates nodes at “trick” states that are closer to (but blocked from) states in unexplored space than nodes that

76 could actually expand toward that unexplored space. These “trick” states occlude the Voronoi regions of nodes that might otherwise provide useful exploration, poten- tially introducing power-like decay in the RRT planner’s performance distribution and hindering its growth on problems like the bug trap and the alpha puzzle.

77 Chapter 4

Distributions and Restarts

Fundamentally, motion planning is a problem of search, for which a much broader and more extensive body of research exists. Therefore, it is sensible to look to techniques established in other facets of search for inspiration on improving sampling-based mo- tion planning algorithms like the RRT planner. One such search problem is that of Boolean satisfiability, which involves locating a series of true/false values that satisfy a given formula (e.g., x ∨ y ∨ ¬z), on which randomized procedures have been applied to great success [26]. As in motion planning, complete solvers exist in Boolean satisfi- ability, but the advantages of randomized approaches have made them a central part of many modern methods [26]. However, Boolean satisfiability algorithms leverage a number of techniques not seen in motion planning, including the notion of a restart, which disposes of all current progress in favor of a completely new attempt at solving the problem. The rationale for the use of restarts is based on the notion that an algorithm makes mistakes in judgment that successively increase its workload. More formally, these errors must take a certain form in order to be exploited to benefit without any other knowledge: work in Boolean satisfiability capitalizes on the appearance of heavy-tailed (e.g., power decay) distributions in the runtime of its algorithms [28, 27].

78 The presence of these distributions are justified by the fact that they can arise from a fixed probability of poor choices, which each have a multiplicative effect on the total effort required to complete the solution [14]. This exponential distribution of exponential workloads results in a heavy-tailed distribution; restarts can reduce both runtime mean and variance (a recognized secondary goal in sampling-based planning research [33]). Based on the analysis in Chapter 3, the RRT planner can also be considered to make similar errors that increase its workload in the form of the bias of past exploration. This suggests that restarts are a viable method for improving the performance of the RRT planner, but their use requires an examination of the relationship between the algorithm’s functionality, computational components, and runtime.

4.1 Distributions to Runtimes and Restarts

The most general theory of restarts treats the subject process as a black box with a random runtime (a so-called Las Vegas algorithm, which provides correct results but with random required resources). In this context and with complete information about the distribution of runtimes, it can be proven that the optimal restart strategy simply begins anew at a constant interval. Alternatively, the universal strategy [49], which devotes balanced amounts of computation to instances of varied length and is described in Figure 4.1, performs within O(t log t) of the expected runtime of the best possible strategy. In either case, the various processes involved in the RRT planner make its runtime effectively a continuous quantity, so a like treatment of the characteristics of relevant runtime distributions is applied. If experimental data is available to gauge the exact distribution of runtimes for a particular setup of problem and query solved via the RRT planner, the (optimal) constant interval restart strategy can be used; complete derivations are included in

79 Figure 4.1: Balanced universal restart strategy. This strategy spends an equal amount of time on instances of lengths of powers of two. At the pictured time (vertical line), four instances have run to length one, two to length two, and one to length four, while no longer runs have been executed.

Appendix A.3. Given a random runtime density ft(t) (and the associated distri- R t bution Ft(t) = 0 ft(u)du), Equation (4.1) defines the density of output runtimes from a restart strategy with a constant interval (τ). This density is composed of exponentially-scaled copies of the lower portion of the underlying density (with base

−1 1 − Ft(τ) and decay rate τ ); the tail of the underlying distribution has been trun- cated. As a result, slowly-decaying distributions with power-like tails can be improved by such a restart strategy. Specifically, Equations (4.2) and (4.3) give their altered means and variances. These relationships allow the construction of comprehensive diagrams of the effects of a constant restart interval on the mean and variation (such as the ones in Figure 4.2), which establish the optimal value for the restart interval. Equation (4.4) provides a resulting condition on the range of useful (mean-reducing) restart intervals, and Equation (4.5) describes the optimal mean in terms of the op- timal restart interval. Still, the viability of this type of restart strategy depends on either experimental data or analytical knowledge of the runtime distribution.

t   t  f (t, τ) = (1 − F (τ))b τ c f t − τ (4.1) tr t t τ

1 Z τ  µtr (τ) = uft (u) du + τ (1 − Ft (τ)) (4.2) Ft (τ) 0

80 (a) (b)

Figure 4.2: Restart statistical diagrams. These diagrams plot the relationship be- tween the mean in (a) and the coefficient of variation in (b) of a given performance measure (e.g., runtime). The optimal restart interval is then chosen as the value that minimizes the mean of the performance measure.

1 Z τ µt− (τ) , uft (u) du Ft (τ) 0 Z τ  2 1 2 2 1 − Ft (τ) σtr (τ) = u − µt− (τ) ft (u) du + τ (4.3) Ft (τ) 0 Ft (τ)

1 Z ∞ µt+ (τ) , uft (u) du 1 − Ft (τ) τ τ ≤ µt+ (τ) − µt− (τ) (4.4) Ft (τ)

∗ ∗ 1 − Ft (τ ) µtr (τ ) = ∗ (4.5) ft (τ )

Several informative, analytical characteristics are available to expose the form of the RRT planner’s runtime distribution. Most fundamentally, it has a proven expo- nential convergence bound on its underlying distribution of iterations [46]. While this may weaken the case for external restarts, it is also important to consider other fac-

81 tors. First, the runtime of the RRT planner as a function of the number of iterations has a linear lower bound because the nearest neighbor computation becomes more de- manding over time. For example, if a na¨ıve nearest neighbor approach is used, its total runtime is based on the square of the number of iterations (given the number of nodes Pn n(n+1) is roughly proportional to the number of iterations and the series k=0 k = 2 ). In turn, a random variable that is the square of another exponential random variable is Weibull distributed, which is a known heavy-tailed distribution [19]; Appendix A.4 presents the details of this relation. Second, distributional tails heavier than expo- nential are not necessarily required for restarts to be useful. For example, a runtime distribution that chooses uniformly between values one and five (with mean of three) benefits from a constant restart interval without having any gradually-decaying tail. Restarting at intervals of one transforms that original distribution into a geometric distribution with success probability one half (and resulting mean of two). Based on these factors, it is practical to examine the distributions associated with the RRT planner’s runtime more directly in the form of experiments.

4.2 Continuous Problems

For problems with a continuous state space, the central goal of motion planning is to locate a viable path through the configuration space obstacles. The continuum of complexity in the workspace geometry of these obstacles and the specificity of paths between them implies that there is no particular balance between the computational requirements of a planner’s collision checking and its other processes. Collision check- ing is recognized as the significant cost in many practical cases [9, 60], though any algorithm using a process that demands increasing computation with time (e.g. near- est neighbor) eventually overwhelms the costs of non-increasing processes. However, in randomized algorithms, this may occur either in general due to the problem’s com-

82 plexity or only in particularly long-running cases. As a result, problem difficulty is a probable factor in determining the viability of restarts for the class of continuous problems.

4.2.1 Experiments

While the existence of tight constraints due to obstacles is not specific to the RRT planner, it is a principal indicator of problem difficulty in motion planning and there- fore a reasonable candidate to introduce distributions with slowly-decaying tails. To examine this notion in a representative setting, Figure 4.3 visualizes the results of sets of 10, 000 runs of the unidirectional RRT planner using the connect heuristic and a cover tree for nearest neighbor on the kinked tunnel. For each tested combi- nation of tunnel width and dimension (with tunnel widths decreasing from 2−2.0 by unit increments in the exponent to 2−7.0 in 2D, to 2−5.0 in 3D, and to 2−3.0 in 4D), the runtime survivor function decays more slowly than the exponential and has small minimum values, leading to situations in which restarts are useful. Further, picking the optimal constant restart interval for each case provides increasing performance improvements as the tunnel width narrows or the dimension increases. While im- provements in the coefficient of variation decline at the smallest tunnel widths, the impact is still consistently positive. With this observed tendency, more realistic but similar narrow passage problems in higher dimension are also reasonable candidates for the application of restarts. With the general conclusion that narrow passages are an indicator of the usefulness of restarts, they provide a potentially valuable solution for many difficult problems in motion planning that involve a narrow passage and a higher-dimensional space. One such problem is the alpha puzzle, a typical benchmark with a six-dimensional state space (spatial rotation and translation) and a narrow passage defined by a set of four scaled versions (1.0, 1.1, 1.2, and 1.5 from narrow to wide). Figure 4.4 plots

83 (a) (b)

Figure 4.3: The usefulness of restarts on various versions of the kinked tunnel. The runtime survivor functions of kinked tunnels with various tunnel widths and dimen- sions in (a) have positive concavity on semi-log axes (i.e., they decay more slowly than the exponential) and small minimum values; a properly-chosen restart interval benefits each one. The curves in (b) show that the potential improvement in both runtime mean and coefficient of variation due to restarts generally improves as the tunnel width shrinks. the restart mean and variation diagrams for sets of 1,000 runs of the unidirectional

RRT planner using the connect heuristic and a cover tree for nearest neighbor. Notably, the easiest version with the widest passage (the 1.5-scaled version) displays no possible improvement due to restarts, while the next harder instance (the 1.2- scaled version) can be improved to a mean runtime and coefficient of variation of 59% and 42% of their base values, respectively. While these settings plausibly illustrate the role of narrow passages in justifying restarts, practical factors like nearest neighbor computation method and the availability of well-performing solution instances can also contribute to their utility. Since the merits of restarts are based on the subject distribution in a complex way, a slow decay rate in its tail is not a sufficient qualification; rather, restarts require a preponderance of particularly fast and slow instances to be viable. This notion indicates that problems in the RRT planner’s specific difficulty class (that have sparse well-performing instances) are not constructive cases for restarts. A rather extreme

84 (a) (b)

Figure 4.4: Restart benefits for two easier versions of the alpha puzzle. While the mean in (a) and the coefficient of variation in (b) do not improve due to restarts in the easiest (1.5-scaled) version, they have a marked impact on the harder (1.2- scaled) version and its narrower passage. These judgements follow from the restart performance at the plotted optimal constant restart intervals (vertical dotted lines).

example (though two-dimensional) is the tube, on which the RRT planner (using the extend heuristic) has substantial freedom and likelihood to cover areas that surround and hamper its exploration through the narrow passage. Based on 10, 000 runs using a cover tree for nearest neighbor, the mean runtime is 26 seconds; however, there is only a 1% chance of observing a run under 13 seconds. Unsurprisingly, restarts are not useful on that case. Further, in cases of more intermediate difficulty, such as the bug trap, the computational effects of the nearest neighbor method can make the sole difference between restarts being potentially beneficial and ineffective. Figure 4.5 illustrates the differences in runtime distribution and restart usefulness due to nearest neighbor method based on 10, 000 runs of the RRT planner using the connect heuristic on the bug trap. In this case, the growth rate of the brute force list-based nearest neighbor method extends the distributional tail and makes restarts viable but only (approximately) enough to correct for the excess runtime compared to using a cover tree. For this reason, it is also reasonable to examine other cases on which the RRT planner depends more acutely on its nearest neighbor component.

85 (a) (b)

Figure 4.5: Impact of nearest neighbor computation method on runtimes and restarts in the RRT planner. The two survivor functions in (a) depict the difference in overall runtime and distributional tail on the bug trap between using a cover tree and a list for nearest neighbor computations. The restart mean diagram in (b) shows that this particular case is amenable to restarts solely due to the computational impact of the list growth; the optimal constant restart interval only corrects for this gap in performance.

4.3 Discrete Problems

Aside from the more typical continuous case, the RRT planner has also been applied to discrete search problems, on which it has demonstrated performance advantages compared to more traditional search algorithms [53, 52]. Unlike continuous problems, many discrete problems have no direct equivalent of collision checking; rather, the central goal is to locate a path that complies with the connectivity of the state transitions. However, much like the continuous case, the associated metrics provide only an indication of (and frequently, a lower bound on) true distance. This creates a similar overall problem (locate a path through a space using an inaccurate measure of distance) where the primary computational cost is redirected toward the management of explored nodes and calculations of the metric. Therefore, the runtime distributional tails are likely to be more significant than for many continuous problems and more amenable to restarts.

86 4.3.1 Planner Variations

In order to address a discrete environment, it is necessary to adapt the RRT planner to its unique challenges. Unlike the continuous case, in which the availability of a local planner is assumed, discrete problems frequently simply enumerate the successor states of a given parent state. By assuming that a target state is best reached from a parent state by a move to the closest one of these successor states, a “strict” interpre- tation of the RRT planner for discrete space is obtained. Unfortunately, this behavior can lead to failing expansions, in which a source is selected as the nearest neighbor but no successor is available closer to the target (either due to a lack of precision in the metric or the dynamics of the system’s transitions). To avoid this occurrence, the set of source nodes (i.e., those that are used in nearest neighbor computation) can be limited to only those with unexpanded successor states, and the closest among these successors is chosen as the destination for expansion. This “thorough” approach yields the original discrete RRT planner [53, 52]. Finally, if the set of unexpanded successor states is used directly in the nearest neighbor computation (to select the destination for expansion and the associated parent node as the source), this leads to a “leafy” discrete RRT planner that outperforms the more basic version [53, 52]. Further, expanded states must be checked against existing nodes to avoid duplica- tion, resulting in additional computational cost. These variations and complications for the algorithm make the discrete setting a separate and attractive application for restarts.

4.3.2 Experiments

While in typical motion planning applications the difficulty is in navigating among obstacles, discrete problems can present a combination of this challenge and that of a large state space (in terms of the count of unique states). Thus, applicable prob- lems can be constructed that mimic typical continuous problems, but others have no

87 (a) (b)

Figure 4.6: Discrete example problems. The exchange in (a) involves switching the positions of two boxes that must move through a narrow tunnel. The 15-puzzle in (b) is a sliding puzzle that requires tiles (frequently depicting gridded segments of a picture that is properly arranged at the final state) to be sorted by moving those that are adjacent to a single empty space.

explicit obstacles. Example problems from each category are shown in Figure 4.6: the exchange, which requires two boxes to switch positions via a tunnel, and the 15- puzzle, which requires sliding tiles to be sorted by swapping them with a single empty tile position. The RRT planner performs well on many continuous problems resem- bling the former, while its greedy behavior, which focuses the majority of its initial effort on quickly expanding into the state space, is a natural method of addressing the large state space size on the latter. The exchange is similar in nature to the kinked tunnel, though the explicit enu- meration of its discrete states makes it a relatively modest problem with some com- plications. Its formulation is defined on a (32 + 1)3 grid with two boxes that occupy single cells, yielding a total of 3332 (about 1.3 billion) states, roughly half of which are in obstacles. However, it is made more complex by the potential for deadlock if both boxes attempt to enter the tunnel concurrently, as its width (of a single state) is insufficient to allow them to pass one another. Using a metric that returns the sum of each box’s Euclidean distance between two states, conditions in which the boxes

88 (a) (b)

Figure 4.7: Performance and restart impact for RRT variants on exchange. The runtime survivor functions for all three planner variants in (a) display significant slower than exponential decay. As a result, the restart mean diagram in (b) shows that constant restart intervals offer notable improvements. enter the tunnel appear closer to the final state than reality, since one box must back out of the tunnel. Figure 4.7 presents the results of 1, 000 runs of each bidirectional planner variant on this problem, along with their corresponding potential restart per- formance. While the “leafy” planner is markedly weaker on this problem than the others, all three have potential improvements of roughly a factor of six due to restarts (the ratio of their best altered to base mean runtime is 17.7% for “strict,” 15.5% for “thorough,” and 16.9% for “leafy”). While this outcome is promising for the potential of restarts on discrete problems, its conceptual correspondence to continuous prob- lems with similar distributions and restart potential makes it prudent to examine a qualitatively different example. Unlike the exchange, the 15-puzzle has no restricted obstacle states, but its state space is significantly larger, making space-filling behavior a pivotal aspect of planner

16! performance. With a reachable state space of 2 (about 10 million million) states, it has a factor of 8, 000 more states than the exchange (closer to a version of the exchange including a third moving box). While the decay of the runtime survivor functions of 1, 000 runs of the bidirectional RRT variants (with a single randomly-

89 (a) (b)

Figure 4.8: Performance and restart impact for RRT variants on 15-puzzle. While their runtime survivor decay in (a) is more moderate than on other problems, the restart mean diagram in (b) reveals the significant potential of restarts.

chosen query) shown in Figure 4.8 are more moderate than on the exchange, restarts can still provide up to a factor of three performance advantage (the ratio of their best altered to base mean runtime is 35.4% for “strict,” 35.4% for “thorough,” and 38.2% on “leafy”). However, this is a case in which a list-based nearest neighbor computation makes the difference in the viability of restarts: basing their use on the number of iterations offers no improvement in that measure. As with continuous problems, only examples with restrictive obstacle constraints or considerable growth in per-iteration computation are useful applications for restarts in the RRT planner.

4.4 Generalizing Restarts

To the extent that restarts can provide a performance enhancement for the RRT planner, minimizing the quantity of information needed to pick a sensible strategy is essential to facilitating their use. This is a two-fold challenge: it encompasses both appraising the relevance of restart intervals and strategies determined from specific situations (i.e., for a specific problem and query) to the larger problem and under- standing the underlying causes of restart-feasible distributions. Further, any (online)

90 integrated and intelligent mechanism for restarts requires some form of signaling from the algorithm that apparent “mistakes” have occurred that indicate a restart is po- tentially beneficial. In total, the treatment of these issues establishes a basis for a more general handling of restarts in the RRT planner.

4.4.1 General Queries

While the RRT planner can perform quite differently when initiated from different positions relative to the same set of obstacles, positive performance impact can still be realized by applying a restart strategy across multiple queries in a fixed environment. The constant restart interval strategy can be disastrous if designed poorly (e.g., if the restart interval is below the minimum solution time), but its dynamics are gener- ally lenient provided that the restart interval overestimates the magnitude of typical runtimes. As a result, the distribution of runtimes on a particularly difficult specific query can generalize to the process of solving a set of queries in a given environment. When a query is representative of such a difficult case, the restart strategy performs well, while it has little effect on comparatively easy queries. The universal restart strategy can apply similarly, though it has the advantage that no fixed cap is placed on runtime, allowing positive performance effects over a broader range of small restart intervals. Therefore, a reasonable restart strategy can be derived from a relatively small amount of information about the magnitude of possible runtimes on a given problem. The kinked tunnel provides a good illustration of the capability of these general restart strategies; queries come in easy and hard versions based on whether the initial and final state lie on the same side of the tunnel. Figure 4.9 shows the mean runtime performance of the two restart strategies versus their basis restart intervals (i.e., the constant restart interval, or the unit runtime in the universal strategy) for a random sequence of 1, 000 queries on the 2D and 3D kinked tunnel. Since the queries are

91 (a) (b)

Figure 4.9: Constant and universal restart strategy performance of the RRT planner on the kinked tunnel with a random sequence of queries. For the 2D version (tunnel width of 2−6) in (a) and the 3D version (tunnel width of 2−4) in (b), both strategies provide performance improvements over a wide range of restart intervals around the marked value (vertical line) derived from the distribution from a single query. How- ever, the universal restart strategy is more tolerant than the constant one without sacrificing any relative performance benefits. uniform, approximately half require a solution path through the tunnel; most others only link two states on the same side. Both strategies surpass the performance of the RRT planner alone over a wide range of basis restart intervals around the value derived from a single query spanning the tunnel. However, the universal strategy allows an order of magnitude smaller basis restart interval than the constant strategy before its performance becomes worse than the RRT planner alone. Notably, the constant restart strategy offers little to no benefit over the universal strategy in this case. Still, this case does allow a straightforward definition (and clear delineation) of easy and hard queries, so a less straightforward example is instructive. Unlike the kinked tunnel, the 15-puzzle contains no obstacles or narrow passages, so any conceptual differentiation between easy and hard query states (beyond the basis of the metric) is difficult. Accordingly, it is prudent to apply the universal restart strategy to arbitrary query states. For this purpose, the optimal constant restart interval from previously-discussed experiments with a fixed initial state provides a

92 gauge of the magnitude of the general runtime and a basis for the universal strategy. This initial state has an optimal solution using 53 moves (as solved by bidirectional A∗ [57]). In comparison, the longest optimal solution for the 15-puzzle uses 80 moves, for which there are relatively few initial states (on the order of ten known [12]). While bidirectional A∗ solves the sample query quickly (1.9 seconds compared to the 25.5 seconds taken by the bidirectional “leafy” RRT variant), it exceeds the 32-bit memory ceiling (at an average 156.723 second runtime) nearly 10% of the time when addressing random queries. The RRT planner does not suffer from this issue, and memory usage tends to be further constrained by a restart strategy in addition to the performance potential. Using only a bare concept of the magnitude of the runtime of a given problem, as is the case on the 15-puzzle, a universal restart strategy can provide observable performance gains for the discrete RRT planner. With unit runtimes (4.478 seconds for “strict,” 1.430 seconds for “thorough,” and 0.917 seconds for “leafy”) based on fixed initial state runtimes, the universal strategy cuts mean runtimes for the discrete RRT variants to approximately half over 1, 000 randomly-chosen queries (58.3% for “strict,” 51.9% for “thorough,” and 48.2% for “leafy”). While not providing optimal paths, the resulting “thorough” and “leafy” variant average runtimes (12.597 seconds for “thorough” and 15.058 seconds for “leafy”) are competitive with the average 19.746 seconds taken for bidirectional A∗ to complete only the 913 queries solvable under the 32-bit memory ceiling. Including the runtimes at which the remaining 87 queries failed in the average indicates that the true mean runtime of bidirectional A∗ is greater than 31.663 seconds. Thus, the wide range of applicable unit runtimes in the universal restart strategy allows it to have a positive impact on discrete RRT variant performance across a variety of queries with little information. While this general application of constant and universal restart strategies has the potential for practical use in reducing the RRT planner’s growing incremental

93 runtime and exploratory missteps, it also exposes that these effects have a clear negative impact on its performance. In the former case, that negative impact can be addressed using improved nearest neighbor methods like the cover tree. However, the latter case arises from the RRT planner’s sampling and selection scheme, which is an integral part of the algorithm. Restarts provide a method of externally correcting these exploratory mistakes with the consequence that any useful information is also lost. As a result, their effectiveness could be increased if applied only to appropriate components of a problem in order to avoid the loss of any other useful exploration.

4.4.2 Task-based Decomposition

Associating the feasibility of restarts in the RRT planner with the presence of narrow passages in its problems leads to the issue of how the exploratory behavior of the algorithm produces the applicable runtime distributions. In simple problems, this exploration can be decomposed into a series of tasks (e.g., locate a passage, negoti- ate that passage, etc.) to investigate this question. For example, Figure 4.10 shows overlaid copies of RRT planner instances used to solve the individual tasks of the bug trap (reach the mouth, reach the final state) and the kinked tunnel (reach the tunnel, navigate the tunnel, reach the final state). The corresponding runtime survivor func- tions of 10, 000 runs each are shown in Figure 4.11 along with the theoretical sum of the task runtimes. On both problems, the task of first locating the entrance to the narrow passage consumes the major share of runtime. The key difference with respect to restarts is that restarts are useful on the most time-consuming task on the kinked tunnel (locating the tunnel), while they are not on the comparable task on the bug trap (reaching the mouth). Notably, the tasks on which restarts are useful coincide with those that have the potential for exploratory impediments due to occlusion of the Voronoi visibility of useful nodes. Runtime distributions that are amenable to restarts consist of significant densities

94 (a) (b)

Figure 4.10: RRT planner instances solving tasks. On the bug trap in (a), the red instance reaches the mouth, while the blue instance explores from the mouth to the final state. On the kinked tunnel in (b), the red instance reaches the tunnel, the green instance traverses it, and the blue instance connects the tunnel exit to the final state.

(a) (b)

Figure 4.11: Runtime survivor functions for RRT planner solving tasks. On the bug trap in (a), the two tasks both play a significant role in total runtime, and while locating the mouth takes more work, the navigation to the final state has a marked potential advantage for restarts. On the kinked tunnel in (b), each task is individually amenable to restarts, though the plot reveals that most runtime is devoted to locating the tunnel and a minor portion to navigating it.

95 (a) (b) (c)

Figure 4.12: Voronoi visibility issues on the task-based portions of problems. While growing from the mouth to the final state on the bug trap in (a), the RRT planner may explore backward, hampering its bias out of and around the bug trap. At a smaller scale, it may also have issues caused by the tunnel corners on the kinked tunnel in (b) and (c).

of short runs that can be used to cancel the effects of a significant density of long runs; the RRT planner supplies the latter side this balance by occasionally compromising the Voronoi visibility of its most potentially useful nodes. Figure 4.12 displays several instances of this compromised Voronoi visibility arising on the task-based portions of the bug trap and kinked tunnel problems. In each corresponding task, the runtime survivor function has visible decay that is slower than exponential and could be improved via restarts. In contrast, the task of initially locating the mouth from inside the bug trap displays an approximately exponential runtime survivor function, suggesting that there is no incentive to erase any past exploration. This is consistent with the notion that it is quite likely the RRT planner will explore the lobes of the bug trap first and simply reproduce any exploratory circumstances with compromised Voronoi visibility. The overall viability of restarts on a complete problem is shaped by the appearance and severity of these effects. Though the runtime distribution of the RRT planner on a partial problem may be beneficially modified via restarts, this is not necessarily indicative of the degree

96 (a) (b)

Figure 4.13: Runtime survivor functions for the RRT planner on the tube. For the task of exploring through the tube in (a), the survivor functions expose the decreasing presence of midrange instances (i.e., the widening horizontal portion) as the tube walls shrink. However, the complete problem’s survivor functions in (b) show that the short instances are largely eliminated by the early exploratory behavior before reaching the tube entrance. of their usefulness on the environment in general. Figure 4.13 illustrates a series of runtime survivor functions of 10, 000 runs each of the unidirectional RRT planner

(using the extend heuristic and a cover tree for nearest neighbor) on the tube. The runtime of the task of exploring from the mouth of the tube to the final state on the opposite side can be increasingly improved by restarts as the tube walls shrink (down to 0.6% of the base runtime when the tube’s wall thickness is equal to its passage width); however, the overall problem has more moderate potential improvement (to 42.3% of base runtime in the same obstacle conditions). While the short runs (under 8.0 milliseconds) that make these gains possible represent 30.1% of instances on the task, their presence on the overall problem is comparatively minor at 0.2%. These short runs have the advantage of not (randomly) encountering the Voronoi visibility issues that impair the performance of the long ones. As a result, a fully general (online) application of restarts to the RRT planner requires a method of identifying the presence of these Voronoi visibility issues as they occur.

97 4.4.3 Algorithmic Measures

Distinguishing problems with the Voronoi visibility issue in a growing RRT instance necessitates a method of reliably measuring it over time. While recognizing over- estimation is relatively straightforward because it affects frequent collisions when expanding specific nodes, even this measure is ambiguous as frequent collisions are also associated with narrow passage nodes that are potentially quite valuable for ex- ploration. Recognizing underestimation is comparatively difficult, since the primary symptom is that such nodes are rarely selected for expansion and little information is available. Further, with such little information, nodes with Voronoi regions un- derestimating their visibility have no clear demarcation from those that are simply surrounded by other nodes. In fact, the latter class of nodes can easily have total vis- ibility that is comparable to or greater than the former, depending on the particular obstacles that surround each one. In the fixed context of the RRT planner’s opera- tion, this implies strongly that any measure of these inaccurate estimations must be indirect. As the RRT planner grows incrementally into its configuration space, continually increasing coverage is a positive performance indicator. An approximate definition of this coverage can also be derived from the sampling action of the algorithm by tracking an upper bound on the distance to colliding states. For each discovered colliding state, that upper bound is updated at the expanding node; the set of random samples for which the distance to the nearest neighbor node is below that neighbor’s upper bound distance is then a reasonable gauge of coverage. Figure 4.14 plots the cumulative portion of samples that falls inside these coverage regions in the best and worst cases (out of 100 runs) on the bug trap and the kinked tunnel. While they do not provide an unambiguous indication of when to trigger a restart, total coverage and the expandability of nodes toward distant target states present clear qualitative signs of overall performance.

98 (a) (b)

Figure 4.14: Cumulative measures of RRT planner coverage. On the bug trap in (a) and the kinked tunnel in (b), the total portion of samples falling inside the RRT planner’s covered region (coverage) consistently decreases on poor runs, while the portion of samples outside that region that stimulate successful expansion of new nodes (uncovered expanding) stays small. Horizontal lines measure the portions of collision-free space (for the bug trap interior and total space on the bug trap and the before-tunnel area and total space on the kinked tunnel), and vertical lines indicate the completion of the tasks (crossing the mouth on the bug trap and entering and exiting the tunnel on the kinked tunnel).

The evolving coverage of the RRT planner leaves useful clues to its ultimate per- formance. First, the total share of samples that falls in covered space approaches that of the area before the narrow passages that block the planner’s progress on poor runs. Second, samples in the uncovered area provide a discernible fraction of nodes on good runs and next to none on poor runs. Notably, the best runs on the kinked tunnel complete without placing any collision-free samples in the narrow passage; rather, all nodes created there represent incremental growth toward far-away target states, reflecting the importance of their bias to the algorithm. The chief consequence for restarts is that the presence of misleading nodes is marked by especially low rates of node creation for target states outside covered space. Unfortunately, the precise rates are a complex function of the shapes, sizes, and balance of open areas and narrow passages in the configuration space (e.g., the best runs on the bug trap create more than 20% of their nodes in seeking random samples outside their current coverage

99 region, while such runs on the kinked tunnel create fewer than 10% of their nodes in this way). Further, narrow passages remain difficult to identify by this measure, as their undiscovered sections represent unmeasured information to the incremental exploration of the RRT planner. Properly-executed restarts allow the RRT planner to circumvent the negative consequences of Voronoi visibility issues signaled by these coverage measures.

4.5 Summary

This chapter has examined the use of restart strategies in countering the slow decay and resulting high variance in the performance distributions of the RRT planner. These effects exist across both continuous and discrete problems due to factors in- cluding the increasing time required by the nearest neighbor computation and the propensity toward growth into “trick” states as observed in the previous chapter. In many of these situations, restarts based on the universal strategy have demonstrated the capacity to improve performance across queries when informed by an approxima- tion of the RRT planner’s runtimes on a given environment and specific query. These strategies are also markedly more consistent across queries than ones strictly based on an optimal constant restart interval. Additionally, the RRT planner’s performance on decomposed tasks has shown that restart-amenable distributions arise on individual portions of problems, so ultimately, the most general restart strategies must identify these tasks and the associated useful nodes. Signatures of the delays between useful exploration caused by coverage of “trick” states are present in the RRT planner’s fre- quency of expansion toward distant target states; however, these signs are difficult to differentiate from normal stagnation associated with finding and negotiating narrow passages.

100 Chapter 5

Neighborhood-based Expansion

In sampling-based motion planning, the success of an algorithm rests on the appropri- ateness of its sampling and the subsequent explored path segments to its environment. In the case of the RRT planner, these characteristics are a function of its greedy be- havior in exploring toward unoccupied areas; this greedy behavior is constructive in many cases, but it is also responsible for poor and highly variable performance in oth- ers. However, these negative performance aspects are not inherently guaranteed by the algorithm’s incremental nearest neighbor expansion, as the experiments in Chap- ter 4 demonstrate its affinity for handling component tasks of problems with less total effort than the complete problems. Thus, reasonably identifying these tasks and leveraging the RRT planner’s particular strengths as appropriate to the environment is an important challenge in improving the algorithm.

5.1 Locally-isolated Expansion

Given a measure of distance that is a function of two arguments in the state space rather than the configuration space, as is typical, it is natural to question its accuracy over large relative values if the environment is complex. Figure 5.1 shows the rela- tionships between the Euclidean metric and true cost-to-go on two problems, which

101 (a) (b)

Figure 5.1: Metric and cost-to-go relationships. On both the maze in (a) and the tube in (b), randomly-selected pairs of states show that the Euclidean metric tends to become a less reliable measure of true cost-to-go as the states separate. illustrate that the metric tends to be more reliable at smaller distances. While the RRT planner determines its expansion via an application of the metric on a global scale, other sampling-based planning algorithms enforce a limited trust of the metric that is more consistent with this observation. The PRM planner, for instance, only attempts connections that are measured to be less than a fixed radius parameter. More recent variants, the Roadmap of Trees planners [1, 6, 56], have also recognized the usefulness of localized, incremental search. Similarly, the SBL planner generates configurations with a receding radius that is initialized to a fixed radius parameter and only creates connections within that distance. While the RRT planner commonly limits the distance it explores via the step size, the bias and direction of its exploration are influenced by its global use of the metric. A more sensible use of the metric is therefore a realistic avenue to achieve improved performance from the RRT planner. While the exploration radius horizon in the DD-RRT planner [67] addresses a part of the antecedent algorithm’s performance issues, an alternate mechanism is desirable to retain a strong bias for nodes with small Voronoi regions but useful visibility. In contrast to the dynamic domain concept, which caps the distance from nodes to acceptable exploratory targets, a cap on the distance from the origin of the expansion

102 to acceptable expansion nodes establishes a form of locally-isolated expansion. Since the RRT planner has an observed tendency to produce path lengths within a small factor of the optimal [43], its cost-to-come (i.e., the path length from the root to a given node) is likely a more suitable measure than absolute distance as it accounts for paths that are forced by obstacles to be jagged or twisted within a small absolute area. Forbidding the expansion of nodes meeting or exceeding a certain cost-to-come from the root creates an RRT instance that has initially strong outward bias, but soon focuses on local exploration as nodes are created that violate that cost-to-come threshold. In such a case, nodes near the edges of obstacles have difficulty reaching this threshold and continue to probe the obstacle for narrow passages, while those in empty space grow to the threshold and curtail further expansion. While not a complete algorithm on its own, this mechanism combines the bias that makes the RRT planner apt for solving certain narrow passage problems with a local focus that ameliorates the potential of the metric to mislead the exploration. Applying locally-isolated expansion in cases that typically cause problems for the RRT planner allows it to discover important locations sooner and realize more com- plete exploration earlier. Unrestricted, it errs on difficult problems by exploring areas that hurt its overall Voronoi visibility; imposing a cost-to-come threshold (labeled

ρlimit) reduces its ability to cover these misleading areas by locking the expansion of nodes connected to the root by paths longer than that threshold. Figure 5.2 shows a sample of this locally-isolated expansion applied to the RRT planner on the bug trap, along with the viable and nonviable Voronoi regions and the survivor func- tions of the number of nodes required to cross the mouth. Clearly, versions with the smaller cost-to-come thresholds frequently cannot reach that destination at all; how- ever, those with intermediate values perform better than the unrestricted one. These observations can be reconciled and the power of locally-isolated expansion leveraged by composing a complete algorithm that includes a method of initiating and growing

103 (a) (b) (c)

Figure 5.2: Locally-isolated expansion. The growing tree in (a) can expand nodes (such as the one at xs) with paths from the root (at xi) that are less than the cost- to-come threshold (ρlimit). The corresponding sampling region in (b) reflects that Voronoi regions of nodes violating that cost-to-come threshold are nonviable (red) while others continue to be viable (green). The survivor functions of nodes created before crossing the mouth with various values of this threshold in (c) show that this measure of performance improves until the threshold becomes so small that it is difficult to reach the mouth at all (it is impossible below 2−3). groups of these local trees. Such an algorithm resolves the inability of the individual local tree to expand over distance by introducing additional ones, while simultane- ously benefiting from their superior coverage.

5.2 Path-length Annexed Random Tree (PART)

The Path-length Annexed Random Tree (PART) planner, which is listed as Algo- rithm 5.1, uses locally-isolated expansion by allocating multiple local trees in its environment. During each iteration, the algorithm compiles the set of local trees for which the nearest neighbor node to the target state has a cost-to-come from its root that is less than a constant cost-to-come threshold parameter (ρlimit). These neigh- bors are then expanded incrementally in random order and in a manner analogous to that of the RRT planner. This process terminates only when the target state is reached or no non-colliding neighbors remain, resulting in functionality close to that

104 of the connect heuristic that maximizes the algorithm’s ability to reach edges of the space. Unlike the RRT planner, which expands a single node during each iteration, the PART planner chooses from a set of candidate neighbors (one from each local tree) in order to increase its coverage among complex obstacles.

Algorithm 5.1: The Path-length Annexed Random Tree (PART) planner. γ ← plan(xi, xf )

1 Ti::initialize(node(xi)), Tf ::initialize(node(xf )); 2 Tα::initialize(Ti), Tβ::initialize(Tf ), xr ← random state(); 3 while true do 4 Ts::initialize(); // set of expandable local trees 5 for Tα ← Tα do 6 [ρ, Tα.ns] ← nearest neighbor(Tα, xr); 7 if Tα.ns.ρ ≤ ρlimit then // local tree’s neighbor is expandable 8 Ts::insert(Tα);

9 while !Ts::empty() ∧ (nα.x 6= xr) do 10 Tα ← random element(Ts); 11 [xα, uα] ← local plan(Tα.ns.x, xr, ε); 12 if !detect collision(Tα.ns.x; uα, xα) then 13 nα ← Tα::insert(Tα.ns, node(xα)), nα.ρ ← Tα.ns.ρ + metric(Tα.ns.x, nα.x); 14 if nα.ρ ≤ ρlimit then // update still-valid neighbor 15 Tα.ns ← nα;

16 else // branch at invalid node 17 Ts::erase(Tα); // local tree’s neighbor is no longer expandable 18 [c, Tc, nc; Ts] ← branch(Tα ∪ Tβ, Tα, nα; Ts, xr); 19 if !c then // new local tree created 20 Tα::insert(Tc), Ts::insert(Tc), Tc.ns ← nc; 21 else if Tc ∈ Tβ then // branching connected the two sets 22 return compose path(Tα, Tα, nα, nc, Tc, Tβ; xi, xf );

23 else // colliding local tree no longer expandable 24 Ts::erase(Tα);

25 if nα.x = xr then 26 if nβ.x = xr then // both sets of local trees reached the target 27 return compose path(Tα, Tα, nα, nβ, Tβ, Tβ; xi, xf );

28 else // one set of local trees can’t reach the target 29 xr ← random state();

30 [Tβ, Tα] ← swap(Tα, Tβ), [nβ, nα] ← swap(nα, nβ);

The multiple selection strategy applied by the PART planner is based on the intu- itive notion that multiple logical choices for expansion exist when searching for narrow passages. For example, a reasonable observer might choose among any node for which

105 (a) (b)

Figure 5.3: Multiple selection in the PART planner. On the bug trap in (a) and the kinked tunnel in (b), each local tree contributes a nearest neighbor node, resulting in expansions that are roughly normal to obstacle surfaces that speed the search for hard-to-find narrow passages. the direction toward the target state is roughly normal to the interior surface of the nearby obstacle; Figure 5.3 illustrates the appearance of such alternatives during runs of the PART planner. Since it is awkward to define the concept of an obstacle surface via discrete samples, the PART planner uses the separation induced by the cost-to- come threshold to base that definition on the absence of other nearby nodes instead. The resulting selection of multiple neighbor nodes assumes that pursuing that set, while requiring more effort per target state than the single nearest neighbor node, is more conducive to useful coverage of the configuration space. However, this selection mechanism further requires a method to create and place multiple local trees. By defining a branching and connection strategy, which is listed as Algorithm 5.2, the PART planner allocates local trees as needed. Branching is initiated whenever the expansion of a node creates a child node that newly exceeds the cost-to-come threshold. In order to preserve the expandability of that branching node in the search as a whole, the process requires that a new local tree be rooted at the branching state if no other local tree can reach that state in compliance with the cost-to-come threshold. The verification process for the latter possibility is conducted in a three-tiered manner:

106 (a) (b) (c)

Figure 5.4: PART planner branching. Each plot visualizes one condition (dotted lines) in the branching process; passing local trees are indicated (bold lines). When one local tree’s new node (red) exceeds the cost-to-come threshold, it is branched, and other local tree roots (blue) are first checked for proximity against the cost-to-come threshold in (a). Nearest neighbor nodes are computed for the complying local trees and their paths to the branching node are checked against the cost-to-come threshold in (b). Finally, those paths are collision-checked in order of increasing distance in (c) until one is located. If any step is failed completely, a new local tree is rooted at the branching state.

first by metric, second by cost-to-come, and third by collision checking. First, the set of all local trees is narrowed to those whose roots are within the cost-to-come threshold as judged by the metric. Second, remaining local trees are pared down to include only those with nearest neighbor nodes offering the capability to reach the branching state within the cost-to-come threshold. Third, qualifying local trees are checked in order of increasing distance to determine if they can realize a collision- free path to the branching state. Figure 5.4 outlines an example of this progression. This branching and connection process ensures that the PART planner retains one expandable instance of each node at all times and mimics many characteristics of the conventional RRT planner. Altogether, the PART planner represents a generalization of the RRT planner designed to improve performance in cases that are uniquely difficult for the lat- ter. Specifically, if the cost-to-come threshold parameter is set above the range of

107 Algorithm 5.2: The Path-length Annexed Random Tree (PART) planner’s branching and connection strategy. [c, Tc, nc; Ts] ← branch(T, Tb, nb; Ts, xr) 1 Tp::initialize(T\Tb); 2 for Tp ∈ Tp do // root cost-to-come step 3 if ρlimit < metric(Tp:root().x, nb.x) then 4 Tp::erase(Tp);

5 if !Tp::empty() then 6 for Tp ∈ Tp do // neighbor cost-to-come step 7 [ρ, Tp.np] ← nearest neighbor(Tp, nb.x); // nearest in each local tree 8 if ρlimit < Tp.np.ρ + metric(Tp.np.x, nb.x) then 9 Tp::erase(Tp);

10 while !Tp::empty() do // collision checking step 11 [ρn, Tn] ← nearest neighbor(Tp.T .np, nb.x); // nearest of valid local trees 12 while true do 13 [xc, uc] ← local plan(Tn.np.x, nb.x, ε); 14 if !detect collision(Tn.np.x; uc, xc) then 15 Tn.np ← Tn::insert(Tn.np, node(xc)); 16 Tn.np.ρ ← Tn.np::parent().ρ + metric(Tn.np::parent().x, Tn.np.x); 17 if metric(Tn.np.x, xr) < metric(Tn.ns.x, xr) then 18 Tn.ns ← Tn.np; 19 if (Tn ∈/ Ts) ∧ (Tn.ns.ρ ≤ ρlimit) then 20 Ts::insert(Tn);

21 if Tn.np.x = nb.x then // connection found 22 return [true, Tn, Tn.np; Ts];

23 else 24 Tp::erase(Tn); 25 break;

26 Tc::initialize(node(nb.x)), Tc::root().ρ ← 0; // no connection; create new local tree 27 return [false, Tc, Tc::root(); Ts];

108 path lengths achievable in a given environment (or to infinity), the PART planner

reproduces the conventional RRT planner with the connect heuristic because no branching occurs. For smaller values, local trees grow only to a finite size based on the cost-to-come threshold and then expand only to local target states; however, the branching process guarantees that any node exceeding the cost-to-come threshold has a coinciding copy below the threshold in a different local tree. This assures that the global nearest neighbor node can always be expanded toward a given target state if no other neighbor nodes can reach it, thereby including the expansion dynamics and completeness of the RRT planner. Additionally, a bidirectional version of the PART planner can be provided via the same procedure as the RRT planner: by expanding one tree to a random state and expanding the opposing tree toward the resulting node while alternating between trees. However, the former can benefit by opening its entire set of local trees to connection in the branching process, diversifying the attempts to link the two trees. Despite these similarities, the PART planner performs at or beyond the standard of the RRT planner on a variety of problems by means of its locally-isolated expansion.

5.3 Performance

The capabilities of new sampling-based motion planning algorithms are best illus- trated through experiments on recognizable benchmark problems. The sections that follow compare the PART planner to several existing single-query algorithms, in- cluding the conventional RRT, Dynamic Domain RRT (DD-RRT), and Single-query Bidirectional Lazy (SBL) planners. In all cases, algorithms with a step size use an informally-estimated “incremental” value; the PART planner then uses a cost-to-come threshold of eight times that step size. Summary statistics are presented to gauge several aspects of their performance (collision checks for overall exploration, search

109 nodes for memory usage, and runtime for computational effort); iteration counts are included for completeness but should not be compared between algorithms due to their differing investment of effort per iteration. The tested implementations of the unidirectional RRT-based planners record the nearest neighbor node to the fi- nal state and attempt to connect to it whenever that neighbor changes, while the bidirectional ones balance their trees by node count and implement the extcon or concon procedures based on experimental evidence of their suitability for holonomic problems [38, 46]. The choice between unidirectional and bidirectional versions of the algorithms is based on the conditions of each problem, which fall into two categories: ones in 2D that demonstrate the functionality of the PART algorithm and ones in SE(3) that better display its particular strengths.

5.3.1 Demonstrative Problems

Experiments on 2D problems provide a convenient visualization of the operation of a motion planning algorithm while also allowing the construction of cases that illicit pathological responses by certain planners. For this purpose, the tests that follow examine three qualitatively different problems: the maze, the kinked tunnel, and the bug trap. These problems, which have been fully specified and illustrated in Chap- ter 1, represent three distinctive characteristics of possible planning problems: the maze contains no narrow passages but its solution meanders through the configura- tion space, the kinked tunnel involves an extended narrow passage, and the bug trap presents a situation that typically misleads the RRT planner due to the convenient access to areas surrounding its short narrow passage. Since each of these problems share the [0, 1]2 state space, similar parameters are applied across all algorithms: the step sizes are 2−6 (ε), the metric-measured limits (i.e., the DD-RRT planner’s basis domain radius, the SBL planner’s initial sampling radius, and the PART planner’s

−3 cost-to-come threshold) are 2 (ρlimit), and the DD-RRT planner’s radius adjustment

110 Collision Checks Search Nodes Iterations Runtime (s) RRT (extcon) 4647 (0.17) 774 (0.15) 1699 (0.21) 0.028 (0.31) RRT (concon) 2638 (0.19) 617 (0.18) 307 (0.39) 0.007 (0.36) DD-RRT (extcon) 4151 (0.18) 996 (0.19) 2399 (0.24) 0.044 (0.37) DD-RRT (concon) 2749 (0.19) 711 (0.19) 470 (0.40) 0.012 (0.42) SBL 4554 (0.16) 1431 (0.22) 1428 (0.22) 0.014 (0.33) PART 4400 (0.28) 1097 (0.26) 76 (0.45) 0.009 (0.37)

Table 5.1: Maze (bidirectional) simulation results evaluating the PART planner. En- tries present mean (µ) and coefficient of variation (σ/µ).

factor is 2−4 (α). The Euclidean metric is also used on all problems. All reported statistics are computed from 1, 000 runs of each algorithm on each problem; because collision checks estimate exploration and dominate runtime for many realistic cases, they are highlighted as the relevant performance measure. In total, these problems provide a broad picture of the behavior of each of the compared algorithms. The maze serves a dual purpose; while it is a relatively easy problem on which to validate a planning algorithm, the circuitousness of its solution path induces a limited degree of the misleading bias that is responsible for poor performance by the RRT planner. The ease with which the problem is solved is reflected in the statistics presented by Table 5.1: coefficients of variation across all measures and algorithms are quite small. Furthermore, all the planners perform similarly with the exception

of the RRT-based planners using the connect heuristic for their exploration. The extended growth provided by connect alleviates the minor growth bias and explores back toward older nodes more quickly than using the extend heuristic; it also covers the long, straight parts of the maze very quickly. Dynamic domain sampling is also of some use, as it avoids collision checks for exploration that would only attempt to

probe the solid walls of the maze, though this is not advantageous with connect exploration. While the extra multiple selection and expansion work done by the PART planner on an easy problem like the maze is unnecessary, it does not sacrifice significant performance compared to the field of planners as a result. The kinked tunnel introduces a narrow passage to the comparison but still does not

111 Collision Checks Search Nodes Iterations Runtime (s) RRT (extcon) 4761 (1.06) 705 (1.05) 2251 (1.30) 0.059 (3.29) RRT (concon) 5826 (0.81) 1125 (0.70) 2050 (1.30) 0.075 (2.95) DD-RRT (extcon) 8058 (0.40) 2266 (0.51) 8473 (0.54) 0.355 (1.18) DD-RRT (concon) 8411 (0.33) 2527 (0.41) 6711 (0.59) 0.340 (1.21) SBL 9665 (0.28) 6298 (0.31) 6295 (0.31) 0.161 (0.61) PART 6787 (0.90) 1304 (0.53) 472 (1.90) 0.033 (2.93)

Table 5.2: Kinked tunnel (bidirectional) simulation results evaluating the PART plan- ner. Entries present mean (µ) and coefficient of variation (σ/µ).

include any significant potential to mislead the RRT planner’s growth. As a result, the statistics in Table 5.2 show that the conventional RRT planner performs best,

but unlike on the maze, the extend heuristic is a superior choice for exploration due to the greater pull the trees exert on each other. Dynamic domain sampling and the pseudo-incremental sampling of the SBL planner do not perform as well. Since collisions are likely in the tunnel, the former frequently blocks large parts of the Voronoi regions of nodes in that tunnel and prevents beneficial expansion. The latter sampling scheme is likely to generate a collision-free sample in the open area outside the tunnel even when selecting a node near the entrance, as the area of the tunnel is comparatively small. The PART planner’s performance is second only to the conventional RRT planners because it retains the beneficial bias that pulls them through the tunnel quickly. Its additional effort is again unnecessary but carries a minimal cost on a problem that is relatively easy for the conventional RRT planner. The bug trap adds a sample of the type of obstacle situation that misleads the RRT planner. The PART planner then outperforms the conventional RRT planner

using the connect heuristic in the statistics in Table 5.3, though the other newer planners perform better. Dynamic domain sampling is particularly useful on the bug trap: as the planner fills the interior area and awaits a useful sample in the small area near the mouth, the dynamic domain avoids collision checking for target states outside the bug trap. The SBL planner in this case is a unidirectional version that prevents the tree outside the bug trap from adding samples, but this does not

112 Collision Checks Search Nodes Iterations Runtime (s) RRT (connect) 14362 (0.56) 2170 (0.47) 10005 (0.73) 0.588 (1.41) DD-RRT (connect) 7047 (0.36) 2161 (0.48) 9955 (0.72) 0.529 (1.37) SBL 4782 (0.70) 1224 (0.69) 1221 (0.69) 0.018 (1.16) PART 11963 (1.01) 1639 (0.53) 2437 (1.48) 0.175 (2.15)

Table 5.3: Bug trap (unidirectional) simulation results evaluating the PART planner. Entries present mean (µ) and coefficient of variation (σ/µ).

create a truly fair comparison because the algorithm simply juggles any samples with paths colliding with the bug trap to the exterior tree. The result is a amalgam of unidirectional and bidirectional search that has an unanswered advantage over the remaining algorithms. Nonetheless, the bug trap provides a conceptual example of a problem on which the RRT planner performs poorly and can be constructively replaced by the PART planner.

5.3.2 Realistic Benchmarks

Realistic problems involving 3D translation and rotation offer a gauge of the perfor- mance of a planning algorithm in real-world situations. The experiments in this sec- tion evaluate the algorithms on two such commonly-available assembly/disassembly problems: the flange and the alpha puzzle [66]. Both are available in a set of four scaled versions providing a range of difficulty (1.0, 0.95, 0.9, and 0.85 for the flange and 1.0, 1.1, 1.2, and 1.5 for the alpha puzzle). For the flange, all versions have the same qualitative solution, but the width of clearance (i.e., of the narrow passage) increases with the smaller scale factors, making the problem easier. For the alpha puzzle, there are two distinct qualitative solutions: the two harder versions (1.0 and 1.1) have a narrower gap and require a complex twisting motion to solve while the two easier versions (1.2 and 1.5) have a wider gap and can be solved by a more straightforward sliding motion. Figure 5.5 depicts these particular problems and their solution paths. For their SE(3) state spaces, the metric used is the weighted sum of the Euclidean

113 Figure 5.5: Solution paths for the flange and the alpha puzzle. The top row shows the removal of the pipe from its socket on the flange, the middle row shows the removal of one alpha part from an easier, wide-gap alpha part (1.5 and 1.2), and the bottom row shows the removal of one alpha part from a harder, narrow-gap alpha part (1.1 and 1.0). metric and the quaternion dot product [37] shown in Equation (2.1). Reported statis- tics are calculated from 100 runs of each algorithm on each problem; cases in which a problem is effectively unsolvable are terminated at a four-hour runtime and limited to 10 attempts. While collision checking dominates runtime for short runs, all algo- rithms use some type of growing computation (e.g., nearest neighbor) to select nodes, which eventually takes over for longer runs. Therefore, runtime is a fitting measure of performance on these more realistic problems. These benchmarks underline the PART planner’s ability to solve realistic problems with difficult elements. The flange is an instance of a typical assembly/disassembly situation that could arise under real-world conditions. It contains an extended narrow passage, at the end of which is the final state. As the base (1.0) versions of the models have very strict clearance (only about 1% of the pipe’s diameter), the second-hardest (0.95) versions are used. The benchmark concentrates on the insertion version of the problem (i.e.,

114 where the initial state has the parts separated and the final state has them inserted); incremental exploration methods can solve the opposite (i.e., removal) problem quite easily because they simply have nowhere else to explore except the solution path. Ac- cordingly, unidirectional versions of the algorithms are tested, including the pseudo- unidirectional version of the SBL planner used on the bug trap. These algorithms use step sizes of 2−3 (ε) and metric-measured limits (i.e., the DD-RRT planner’s basis domain radius, the SBL planner’s initial sampling radius, and the PART planner’s

0 cost-to-come threshold) of 2 (ρlimit), while the DD-RRT planner’s radius adjustment factor is 2−4 (α). Their metric is a one to two weighting of Euclidean distance and quaternion dot product (i.e., ct = 1.0 and cr = 2.0). The algorithms’ performance on this example is an indicator of their usefulness on difficult real-world problems. The PART planner excels amidst the difficult conditions of the flange. Table 5.4 shows that while it solves the problem in an average of approximately 17.0 minutes, the other planners consistently have not located a solution within four hours. The relative dominance of the unidirectional RRT planner solving the removal problem compared to its consistent failure on the insertion problem is similar in nature to the asymmetry observed in problems like the bug trap. As a result of this phenomenon, the RRT-based planners have a weak bias to explore into the narrow passage once the surrounding area is explored (which occurs early as it is unobstructed). The SBL planner has a similar issue: as on the kinked tunnel, it is simply much easier (i.e., more likely) to add samples in the open area than in the narrow passage as the pseudo-unidirectional version cannot sample directly around the (inserted) final state. In contrast, the PART planner fills the space but does not compromise the bias of useful nodes into the narrow passage. In difficult problems like the flange, this mechanism allows the PART planner to significantly outperform other algorithms. Much like the flange, the alpha puzzle is a narrow passage assembly/disassembly problem available in a range of difficulties; however, it does not become straightfor-

115 Collision Checks Search Nodes Iterations Runtime (s) RRT (connect) ≥ 3.34 × 106 ≥ 8.11 × 105 ≥ 7.98 × 105 ≥ 14400 DD-RRT (connect) ≥ 3.28 × 106 ≥ 8.08 × 105 ≥ 7.94 × 105 ≥ 14400 SBL ≥ 2.76 × 106 ≥ 1.79 × 106 ≥ 1.79 × 106 ≥ 14400 PART 2.47 × 106 (0.40) 3.47 × 105 (0.31) 3.54 × 103 (0.40) 1019 (0.48) RRT (connect)∗ 3.01 × 104 (0.38) 3.00 × 102 (0.16) 9.94 × 103 (0.38) 33 (0.35)

Table 5.4: Flange 0.95 (unidirectional) simulation results evaluating the PART plan- ner. Entries present mean (µ) and coefficient of variation (σ/µ). ∗The final row summarizes the performance of the RRT planner in solving the reversed (removal) problem in order to validate the implementation.

ward when addressed in a particular direction. The two easier (1.2 and 1.5) versions of the models provide sufficient clearance to remove one from the other (i.e., negoti- ate the narrow passage) by only translational motion, while the two harder (1.0 and 1.1) versions require a more complex solution including rotation. To preserve this source of challenge while reducing solution times, the second-hardest (1.1) version is used. Since incremental exploration does not trivialize the alpha puzzle as it does the flange, bidirectional algorithms are tested. These algorithms use step sizes of 20 (ε) and metric-measured limits (i.e., the SBL planner’s initial sampling radius, and the

3 PART planner’s cost-to-come threshold) of 2 (ρlimit), while the DD-RRT planner’s radius adjustment factor is 2−4 (α). In this problem, the DD-RRT planner’s basis

5 domain radius is raised to a larger value of 2 (ρlimit) as smaller values dramatically increase the rate at which it rejects sample states. The metric is a one to 32 weight-

ing of Euclidean distance and quaternion dot product (i.e., ct = 1.0 and cr = 32.0). This example provides an illustration of the algorithms’ performance in a practical, bidirectional setting. When applied to a complex, bidirectional planning benchmark like the alpha puz- zle, the PART planner further displays its advantages over the compared algorithms. As shown in Table 5.5, it completes the problem in approximately 18.5 minutes, while the other algorithms are not capable of solving it in reasonable runtime. Compara- tively, the solution introduced by the slightly wider gap of the next-easier (1.2) version

116 Collision Checks Search Nodes Iterations Runtime (s) RRT (extcon) ≥ 1.83 × 107 ≥ 4.93 × 104 ≥ 1.79 × 107 ≥ 14400 DD-RRT (extcon) ≥ 7.45 × 105 ≥ 5.24 × 104 ≥ 1.98 × 107 ≥ 14400 SBL ≥ 3.05 × 106 ≥ 1.10 × 106 ≥ 1.10 × 106 ≥ 14400 PART 5.71 × 106 (0.17) 5.31 × 105 (0.10) 5.98 × 103 (0.30) 1107 (0.23) RRT (extcon)∗ 1.25 × 105 (2.56) 2.63 × 103 (0.65) 1.04 × 105 (2.95) 24 (3.59) Table 5.5: Alpha puzzle 1.1 (bidirectional) simulation results evaluating the PART planner. Entries present mean (µ) and coefficient of variation (σ/µ). ∗The final row summarizes the performance of the RRT planner in solving the easier (1.2) problem in order to validate the implementation.

of the problem allows the conventional RRT planner to solve it quite easily (in less than 30 seconds). The RRT-based planners’ two trees face different varieties of the same issue: when the exterior tree fills the surrounding space, it compromises its bias to probe the smaller region of states in which the two parts are connected, while the interior tree can more easily reach connected “trick” states that obstruct its expan- sion into the narrow passage. As on other examples, this tendency does not affect the PART planner, allowing it to outperform other algorithms on problems including potentially misleading obstacles like the alpha puzzle.

5.4 Roadmaps and Paths

In addition to the performance advantages offered by the PART planner, its branching mechanism has the side effect of creating the type of explicit interconnections charac- teristic of roadmap planning methods. The algorithm can then output a roadmap of the reachable, explored space in addition to a simple solution path. Figure 5.6 shows examples of these roadmaps on 2D problems; owing to the constant cost-to-come

threshold and the straight path segments produced by its connect-like operation, the distribution of local trees is very regular. Furthermore, the algorithm produces consistently- and moderately-sized roadmaps across the benchmark problems: on all 2D problems, the mean number of local trees is less than 100 (coefficients of variation

117 (a) (b)

Figure 5.6: Roadmaps generated by the PART planner. For ease of visualization, the roadmaps on the kinked tunnel in (a) and the maze in (b) include only the roots of local trees and their connections. Since these connections are actually composed of non-straight paths, this interpretation may intersect with obstacles. of less than 0.2), on the flange, it averages 1, 424 (coefficient of variation of 0.17), and on the alpha puzzle, it averages 11, 084 (coefficient of variation of 0.08). These roadmaps can be retained for use on future queries if the environment is static, but they also have use in improving the solution path quality of the query for which they were generated. The PART planner’s roadmaps provide two methods of extracting a solution path at the completion of a run. Its foundation as a tree-based method means that a trivial solution path can be extracted by following the unambiguous parent/child relationship of the local trees up from the final node (with each local tree root connecting to the local tree node from which it was branched). Alternatively, the connections between local trees can be searched via a discrete search algorithm (e.g., A∗) to produce a possibly improved quality solution path. For the latter purpose, the roadmap can be interpreted in its fullest form, where both local tree roots and branching nodes are roadmap nodes and connections linking both local tree roots to their own branching nodes and branching nodes to one another are roadmap connections. Compared to the cost of actual planning, the cost to search those moderately-sized roadmaps is

118 inconsequential. While the trivial solution paths generated by the PART planner are not particularly circuitous, the discrete search approach can provide noticeable improvements in their total length. The PART planner’s local tree exploration allows localized growth that is not biased by surrounding areas, so its solution paths can be improved by following those local trees’ connections in the proper way. On the 2D problems, where the RRT-based planners provided solutions, this post-processing allows the solution paths from the PART planner to close a slight quality gap compared to those from the RRT planner. Specifically, discrete search processing takes the PART planner’s solution paths from 9% longer to 1% shorter on the maze and from 6% longer to 2% shorter on the kinked tunnel, as compared to the RRT planner’s solution paths. The margin of improvement increases on problems with large and open space; the PART planner’s solution paths go from 16% to 30% shorter than those of the RRT planner on the bug trap. While the compared algorithms do not reasonably provide solution paths on the SE(3) problems, discrete search processing is more helpful in their case: it shortens flange solution paths by an average of 44% and alpha puzzle solution paths by an average of 35%. Thus, simple discrete search post-processing allows the PART planner to supply superior quality solutions without additional exploration of its environment.

5.5 Summary

This chapter has introduced the Path-length Annexed Random Tree (PART) planner, which is based on the notion of localizing the expansion of the RRT planner in order to prevent the influence of “trick” states. The cost-to-come threshold that limits the expansion of the planner’s local trees prevents individual ones from exploring deeply into the territory of such “trick” states, improving performance on difficult cases with-

119 out sacrificing it in straightforward ones. This technique also represents a method of approximating and separately addressing the tasks discussed in the previous chapter by quantizing them with the cost-to-come threshold and leveraging the RRT planner’s strength in quickly completing many such tasks. The cost-to-come thresholds tested experimentally have represented informally-estimated values, suggesting that advan- tageous selections can be derived from experience with a given problem. Further, these values are not likely to represent the best possible performance of the PART planner on each environment, as they have not been tuned in any way. Instead, their selections are small enough to avoid the weaknesses of the RRT planner (where they arise) and large enough to avoid unnecessarily dense coverage of local trees. The corresponding experiments have demonstrated the PART planner’s strengths in solv- ing benchmarks that are difficult for other, established algorithms and in providing roadmaps of the environment in addition to competitive-quality solution paths.

120 Chapter 6

Local Obstacle Adaptation

While the Path-length Annexed Random Tree (PART) planner addresses pathological cases that typically deceive the RRT planner into poor exploratory behavior, like the latter algorithm, it does not directly react to the presence of obstacles. Rather, the bias for its exploration is localized by an additional cost-to-come threshold parameter, and its compensation for obstacles is limited to the effects of its distribution of nodes in the same way as the RRT planner. As a consequence, implementations of the PART planner require that parameter to be selected, even in situations without con- venient and complete visualizations (e.g., the SE(3) state space). The experiments in Chapter 5 use an informal estimation of this value; however, performance may suffer if a suboptimal choice is made. Therefore, it is significant to thoroughly examine the effects of the cost-to-come threshold parameter and to investigate methods to select it automatically in the context of an individual problem.

6.1 Cost-to-come Thresholds

The chief purpose of the cost-to-come threshold in the PART planner is to isolate nodes from others that would damage their likelihood to provide useful exploration. At the same time, the local trees it separates also have overlapping regions of coverage,

121 since each (non-root) local tree is rooted at a previously-branched node and can cover up to the cost-to-come threshold in growing back in the direction of that local tree’s root. This implies that the number of local trees should be minimized and their size maximized, provided both aspects are reconciled with the possibility of misleading their RRT-like growth. With a constant cost-to-come threshold, these aspects are linked; smaller values lead to small and numerous local trees, while larger values lead to large and sparse local trees. In this context, the values should be selected to reduce or eliminate the possibility of a single local tree surrounding an obstacle in a way that compromises its Voronoi visibility, while also recognizing that smaller values can have their own negative effect by creating small (and densely-connected) local trees. In light of the PART planner’s equivalence to the RRT planner at large cost-to- come thresholds and the latter’s history of experimental success, picking a cost-to- come threshold becomes a matter of assessing the value of purely greedy, nearest neighbor node selection on a given problem. On easy problems without misleading obstacles, the RRT planner performs well, so large cost-to-come thresholds are likely to lead to better performance than small ones. On hard problems that thwart the RRT planner’s approach, small cost-to-come thresholds diminish that possibility. These assertions are borne out by the collision checking performance of varied cost-to-come thresholds shown in Figure 6.1, which tests the PART planner on the maze, the kinked tunnel, and the bug trap (at 1, 000 runs per parameter value). On the easiest problem, the maze, the PART planner’s performance monotonically decreases in the cost-to-come threshold, while it has a moderate beneficial range on the kinked tunnel and a wider and more favorable one on the bug trap. While these observations provide guidance on selecting the cost-to-come threshold, it is nevertheless straightforward to envision situations in which the choice is still made difficult by factors such as the complexity of obstacles or high dimensionality. In addition to the potential difficulty of picking a cost-to-come threshold for the

122 (a) (b) (c)

Figure 6.1: PART planner collision checking performance by cost-to-come threshold. On the maze in (a), the kinked tunnel in (b), and the bug trap in (c), the range of beneficial cost-to-come thresholds widens with increasing problem difficulty for the RRT planner. Mean values and standard deviation error bars are plotted; values used in the previous chapter are indicated with vertical lines.

PART planner, it is clear that the value should also vary across the configuration space to respond to the demands of nearby obstacles. For example, an ideal local tree used to escape the mouth of the bug trap should be small so it cannot explore into the lobes and create the same situation that hurts the RRT planner’s performance. In contrast, a local tree entering it can be large because the bug trap’s negative effects on its growth have little effect in the reverse direction. Further, such a large local tree covers a greater region of space, reducing the total count of local trees and therefore, the number of neighbor nodes that must be tested for expansion during each iteration. These facts highlight the potential effectiveness of an adaptive method of picking cost-to-come thresholds per local tree and based on local obstacle conditions.

6.2 Potentially-reachable Regions

Choosing a suitable cost-to-come threshold for a given local tree’s conditions requires a method of determining the degree to which it can be misled by the interaction between the metric and nearby obstacles. The appearance of a target state that is not

123 reachable by that local tree but is reachable in the broader view of the configuration space is a possible indication that the local tree is too loosely constrained by its cost-to-come threshold to properly explore its neighborhood. Conversely, absent such events, the local tree should be arbitrarily large to copy the RRT planner’s behavior and avoid any overlapping coverage. A viable strategy should therefore mimic the RRT planner until enough information is available about the obstacles to influence an alternate method of operation. Such a strategy then allows greedy, nearest neighbor exploration in environments with little or no obstacle-based potential to hinder it and those with precise metrics. The notion of using unreachable target states to prompt changes in the cost- to-come is challenging because many such states lie in unknown regions during the planning process; truly determining their reachability is equivalent to the general motion planning problem. Instead, a practical strategy can initialize arbitrarily large cost-to-come thresholds, liberally assume that any target state is reachable if it is also collision-free, and reduce those cost-to-come thresholds based on distances to such perceived reachable but blocked states. While this clearly results in smaller (and more numerous) local trees than would be appropriate with full reachability information, this strategy only requires the negligible cost of a single additional col- lision check per target state. Simultaneously, it is well-behaved on problems that are amenable to the conventional RRT planner. In obstacle-free environments, it never reduces the cost-to-come threshold, resulting in exploration that copies the RRT planner. If the optimal metric is available, as is displayed by the instances of the RRT planner shown in Figure 6.2, the cost-to-come threshold also remains large because only expansions with unreachable target states (that are infinitely far away) result in collisions. While necessarily liberal in decreasing cost-to-come thresholds, this strategy provides a practical method to determine them in response to local obstacles.

124 (a) (b)

Figure 6.2: Behavior of the RRT planner with an optimal metric. While easily- computed metrics create a significant gap in difficulty between the maze in (a) and the tube in (b) for the RRT planner, access to the optimal metric largely erases it. Accordingly, good strategies for determining cost-to-come thresholds for local trees should account for this by preserving a large value and hence the minimal exploration of the RRT planner.

6.3 Adaptive PART (APART)

The adaptive PART (APART) algorithm, which is listed as Algorithm 6.1, builds upon the PART algorithm by replacing its constant cost-to-come thresholds with varying values stored with each local tree and updated based on local obstacle conditions. By presuming that potentially-reachable regions exist when blocked but collision-free target states are located, it decreases cost-to-come thresholds only as the exploration suggests. The core algorithm functions in much the same way as the original PART version in the previous chapter, with one exception: a list of colliding local trees is kept and used to shrink the cost-to-come thresholds of component local trees if the target state is collision-free. For simplicity, this only occurs at the conclusion of each iteration as a bulk process that updates any outstanding cost-to-come thresholds and creates any new local trees required as a result. These adjustment also merit small changes to the branching and connection operation. The branching and connection strategy for the APART planner, which is listed

125 Algorithm 6.1: The Adaptive Path-length Annexed Random Tree (APART) planner. γ ← plan(xi, xf )

1 Ti::initialize(node(xi)), Tf ::initialize(node(xf )), Tα::initialize(Ti), Tβ::initialize(Tf ); 2 while true do 3 xr ← random state(); 4 Ts::initialize(), To::initialize(); // sets of expandable and colliding local trees 5 for Tα ← Tα do 6 [ρ, Tα.ns] ← nearest neighbor(Tα, xr); 7 if Tα.ns.ρ ≤ Tα.ρlimit then // local tree’s neighbor is expandable 8 Ts::insert(Tα);

9 while !Ts::empty() ∧ (nα.x 6= xr) do 10 Tα ← random element(Ts); 11 [xα, uα] ← local plan(Tα.ns.x, xr, ε); 12 if !detect collision(Tα.ns.x; uα, xα) then 13 nα ← Tα::insert(Tα.ns, node(xα)), nα.ρ ← Tα.ns.ρ + metric(Tα.ns.x, nα.x); 14 if nα.ρ ≤ Tα.ρlimit then // update still-valid neighbor 15 Tα.ns ← nα;

16 else // branch at invalid node 17 Ts::erase(Tα); // local tree’s neighbor is no longer expandable 18 [c, Tc, nc; Ts, Th] ← branch(Tα ∪ Tβ, Tα, nα; Ts, xr); 19 if !c then // new local tree created 20 Tα::insert(Tc), Ts::insert(Tc), Tc.ns ← nc; 21 else if Tc ∈ Tβ then // branching connected the two sets 22 return compose path(Tα, Tα, nα, nc, Tc, Tβ; xi, xf );

23 else // colliding local tree no longer expandable, may shrink 24 Ts::erase(Tα), To::insert(Tα);

25 if !detect collision(xr) then 26 for To ∈ To do // adjust thresholds for colliding local trees 27 ρ ← metric(To::root().x, xr); 28 if ρ < To.ρlimit then // shrink local tree threshold 29 To.ρlimit ← ρ; 30 Th::insert(To);

31 shrink(Tα ∪ Tβ, Th); // process receding thresholds 32 [Tβ, Tα] ← swap(Tα, Tβ);

126 as Algorithm 6.2, updates the version applied by the PART planner to include cost- to-come threshold updates. Aside from adding consistency with the core algorithm, these updates have an important implication for the bidirectional algorithm: direct growth impetus between the two groups of local trees is no longer necessary to connect them. Since the complete set of local trees is used for connection and they shrink in response to collisions, blockages between the two groups shrink local trees along the edges of their coverage and precipitate further branching. As a result, the bidirectional APART planner can rely on this mechanism to link the two groups of local trees (i.e., to complete the query) and dispense with the need to directly grow them toward one another. However, the addition of cost-to-come threshold updates requires the addition of a process to create the local trees that warrant branching as a result of these shrinking values. The final component of the updated APART algorithm is the shrinking function, which is listed as Algorithm 6.3 and is responsible for enforcing the consistent branch- ing of new local trees. It follows the completion of an iteration of the core algorithm by checking any local trees with recently-decreased cost-to-come thresholds to de- termine whether that decrease has toggled any of its nodes from being within the cost-to-come threshold to outside it. While such a node exists (and therefore requires branching), the process branches it and moves any descendant nodes into the new local tree (if one is created) so that the branching local tree’s progress is preserved. By cycling until all necessary local trees are examined, the shrinking operation adds any local trees or connections that would have been created had the cost-to-come thresholds had the smaller values originally. Thus, the operation of the APART plan- ner is consistent with the cost-to-come conditions of the original PART planner while also adapting to the specific environment and avoiding the need to select a governing parameter.

127 Algorithm 6.2: The Adaptive Path-length Annexed Random Tree (APART) planner’s branching and connection strategy. [c, Tc, nc; Ts, Th] ← branch(T, Tb, nb; Ts, xr) 1 Tp::initialize(T\Tb), To::initialize(), Th::initialize(); 2 for Tp ∈ Tp do // root cost-to-come step 3 if Tp.ρlimit < metric(Tp:root().x, nb.x) then 4 Tp::erase(Tp);

5 if !Tp::empty() then 6 for Tp ∈ Tp do // neighbor cost-to-come step 7 [ρ, Tp.np] ← nearest neighbor(Tp, nb.x); // nearest in each local tree 8 if Tp.ρlimit < Tp.np.ρ + metric(Tp.np.x, nb.x) then 9 Tp::erase(Tp);

10 while !Tp::empty() do // collision checking step 11 [ρn, Tn] ← nearest neighbor(Tp.T .np, nb.x); // nearest of valid local trees 12 while true do 13 [xc, uc] ← local plan(Tn.np.x, nb.x, ε); 14 if !detect collision(Tn.np.x; uc, xc) then 15 Tn.np ← Tn::insert(Tn.np, node(xc)); 16 Tn.np.ρ ← Tn.np::parent().ρ + metric(Tn.np::parent().x, Tn.np.x); 17 if metric(Tn.np.x, xr) < metric(Tn.ns.x, xr) then 18 Tn.ns ← Tn.np; 19 if (Tn ∈/ Ts) ∧ (Tn.ns.ρ ≤ Tn.ρlimit) then 20 Ts::insert(Tn);

21 if Tn.np.x = nb.x then // connection found 22 return [true, Tn, nc; Ts, Th];

23 else 24 Tp::erase(Tn); 25 if !path exists(Tn.np, T, nb) then // unconnected local trees only 26 To::insert(Tn);

27 break;

28 for To ∈ To do // adjust thresholds for colliding local trees 29 ρ ← metric(To::root().x, nb.x); 30 if ρ < To.ρlimit then // shrink local tree threshold 31 To.ρlimit ← ρ; 32 Th::insert(To);

33 Tc::initialize(node(nb.x)), Tc::root().ρ ← 0; // no connection; create new local tree 34 return [false, Tc, Tc:root(); Ts, Th];

128 Algorithm 6.3: The Adaptive Path-length Annexed Random Tree (APART) planner’s shrinking function. shrink(T, Th) 1 while !Th::empty() do 2 Th ← Th::front(), ρh ← ∞, nh ← ∅; 3 for nr ∈ Th do // find node violating threshold with shortest path 4 if nr.v ∧ (Th.ρlimit < nr.ρ < ρh) then 5 ρ ← nr.ρ, nh ← nr;

6 if nh ∈ Th then // need to process a violating node 7 for nk ∈ nh::decendants() do // mark decendants to avoid repetition 8 nk.v ← false; 0 9 [c, Tc, nc; ∅, Th] ← branch(T, Th, nh); 0 10 Th::insert(Th); // include local trees shrunk in branching 11 if !c then // new local tree created 12 for nk ∈ nh::children() do // new local tree inherits old’s children 13 Tc::adopt(Tc::root(), nk);

14 for nk ∈ Tc do // all nodes initially valid in new, large local tree 15 nk.v ← true;

6.4 Performance

Since the APART planner is a successor to the original PART planner, the previ- ous algorithm provides a helpful benchmark for its performance. Accordingly, the experiments of the previous chapter can be augmented by more comprehensive tests with various constant cost-to-come thresholds and compared to the performance of the APART planner. This supplies a reference for the adaptive algorithm’s relative performance and an idea of the suitability of both the previously-used estimated cost- to-come thresholds and their range as a whole. Results in the sections that follow reproduce data from those experiments as applicable, and all parameters and settings (e.g., run counts per algorithm) are identical unless otherwise stated. The labels “pre- vious” and “best” are used to denote constant cost-to-come thresholds used in earlier experiments and those that minimize the relevant performance measure (collision checks on the 2D problems and runtime on the SE(3) problems), respectively.

129 Collision Checks Search Nodes Iterations Runtime (s) RRT (concon) 2638 (0.19) 617 (0.18) 307 (0.39) 0.007 (0.36) DD-RRT (concon) 2749 (0.19) 711 (0.19) 470 (0.40) 0.012 (0.42) SBL 4554 (0.16) 1431 (0.22) 1428 (0.22) 0.014 (0.33) PART (previous: 2−3.0) 4400 (0.28) 1097 (0.26) 76 (0.45) 0.009 (0.37) PART (best: 21.0) 2575 (0.19) 600 (0.17) 290 (0.38) 0.007 (0.34) APART 4772 (0.37) 1187 (0.36) 116 (0.45) 0.011 (0.48)

Table 6.1: Maze (bidirectional) simulation results evaluating the APART planner. Entries present mean (µ) and coefficient of variation (σ/µ).

6.4.1 Demonstrative Problems

While a mostly straightforward problem for the RRT planner, the maze is actually a potentially difficult case for the adaptive nature of the APART planner. Since the problem does not have significant potential to mislead the conventional RRT planner, the results in Table 6.1 show that the best choice of constant cost-to-come threshold (out of values from 2−5.0 to 21.0 in quarter-point increments in the exponent) is a value large enough to ensure equivalence to the conventional RRT planner. As such, the widespread thin obstacles have the potential to spur the creation of many small local trees and hurt the algorithm’s overall performance. However, the simplicity of the problem and the quick exploration of its connect-like growth allow the APART planner to solve the problem before this becomes an issue in practice. Its performance, while still worse than the conventional RRT planner (or large cost-to-come threshold PART planner), is then only slightly poorer than the performance that results from the previously-used estimated cost-to-come threshold. The adaptive method thus well tolerates a case that might otherwise be expected to result in poor performance by balancing the problem’s relatively low difficulty against indications to the contrary presented by its thin obstacles. The kinked tunnel introduces a narrow passage with corners that provide the APART planner with the opportunity to introduce varied cost-to-come thresholds that assist in negotiating those corners. While the reliability of the metric in this

130 Collision Checks Search Nodes Iterations Runtime (s) RRT (concon) 5826 (0.81) 1125 (0.70) 2050 (1.30) 0.075 (2.95) DD-RRT (concon) 8411 (0.33) 2527 (0.41) 6711 (0.59) 0.340 (1.21) SBL 9665 (0.28) 6298 (0.31) 6295 (0.31) 0.161 (0.61) PART (previous: 2−3.0) 6787 (0.90) 1304 (0.53) 472 (1.90) 0.033 (2.93) PART (best: 2−1.5) 4603 (0.75) 857 (0.51) 535 (1.39) 0.020 (2.19) APART 4485 (0.49) 839 (0.38) 603 (0.76) 0.017 (0.99)

Table 6.2: Kinked tunnel (bidirectional) simulation results evaluating the APART planner. Entries present mean (µ) and coefficient of variation (σ/µ).

environment results in strong performance from the RRT planner and a relatively large best cost-to-come threshold (out of values from 2−5.0 to 21.0 in quarter-point increments in the exponent) for the PART planner, the data in Table 6.2 shows that the APART planner’s local adaptations provide performance superior to any of the compared algorithms. Those local adaptations allow it to resemble the RRT planner in the open areas to either side of the tunnel while also altering its behavior as appropriate in the tunnel. As a result, the APART planner can provide improved performance even on examples for which the conventional RRT planner excels. The bug trap combines features of the maze and kinked tunnel with respect to the adaptive method: it has thin obstacles, though their effects on the RRT planner’s expansion depend on their location. While the thin walls of the mouth create unfavor- able circumstances for the RRT planner, it can explore into the convex portion of free space bounded by the bug trap’s back wall without issue. The implication, expressed by the results in Table 6.3, is that the best constant cost-to-come threshold (out of values from 2−5.0 to 21.0 in quarter-point increments in the exponent) is an interme- diate value that is too large to absolutely prevent a local tree from growing around the mouth and creating similar conditions to those that hamper the conventional RRT planner. These competing factors also cause the APART planner to provide roughly equal performance to the PART planner’s previously-used estimated cost- to-come threshold. While the dynamic domain sampling restriction still has greater impact, the APART planner realizes a noticeable improvement in performance over

131 Collision Checks Search Nodes Iterations Runtime (s) RRT (connect) 14362 (0.56) 2170 (0.47) 10005 (0.73) 0.588 (1.41) DD-RRT (connect) 7047 (0.36) 2161 (0.48) 9955 (0.72) 0.529 (1.37) SBL 4782 (0.70) 1224 (0.69) 1221 (0.69) 0.018 (1.16) PART (previous: 2−3.0) 11963 (1.01) 1639 (0.53) 2437 (1.48) 0.175 (2.15) PART (best: 2−1.75) 9551 (0.93) 1259 (0.58) 2967 (1.28) 0.162 (2.21) APART 11725 (0.93) 2203 (0.66) 1171 (1.37) 0.165 (1.90)

Table 6.3: Bug trap (unidirectional) simulation results evaluating the APART plan- ner. Entries present mean (µ) and coefficient of variation (σ/µ).

the conventional RRT planner on the bug trap.

6.4.2 Realistic Benchmarks

The requirement to provide an estimated cost-to-come threshold to implement the PART planner in a nontrivial environment allows the APART planner to perform comparatively well on the flange and the alpha puzzle. Figure 6.3 illustrates the mean runtimes of the two algorithms over a range of constant cost-to-come thresholds (in quarter-point increments in the exponent from 2−0.5 through 21.0 for the flange and in half-point increments in the exponent from 23.0 through 25.5 for the alpha puzzle); Tables 6.4 and 6.5 present the corresponding statistics. While the previously-used cost-to-come threshold offers close to the best performance of the tested values on the flange, on the alpha puzzle, the previously-used value underestimates the best value significantly enough that it slows the mean runtime of the PART planner by more than a factor of eight compared to its best experimental performance. On both problems, there is an obvious well-performing basin of constant cost-to-come thresholds; smaller values create small local trees and denser interconnections than are needed to solve the problems, while larger ones match the RRT planner too closely and fall into its issues with “trick” states. Notably, the latter case results in increased variance as some instances of the PART planner are not misled by the “trick” states and others are greatly affected. In contrast, the APART planner avoids these issues

132 (a) (b)

Figure 6.3: APART planner comparative performance on the SE(3) benchmarks. The APART planner outperforms all tested cost-to-come thresholds for the PART planner on the flange in (a); it is competitive among the field of possible values and resulting performance on the alpha puzzle in (b). Mean values and standard deviation error bars are plotted.

Collision Checks Search Nodes Iterations Runtime (s) PART (previous: 20.0) 2.47 × 106 (0.40) 3.47 × 105 (0.31) 3.54 × 103 (0.40) 1019 (0.48) PART (best: 20.25) 2.36 × 106 (0.44) 3.08 × 105 (0.36) 4.71 × 103 (0.45) 984 (0.53) APART 1.39 × 106 (0.55) 2.21 × 105 (0.49) 3.88 × 103 (0.45) 627 (0.65)

Table 6.4: Flange 0.95 (unidirectional) simulation results evaluating the APART planner. Entries present mean (µ) and coefficient of variation (σ/µ).

by adapting to the environment and provides performance competitive with the best tested cost-to-come thresholds that surpasses their runtime on the flange and is within a factor of two on the alpha puzzle. These results are achieved in part due to the prudent distribution of local trees created by the APART planner, such as those shown in Figure 6.4, which allocate larger quantities near the edges of configuration space obstacles. These automatically-allocated local trees provide strong performance while also striking a resemblance to the non-uniform sampling methods of the PRM planner. The distribution of local trees and the corresponding roadmap arising in the APART planner appear qualitatively similar to those produced by non-uniform sam- pling methods in the PRM planner, calling into question their comparative ability

133 Collision Checks Search Nodes Iterations Runtime (s) PART (previous: 23.0) 5.71 × 106 (0.17) 5.31 × 105 (0.10) 5.98 × 103 (0.30) 1107 (0.23) PART (best: 25.0) 7.97 × 105 (0.74) 4.66 × 104 (0.33) 3.01 × 104 (0.77) 133 (0.81) APART 1.34 × 106 (0.34) 1.11 × 105 (0.28) 1.31 × 104 (0.29) 233 (0.41)

Table 6.5: Alpha puzzle 1.1 (bidirectional) simulation results evaluating the APART planner. Entries present mean (µ) and coefficient of variation (σ/µ).

(a) (b)

Figure 6.4: APART planner local trees on the SE(3) benchmarks. On the flange in (a) and the alpha puzzle in (b), local trees are denser near states that have the agent touching the obstacle and sparse where the two are separated.

134 to complete these problems. Without carrying out complete simulations, the poten- tial performance of these non-uniform sampling methods can be simply examined by taking advantage of the fact that the algorithm’s nodes are created directly from its samples. Since fulfilling the query on a problem like the alpha puzzle requires samples that describe a connected path through the narrow passage, the computation required to draw a single sample state with visibility of another sample state in that narrow passage represents an essential portion of the total computation. For a given basis narrow passage state, the region of visible states can be (approximately) bounded by incrementally exploring toward a dense sample set and recording the most distant state reached. This bound can then be used to easily trim the set of possible samples to ones that could be visible. In turn, samples complying with the metric bound can be collision checked to assess their visibility of the basis state. This procedure can be used to inform the usability of non-uniform sampling methods and the PRM planner in solving these problems. Estimating the visibility of an alpha puzzle narrow passage state to evaluate the potential of the PRM planner reveals that the incidence of samples usable in negoti- ating its narrow passage is quite rare. Based on an example state from an APART planner solution path and a sample of 218 (262, 144) target states, distances of up to 25.341 in the metric can connect narrow passage states. Testing states (i.e., candi- date roadmap nodes) generated uniformly until 220 (1, 048, 576) samples fall within the bound requires 197, 658, 917 total samples, of which three (less than one in 65 million) have visibility of the basis state. Applying a similar procedure to Gaussian- distributed bridge sampling with a standard deviation of 2.0 (in the metric) requires 392, 801, 461 attempts to locate 220 (1, 048, 576) state triplets in which the center state falls within the bound. Of these, 2, 421 comply with the bridge sampling col- lision checking conditions (i.e., that the center state is collision-free and the others are collisions) and none have visibility of the basis state. Based on an average run-

135 time for one collision check of 12.908 microseconds (which is an underestimate of the work per iteration for the PRM planner) and these estimated rates, neither method is capable of spanning the narrow passage near the order of the APART planner’s runtime, making the latter better able to provide roadmaps on the alpha puzzle.

6.5 Roadmaps and Paths

Like the original PART planner, the APART planner creates a roadmap during its execution that can be processed to improve path quality and retained to augment per- formance in answering future queries. However, the adaptive cost-to-come thresholds have a marked effect on the distribution of local trees that results. While the PART planner’s constant value leads to a high degree of uniformity, the APART planner’s adjustments concentrate a greater density of local trees near (potentially) difficult areas like thin obstacles and corners. These characteristics can be observed from the two algorithms’ roadmap instances shown in Figure 6.5 on the bug trap. Local trees appear with greater relative frequency in potentially difficult regions near thin obstacles and sparsely in open areas. Unsurprisingly, these environment-responsive distributions in the APART planner can provide superior coverage of the configura- tion space than tends to arise with the constant cost-to-come thresholds of the PART planner. Corresponding to the performance results of the APART planner, its allocation of local trees rarely outperforms the best instance of the PART planner, but it is competitive with that of the broad range of possible cost-to-come threshold values. However, it provides a more intelligent method of distributing those local trees such that they are concentrated in regions that may present the greatest difficulty for their exploration mechanism. Two examples of their densities are shown in Figure 6.6; the algorithm generates more numerous (and smaller) local trees near thin obstacles

136 (a) (b)

Figure 6.5: Compared roadmaps from the PART and APART planners. In (a), the PART planner distributes local trees very uniformly as it expands, while in (b), the APART planner concentrates more around thin obstacles and fewer in large regions away from obstacles.

Maze Kinked Tunnel Bug Trap Flange Alpha Puzzle PART (previous) 49 42 48 1424 11084 PART (best) 3 10 12 886 165 APART 28 8 34 628 510 Table 6.6: Mean local tree counts for the PART and APART planners. and corners to assist its exploration. This behavior leads to an intuitively-satisfying pattern that reduces the count of local trees as presented in Table 6.6, compared to the previously-used constant cost-to-come thresholds in all tested cases. It further extends to higher-dimensional cases such as the SE(3) benchmarks, which place more local trees near states that cause the agent and obstacle to touch. Unsurprisingly, these trends also carry over to the resulting solution path lengths. As the APART planner produces local trees in sizes and numbers that tend to span those of constant cost-to-come thresholds, its solution paths follow a similar dynamic. As reflected in Figure 6.7, the lengths of these solution paths display a gradual decreasing trend as the cost-to-come threshold drops, while the ones from the APART planner fall in or near that range. Like the original PART planner, the APART planner improves over the solution paths generated by the conventional RRT

137 (a) (b)

Figure 6.6: Densities of local trees created by the APART planner. On the bug trap in (a) and the kinked tunnel in (b), the adaptive shrinking of cost-to-come thresholds results in denser local trees in potentially-difficult areas near the edges of thin obstacles and around sharp corners. planner provided that they are post-processed by a form of discrete search. However, crossing an open area along a single one of the larger local trees preserved by the APART planner in such areas follows the type of suboptimal paths created by the RRT planner and hence does somewhat worse than what can be accomplished by following a more densely-connected roadmap that might be created using a small cost-to-come threshold. Consequently, solution paths from the APART planner may be of lesser quality compared to the PART planner, but as with other aspects of its performance, it remains competitive without requiring additional parameters over the conventional RRT planner.

6.6 Summary

This chapter has introduced the improved Adaptive Path-length Annexed Random Tree (APART) planner, a revision of the planner introduced in the previous chapter that adapts to the local conditions of its configuration space. While liberal in choosing small values for the cost-to-come thresholds of individual local trees (to suit the

138 (a) (b)

Figure 6.7: Solution path lengths from the PART and APART planners. On the maze in (a) and the kinked tunnel in (b), the PART planner’s average solution path lengths gradually decrease with smaller cost-to-come thresholds as its roadmap becomes more densely-connected. The APART planner’s average solution paths lie in the middle of that range. limited information that is easily computed during the course of the algorithm), it performs at or beyond the standard of the original version in many cases. In fact, experiments have shown that it is competitive with the best constant cost-to-come thresholds of the original version across problems in both performance and solution path quality. Further, the APART planner has demonstrated performance beyond any of the compared algorithms on cases that admit its specific adjustment method (i.e., that contain thin obstacles only in locations that actually merit small local trees, such as the kinked tunnel’s corners).

139 Chapter 7

Conclusion

The preceding material has presented an exploration of the performance issues of sampling-based motion planning with a focus on the Rapidly-exploring Random Tree (RRT) algorithm. To this end, it has provided analysis of the planner and its in- teractions with various environments, along with the general properties thereof that trigger detrimental performance effects. These effects have motivated an investiga- tion into the usefulness of various restart strategies to lessen their impact and undo exploratory missteps made by the planner. In turn, the subsequent observation that the RRT planner frequently remains well-behaved on limited tasks in otherwise dif- ficult environments has inspired a method of neighborhood-limited exploration and the corresponding Path-length Annexed Random Tree (PART) algorithm that offers notable performance advantages on difficult problems. As this planner contains an additional parameter governing the size of those neighborhoods, its effects have been explored and weighed against a liberal strategy of setting it locally and adaptively. This research has demonstrated the particular strengths of the RRT planner along with developed methods of addressing its weaknesses.

Chapter 1 has defined the problem of motion planning and the difficulties of dimen-

140 sionality, control, and obstacle constraints associated with it. Chapter 2 has summarized the approaches to motion planning and the issues asso- ciated with each. For the more modern, sampling-based approach, an overview of the typical tools has provided the framework for a discussion of a selection of its current algorithms. The weaknesses of these algorithms has in turn motivated the overall direction of this research. Chapter 3 has analyzed the exploration dynamics and resulting performance of the RRT planner and tied those dynamics to the qualitative types of distributions that arise on that performance via several probabilistic models. A survey of the

implications of the step size parameter and the extend and connect heuristics has established that the latter eases biases experienced by the former in its initial coverage and issues it has associated with approaching the edge of the state space. Experimental data has corroborated that easy access to regions of the configuration space close in metric but distant in cost-to-go to those that must be explored to solve a problem creates issues for the planner’s Voronoi visibility. Direct examples have illustrated that similar cases can arise with thin obstacles, though the phenomenon has broad-ranging impact on the planner’s overall growth. Chapter 4 has considered the use of restart strategies to improve the performance of the RRT planner on difficult problems. Experiments on various continuous and discrete problems have exposed that the slowly-decaying distributions amenable to restarts can arise from both the previously-observed Voronoi visibility issues or the computational processes of the algorithm. Further tests have indicated that many cases can be addressed by a universal restart strategy that balances the total time in- vested toward attempts of various lengths and that such strategies can be consistently beneficial on varied queries when designed with the environment’s potential difficulty in mind. Finally, the relative consistency of the planner on quantized tasks within broader problems has shown that its growth mechanism is primarily hampered by sit-

141 uations that require it to interfere with itself and lead to low incidence of successful incremental expansion due to distant target states. Chapter 5 has leveraged the idea that nearest neighbor expansion is best applied on a local rather than a global scale to achieve superior coverage. This notion has established the basis for a new algorithm, the Path-length Annexed Random Tree (PART) planner, that defines a set of localized, interconnected trees based on a fixed cost-to-come threshold. Experiments have outlined its operation and suitability on difficult benchmarks. While outperforming several well-known planners on the most challenging examples, the new algorithm has also introduced the benefit of a roadmap output that can reduce the effort required to satisfy future queries and improve the quality of solution paths. Chapter 6 has investigated the effects of the constant cost-to-come thresholds for lo- cal exploration and developed a method to choose the values adaptively and locally as demanded by the environment, resulting in the Adaptive Path-length Annexed Ran- dom Tree (APART) planner. While this strategy has been presented to be necessarily liberal in reducing these cost-to-come thresholds, experiments have demonstrated that it performs well in comparison to previously-used estimated values while removing the need for a potentially complex choice of parameter. The supplied data have val- idated the APART planner by illustrating that it performs competitively with the best possible constant cost-to-come threshold across measures of runtime, solution path length, and output roadmap size. Chapter 7 has concluded the presentation with an overview of its topics. Proposi- tions for future research directions follow.

142 7.1 Future Work

This research has presented analysis and development of sampling-based motion plan- ning algorithms with the goal of evolving their ability to quickly and consistently solve difficult problems. While several of the presented techniques have shown great promise toward this end, their results are neither unvaryingly positive nor proven optimal. As a result, there are a number of reasonable future directions available to continue in the same vein. The sections that follow present a short overview of these possibilities.

7.1.1 Extended Algorithm Models

The presented modeling and analysis of the RRT planner could be extended to mech- anisms such as bidirectional search and variants such as the DD-RRT planner. The one-dimensional model could be supplemented with an additional variable recording the (reversed) progress of a second tree and an altered completion condition that marks the crossing of the two progress variables to model bidirectional search. Simi- larly, a (worst-case) one-dimensional model for the original DD-RRT planner could be created by requiring random states within a fixed distance of the progress variable to change its value. The Markov chain model is additionally compatible with the DD- RRT planner; however, it would require augmented occupancy states that include whether each node had experienced a collision and that reduce the corresponding transition probabilities. Such models could facilitate direct comparisons between the conventional RRT planner and the many existing variations.

7.1.2 Informed Restart Strategies

Though many problems can be addressed using a simple, external restart strategy that treats the RRT planner as a black box, benefit could be derived from a concise

143 identification of the usefulness of individual nodes. As some problems undoubtedly entail a series of potentially challenging obstacles that must be surmounted by an incremental planner in order, a restart strategy that recognizes and retains nodes that have completed such goals could outperform those that do not. Alternatively, a restart strategy that interacts directly with the RRT planner could test separate explorations rooted at novel initial states located during previous runs. In the RRT planner, the information that informs such decisions is likely based on but not limited to its ability to persistently expand toward distant target states. Such informed restart strategies operating on the RRT planner could serve as useful single-query algorithms by circumventing the planner’s weaknesses on certain types of problems.

7.1.3 APART with Disconnected Components

While the APART planner facilitates rapid incremental expansion among problematic obstacles, its framework provides an obvious point to add local trees at unreachable and collision checked target states. This would introduce disconnected components and would require that the algorithm track its progress toward creating a connected roadmap in a manner similar to the PRM planner. It could also increase the uni- formity of sampling before the algorithm has covered the reachable space. However, it also has a potential negative consequence: on problems like the alpha puzzle that strictly constrain one tree while freeing the other in unbounded space, such uncon- nected local trees would shift computation toward largely fruitless collision checking in open space. Similarly, they complicate the idea of using balancing to eliminate this eventuality for ordinary bidirectional search.

7.1.4 Balancing APART Exploration

On the assumption that the primary purpose of the planner is to explore, the APART planner exhaustively tests local tree neighbors for expansion to any given target state;

144 however, other options could provide better balancing or superior solution paths. One implication of incremental growth is that the earliest covered regions also generally see the greatest expenditure of time on exploration. Despite this, it is reasonable to direct greater focus to covering newly-discovered regions (e.g., suspending exploration before and in a tunnel when the search first exits into an open area). This would require a probabilistic or absolute bias based on the age of or attention given to local trees or their nodes (similar to tabu search [25]) that also does not disrupt the spatial bias of nearest neighbor expansion. Further, replacing the metric distance used to order the branching connection tests with a measure including cost-to-come (analogous to the A∗ algorithm) could result in connections that minimize solution path length at the expense of additional collision checking.

7.1.5 Collision Checking in APART Path Processing

While the presentation of the APART planner uses only its direct output to create solution paths, other motion planning algorithms (e.g., the SBL planner) implement path smoothing operations that also use collision checking. The na¨ıve method of processing a solution path in this way (validating paths between all pairs of nodes along the path and searching for the minimum-cost sequence) is potentially expensive; however, the density and size of local trees created by the APART planner suggests that their roots alone could provide a roadmap with useful coverage. Connecting just these nodes with straight paths could preserve the overall connectivity of the output and allow the extraction of superior solution paths. Where these local tree roots do not alone preserve connectivity, the proper connecting paths remain available for use in supplementing this information.

145 Appendix A

Derivations

A.1 One-dimensional RRT Model

A.1.1 Recurrence

The recurrence relation on the distribution (or density) of progress for the one- dimensional model of the RRT planner is developed by formulating the definition of that distribution function.

Fxn+1 (x) = P {xn+1 ≤ x} Z

= fxn (u) fxr (v) dudv u,v∈Dx

(Dx = ({0 ≤ xn ≤ x} ∧ {0 ≤ xr ≤ x}) ∨ ({0 ≤ xn ≤ x − ε} ∧ {x ≤ xr ≤ 1})) Z x Z x Z 1 Z x−ε

= fxn (u) dudv + fxn (u) dudv 0 0 x 0 Z x Z x−ε

= x fxn (u) du + (1 − x) fxn (u) du 0 0

Fxn+1 (x) = xFxn (x) + (1 − x) Fxn (x − ε) Z x

fxn+1 (x) = xfxn (x) + fxn (u) du + (1 − x) fxn (x − ε) x−ε

146 A.1.2 Distribution

To derive the density of iterations required for completion in the one-dimensional RRT planner, a proposed solution to the recurrence for the distribution of progress is first verified.

" T #" T   #     1 Y X k T n x − xi F (x) = 1 − (x − kε) (−1) (x − kε) T = − 1 xn T !εT k ε k=0 k=0

" T #" T   # 1 Y X k T n+1 F (x) = 1 − (x − kε) (−1) (x − kε) xn+1 T !εT k k=0 k=0 " T #" T   1 Y X k T n = 1 − (x − kε) x (−1) (x − kε) T !εT k k=0 k=0 T   # X k T n − ε (−1) k (x − kε) k k=1 " T #" T   1 Y X k T n = 1 − (x − kε) x (−1) (x − kε) T !εT k k=0 k=0 T −1   # X k T n + ε (−1) (k + 1) (x − (k + 1) ε) k + 1 k=0 " T #" T   1 Y X k T n = 1 − (x − kε) x (−1) (x − kε) T !εT k k=0 k=0 T −1   # X k T − 1 n + T ε (−1) (x − (k + 1) ε) k k=0 " T #" T   # 1 Y X k T n = 1 − (x − kε) x (−1) (x − kε) T !εT k k=0 k=0 " T #"T −1   # 1 Y X k T − 1 n + 1 − (x − kε) (−1) ((x − ε) − kε) (T − 1)!εT −1 k k=0 k=0 " T #" T   # x Y X k T n = 1 − (x − kε) (−1) (x − kε) T !εT k k=0 k=0 "T −1 #"T −1   # 1 − x Y X k T − 1 n + 1 − (x − (k + 1) ε) (−1) ((x − ε) − kε) (T − 1)!εT −1 k k=0 k=0

147 " T #" T   # x Y X k T n = 1 − (x − kε) (−1) (x − kε) T !εT k k=0 k=0 "T −1 #"T −1   # 1 − x Y X k T − 1 n + 1 − ((x − ε) − kε) (−1) ((x − ε) − kε) (T − 1)!εT −1 k k=0 k=0

Fxn+1 (x) = xFxn (x) + (1 − x) Fxn (x − ε)

While this form satisfies the recurrence, it does not obey the properties of a distribution function (i.e., that limx→∞ Fx(x) = 1); the proper distribution of progress is a summation of these terms. This form provides the chief part of the density of iterations required for completion.

p [n] = P {xf < xn} − P {xf < xn−1}  = (1 − Fxn (xf )) − 1 − Fxn−1 (xf )

= Fxn−1 (xf ) − Fxn (xf ) T " t #" t   # X 1 Y X k t n−1 = t 1 − (xf − kε) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 T " t #" t   # X 1 Y X k t n − t 1 − (xf − kε) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 T " t #" t   X 1 Y X k t n−1 = t 1 − (xf − kε) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 t   # X k t n − (−1) (x − kε) k f k=0 T " t #" t   X 1 Y X k t n−1 = t 1 − (xf − kε) (1 − xf ) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 t   # X k t n−1 + (−1) kε (x − kε) k f k=0 T " t #" t   X 1 Y X k t n−1 = t 1 − (xf − kε) (1 − xf ) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 t      # X k t t n−1 + (−1) tε − (t − k) ε (x − kε) k k f k=0

148 T " t #" t   X 1 Y X k t n−1 = t 1 − (xf − kε) (1 − xf ) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 t   t   # X k t n−1 X k t n−1 + tε (−1) (x − kε) − ε (−1) (t − k) (x − kε) k f k f k=0 k=0 T " t #" t   X 1 Y X k t n−1 = t 1 − (xf − kε) (1 − (xf − tε)) (−1) (xf − kε) t!ε (1 − (xf − tε)) k t=0 k=0 k=0 t−1   # X k t − 1 n−1 − tε (−1) (x − kε) k f k=0 T " t #" t   # X 1 Y X k t n−1 = 1 − (x − kε) (−1) (x − kε) t!εt f k f t=0 k=0 k=0 T "t−1 #"t−1   # X 1 Y X k t − 1 n−1 − 1 − (x − kε) (−1) (x − kε) (t − 1)!εt−1 f k f t=1 k=0 k=0 T " t #" t   # X 1 Y X k t n−1 = 1 − (x − kε) (−1) (x − kε) t!εt f k f t=0 k=0 k=0 T −1 " t #" t   # X 1 Y X k t n−1 − 1 − (x − kε) (−1) (x − kε) t!εt f k f t=0 k=0 k=0 " T #" T   # 1 Y X k T n−1 p [n] = 1 − (x − kε) (−1) (x − kε) T !εT f k f k=0 k=0

A.1.3 Approximations

Geometric

The distribution of iterations required for completion approximates the geometric for large step size, as the underlying RRT planner must sample beyond the final state.

x − x  T = f i − 1 (x − x ≤ ε) ⇒ T = 0 ε f i

" T #" T   # 1 Y X k T n−1 p [n] = 1 − (x − kε) (−1) (x − kε) T !εT f k f k=0 k=0

149 " 0 #" 0   # 1 Y X k 0 n−1 = 1 − (x − kε) (−1) (x − kε) 0!ε0 f k f k=0 k=0

n−1 p [n] = (1 − xf ) xf

Negative Binomial

The distribution of iterations required for completion approximates the negative bi- nomial for small step size. This distribution is parameterized by n trials requiring

T +1 successes at probability 1−xf , as the underlying RRT planner must make T +1 steps at roughly fixed probability of 1 − xf to reach the final state.

n n " n   # n d ∇ [f](x) X n n d f (x) = ε + O (ε) ⇒ lim (−1) f (x − kε) = εn f (x) dxn εn ε→0 k dxn k=0

dT (n − 1)! f (x) xn−1 ⇒ f (x) = xn−1−T , dxT (n − 1 − T )!

" T #" T   #! 1 Y X k T n−1 lim p [n] = lim 1 − (xf − kε) (−1) (xf − kε) ε→0 ε→0 T !εT k k=0 k=0 " T # ! 1 Y T (n − 1)! n−1−T = lim 1 − (xf − kε) ε x ε→0 T !εT (n − 1 − T )! f k=0 " T # ! Y (n − 1)! n−1−T = lim 1 − (xf − kε) x ε→0 T !(n − 1 − T )! f k=0 n − 1 = (1 − x )T +1 xn−1−T f T f   n − 1 T +1 n−1−T lim p [n] = (1 − xf ) x ε→0 T f

150 A.2 Power Decay

Defining a power-like form for the success probability (i.e., Voronoi region) yields a density of iterations required for the first success that contains power terms.

1 − p∞ ps [n] , + p∞ (0 ≤ n0) n + n0 + 1

∞ k−1 ! X Y ns [n] = ps [k] 1 − ps [m] δ [n − k] k=1 m=1 ∞   k−1  ! X 1 − p∞ Y 1 − p∞ = + p∞ 1 − + p∞ δ [n − k] k + n0 + 1 m + n0 + 1 k=1 m=1 ∞   k−1 ! X 1 − p∞ Y (m + n0) (1 − p∞) = + p∞ δ [n − k] k + n0 + 1 m + n0 + 1 k=1 m=1 ∞     X 1 − p∞ 1 + n0 k−1 = + p∞ (1 − p∞) δ [n − k] k + n0 + 1 k + n0 k=1 ∞     X 1 − p∞ k−1 1 + n0 ns [n] = + p∞ (1 − p∞) δ [n − k] k + n0 + 1 k + n0 k=1

∞     ! X 1 p∞ k 1 lim ns [n] = lim (1 + n0) + (1 − p∞) δ [n − k] p∞→0 p∞→0 k + n0 + 1 1 − p∞ k + n0 k=1 ∞     ! X 1 k 1 = lim (1 + n0) (1 − p∞) δ [n − k] p∞→0 k + n0 + 1 k + n0 k=1 ∞ X  1   1  lim ns [n] = (1 + n0) δ [n − k] p∞→0 k + n0 + 1 k + n0 k=1

151 A.3 Constant Restart Intervals

A.3.1 Mean

For a constant restart interval, the mean of the modified distribution is characterized using the density of that modified distribution (i.e., repeating, exponentially-decaying copies of the portion of the original coming before the restart interval).

Z ∞

µtr (τ) = uftr (u) du 0 Z τ Z 2τ Z 3τ 2 = uft (u) du + u (1 − Ft (τ)) ft (u − τ) du + u (1 − Ft (τ)) ft (u − 2τ) du + ... 0 τ 2τ Z τ Z τ Z τ 2 = uft (u) du + (u + τ) (1 − Ft (τ)) ft (u) du + (u + 2τ) (1 − Ft (τ)) ft (u) du + ... 0 0 0 Z τ  2  = u + (u + τ) (1 − Ft (τ)) + (u + 2τ) (1 − Ft (τ)) + ... ft (u) du 0 Z τ ∞ ∞ ! X k X k = u (1 − Ft (τ)) + τ k (1 − Ft (τ)) ft (u) du 0 k=0 k=0 Z τ ! 1 1 − Ft (τ) = u + τ 2 ft (u) du 0 1 − (1 − Ft (τ)) (1 − (1 − Ft (τ))) Z τ Z τ  1 1 − Ft (τ) = uft (u) du + τ ft (u) du Ft (τ) 0 Ft (τ) 0 1 Z τ  µtr (τ) = uft (u) du + τ (1 − Ft (τ)) Ft (τ) 0

Z τ 1 1 − Ft (τ) µt− (τ) , uft (u) du ⇒ µtr (τ) = µt− (τ) + τ Ft (τ) 0 Ft (τ)

152 A.3.2 Variance

Similar to the modified mean, the modified variance is derived via the modified dis- tribution.

Z ∞ σ2 (τ) = (v − µ (τ))2 f (v) dv tr tr tr 0 Z τ Z 2τ 2 2 = (v − µtr (τ)) ft (v) dv + (v − µtr (τ)) (1 − Ft (τ)) ft (v − τ) dv 0 τ Z 3τ 2 2 + (v − µtr (τ)) (1 − Ft (τ)) ft (v − 2τ) dv + ... 2τ Z τ  2 2 = (v − µtr (τ)) + (v + τ − µtr (τ)) (1 − Ft (τ)) 0 2 2  + (v + 2τ − µtr (τ)) (1 − Ft (τ)) + ... ft (v) dv Z τ  2  2 2 = (v − µtr (τ)) + (v − µtr (τ)) + 2τ (v − µtr (τ)) + τ (1 − Ft (τ)) 0  2 2 2  + (v − µtr (τ)) + 4τ (v − µtr (τ)) + 4τ (1 − Ft (τ)) + ... ft (v) dv Z τ ∞ ∞ 2 X k X k = (v − µtr (τ)) (1 − Ft (τ)) + 2τ (v − µtr (τ)) k (1 − Ft (τ)) 0 k=0 k=0 ∞ ! 2 X 2 k + τ k (1 − Ft (τ)) ft (v) dv k=0 Z τ 2 1 1 − Ft (τ) = (v − µtr (τ)) + 2τ (v − µtr (τ)) 2 0 1 − (1 − Ft (τ)) (1 − (1 − Ft (τ))) ! 2 (1 − Ft (τ)) (1 + (1 − Ft (τ))) + τ 3 ft (v) dv (1 − (1 − Ft (τ))) Z τ  2 1 1 − Ft (τ) = (v − µtr (τ)) + 2τ (v − µtr (τ)) 2 0 Ft (τ) Ft (τ)  2 (1 − Ft (τ)) (1 + (1 − Ft (τ))) + τ 3 ft (v) dv Ft (τ) Z τ  2 1 2 1 − Ft (τ) 1 − Ft (τ) = (v − µtr (τ)) + 2τ (v − µtr (τ)) + τ Ft (τ) 0 Ft (τ) Ft (τ)  2 1 − Ft (τ) + τ 2 ft (v) dv Ft (τ) Z τ  2 ! 1 1 − Ft (τ) 2 1 − Ft (τ) = v − µtr (τ) + τ + τ 2 ft (v) dv Ft (τ) 0 Ft (τ) Ft (τ) Z τ Z τ  1 2 2 1 − Ft (τ) = v − µt− (τ) ft (v) dv + τ 2 ft (v) dv Ft (τ) 0 Ft (τ) 0 τ 1 Z 2 1 − F (τ) σ2 (τ) = v − µ (τ) f (v) dv + τ 2 t tr t− t Ft (τ) 0 Ft (τ)

153 A.3.3 Usefulness

Useful restart intervals are defined here as those that reduce the mean of the original.

µtr (τ) ≤ µ 1 Z τ  Z ∞ uft (u) du + τ (1 − Ft (τ)) ≤ uft (u) du Ft (τ) 0 0 Z τ Z ∞ uft (u) du + τ (1 − Ft (τ)) ≤ Ft (τ) uft (u) du 0 0 Z ∞ Z τ τ (1 − Ft (τ)) ≤ Ft (τ) uft (u) du − uft (u) du 0 0 Z ∞ Z τ ≤ Ft (τ) uft (u) du − (1 − Ft (τ)) uft (u) du τ 0 τ 1 Z ∞ 1 Z τ ≤ uft (u) du − uft (u) du Ft (τ) 1 − Ft (τ) τ Ft (τ) 0

1 Z ∞ τ µt+ (τ) , uft (u) du ⇒ ≤ µt+ (τ) − µt− (τ) 1 − Ft (τ) τ Ft (τ)

A.3.4 Optimality

Local minima of the restart interval modified mean determine the optimal value.

d µ (τ ∗) = 0 dτ tr ∗ Z τ ∗ ! ∗ ft (τ ) ∗ ∗ 1 − Ft (τ ) − 2 ∗ uft (u) du + τ (1 − Ft (τ )) + ∗ = Ft (τ ) 0 Ft (τ ) ∗ ∗ ft (τ ) ∗ 1 − Ft (τ ) − ∗ µtr (τ ) + ∗ = Ft (τ ) Ft (τ ) ∗ ∗ 1 − Ft (τ ) µtr (τ ) = ∗ ft (τ )

154 A.4 Na¨ıve Neighbor Scaling

To derive the density of possible runtimes for a nearest neighbor process, the runtime is approximated as proportional to the square of the number of iterations (which is in turn approximately proportional to the number of nodes).

N X N (N + 1) N 2 + N k = = ∼ N 2 2 2 k=1

√ 2 n o t = n (0 ≤ n) ⇒ Ft (t) = P n ≤ t √  = Fn t 1 √  ft (t) = √ fn t 2 t

An exponential density of iterations is transformed into a Weibull density of run- times (with shape parameter of 1/2, making it heavy-tailed).

√ −λn 1 −λ t fn (n) = λe ⇒ ft (t) = √ λe 2 t 2 λ −1/2 2 1/2 = λ2t e−(λ t) 2  1/2−1  1/2 1/2 t − t f (t) = e 1/λ2 t 1/λ2 1/λ2

155 Appendix B

Implementation

B.1 Algorithm Terminology and Notation

Algorithms described in this work use a common set of terminology and notation as follows. The syntax resembles that used in the C++ , in which the scope resolution operator (::) represents functions of a class/object and the dot operator (.) accesses object data.

Symbols: γ - solution path δ - threshold collision checking resolution (in the SBL planner) κ - collision checking specificity (in the SBL planner) λ - metric separation from parent (in the SBL planner) ρ - path length n - node (i.e., a structure containing state/transition and other data) v - viability (i.e., whether an APART planner node complies with its local tree’s cost-to-come threshold) G - graph Q - queue

156 T - tree

T - set of trees

Subscripts: α/β - initial and final tree/node labels (used for swapping behavior) b - branching (e.g., for braching nodes in the PART planner) c - connecting (e.g., for a node and its local tree in the PART planners that reach a branched state) h - horizon (e.g., for nodes on the “edge” of an APART planner local tree that have recently moved outside of its cost-to-come threshold) i/f - initial/final (e.g., query states and trees/nodes rooted there) o - obstacle-adjacent (e.g., for the set of local trees in the PART planners that have recently collided) p - proximity (e.g., for nearby local trees in the PART planners’ branching processes) r - random (e.g., for random target states) s - selection (e.g., for a node selected for expansion)

Functions: adopt() - take a supplied node (and its decendants) into the calling structure at a supplied position compose path() - compose a solution path using the supplied data structures with optional ordering decendants() - return the set of nodes decending from the calling node detect collision() - collision check a state or path segment with optional transition and destination parameters erase() - erase a contained object from the calling structure initialize() - create a structure with only the supplied content insert() - insert a new child object into the calling structure local plan() - generate a path segment progressing from the supplied origin to the

157 supplied destination with maximum length metric() - calculate the appropriate metric for the state space nearest neighbor() - compute the minimum distance and identity of the closest node from the supplied set to the supplied state parent() - return the next-highest node on the tree from the calling node path exists() - check if a path exists that links the supplied nodes through the sup- plied data structure (e.g., a graph) random element() - select a random element from the supplied set with optional weighting factors random state() - generate a random state with optional location and radius param- eters root() - return the root node of the calling tree swap() - reverse the ordering of the arguments

B.2 Software

Experimental results presented throughout this research originate from a common framework implemented using the C++ language. The design of this framework is conceptually visualized in Figure B.1 and is composed of separated sections for system and planner. Each problem is built as a statically-linked library containing a class that exposes functions of general use in motion planning such as collision detection and local planning; each planner is a template conditioned on the state and transition types that uses a system’s basic functions to execute the procedure of a given algorithm. This allows all tested planners to execute using a fixed set of separately-implemented code to ensure consistency between tests. Both system and planner also output independent sets of summary statistics and solution path information to stored files as needed. This construction provides a common form for

158 all data contained herein. All software is specifically designed and written for use in this research, with the exceptions of the Proximity Query Package (PQP) [40] and the Standard Template Library (STL). This includes implementations of the various tested problems and algorithms along with support structures such as the cover tree (for nearest neighbor search) [8] and Fibonacci heap (for A∗ sorting). All testing and presented data comes from a computer equipped with an Intel R CoreTM 2 Duo E6850 dual-core 3.0GHz CPU and 4GB of RAM, running Microsoft R Windows R 7 x64 Professional. All programs are compiled as single-threaded, 32-bit (x86) executables and are accordingly limited to a maximum of 2GB of addressable memory. Unless specifically noted, no premature terminations or cutoffs apply to the simulations to ensure the validity of included distributions and statistics. All quoted runtimes are measured as CPU time.

159 Figure B.1: Design diagram for the simulation software. Each planner is abstracted to a general definition of a system that has state and transition types and exposes functions for collision detection, metric computation, local planning, and random sampling. Both planner and system then output statistics and performance data related to their execution.

160 Bibliography

[1] M. Akinc, K. E. Bekris, B. Y. Chen, A. M. Ladd, E. Plaku, and L. E. Kavraki, “Probabilistic roadmaps of trees for parallel computation of multiple query roadmaps,” in Robotics Research: The 11th International Symposium, Siena, Italy, October 2003, pp. 80–89.

[2] N. M. Amato, O. B. Bayazit, L. K. Dale, C. Jones, and D. Vallejo, “OBPRM: An obstacle-based PRM for 3D workspaces,” in Robotics: The Algorithmic Per- spective, Houston, Texas, USA, March 1998, pp. 155–168.

[3] N. M. Amato, O. B. Bayazit, L. K. Dale, C. Jones, and D. Vallejo, “Choosing good distance metrics and local planners for probabilistic roadmap methods,” IEEE Transactions on Robotics and Automation, vol. 16, no. 4, pp. 442–447, August 2000.

[4] F. Aurenhammer, “Voronoi diagrams-a survey of a fundamental geometric data structure,” ACM Computing Surveys, vol. 23, no. 3, pp. 345–405, September 1991.

[5] J. Barraquand and J. Latombe, “A Monte-Carlo algorithm for path planning with many degrees of freedom,” in Proceedings of the 1990 IEEE International Conference on Robotics and Automation, Cincinnati, Ohio, USA, May 1990, pp. 1712–1717.

161 [6] K. E. Bekris, B. Y. Chen, A. M. Ladd, E. Plaku, and L. E. Kavraki, “Multiple query probabilistic roadmap planning using single query planning primitives,” in Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, Nevada, USA, October 2003, pp. 656–661.

[7] P. Bessi`ere,J. Ahuactzin, E. Talbi, and E. Mazer, “The Ariadne’s clew algorithm: Global planning with local methods,” in Proceedings of the 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems, Yokohama, Japan, July 1993, pp. 1373–1380.

[8] A. Beygelzimer, S. Kakade, and J. Langford, “Cover trees for nearest neigh- bor,” in Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, Pennsylvania, USA, June 2006, pp. 97–104.

[9] R. Bohlin and L. E. Kavraki, “Path planning using lazy PRM,” in Proceedings of the 2000 IEEE International Conference on Robotics and Automation, San Francisco, California, USA, April 2000, pp. 521–528.

[10] V. Boor, M. H. Overmars, and A. F. van der Stappen, “The Gaussian sampling strategy for probabilistic roadmap planners,” in Proceedings of the 1999 IEEE International Conference on Robotics and Automation, Detroit, Michigan, USA, May 1999, pp. 1018–1023.

[11] M. S. Branicky, S. M. LaValle, K. Olson, and L. Yang, “Quasi-randomized path planning,” in Proceedings of the 2001 IEEE International Conference on Robotics and Automation, Seoul, Korea, May 2001, pp. 1481–1487.

[12] A. Br¨ungger,A. Marzetta, K. Fukuda, and J. Nievergelt, “The parallel search bench and its applications,” Annals of Operations Research, vol. 90, no. 0, pp. 45–63, January 1999.

162 [13] J. F. Canny, “The complexity of robot motion planning,” Ph.D. dissertation, Department of Electrical Engineering and Computer Science, Massachusetts In- stitute of Technology, Cambridge, Massachusetts, USA, May 1987.

[14] H. Chen, C. Gomes, and B. Selman, “Formal models of heavy-tailed behavior in combinatorial search,” in Proceedings of the Seventh International Conference on Principles and Practices of Constraint Programming, Paphos, Cyprus, November 2001, pp. 408–421.

[15] J. D. Cohen, M. C. Lin, D. Manocha, and M. Ponamgi, “I-COLLIDE: An in- teractive and exact collision detection system for large-scale environments,” in Proceedings of the 1995 Symposium on Interactive 3D Graphics, Monterey, Cal- ifornia, USA, April 1995, pp. 189–196.

[16] G. E. Collins, “Quantifier elimination for real closed fields by cylindrical algebraic decomposition,” in Proceedings of the Second GI Conference on Automata Theory and Formal Languages (Lecture Notes in Computer Science), Berlin, Germany, May 1975, pp. 134–183.

[17] M. M. Curtiss, “Motion planning and control using RRTs,” Master’s thesis, De- partment of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, USA, May 2002.

[18] M. Dyer, A. Frieze, and R. Kannan, “A random polynomial-time algorithm for approximating the volume of convex bodies,” Journal of the ACM, vol. 38, no. 1, pp. 1–17, January 1991.

[19] P. Embrechts and H. Schmidli, “Modelling of extremal events in insurance and finance,” Mathematical Methods of Operations Research, vol. 39, no. 1, pp. 1–34, February 1994.

163 [20] A. Foisy and V. Hayward, “A safe swept volume method for collision detec- tion,” in Robotics Research: The 6th International Symposium, Hidden Valley, Pennsylvania, USA, October 1993, pp. 61–68.

[21] E. Frazzoli, “Robust hybrid control for autonomous vehicle motion planning,” Ph.D. dissertation, Department of Aeronautics and Astronautics, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA, June 2001.

[22] J. H. Friedman, J. L. Bentley, and R. A. Finkel, “An algorithm for finding best matches in logarithmic expected time,” ACM Transactions on Mathematical Software, vol. 3, no. 3, September 1977.

[23] R. Geraerts and M. H. Overmars, “A comparative study of probabilistic roadmap planners,” in Algorithmic Foundations of Robotics V (The 2002 Workshop on the Algorithmic Foundation of Robotics), Nice, France, December 2002, pp. 43–58.

[24] E. N. Gilbert, “Random subdivisions of space into crystals,” The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 958–972, September 1962.

[25] F. Glover and M. Laguna, Tabu Search. Norwell, Massachusetts, USA: Kluwer Academic Publishers, June 1998, ch. Tabu Search Background, pp. 1–24.

[26] C. P. Gomes, Constraint and Integer Programming: Toward a Unified Method- ology. Norwell, Massachusetts, USA: Kluwer Academic Publishers, November 2003, ch. Randomized Backtrack Search, pp. 233–291.

[27] C. P. Gomes, B. Selman, , and H. Kautz, “Boosting combinatorial search through randomization,” in Proceedings of the 15th National Conference on Artificial Intelligence, Madison, Wisconsin, USA, July 1998, pp. 431–437.

164 [28] C. P. Gomes, B. Selman, N. Crato, and H. Kautz, “Heavy-tailed phenomena in satisfiability and constraint satisfaction problems,” Journal of Automated Rea- soning, vol. 24, no. 1–2, pp. 67–100, February 2000.

[29] S. Gottschalk, M. C. Lin, and D. Manocha, “OBBTree: A hierarchical structure for rapid interface detection,” in Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, Louisiana, USA, August 1996, pp. 171–180.

[30] C. A. Hoare, “Quicksort,” The Computer Journal, vol. 5, no. 1, pp. 10–16, May 1962.

[31] D. Hsu, T. Jiang, J. Reif, and Z. Sun, “The bridge test for sampling narrow passages with probabilistic roadmap planners,” in Proceedings of the 2003 IEEE International Conference on Robotics and Automation, Taipei, Taiwan, Septem- ber, pp. 4420–4426.

[32] D. Hsu, L. E. Kavraki, J. Latombe, R. Motwani, and S. Sorkin, “On finding nar- row passages with probabilistic roadmap planners,” in Robotics: The Algorithmic Perspective (1998 Workshop on the Algorithmic Foundations of Robotics), Hous- ton, Texas, USA, March 1998, pp. 141–153.

[33] P. Isto, M. M¨antyl¨a,and J. Tuominen, “On addressing the run-cost variance in randomized motion planners,” in Proceedings of the 2003 IEEE International Conference on Robotics and Automation, Taipei, Taiwan, September 2003, pp. 2934–2939.

[34] L. Jaillet, A. Yershova, S. M. LaValle, and T. Sim´eon,“Adaptive tuning of the sampling domain for dynamic-domain RRTs,” in Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, Edmon- ton, Alberta, Canada, August 2005, pp. 2851–2856.

165 [35] L. E. Kavraki, P. Svestka,˘ J. Latombe, and M. H. Overmars, “Probabilistic roadmaps for path planning in high-dimensional configuration spaces,” IEEE Transactions on Robotics and Automation, vol. 12, no. 4, pp. 566–580, August 1996.

[36] T. Kiang, “Random fragmentation in two and three dimensions,” Zeitschrift f¨ur Astrophysik, vol. 64, no. 1, pp. 433–439, June 1966.

[37] J. J. Kuffner, “Effective sampling and distance metrics for 3D rigid body path planning,” in Proceedings of the 2004 IEEE International Conference on Robotics and Automation, Barcelona, Spain, April 2004, pp. 3993–3998.

[38] J. J. Kuffner and S. M. LaValle, “RRT-Connect: An efficient approach to single- query path planning,” in Proceedings of the 2000 IEEE International Conference on Robotics and Automation, San Francisco, California, USA, April 2000, pp. 995–1001.

[39] A. Ladd and L. E. Kavraki, “Generalizing the analysis of PRM,” in Proceed- ings of the 2002 IEEE International Conference on Robotics and Automation, Washington, DC, USA, May 2002, pp. 2120–2125.

[40] E. Larsen and S. Gottschalk, “Proximity query package,” PQP - A Proximity Query Package, May 2002. [Online]. Available: http://gamma.cs.unc.edu/SSV/

[41] E. Larsen, S. Gottschalk, M. C. Lin, and D. Manocha, “Fast distance queries with rectangular swept sphere volumes,” in Proceedings of the 2000 IEEE In- ternational Conference on Robotics and Automation, San Francisco, California, USA, April 2000, pp. 3719–3726.

[42] J. Laumond and P. Sou`eres,“Metric induced by the shortest paths for a car-like mobile robot,” in Proceedings of the 1993 IEEE/RSJ International Conference on Intelligent Robots and Systems, Yokohama, Japan, July 1993, pp. 1299–1304.

166 [43] S. M. LaValle, “Rapidly-exploring random trees: A new tool for path planning,” Department of Computer Science, Iowa State University, Ames, Iowa, USA, Tech. Rep. 98-11, October 1998.

[44] S. M. LaValle, Planning Algorithms. New York, New York, USA: Cambridge University Press, July 2006, ch. Motion Planning, pp. 77–432.

[45] S. M. LaValle, M. S. Branicky, and S. R. Lindemann, “On the relationship be- tween classical grid search and probabilistic roadmaps,” vol. 23, no. 7–8, pp. 673–692, August 2004.

[46] S. M. LaValle and J. J. Kuffner, “Rapidly-exploring random trees: Progress and prospects,” in Algorithmic and Computational Robotics: New Directions (The 4th Workshop on the Algorithmic Foundations of Robotics), Hanover, New Hampshire, USA, March 2000, pp. 293–308.

[47] S. R. Lindemann and S. M. LaValle, “Incrementally reducing dispersion by in- creasing Voronoi bias in RRTs,” in Proceedings of the 2004 IEEE International Conference on Robotics and Automation, New Orleans, Louisiana, USA, April 2004, pp. 3251–3257.

[48] S. R. Lindemann and S. M. LaValle, “Steps toward derandomizing RRTs,” in Proceedings of the 4th International Workshop on Robot Motion and Control, Puszczykowo, Poland, June 2004, pp. 271–277.

[49] M. Luby, A. Sinclair, and D. Zuckerman, “Optimal speedup of Las Vegas algo- rithms,” Information Processing Letters, vol. 47, no. 4, pp. 173–180, September 1993.

[50] D. G. Luenberger, Introduction to Dynamic Systems: Theory, Models, and Ap- plications. Hoboken, New Jersey, USA: John Wiley and Sons, May 1979, ch. Markov Chains, pp. 224–253.

167 [51] M. Montemerlo et al., “Junior: The Stanford entry in the Urban Challenge,” Journal of Field Robotics, vol. 25, no. 9, pp. 569–597, September 2008.

[52] S. Morgan and M. S. Branicky, “Sampling-based planning for discrete spaces,” in Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 2004, pp. 1938–1945.

[53] S. B. Morgan, “Sampling-based planning for discrete spaces,” Master’s thesis, Department of Electrical Engineering and Computer Science, Case Western Re- serve University, Cleveland, Ohio, USA, May 2004.

[54] M. M. Muller, “Some continuous Monte Carlo methods for the Dirichlet prob- lem,” The Annals of Mathematical Statistics, vol. 27, no. 3, pp. 569–589, Septem- ber 1956.

[55] W. Newman, “Team Case and the 2007 DARPA Urban Chal- lenge,” Semifinalist Technical Papers, November 2007. [Online]. Available: http://archive.darpa.mil/grandchallenge/resources.asp

[56] E. Plaku, K. E. Bekris, B. Y. Chen, A. M. Ladd, and L. E. Kavraki, “Sampling- based roadmap of trees for parallel motion planning,” IEEE Transactions on Robotics, vol. 21, no. 4, pp. 597–608, August 2005.

[57] I. Pohl, “Bi-directional and heuristic search in path problems,” Ph.D. disserta- tion, Department of Computer Science, Stanford University, Stanford, California, USA, May 1969.

[58] J. A. Reeds and L. A. Shepp, “Optimal paths for a car that goes both forwards and backwards,” Pacific Journal of Mathematics, vol. 145, no. 2, pp. 367–393, October 1990.

168 [59] J. H. Reif, Planning, Geometry, and Complexity of Robot Motion. Norwood, New Jersey, USA: Ablex Publishing Corporation, May 1987, ch. Complexity of the Generalized Mover’s Problem, pp. 267–281.

[60] G. S´anchez and J. Latombe, “A single-query bi-directional probabilistic roadmap planner with lazy collision checking,” in Robotics Research: The 10th Interna- tional Symposium, Lorne, Australia, November 2001, pp. 403–417.

[61] F. Schwarzer, M. Saha, and J. Latombe, “Adaptive dynamic collision check- ing for single and multiple articulated robots in complex environments,” IEEE Transactions on Robotics, vol. 21, no. 3, pp. 338–353, June 2005.

[62] M. Strandberg, “Augmenting RRT-planners with local trees,” in Proceedings of the 2004 IEEE International Conference on Robotics and Automation, New Orleans, Louisiana, USA, April 2004, pp. 3258–3262.

[63] S. Thrun et al., “Stanley: The robot that won the DARPA Grand Challenge,” Journal of Field Robotics, vol. 23, no. 9, pp. 661–692, September 2006.

[64] N. A. Wedge and M. S. Branicky, “On heavy-tailed runtimes and restarts in rapidly-exploring random trees,” in Proceedings of the First Symposium on Search Techniques in Artificial Intelligence and Robotics, Chicago, Illinois, USA, July 2008, pp. 127–133.

[65] N. A. Wedge and M. S. Branicky, “Using path-length localized RRT-like search to solve challenging planning problems,” in Proceedings of the 2011 IEEE Inter- national Conference on Robotics and Automation, Shanghai, China, May 2011, to be published.

[66] B. Yamrom, “Alpha puzzle / flange,” Texas A&M University Algorithms and Applications Group: Motion Planning Puzzles, October 2000. [Online]. Available: http://parasol-www.cs.tamu.edu/dsmft/benchmarks/

169 [67] A. Yershova, L. Jaillet, T. Sim´eon,and S. M. LaValle, “Dynamic-domain RRTs: Efficient exploration by controlling the sampling domain,” in Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Barcelona, Spain, April 2005, pp. 3856–3861.

170