Sequential Quadratic Programming for Task Plan Optimization
Total Page:16
File Type:pdf, Size:1020Kb
Sequential Quadratic Programming for Task Plan Optimization Dylan Hadfield-Menell2, Christopher Lin1, Rohan Chitnis1, Stuart Russell2, and Pieter Abbeel2 Abstract— We consider the problem of refining an abstract task plan into a motion trajectory. Task and motion planning is a hard problem that is essential to long-horizon mobile manipulation. Many approaches divide the problem into two steps: a search for a task plan and task plan refinement to find a feasible trajectory. We apply sequential quadratic programming to jointly optimize over the parameters in a task plan (e.g., trajectories, grasps, put down locations). We provide two modifications that make our formulation more suitable to task and motion planning. We show how to use movement primitives to reuse previous solutions (and so save optimization effort) without trapping the algorithm in a poor basin of attraction. We also derive an early convergence criterion that lets us (a) Straight line initialization. (b) Backtracking solution. quickly detect unsatisfiable constraints so we can re-initialize their variables. We present experiments in a navigation amongst movable objects domain and show substantial improvement in cost over a backtracking refinement algorithm. I. INTRODUCTION Long-horizon mobile manipulation planning is a funda- mental problem in robotics. Viewed as trajectory optimiza- tion, these problems are wildly non-convex and direct motion planning is usually infeasible. Viewed as a classical planning problem, there is no good way to represent the geometry of the problem efficiently in a STRIPS or PDDL representation. (c) Intermediate Solution. (d) Final Solution. The robotics and planning communities have studied the Fig. 1: The robot, shown in red, moves a green can to the goal problem of task and motion planning (TAMP) as a way to location. The backtracking solution samples and fixes a trajectory overcome these challenges. TAMP integrates classical task waypoint. This leads to an unnecessarily long path. (c) and (d) show planning methods, that can handle long horizons, with mo- an intermediate and final trajectory computed by running sequential tion planning approaches, that can handle complex geometry. quadratic programming on the task plan. Recent years have seen a variety of approaches to finding feasible task and motion plans [1], [2], [3]. solve a sequence of independent motion planning problems. The approach to TAMP in [1] relies on three components: We propose an approach that jointly optimizes over all of a black box classical planner that ignores geometry to find the parameters and trajectories in a given abstract plan. This an abstract task plan, a black box motion planner that can leads to final solutions with substantially lower cost, when determine motion plans for a given abstract action, and an compared with approaches that compute motion plans for interface that shares information between the two different each high level action independently. Figure 1 shows an planners. Task plans consist of bound object references example that compares the result from joint optimization with the result from a backtracking search. (e.g., can1) and unbound pose references (e.g., pose1). Pose references are continuous parameters that are characterized The optimization problems we consider are highly non- by a set of constraints. For example, a task plan may require convex. We rely on randomized restarts to find solutions: if that pose1 be a grasping pose for can1. we fail to converge, we determine variables associated with The process of motion planning for an abstract plan infeasible constraints and sample new initial values. After a is called plan refinement. If plan refinement for a given fixed budget of restarts, we return to the task planning layer task plan fails, the interface updates the task planner with and generate a new task plan. We contribute two algorithmic information that lets it plan around the failure. In this work, modifications that facilitate efficient randomized restarts. we contribute a novel method for the task plan refinement The first modification uses a minimum velocity projec- component of this system. Our approach has applications to tion [4] of the previous solution to re-initialize trajectories. systems that use a similar decomposition and as a trajectory This preserves the overall global structure of the trajecto- smoother for general TAMP algorithms. ries without trapping new solutions in the same basin of Current approaches to task plan refinement rely on a attraction. The second modification is an early convergence backtracking search over the parameters of the plan and criterion that checks to see if a constraint is likely to be unsatisfiable. This allows us to restart more frequently and • a configuration space of robot poses reduces solution time. • a set of obstacles O Our contributions are as follows: 1) we apply sequential • an initial and goal configuration. convex programming to jointly optimize over the trajectories We define configuration spaces by a set of feasible robot and parameters in a plan refinement; 2) we show how to poses X and a dynamics constraint. The dynamics constraint reuse previous solutions without trapping the optimization is a Boolean function f : X × X ! f0; 1g. It takes as input in a bad basin of attraction; and 3) we show how to do early a pair of poses p1; p2 and is 1 iff p2 is directly reachable convergence detection to avoid wasted effort on infeasible from p1. plans. We present experiments that compare our approach Figure 1 shows a 2D motion planning problem that will to a backtracking refinement. Our approach leads a 2-4x serve as the starting point for a running example. The reduction in the total path cost of solutions at the cost of a pose of the robot is represented by a pair (x,y). We let 1.5-3x increase in running time. We verify that our proposed X be a bounding box so x 2 [0; 7] and y 2 [−2; 7]: modifications led to reductions in refinement time. The dynamics function ensures that the distance between subsequent states of the trajectory is always less than a fixed II. RELATED WORK constant: f(p1; p2) = (p1 − p2 < dmax): Related work largely comes from plan-skeleton ap- There are three main approaches to motion planning proaches to task and motion planning. These are approaches that are used in practice: discretized configuration space that search over a purely discrete representation of the search [8], randomized motion planners [9], [10], and problem and then attempt to refine the task plans they obtain. trajectory optimization [7], [11]. In this work, we build on Toussaint [3] also considers joint trajectory optimization trajectory optimization approaches. to refine an abstract plan. In his formulation, the symbolic The downside of trajectory optimization approaches is that state from a task plan defines constraints on a trajectory they are usually locally optimal and incomplete, while the optimization. The system optimizes jointly over all plan other approaches have completeness or global optimality parameters and uses an initialization scheme similar to ours. guarantees. The upside of trajectory optimization is that it The problems they consider are difficult because the inter- scales well to high dimensions and converges quickly. The mediate states are complicated structures that must satisfy second property is useful in a task and motion planning stability constraints. In contrast, the problems we consider context because it quickly rules out infeasible task plans. are difficult because motion planning problems are hard to Trajectory optimization generates a motion plan by solving solve. This leads us to focus on trajectory re-use and early the following constrained optimization problem. convergence detection. Lozano-Perez´ and Kaelbling [5] consider a similar ap- 2 proach. They enumerate plans that could possibly achieve min jjτjj (1) τt2X a goal. For each such abstract plan, they discretize the subject to f(τ ; τ ) = 1 parameters in the plan and formulate a discrete constraint t t+1 satisfaction problem. They use an off-the-shelf CSP solver SD(τt; o) ≥ dsafe 8o 2 O to find a trajectory consistent with the constraints imposed τ0 = p0; τT = pT by the abstract plan. Our approach to refinement draws on We optimize over a fixed number of waypoints τ , with this perspective, but we do not discretize the plan parameters; t t = 0;:::;T . The objective jjτjj2 is a regularizer that pro- instead, we use continuous optimization to set them. duces smooth trajectories. A standard choice is the minimum Lagriffoul and Andres [6] define the fluents in their task velocity regularizer planning formulation in a similar way to ours. They use these 2 X 2 constraint definitions to solve a linear program over the plan jjτjj = jjτt − τt+1jj : parameters. They then use this LP to reduce the effort of a t backtracking search for plan refinement. This is similar to The first constraint is the dynamics constraint that ensures the first initialization step that we and [3] use, in that it only that the pose at time t + 1 is reachable from the pose at time considers the intermediate states. t. The second constraint is a collision avoidance constraint. It 1 III. TRAJECTORY OPTIMIZATION WITH requires that the distance from any robot pose to an object SEQUENTIAL QUADRATIC PROGRAMMING be larger than a fixed safety margin. The final constraint ensures that the trajectory begins (resp. ends) at the initial Our approach uses sequential quadratic programming to (resp. final) pose. do task plan refinement. In this section, we describe the motion planning algorithm from [7], which applies sequential Sequential Quadratic Programming quadratic programming to motion planning. [7] applied sequential quadratic programing (SQP) to Motion Planning as Constrained Trajectory Optimization trajectory optimization. Practically, this treats a robot tra- A core problem in robotics is motion planning: finding jectory as variables in a mathematical program and applies a collision-free path between fixed start and goal poses.