ALTRO: a Fast Solver for Constrained Trajectory Optimization

ALTRO: A Fast Solver for Constrained Trajectory Optimization

Taylor A. Howell1∗, Brian E. Jackson1∗, and Zachary Manchester2

Abstract— Trajectory optimization is a widely used tool for and direct collocation (DIRCOL) [4] are common direct robot motion planning and control. Existing solvers for these methods. problems either rely on off-the-shelf nonlinear programming Alternatively, “indirect” methods exploit the Markov struc- solvers that are numerically robust and capable of handling arbitrary constraints, but tend to be slow because they are ture of (1) and only treat the control inputs as decision general purpose; or they use custom numerical methods that variables, with the dynamics constraints implicitly enforced take advantage of the problem structure to be fast, but often by simulating the system’s dynamics. Differential Dynamic lack robustness and have limited or no ability to reason Programming (DDP) [5] and iterative LQR (iLQR) [6] are about constraints. This paper presents ALTRO (Augmented closely related indirect methods that solve (1) by breaking Lagrangian TRajectory Optimizer), a solver for constrained trajectory optimization problems that handles general nonlin- it into a sequence of smaller sub-problems. Because of ear state and input constraints and offers fast convergence their strict enforcement of dynamic feasibility, it is often and numerical robustness thanks to careful exploitation of difficult to find a control sequence that produces a reasonable problem structure. We demonstrate its performance on a set of initialization for DDP methods. While they are fast and benchmark motion-planning problems and offer comparisons have a low memory footprint, making them amenable to to the standard direct collocation method with large-scale sequential quadratic programming and interior-point solvers. embedded implementation, DDP methods have historically been considered less numerically robust and less well-suited I.INTRODUCTION to handling nonlinear state and input constraints. Trajectory optimization is a powerful tool for motion plan- Several efforts have been made to incorporate constraints ning, enabling the synthesis of dynamic motion for complex into DDP methods: box constraints on controls [7] and stage- underactuated robotic systems. This general framework can wise inequality constraints on the states [8], [9] have been be applied to robots with nonlinear dynamics and constraints handled by solving a constrained quadratic program (QP) at where other motion planning paradigms—such as sample- each sub-problem step. A projection method was also devised based planning, inverse dynamics, or differential flatness— that satisfies linearized terminal state and stage state-control might be ineffective or impractical. constraints [10]. Augmented Lagrangian (AL) methods have Numerical trajectory optimization algorithms solve varia- been proposed [11], including hybrid approaches that also tions of the following problem, solve constrained QPs for stage state-control constraints [9], N−1 [12]. Mixed state-control constraints have also been handled X using a penalty method [13]. Unfortunately, all of these minimize `N (xN ) + `k(xk, uk, ∆t) methods either have limitations on the types of constraints x0:N , u0:N−1, ∆t k=0 subject to x = f(x , u , ∆t), (1) they can handle, or suffer from numerical instability and k+1 k k conditioning issues. gk(xk, uk) ≤ 0, This paper presents ALTRO (Augmented Lagrangian TRa- hk(xk, uk) = 0, jectory Optimizer), a solver for constrained trajectory optimization problems of the form (1), that combines the best where k is the time-step index, `N and `k are the terminal characteristics of direct and DDP methods, namely: speed, and stage costs, xk and uk are the states and control inputs, numerical robustness, handling of general state and input ∆t is the duration of a time step, f(xk, uk, ∆t) is the discrete constraints, and initialization with infeasible state trajecto- dynamics, and gk(xk, uk) and hk(xk, uk) are inequality and equality constraints. ries. ALTRO combines iLQR with an augmented Lagrangian To solve such problems, “direct” methods treat both states method to handle general state and input constraints and an and controls as decision variables and use general-purpose active-set projection method for final “solution polishing” nonlinear programming (NLP) solvers, such as SNOPT [1] to achieve fast and robust overall convergence. Our novel and IPOPT [2], that tend to be versatile and robust. It contributions are: 1) a numerically robust square-root formu- is straight forward to provide an initial state trajectory to lation of the DDP algorithm, 2) a method for initializing DDP the solver with such methods, even if the trajectory is with an infeasible state trajectory, 3) a strategy for solving dynamically infeasible. Direct transcription (DIRTRAN) [3] minimum-time problems with DDP, and 4) an objective- weighted Newton projection method for solution polishing. 1Department of Mechanical Engineering, Stanford University, USA The remainder of the paper is organized as follows: {thowell,bjack205}@stanford.edu Section II reviews the iLQR algorithm with augmented 2Department of Aeronautics and Astronautics, Stanford University, USA [email protected] Lagrangian constraint handling, which we denote AL-iLQR, ∗These authors contributed equally to this work along with some results on matrix square roots. Section III presents the ALTRO algorithm in detail. Comparisons resulting in the optimal terminal cost-to-go second-order between ALTRO and DIRCOL are then provided for several expansion, motion-planning problems in Section IV. Finally, we sum- p = (` ) + (c )T (λ + I c ) (5) marize our findings in Section V. N N x N x µN N T PN = (`N )xx + (cN )x IµN (cN )x. (6) II.BACKGROUND The relationship between δVk and δVk+1 is derived by n m Notation: For a function `(x, u): R ×R → R, we define taking the second-order Taylor expansion of Qk with respect n 2 2 n×n `x ≡ ∂`/∂x ∈ R , `xx ≡ ∂ `/∂x ∈ R , and `xu ≡ to the state and control, ∂2`/∂x∂u ∈ Rn×m. We also use the notation (A, B, C) to T T indicate the vertical concatenation of matrices, [AT BT CT ]T . 1 δxk Qxx Qxu δxk Qx δxk δQk = + (7) 2 δuk Qux Quu δuk Qu δuk A. Augmented Lagrangian iLQR Dropping the time-step indices for notational clarity, the We present a succinct description of AL-iLQR comprising block matrices are, two parts: the modified iLQR iteration, and the Lagrange T 0 T multiplier and penalty updates used between successive Qxx = `xx + A P A + cx Iµcx (8) T 0 T iLQR steps. Quu = ùu + B P B + cu Iµcu (9) 1) Modified iLQR: From (1), the augmented Lagrangian T 0 T Qux = ùx + B P A + cu Iµcx (10) is: T 0 T Qx = `x + A p + cx (λ + Iµc) (11) LA(X, U, λ, µ) = `N (xN ) T 0 T Qu = ù + B p + cu (λ + Iµc), (12) N−1 1 T X 0 + (λ + I c (x )) c (x ) + ` (x , u , ∆t) where A = ∂f/∂x|xk,uk , B = ∂f/∂u|xk,uk , and indicates N 2 µN N N N N k k k k=0 variables at the k + 1 time step. Consistent with the use of 1 linearized dynamics in iLQR, linearized constraints are also + (λ + I c (x , u ))T c (x , u ) k 2 µk k k k k k k used in the expansion. N−1 Minimizing (7) with respect to δu gives a correction to X k = LN (xN , λN , µN ) + Lk(xk, uk, λk, µk) the control trajectory. The result is a feedforward term dk k=0 and a linear feedback term K δx . Regularization is added (2) k k p p to ensure the invertibility of Quu: where λk ∈ R k is a Lagrange multiplier, µk ∈ R k is a pk ∗ −1 penalty weight, and ck = (gk, hk) ∈ R is the concatenated δuk = −(Quu +ρI) (Quxδxk +Qu) ≡ Kkδxk +dk. (13) set of inequality and equality constraints with index sets Ik ∗ Substituting δuk back into (7), a closed-form expression and Ek, respectively. Iµk is a diagonal matrix defined as, for pk, Pk, and the expected change in cost, ∆Vk, is found: ( 0 if ck (xk, uk) < 0 ∧ λk = 0, i ∈ I T T i i Pk = Qxx + Kk QuuKk + Kk Qux + QxuKk (14) Iµk,ii = (3) µki otherwise, T T pk = Qx + Kk Quudk + Kk Qu + Qxudk (15) where ki indicates the ith constraint at time step k. T 1 T ∆Vk = d Qu + d Quudk. (16) The dynamics constraints are handled implicitly using k 2 k an initial state x0 and nominal control trajectory, U = A forward pass then simulates the system using the cor- {u0, . . . , uN−1}, to simulate forward the state trajectory rection to the nominal control trajectory and a line search is X = {x0 . . . , xN }. performed on the feedforward term dk to ensure a reduction The backward pass is derived by defining the optimal cost- in cost. to-go for fixed multipliers and penalty terms, V |λ,µ, and the 2) Augmented Lagrangian Updates: After an iLQR step recurrence relationship, with λ and µ held constant, the dual variables are updated according to, V (x )| = L (x , λ , µ ) N N λ,µ N N N N ( λ + µ c (x∗, u∗) i ∈ E Vk(xk)|λ,µ = min{Lk(xk, uk, λk, µk) λ+ = ki ki ki k k k (17) uk ki ∗ ∗ max(0, λki + µki cki (xk, uk)) i ∈ Ik, + Vk+1(f(xk, uk))|λ,µ} and the penalty is increased monotonically according to the = min Q(xk, uk)|λ,µ, uk schedule, + µ = φµk , (18) where Qk = Q(xk, uk)|λ,µ is the action-value function. To ki i make the dynamic programming step tractable, we take a where φ > 1 is a scaling factor. The solve-update cycle second-order Taylor expansion of Vk with respect to the state is then repeated until a desired optimality and constraint variable and fixed λ and µ, tolerances are achieved. DDP variants have previously been 1 used to solve the inner minimization of the AL method with δV ≈ pT δx + δxT P δx , (4) k k k 2 k k k good results [9], [11]. Algorithm 1 AL-iLQR 1: function AL-ILQR(x0, U, tol.) p 0 Z ← QR ( ` ,S A, pI c ) (20) 2: Initialize λ, µ, φ xx xx µ x 3: while max(c) > tol. do p 0 p √ Zuu ← QR ( ùu,S B, Iµcu, ρI) , (21) 4: minimize LA(X, U, λ, µ) using iLQR 5: Update λ using (17), update µ using (18) The gains K and d from (13) are expressed in square-root 6: end while form as, −1 −T 7: return X, U, λ K = −Zuu Zuu Qux (22) 8: end function −1 −T d = −Zuu Zuu Qu, (23) and the gradient (15) and expected change (16) of the cost- B. Matrix Square Roots to-go are, T T The QR decomposition F = QR returns an upper tri- p = Qx + (ZuuK) (Zuud) + K Qu + Qxud (24) angular matrix R and orthogonal matrix Q. The Cholesky T 1 T ∆V = d Qu + (Zuud) (Zuud). (25) decomposition G = U T U of a positive-definite matrix G 2 returns an upper-triangular√ “square-root” matrix U, which The square root of the the cost-to-go Hessian (14)—which n×n we denote U = G. For √A, B ∈ R ,√ we use these√ frequently exhibits the worst numerical conditioning—is factorizations to calculate A + B from A and B. derived by assuming the following upper-triangular Cholesky h√ T √ T i √ √ Note that A + B = A B ( A, B) = F T F , factorization, 2n×n T T where F ∈ R . Using the QR decomposition F = I Zxx 0 Zxx C I T T T P = T T QR, A + B = R Q √QR = R R, which√ gives us a K C D 0 D K Cholesky factor R = A + B. We define A + B ← T √ √ √ Z + CK Z + CK QR ( A, B) . Similarly, A − B can be computed by = xx xx √ DK DK performing√ successive rank-one√ downdates√ on √A using the (26) rows of B. We define A − B ← DD( A, B). where, III. ALTRO −T C = Zxx Qxu (27) q ALTRO (Algorithm 4) comprises two stages: The first T −1 −T D = ZuuZuu − QuxZxx Zxx Qxu. (28) stage solves (1) rapidly to a coarse tolerance using a version of AL-iLQR with several novel refinements. The second S can then be computed with a QR decomposition: stage uses the coarse solution from the first stage to warm ! Z + Z−T Q K start an active-set projection method that achieves high- S ← QR xx xx xu . (29) (Z ,Z−T Q )K precision constraint satisfaction. We now present our exten- DD uu xx xu sions to AL-iLQR, as well as the details of the projection B. Infeasible State Trajectory Initialization method. Desired state trajectories can often be identified (e.g., from sampling-based planners or expert knowledge), whereas A. Square-Root Backward Pass finding a control trajectory that will produce this result is For AL methods to achieve fast convergence, the penalty usually challenging. Dynamically infeasible state trajectory terms must be increased to large values, which can result in initialization is enabled by introducing additional inputs to n severe numerical ill-conditioning [14]. To help mitigate this the dynamics with slack controls s ∈ R , issue and make ALTRO more numerically robust, especially xk+1 = f(xk, uk) + sk, (30) on embedded processors with limited numerical precision, we introduce a square-root backward pass inspired by the to make the system artificially fully actuated. ˜ square-root Kalman filter [15]. Given initial state and control trajectories, X and U, the We now derive recursive expressions for the following initial infeasible controls s0:N−1 are computed as the differ- √ √ ence between the desired state trajectory and the dynamics upper-triangular Cholesky factors: S ≡ P,Zxx ≡ Qxx, √ at each time step: and Zuu ≡ Quu, using the methods from section II-B. The square root of the terminal cost-to-go Hessian (6) is, sk =x ˜k+1 − f(˜xk, uk) (31) The optimization problem (1) is modified by replacing the p p dynamics with (30). An additional cost term, SN ← QR ( (`N )xx, IµN (cN )x) , (19) N−1 X 1 T and the action-value expansion factorizations, (8) and (9), s Rssk, (32) 2 k are, k=0 and constraints sk = 0, k=0,...,N−1 are also added to This hybrid approach avoids the numerical ill-conditioning the problem. Since sk = 0 at convergence, a dynamically and slow tail convergence exhibited by AL methods when feasible solution to (1) is still obtained. the penalty weights are made large. To ensure strict satisfaction of the dynamics and constraints, the current solution C. Time-Penalized Problems is projected onto the manifold associated with the active Minimum-time, or more general time-penalized√ problems, constraints d. The norm used to compute distances from can be solved by considering τk = ∆tk ∈ R as an this manifold is weighted by the Hessian of the objective input at each time step. To ensure the solver does not H. Algorithm 2 takes successive Newton steps δY , only exploit discretization errors in the system dynamics, equality updating the constraint Jacobian D when the convergence constraints must be added between time step durations. The rate r drops below a threshold, allowing re-use of the optimization problem (1) is modified to use dynamics, same matrix factorization S for inexpensive linear system solutions. Further, this algorithm can be implemented in a xk+1 f(xk, uk, τk) = , (33) sequential manner [16] that does not require building large δk+1 τk matrices, making it amenable to embedded systems. with an additional cost term, N−1 Algorithm 4 ALTRO X 2 Rτ τk , (34) 1: procedure k=0 2: Initialize x0, U, tolerances; X˜ 3: if Infeasible Start then subject to δk = τk, k=1,...,N−1 constraints, and upper 4: X ← X˜, s ← from (31) and lower bounds on τk. 0:N−1 5: else Algorithm 2 Projection 6: X ← Simulate from x0 using U 1: function PROJECTION(Y, tol.) 7: end if 2: H−1 ← invert Hessian of objective 8: X, U, λ ← AL-ILQR(X, U, tol.) 9: (X, U, λ) ← PROJECTION((X, U, λ), tol.) 3: while kdk∞ > tol. do 10: return X,U 4: d, D ←√ linearize active constraints 5: S ← DH−1DT 11: end procedure

6: v ← kdk∞ 7: r ← ∞ IV. SIMULATION RESULTS 8: while v > tol. and r > conv. rate tol. do 9: Y, v+ ← LINESEARCH(Y, S, H−1, D, d, v) ALTRO’s timing and constraint-satisfaction performance 10: r ← log v+/ log v is compared to DIRCOL [4] on a number of benchmark 11: end while motion-planning problems. Each problem uses a quadratic 12: end while objective, has initial and terminal state constraints, is solved 13: return Y to constraint satisfaction cmax = 1e-8 (cmax = 1e-6 for 14: end function time-penalized problems), and is performed on a desktop computer with an AMD Ryzen Threadripper 2950x processor and 16GB RAM. ALTRO is implemented purely in the Julia Algorithm 3 Projection Line Search programming language, while DIRCOL uses the Ipopt and −1 SNOPT NLP solvers through a Julia interface. DIRCOL is 1: function LINESEARCH(Y, S, H , D, d, v0) provided the dynamically feasible state trajectory computed 2: Initialize α, γ during the initial forward simulation from ALTRO. Further 3: while v > v0 do −1 T −1 −T details regarding the problems and parameters used, along 4: δYp ← H D (S S d) with full source code, is available on GitHub. 5: Y¯p ← Yp + αδYp 6: d ← UPDATECONSTRAINTS(Y¯p) A. Problem Descriptions 7: v ← kdk∞ 1) 1D Block Move: Double Integrator with two states and 8: α ← γα one input. The system is tasked to move one unit length 9: end while subject to control limits. 10: return Y¯ , v 2) Pendulum: The system must swing from the down- 11: end function ward equilibrium to the upward equilibrium point subject to control limits. 3) Acrobot: Double pendulum system limited to actuation D. Active-Set Projection Method at the “elbow” joint. Tasked with swinging upright from the The coarse primal and dual trajectories, Y ← X, U, λ, downward position. returned from the AL-iLQR stage of ALTRO are used to 4) Cartpole: The system must perform a swing-up ma- warm start an active-set projection method (Algorithm 2). neuver while obeying control limits. 100 initial guess SNOPT SNOPT Ipopt Ipopt 10−2 AL-iLQR AL-iLQR ALTRO ALTRO Projected Newton 10−4

10−6

−8

maximum constraint violation 10

0 1 2 3 4 5 Fig. 3. 2D position trajectories for car escape problem. ALTRO, Ipopt, and time (s) SNOPT converge to the same solution, but AL-iLQR fails to find a collision- free trajectory. Yellow and red markers indicate start and goal positions, Fig. 1. Maximum constraint violation comparison for the cartpole. respectively. After solving to a coarse tolerance, ALTRO performs a single iteration of Algorithm 2 and converges to the specified constraint tolerance. AL-iLQR fails to converge to the specified tolerance without the projection step.

SNOPT Ipopt ALTRO

Fig. 4. Front (left) and side (right) views of the Kuka iiwa arm moving its end effector from an initial position (red) to a desired position (green) Fig. 2. Trajectories for the Reeds-Shepp car with obstacles. ALTRO and while avoid obstacles. Ipopt converge to similar trajectories, while SNOPT finds a different solution that exploits discretization error. Yellow and red markers are start and goal positions, respectively. interpolated from 7 3D positions. Timing results for the benchmark problems are included 5) Reeds-Shepp Car: A differential-drive system with in Table I. ALTRO is 2-5 times faster than both SNOPT and three states and two controls that can move forward or Ipopt for most problems, except for time-penalized problems, backward [17] performs a few different tasks: Parallel Park: where DIRCOL performs considerably better than ALTRO. Move one unit in the direction perpendicular to the direction Constraint satisfaction versus solver time is shown for the of motion (i.e., “sideways”) and end in the same orientation cartpole problem in Figure 1. Infeasible state trajectory ini- while obeying state and control constraints. Also performed tialization is demonstrated on the Car Escape and Quadrotor with time penalization. Obstacles: Move to goal while avoid- Maze problems where AL-iLQR fails to find collision free ing 3 circular obstacles (Figure 2). Escape: Shown in Figure trajectories. DIRCOL was unable to find feasible solutions 3, the car must move from one “room” to another while for the Robot Arm and Quadrotor Maze obstacle avoidance obeying control limits. For this task, solvers are initialized problems. with a state trajectory interpolated from 6 2D positions. V. DISCUSSION 6) Robot Arm: A serial manipulator with 14 states and 7 inputs, modeled after the Kuka iiwa, is tasked with moving ALTRO performs competitively in terms of both compu- from an initial to final configuration through an environment tation time and constraint satisfaction when compared to with obstacles and subject to torque limits (Figure 4). The DIRCOL on a variety of benchmark problems. ALTRO’s control trajectory is initialized to compensate for gravity. rapid convergence on constraint satisfaction demonstrated in 7) Quadrotor: The system with 13 states (quaternion Figure 1 is typical of all the benchmark problems. By using angular representation) and 4 inputs is tasked with navigating an AL method to make rapid initial progress, then switching through a maze with floor and ceiling constraints (Figure 5). to an active-set method once the active inequality constraints Inputs are bounded and initialized to perform hovering. For are known, fast convergence can be achieved throughout the the maze task, solvers are initialized with a state trajectory entire solve process. In conclusion, we have presented a new solver for constrained trajectory optimization problems, ALTRO, that combines the advantages of direct and DDP methods. Compared to a standard tool for robot motion planning, DIRCOL, ALTRO exhibits comparable capabilities and typically faster solve times on a variety of benchmark problems. Our implementation of ALTRO is available at https://github.com/ RoboticExplorationLab/TrajectoryOptimization.jl.

ACKNOWLEDGEMENTS This work was supported by a NASA Early Career Faculty Award (Grant Number 80NSSC18K1503). This material is based upon work supported by the National Science Foun- dation Graduate Research Fellowship Program under Grant Fig. 5. Quadrotor navigating maze environment. No. DGE-1656518. Any opinions, findings, and conclusions or recommendations expressed in this material are those of TABLE I the author(s) and do not necessarily reflect the views of the RUNTIMEPEFORMANCE: ALTRO VS DIRCOL National Science Foundation. System ALTRO Ipopt SNOPT REFERENCES Block Move 12 ms 26 ms 31 ms Pendulum 65 ms 213 ms 646 ms [1] P. E. Gill, W. Murray, and M. A. Saunders, “SNOPT: An SQP Algorithm for Large-scale Constrained Optimization,” SIAM Review, vol. 47, no. 1, pp. 99– Pendulum Time Pen. 848 ms 134 ms 225 ms 131, 2005. Acrobot 1.2 s 11.9 s 6.1 s [2]A.W achter¨ and L. T. Biegler, “On the implementation of an interior- Cartpole 661 ms 1.1 s 3.8 s point filter line-search algorithm for large-scale nonlinear programming,” en, Parallel Park 63 ms 138 ms 385 ms Mathematical Programming, vol. 106, no. 1, pp. 25–57, Mar. 2006. Parallel Park Time Pen. 5.0 s 175 ms 119 ms [3] D. Pardo, L. Moller,¨ M. Neunert, A. W. Winkler, and J. Buchli, “Evaluating direct transcription and nonlinear optimization methods for robot motion Car w/ 3 Obs. 136 ms 935 ms 2.4 s planning,” pp. 1–9, Apr. 2015. Car Escape 1.2 s 9.2 s 37.2 s [4] C. R. Hargraves and S. W. Paris, “Direct Trajectory Optimization Using Kuka w/ Obs. 9.2 s - - Nonlinear Programming and Collocation,” J. Guidance, vol. 10, no. 4, Quadrotor Line 2.3 s 4.9 s 7.9 s pp. 338–342, 1987. Quadrotor Maze 33.0 s - - [5] D. Q. Mayne, “A second-order gradient method of optimizing nonlinear discrete time systems,” Int J Control, vol. 3, p. 8595, 1966. [6] W. Li and E. Todorov, “Iterative Linear Quadratic Regulator Design for Non- linear Biological Movement Systems,” in Proceedings of the 1st International Conference on Informatics in Control, Automation and Robotics, Setubal, For time-penalized problems, DIRCOL finds lower total Portugal, 2004. [7] Y. Tassa, T. Erez, and E. Todorov, “Control-Limited Differential Dynamic trajectory times and solves faster and more reliably compared Programming,” in Proceedings of the International Conference on Robotics to ALTRO. This may be due to inherent sensitivity to initial and Automation (ICRA), May 2014. [8] Z. Xie, C. K. Liu, and K. Hauser, “Differential dynamic programming with time steps in shooting methods like iLQR. nonlinear constraints,” en, in 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, Singapore: IEEE, May 2017, pp. 695– One of ALTRO’s most significant advantages over AL- 702. iLQR and other DDP methods is the ability to be initialized [9] T. C. Lin and J. S. Arora, “Differential dynamic programming technique for constrained optimal control,” en, Computational Mechanics, vol. 9, no. 1, with infeasible state trajectories. Even though the dimension pp. 27–40, Jan. 1991. of the input vector is increased, ALTRO is able to solve these [10] M. Giftthaler and J. Buchli, “A Projection Approach to Equality Constrained Iterative Linear Quadratic Optimal Control,” en, 2017 IEEE-RAS 17th Inter- problems reliably and faster than DIRCOL. national Conference on Humanoid Robotics (Humanoids), pp. 61–66, Nov. DIRCOL was unable to find feasible solutions for two of 2017. [11] B. Plancher, Z. Manchester, and S. Kuindersma, “Constrained Unscented the obstacle avoidance problems. This is likely due to the Dynamic Programming,” in Proceedings of the IEEE/RSJ International way constraints were represented in simulation as well as Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, 2017. [12] G. Lantoine and R. P. Russell, “A Hybrid Differential Dynamic Programming the differences in the way constraints are handled between Algorithm for Constrained Optimal Control Problems. Part 1: Theory,” en, the penalty, interior point, and active-set methods used by Journal of Optimization Theory and Applications, vol. 154, no. 2, pp. 382– 417, Aug. 2012. ALTRO, Ipopt, and SNOPT, respectively. [13] F. Farshidian, M. Neunert, A. W. Winkler, G. Rey, and J. Buchli, “An efficient Future improvements to ALTRO include parallelization optimal planning and control framework for quadrupedal locomotion,” en, in 2017 IEEE International Conference on Robotics and Automation (ICRA), of constraint and Jacobian evaluations, cost expansions, and Singapore, Singapore: IEEE, May 2017, pp. 93–100. dynamics simulations, while more significant parallelization [14] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods. Athena Scientific, 1996. may be achieved using Alternating Direction Method of [15] J. Bellantoni and K. Dodge, “A square root formulation of the Kalman- Multipliers techniques, a natural extension to the AL method Schmidt filter,” AIAA journal, vol. 5, no. 7, pp. 1309–1314, 1967. [16] C. V. Rao, S. J. Wright, and J. B. Rawlings, “Application of Interior-Point used in ALTRO. The sequential version of Algorithm 2 Methods to Model Predictive Control,” en, Journal of Optimization Theory will make ALTRO more amenable to implementation on and Applications, vol. 99, no. 3, pp. 723–757, Dec. 1998. [17] J. Reeds and L. Shepp, “Optimal paths for a car that goes both forwards and embedded hardware. Future work will investigate using backwards,” en, Pacific Journal of Mathematics, vol. 145, no. 2, pp. 367–393, ALTRO to solve bi-level optimization problems including Oct. 1990. contact-implicit trajectory optimization and Stackelberg com- petitions.