<<

Preprint

Convex Optimization for Trajectory Generation A Tutorial On Generating Dynamically Feasible Trajectories Reliably And Efficiently

Danylo Malyuta a,? , Taylor P. Reynolds a, Michael Szmuk a, Thomas Lew b, Riccardo Bonalli b, Marco Pavone b, and Behc¸et Ac¸ıkmes¸e a

a William E. Boeing Department of Aeronautics and Astronautics, University of Washington, Seattle, WA 98195, USA b Department of Aeronautics and Astronautics, , Stanford, CA 94305, USA

Reliable and efficient trajectory generation methods are a fundamental need for autonomous dynamical systems of tomorrow. The goal of this article is to provide a comprehensive tutorial of three major con- vex optimization-based trajectory generation methods: lossless convexification (LCvx), and two sequential convex programming algorithms known as SCvx and GuSTO. In this article, trajectory generation is the computation of a dynamically feasible state and control signal that satisfies a set of constraints while opti- mizing key mission objectives. The trajectory generation problem is almost always nonconvex, which typ- ically means that it is not readily amenable to efficient and reliable solution onboard an autonomous ve- hicle. The three algorithms that we discuss use problem reformulation and a systematic algorithmic strat- egy to nonetheless solve nonconvex trajectory generation tasks through the use of a convex optimizer. The theoretical guarantees and computational speed offered by have made the algo- rithms popular in both research and industry circles. To date, the list of applications include rocket land- ing, hypersonic reentry, spacecraft rendezvous and docking, aerial motion planning for fixed- wing and quadrotor vehicles, robot motion planning, and more. Among these applications are high-profile rocket flights conducted by organizations like NASA, , SpaceX, and . This article aims to give the reader the tools and understanding necessary to work with each algorithm, and to know what each method can and cannot do. A publicly available source code repository supports the numerical examples provided at the end of this article. By the end of the article, the reader should be ready to use each method, to extend them, and to contribute to their many exciting modern applications.

utonomous vehicles and robots promise many ual vehicles must be endowed with their own high qual- exciting new applications that will transform ity decision making capability. Failure to generate a safe our society. For example, autonomous aerial trajectory can result in losing the vehicle, the payload, vehicles (AAVs) operating in urban envi- and even human life. Reliable methods for trajectory ronments could deliver commercial goods, generation are a fundamental need if we are to maintain Aemergency medical supplies, monitor traffic, and provide public trust and the high safety standard that we have threat alerts for national security [1]. At the same come to expect from autonomous or automatic systems time, these applications present significant engineering [5]. challenges for performance, trustworthiness, and safety. Computational resource requirements are the second For instance, AAVs can be a catastrophic safety hazard major consideration for onboard trajectory generation. arXiv:2106.09125v1 [math.OC] 16 Jun 2021 should they lose control or situational awareness over a Historically, this has been the driving factor for much populated area. Space missions, self-driving cars, and of practical algorithm development [6,7]. Although applications of autonomous underwater vehicles (AUVs) the modern consumer desktop is an incredibly powerful share similar concerns. machine, industrial central processing units (CPUs) can Generating a trajectory autonomously onboard the ve- still be relatively modest [8]. This is especially true hicle is not only desirable for many of these applications for spaceflight, where the harsh radiation environment but also indeed a necessity when considering the deploy- of outer space prevents the rapid adoption of new ment of autonomous systems either in remote areas with computing technologies. For example, NASA’s flagship little to no communication, or at scale in a dynamic and Mars rover “” landed in 2021 using a uncertain world. For example, it is not possible to re- BAE RAD750 PowerPC flight computer, which is a motely control a spacecraft during a Mars landing sce- 20 year-old technology [9, 10]. When we factor in the nario [2,3,4], nor is it practical to coordinate the motion energy requirements of powerful CPUs and the fact that of tens of thousands of delivery drones from a centralized trajectory generation is a small part of all the tasks location [1]. In these and many other scenarios, individ- that an autonomous vehicle must perform, it becomes ? Corresponding author. clear that modern trajectory generation is still confined E-mail address: [email protected] to a small computational footprint. Consequently, real- 1 Preprint time onboard trajectory generation algorithms must be The cost (1a) encodes the mission goal, the system computationally efficient. dynamics are modeled by the differential equation con- We define trajectory generation to be the computation straint (1b), the state and control specifications are en- of a multi-dimensional temporal state and control signal forced through (1c), and the boundary conditions are that satisfies a set of specifications, while optimizing key fixed by (1d). Note that Problem1 makes a distinction mission objectives. This article is concerned exclusively between a control vector u(t), which is a temporal signal, with dynamically feasible trajectories, which are those and a so-called parameter vector p, which is a static vari- that respect the equations of motion of the vehicle under able that encodes other decision variables like temporal consideration. Although it is commonplace to track dy- scaling. namically infeasible trajectories using feedback control, In most cases, the solution of such an infinite-dimen- at the end of the day a system can only evolve along sional optimization problem is neither available in closed dynamically feasible paths, whether those are computed form nor is it computationally tractable to numerically upfront during trajectory generation or are the result compute (especially in real-time). Instead, different so- of feedback tracking. Performing dynamically feasible lution methods have been proposed that typically share trajectory generation carries two important advantages. the following three main components: First, it provides a method to systematically satisfy con- straints (i.e., specifications) that are hard, if not impos- Formulation: specification of how the functions J and f, and the sets , 0, and f are expressed mathe- sible, to meet through feedback control. This includes, C X X for example, translation-attitude coupled sensor pointing matically; constraints. Secondly, dynamically feasible trajectories Discretization: approximation of the infinite-dimen- leave much less tracking error for feedback controllers to sional state and control signal by a finite-dimensional “clean up”, which usually means that tracking perfor- set of basis functions; mance can be vastly improved. These two advantages shall become more apparent throughout the article. Numerical optimization: iterative computation of an Numerical optimization provides a systematic mathe- optimal solution of the discretized problem. matical framework to specify mission objectives as costs or rewards to be optimized, and to enforce state and Choosing the most suitable combination of these three control specifications as well as the equations of motion components is truly a mathematical art form that is via constraints. As a result, we can express trajectory highly problem dependent, and not an established plug- generation problems as problems, which and-play process like least-squares regression [26]. No are infinite-dimensional optimization problems over func- single recipe or method is the best, and methods that tion spaces [11, 12]. Since the early 1960s, optimal con- work well for some problems can fare much worse for trol theory has proven to be extremely powerful [6, 13]. others. Trajectory algorithm design is replete with prob- Early developments were driven by aerospace applica- lem dependent tradeoffs in performance, optimality, and tions, where every gram of mass matters and trajectories robustness, among others. Still, optimization literature are typically sought to minimize fuel or some other mass- (to which this article belongs) attempts to provide some reducing metric (such as aerodynamic load and thereby formal guidance and intuition about the process. For structural mass). This led to work on trajectory algo- the rest of this article, we will be interested in meth- rithms for climbing aircraft, ascending and landing rock- ods that solve the exact form of Problem1, subject ets, and spacecraft orbit transfer, to name just a few only to the initial approximation made by discretizing [14, 15, 16, 17, 18, 19]. Following these early applica- the continuous-time dynamics. tions, trajectory optimization problems have now been Many excellent references discuss the discretization formulated in many practical areas, including aerial, un- component [13, 27, 28]. Once the problem is discretized, derwater, and space vehicles, as well as for chemical pro- we can employ numerical optimization methods to obtain cesses [20, 21, 22], building climate control [23, 24, 25], a solution. This is where the real technical challenges robotics, and medicine [13], to mention a few. for reliable trajectory generation arise. Depending on At its core, optimization-based trajectory generation the problem formulation and the discretization method, requires solving an optimal control problem of the fol- the finite-dimensional optimization problem that must lowing form (at this point, we keep the problem very be solved can end up being a nonlinear (i.e., noncon- general): vex) optimization problem (NLP). However, NLP opti- mization has high computational complexity, and there

min J(x, u, p, tf ) (1a) are no general guarantees of either obtaining a solution or u,p,tf even certifying that a solution does not exist [26, 29, 30]. s.t.x ˙(t) = f(x, u, p, t), (1b) Hence, general NLP methods may not be appropriate for control applications, since we need deterministic guar- x(t), u(t), p, tf (t), t [0, tf ], (1c) ∈ C ∀ ∈ antees for reliable trajectory generation in autonomous x(0), p 0, x(tf ), p f . (1d) ∈ X  ∈ X control.   2 Preprint

In contrast, if the discretized problem is convex, then principle [11] can be used to show that solving the new it can be solved reliably and with an efficiency that ex- problem recovers a globally optimal solution of the origi- ceeds all other areas of numerical optimization except nal problem. This gives the approach the name lossless for and least squares [26, 29, 31, 32, convexification (LCvx), and the resulting problem can 33, 34]. This is the key motivation behind this article’s often be solved with a single call to a convex solver. As focus on a convex optimization-based problem formula- one can imagine, however, LCvx tends to apply only to tions for trajectory generation. By building methods on very specific problems. Fortunately, this includes some top of convex optimization, we are able to leverage fast it- important and practically useful forms of rocket landing erative algorithms with polynomial-time complexity [35, and other trajectory generation problems for spacecraft Theorem 4.7]. In plain language, given any desired so- and AAV vehicles [6, 45, 46]. When LCvx cannot be lution accuracy, a convex optimization problem can be used, convex optimization can be applied via sequen- solved to within this accuracy in a predetermined number tial convex programming (SCP). This natural ex- of arithmetic operations that is a polynomial function of tension linearizes all nonconvex elements of Problem1, the problem size. Hence, there is a deterministic bound and solves the convex problem in a local neighborhood on how much computation is needed to solve a given con- where the linearization is accurate. Roughly speaking, vex optimization problem, and the number of iterations the problem is then re-linearized at the new solution, cannot grow indefinitely like for NLP. These special prop- and the whole process is repeated until a stopping crite- erties, together with a mature theoretical understand- rion is met. In the overall classification of optimization ing and an ever-growing list of successful real-world use- algorithms, SCP is a trust region method [29, 47, 48]. cases, leave little doubt that convex optimization-based While SCP is a whole class of algorithms, our primary solution methods are uniquely well-suited to be used in focus is on two particular and closely related methods the solution of optimal control problems. called SCvx (also known as successive convexification) Among the successful real-world use-cases of convex and GuSTO [49, 50]. optimization-based trajectory generation are several Let us go through the numerous applications where the inspiring examples from the aerospace domain. These LCvx, SCvx and GuSTO methods have been used. The include autonomous drones, spacecraft rendezvous and LCvx method was originally developed for rocket landing docking, and most notably planetary landing. The [45, 51]. This was the method at the center of the afore- latter came into high profile as an enabling technology mentioned NASA JPL multi-year flight test campaign for for reusability of the SpaceX Falcon 9 and Heavy rockets Mars pinpoint landing [37]. We hypothesize that LCvx [36]. Even earlier, the NASA Jet Propulsion Laboratory is also a close relative of the SpaceX Falcon 9 termi- (JPL) demonstrated the use of a similar method for nal landing algorithm [36]. The method also appears in Mars pinpoint landing aboard the Masten Xombie applications for fixed-wing and quadrotor AAV trajec- sounding rocket [37, 38, 39]. Today, these methods are tory generation [46, 52], spacecraft hypersonic reentry being studied and adopted for several Mars, Moon, and [53, 54, 55], and spacecraft rendezvous and docking [56, Earth landing applications [8, 36]. Although each appli- 57]. The SCvx method, which applies to far more gen- cation has its own set of unique challenges, they all share eral problems albeit with fewer runtime guarantees, has the need to use the full spacecraft motion envelope with been used extensively for the general rocket landing prob- limited sensing, actuation, and fuel/power [8]. These lem [58, 59, 60, 61, 62, 63], quadrotor trajectory genera- considerations are not unique to space applications, and tion [52, 64, 65], spacecraft rendezvous and docking [66], can be found in almost all autonomous vehicles such as and cubesat attitude control [67]. Recently, as part of cars [40, 41], walking robots [42, 43], and quadrotors the NASA SPLICE project to develop a next-generation [44]. planetary landing computer [8], the SCvx algorithm is Having motivated the use of convex optimization, we being tested as an experimental payload aboard the Blue note that many trajectory generation problems have Origin rocket [68]. The GuSTO method common sources of nonconvexity, among which are: has been applied to free-flyer robots such as those used nonconvex control constraints, nonconvex or coupled aboard the international space station (ISS) [50, 69, 70, state-control constraints, and nonlinear dynamics. 71], car motion planning [72, 73], aircraft motion plan- The goal of a convex optimization-based trajectory ning [50], and robot manipulator arms [69]. generation algorithm is to provide a systematic way This article provides a first-ever comprehensive tuto- of handling these nonconvexities, and to generate a rial of the LCvx, SCvx and GuSTO algorithms. Plac- trajectory using a convex solver at its core. ing these related methods under the umbrella of a single Two methods stand out to achieve this goal. In special article allows to provide a unified description that high- cases, it is possible to reformulate the problem into a con- lights common ideas and helps the practitioner to know vex one through a variable substitution and a “lifting” how, where, and when to deploy each method. Previous (i.e., augmentation) of the control input into a higher- tutorials on LCvx and SCvx provide a complementary dimensional space. In this case, Pontryagin’s maximum technical discussion [74, 75]. 3 Preprint

problems in one shot. Part II discusses sequential convex programming (SCP) which can handle very general and highly nonconvex trajectory generation tasks by itera- tively solving a number of convex optimization problems. In particular, Part II provides a detailed tutorial on two modern SCP methods called SCvx and GuSTO [49, 50]. Lastly, Part III applies LCvx, SCvx, and GuSTO to three complex trajectory generation problems: a rocket- FIGURE 1 A typical control architecture consists of trajectory genera- powered planetary lander, a quadrotor, and a micrograv- tion and feedback control elements. This article discusses algorithms ity free-flying robotic assistant. Some important naming for trajectory generation, which traditionally provides reference and feedforward control signals. By repeatedly generating new trajecto- conventions and notation that we use throughout the ar- ries, a feedback action is created that can itself be used for control. ticle are defined in the “Abbreviations” and “Notation”. Repeated trajectory generation for feedback control underlies the the- To complement the tutorial style of this article, the ory of model predictive control. numerical examples in Part III are accompanied by open- source implementation code linked in Figure2. We use the Julia programming language because it is simple to There are two reasons for focusing on LCvx, SCvx, read like Python, yet it can be as fast as C/C++ [80]. and GuSTO specifically. First, the authors are the de- By downloading and running the code, the reader can velopers of the three algorithms. Hence, we feel best recreate the exact plots seen in this article. positioned to provide a thorough description for these particular methods, given our experience with their im- Convex Optimization Background plementation. Second, these three methods have a sig- Convex optimization seeks to minimize a convex objec- nificant history of real-world application. This should tive function while satisfying a set of convex constraints. provide confidence that the methods withstood the test The technique is expressive enough to capture many tra- of time, and have proven themselves to be useful when jectory generation and control applications, and is ap- the stakes were high. By the end of the article, our hope pealing due to the availability of solution algorithms with is to have provided the understanding and the tools nec- the following properties [26, 29]: essary in order to adapt each method to the reader’s particular engineering application. A globally optimal solution is found if a feasible solu- Although our discussion for SCP is restricted to SCvx tion exists; and GuSTO, both methods are closely related to other A certificate of infeasibility is provided when a feasible existing SCP algorithms. We hope that after reading solution does not exist; this tutorial, the reader will be well-positioned to under- stand most if not all other SCP methods for trajectory The runtime complexity is polynomial in the problem generation. Applications of these SCP alternatives are size; discussed in the recent survey paper [6]. Finally, we note that this article is focused on solv- The algorithms can self-initialize, eliminating the ing a trajectory optimization problem like Problem1 need for an expert initial guess. once in real-time. As illustrated in Figure1, this re- The above properties are fairly general and apply sults in a single optimal trajectory that can be robustly to most, if not all, trajectory generation and control tracked by a downstream control system. The ability to solve for the trajectory in real-time, however, can allow for updating the trajectory as the mission evolves and more information is revealed to the autonomous vehicle. Repetitive trajectory generation provides a feedback ac- tion that can itself be used for control purposes. This approach is the driving force behind model predictive control (MPC), which has been applied to many appli- cation domains over the last three decades [76, 77, 78]. This article does not cover MPC, and we refer the inter- ested reader to existing literature [78, 79]. github.com/dmalyuta/scp traj opt/tree/csm The rest of this article is organized as follows. We begin with a short section on convex optimization, with the primary objective of highlighting why it is so useful FIGURE 2 The complete implementation source code for the numeri- cal examples at the end of this article can be found in the csm branch for trajectory generation. The article is then split into of our open-source GitHub repository. The master branch provides three main parts. Part I surveys the major results of loss- even more algorithms and examples that are not covered in this arti- less convexification (LCvx) to solve nonconvex trajectory cle. 4 Preprint

Abbreviations Notation LCvx Lossless convexification M Matrices are capitalized letters LTV Linear time-varying z A scalar or vector variable SCP Sequential convex programming Sets are calligraphic capitalized let- S SQP Sequential ters LP Linear program (a, b, c) Concatenation of column or row vec- QP Quadratic program tors SOCP Second-order cone program ei Standard i-th standard basis vector n n In R × Identity matrix SDP Semidefinite program ∈ MILP Mixed-integer linear program v × Skew symmetric matrix representa- tion of a cross product NLP Nonlinear (nonconvex) program diag(a, B, ...) (Block) diagonal matrix from scalars IPM Interior point method and/or matrices MPC Model predictive control dom f Domain of a function ISS International Space Station x f The gradient vector or Jacobian matrix KKT Karush-Kuhn-Tucker ∇ of f with respect to x ODE Ordinary differential equation f [t] Shorthand for a function evaluated at ZOH Zeroth-order hold a particular time (e.g., f (t, x(t), u(t))) FOH First-order hold DoF Degree of freedom Programming Optimization

applications. This makes convex programming safer than other optimization methods for autonomous applications. To appreciate what makes an optimization problem convex, we introduce some basic definitions here and de- fer to [26, 81] for further details. Two fundamental ob- (a) A convex set contains all line segments connecting its points. jects must be considered: a convex function and a convex set. For reference, Figure3 illustrates a notional convex set and function. By definition, Rn is a convex set if and only if it contains the line segmentC ⊆ connecting any two of its points:

x, y [x, y]θ (2) ∈ C ⇒ ∈ C for all θ [0, 1], where [x, y]θ , θx + (1 θ)y. An im- portant property∈ is that convexity is preserved− under set intersection. This allows us to build complicated convex sets by intersecting simpler sets. By replacing the word “sets” with “constraints”, we can readily appreciate how (b) A convex function lies below all line segments connecting its points. this property plays into modeling trajectory generation problems using convex optimization. FIGURE 3 Illustration of a notional convex set (a) and convex function A function f : n is convex if and only if dom f is (b). In both cases, the variable θ [0, 1] generates a line segment R R ∈ n a convex set and f lies→ below the line segment connecting between two points. The epigraph epi f R R is the set of points which lie above the function, and itself defines⊆ × a convex set. any two of its points:

x, y dom f f([x, y]θ) [f(x), f(y)]θ (3) ∈ ⇒ ≤ gi(x) = 0, i = 1, ... , neq, (4c) for all θ [0, 1]. A convex optimization problem is sim- n n ∈ where f0 : R R is a convex cost function, fi : R ply the minimization of a convex function subject to a → n → R are convex inequality constraints, and gi : R R number of convex constraints that act to restrict the are affine equality constraints. The problem contains→ search space: nineq inequality and neq equality constraints. We stress that the equality constraints must be affine, which means min f0(x) (4a) x n ∈R that each function gi is a linear expression in x plus a s.t. fi(x) 0, i = 1, ... , nineq, (4b) constant offset. The equations of motion are equality ≤ 5 Preprint constraints, therefore basic convex optimization restricts method’s name. To date, the method has been extended the dynamics to be affine (i.e., linear time-varying at as far as relaxing certain classes of nonconvex control most). Handling nonlinear dynamics will be a major constraints, such as an input norm lower bound (see topic of discussion throughout this article. “Convex Relaxation of an Input Lower Bound”) and a Each constraint defines a convex set so that, together, nonconvex pointing constraint (see “Convex Relaxation (4b) and (4c) form a convex feasible set of values that of an Input Pointing Constraint”). the decision variable x may take. To explicitly connect The LCvx method has been shown to work for a large this discussion back to the generic convex set introduced class of state-constrained optimal control problems, in (2), we can write the feasible set as: however a working assumption is that state constraints are convex. Lossless relaxation of nonconvex state n = x R : fi(x) 0, i = 1, ... , nineq, C { ∈ ≤ constraints remains under active research, and some gi(x) = 0, i = 1, ... , neq . (5) related results are available [46] (which we will cover in } this section). A fundamental consequence of convexity is that any As the reader goes through Part I, it is suggested to local minimum of a convex function is a global minimum keep in mind that the main concerns of lossless convexi- [26, Section 4.2.2]. More generally, convex functions fication are: come with a plethora of properties that allow algorithm designers to obtain global information about function be- To find a convex lifting of the feasible input set; havior from local measurements. For example, a differ- To show that the optimal input of the lifted problem entiable convex function is globally lower bounded by its projects back to a feasible input of the original non- local first-order approximation [26]. Thus, we may look lifted problem. at convexity as a highly beneficial assumption on func- tion behavior that enables efficient algorithm design. In- Figure4 chronicles the development history of LCvx. deed, a landmark discovery of the twentieth century was The aim of this part of the article is to provide a tu- that it is convexity, not linearity, that separates “hard” torial overview of the key results, so theoretical proofs and “easy” problems [30]. are omitted in favor of a more practical and action-ori- For practitioners, the utility of convex optimization ented description. Ultimately, our aim is for the reader stems not so much from the ability to find the global to come away with a clear understanding of how LCvx minimum, but rather from the ability to find it (or in- can be applied to their own problems. deed any other feasible solution) quickly. The field of In the following sections, we begin by introducing numerical convex optimization was invigorated by the LCvx for the input norm lower bound and pointing interior-point method (IPM) family of optimization al- nonconvexities using a baseline problem with no state gorithms, first introduced in 1984 by Karmarkar [82]. constraints. Then, the method is extended to handle Today, convex optimization problems can be solved by affine and quadratic state constraints, followed by primal-dual IPMs in a few tens of iterations [26, 31]. general convex state constraints. We then describe how Roughly speaking, we can say that substantially large the LCvx method can also handle a class of dynamical trajectory generation problems can usually be solved in systems with nonlinear dynamics. For more general ap- under one second [9, 37, 63]. In technical parlance, IPMs plications, embedding LCvx into nonlinear optimization have a favorable polynomial problem complexity: the algorithms is also discussed. At the very end of this part number of iterations required to solve the problem to of the article, we cover some of the newest LCvx results a given tolerance grows polynomially in the number of from the past year, and provide a toy example that constraints nineq + neq. With some further assumptions, gives a first taste of how LCvx can be used in practice. it is even possible to provide an upper bound on the number of iterations [35, 83]. We defer to [29, Chap- No State Constraints ters 14 and 19] for futher details on convex optimization We begin by stating perhaps the simplest optimal con- algorithms. Throughout this article, our goal will be to trol problem for which an LCvx result is available. Its leverage existing convex problem solvers to create higher- salient features are a distinct absence of state constraints level frameworks for the solution of trajectory generation (except for the boundary conditions), and its only source problems. of nonconvexity is a lower-bound on the input given by (6c). A detailed description of LCvx for this problem PART I: LOSSLESS CONVEXIFICATION may be found in [45, 84, 86]. Lossless convexification (LCvx) is a modeling tech- nique that solves nonconvex optimal control problems tf through a convex relaxation. In this method, Pontrya- min m(tf , x(tf )) + ζ `(g1(u(t))) dt (6a) u,tf gin’s maximum principle [11] is used to show that a con- Z0 vex relaxation of a nonconvex problem finds the glob- s.t.x ˙(t) = A(t)x(t) + B(t)u(t) + E(t)w(t), (6b) ally optimal solution to the original problem, hence the ρmin g1(u(t)), g0(u(t)) ρmax, (6c) ≤ ≤ 6 Preprint

optimization algorithm, or solve a simpler problem that Input lower is convex. The mantra of LCvx is to take the latter bound [45, 84] 2005 approach by “relaxing” the problem until it is convex. Minimum landing In the case of Problem6, let us propose the following error [85] relaxation, which introduces a new variable σ( ) R 2007 Generalized input · ∈ nonconvexity [86] called the slack input. Active state constraints

at discrete times [86] tf 2010 Input pointing min m(tf , x(tf )) + ζ `(σ(t)) dt (7a) σ,u,tf 2011 constraint [87] Z0 2012 Fusion with s.t.x ˙(t) = A(t)x(t) + B(t)u(t) + E(t)w(t), (7b) mixed-integer ρmin σ(t), g0(u(t)) ρmax, (7c) 2013 programming [46] ≤ ≤ g1(u(t)) σ(t), (7d) 2014 Persistently active ≤ quadratic state x(0) = x0, b(tf , x(tf )) = 0. (7e) constraints [88] Persistently active The relaxation of the nonconvex input constraint (6c) linear state to the convex constraints (7c)-(7d) is illustrated in “Con- constraints [88, 89] vex Relaxation of an Input Lower Bound” for the case of 2019 Semi-continuous a throttleable rocket engine. Note that if (7d) is replaced input norms [90] 2020 with equality, then Problem7 is equivalent to Problem6. Disconnected sets [91] prove 2021 Indeed, the entire goal of LCvx is to that (7d) holds Fixed-final time [92] with equality at the globally optimal solution of Prob- Mixed constraints [93] lem7. Because of the clear importance of constraint (7d) to the LCvx method, we shall call it the LCvx equality constraint. This special constraint will be highlighted FIGURE 4 Chronology of lossless convexification theory development. Note the progression from state-unconstrained problems to those with in red in all subsequent LCvx optimization problems. progressively more general state constraints and, finally, to problems Consider now the following set of conditions, which that contain integer variables. arise naturally when using the maximum principle to prove lossless convexification. The theoretical details are provided in [86, Theorem 2]. x(0) = x0, b(tf , x(tf )) = 0. (6d)

n In Problem6, tf > 0 is the terminal time, x( ) R Condition 1 · ∈ is the state trajectory, u( ) Rm is the input trajectory, The pair A( ), B( ) must be totally controllable. This · ∈ { · · } w( ) Rp is an exogenous additive disturbance, m : R means that any initial state can be transferred to any · ∈ × Rn R is a convex terminal cost, ` : R R is a convex final state by a bounded control trajectory in any finite → → and non-decreasing running cost modifier, ζ 0, 1 is a time interval [0, tf ][46, 94]. If the system is time in- fixed user-chosen parameter to toggle the running∈ { } cost, variant, this is equivalent to A, B being controllable, m { } g0, g1 : R R+ are convex functions, ρmin > 0 and and can be verified by checking that the controllability → n ρmax > ρmin are user-chosen bounds, and b : R R matrix is full rank or, more robustly, via the PBH test × → Rnb is an affine terminal constraint function. Note that [95].  the dynamics in Problem6 define a linear time varying (LTV) system. For all the LCvx discussion that follows, Condition 2 we will also make the following two assumptions on the Define the quantities: problem data. xm[tf ] n+1 mLCvx = ∇ R , (8a) tm[tf ] + ζ`(σ(tf )) ∈ Assumption 1 ∇  T If any part of the terminal state is constrained then xb[tf ] (n+1) nb nb n BLCvx = ∇ T R × . (8b) the Jacobian xb[tf ] R × is full row rank, i.e., tb[tf ] ∈ ∇ ∈ ∇  rank xb[tf ] = nb. This implies that the terminal state ∇ is not over-constrained.  The vector mLCvx and the columns of BLCvx must be linearly independent.  Assumption 2 We can now state Theorem1, which is the main loss- The running cost is positive definite, in other words less convexification result for Problem6. The practical `(z) > 0 for all z = 0. implication of Theorem1 is that the solution of Prob- 6  When faced with a nonconvex problem like Problem6, lem6 can be found in polynomial time by solving Prob- an engineer has two choices. Either devise a nonlinear lem7 instead. 7 Preprint

Convex Relaxation of an Input Lower Bound becomes:

ρmin σ, u 2 min(σ, ρmax). (S2) ≤ k k ≤ 2 Figure S1 shows how for u R , going from (S1) to (S2) ∈ lifts a nonconvex annulus to a convex volume , V

3 (u, σ) R : ρmin σ, V , { ∈ ≤ u 2 min(σ, ρmax) . (S3) k k ≤ } The drawback is that inputs that were not feasible for FIGURE S1 Relaxation of the nonconvex input constraint set (S1) to the convex set (S2) by using a slack input σ. If the optimal in- (S1) become feasible for (S2). In particular, any u for which (S4) put (u∗, σ∗) ∂LCvx , then u∗ is feasible for the nonconvex u 2 < ρmin is now feasible. However, for the special case constraint (S1∈). V k k of (u, σ) ∂LCvx , where: ∈ V

3 typical rocket engine’s thrust is upper bounded by the ∂LCvx (u, σ) R : ρmin σ, V , { ∈ ≤ A engine’s performance and lower bounded by combus- σ ρ (S4) u 2 = min( , max) , tion instability issues for small thrusts. Thus, (6c) can be k k } written as the input u is feasible. The goal of lossless convexification is to show that this occurs at the global optimum of Prob- ρmin u 2 ρmax, (S1) ≤ k k ≤ lem7, in other words (u∗(t), σ∗(t)) ∂LCvx . Theorem1 ∈ V so that g0(u) = g1(u) u 2. The relaxation (7c)-(7d) then guarantees this. , k k

T Theorem 1 nˆ u(t) σ(t)ˆng, (10e) u ≥ The solution of Problem7 is globally optimal for Prob- x(0) = x0, b(tf , x(tf )) = 0. (10f) lem6 if Conditions1 and2 hold.  There is a partial generalization of Theorem1. Just like in Problem7, we introduced a slack input First, restrict Problem6 to the choices ζ = 1 and σ( ) R to strategically remove nonconvexity. Note once · ∈ g0 = g1. Next, introduce a new pointing-like input again the appearance of the LCvx equality constraint m constraint (9d). The quantitiesn ˆu R andn ˆg R (10d). Meanwhile, the relaxation of (9d) to (10e) cor- are user-chosen parameters. The new∈ problem takes∈ the responds to a halfspace input constraint in the (u, σ) ∈ following form: Rm+1 space. A geometric intuition about the relaxation is illustrated in “Convex Relaxation of an Input Pointing tf min m(tf , x(tf )) + `(g0(u(t))) dt (9a) Constraint” for a typical vehicle tilt constraint. Lossless u,tf Z0 convexification can again be shown under an extra Con- s.t.x ˙(t) = A(t)x(t) + B(t)u(t) + E(t)w(t), (9b) dition3, yielding Theorem2. Theoretical details are provided in [87, 96]. ρmin g0(u(t)) ρmax, (9c) ≤ ≤ T nˆ u(t) nˆgg0(u(t)), (9d) u ≥ Condition 3 m (m 1) x(0) = x0, b(tf , x(tf )) = 0. (9e) Let N R × − be a matrix whose columns span the nullspace∈ ofn ˆ in (9d). The pair A( ), B( )N must be Using (9d), one can, for example, constrain an air- u totally controllable. { · · } borne vehicle’s tilt angle. This constraint, however, is  nonconvex forn ˆ < 0. We take care of this noncon- g Theorem 2 vexity, along with the typical nonconvexity of the lower The solution of Problem 10 is globally optimal for Prob- bound in (9c), by solving the following relaxation of the lem9 if Conditions1,2, and3 hold. original problem: 

tf Affine State Constraints min m(tf , x(tf )) + `(σ(t)) dt (10a) The logical next step after Theorems1 and2 is to ask σ,u,tf 0 Z whether Problem6 can incorporate state constraints. It s.t.x ˙(t) = A(t)x(t) + B(t)u(t) + E(t)w(t), (10b) turns out that this is possible under a fairly mild set of ρmin σ(t) ρmax, (10c) ≤ ≤ extra conditions. The results presented in this section g0(u(t)) σ(t), (10d) originate from [88, 89, 97]. ≤ 8 Preprint

Convex Relaxation of an Input Pointing Constraint

from a nominal pointing direction ˆnu. The relaxed constraint (10e) becomes

T ˆn u σ cos(θmax), (S6) u ≥ which is a halfspace constraint and is thus convex even for

θmax > π/2, when (S5) becomes nonconvex. 2 Figure S2 shows how for u R , relaxing (S5) to (S6) ∈ lifts an obtuse “pie slice” to a convex halfspace , V FIGURE S2 Relaxation of the nonconvex input set (S5) to a convex 3 T , (u, σ) R : ˆnu u σ cos(θmax), u 2 σ . (S7) set (S6) via intersection with a halfspace in the lifted (u, σ) space. V { ∈ ≥ k k ≤ } If the optimal input (u∗, σ∗) ∂LCvx then u∗ is feasible for the nonconvex constraint (S5). ∈ V The drawback is that inputs that were not feasible for (S5) become feasible for (S6). In particular, low magnitude in- T ractical mechanical systems, such as Segways and puts for which ˆnu u < u 2 cos(θmax) become feasible. How- k k rockets, may have a constraint on their tilt angle away ever, if (u, σ) ∂LCvx , P ∈ V from an upright orientation. To model such cases, the in- 3 T ∂LCvx (u, σ) R : ˆnu u σ cos(θmax), u 2 = σ , equality (9d) can be specialized to an input pointing con- V , { ∈ ≥ k k } (S8) straint. By choosing ˆng = cos(θmax) and g0(u) = u 2, the k k constraint becomes then the input u satisfies the original constraint (S5). The goal of lossless convexification is to show that this occurs T ˆnu u u 2 cos(θmax), (S5) ≥ k k at the global optimum, in other words (u∗, σ∗) ∂LCvx . ∈ V which constrains u to be no more than θmax radians away Theorem2 guarantees this.

Affine inequality constraints are the simplest class of straint set, as illustrated in “Using Halfspaces to Further state constraints that can be handled in LCvx. The non- Constrain the Input Set”. convex statement of the original problem is a close rela- Let us propose the following convex relaxation of Prob- tive of Problem6: lem 11, which takes the familiar form of Problem7:

tf tf min m(x(tf )) + ζ `(σ(t)) dt (12a) σ,u,tf min m(x(tf )) + ζ `(g1(u(t))) dt (11a) 0 u,tf Z Z0 s.t.x ˙(t) = Ax(t) + Bu(t) + Ew, (12b) s.t.x ˙(t) = Ax(t) + Bu(t) + Ew, (11b) ρmin σ(t), g0(u(t)) ρmax, (12c) ρmin g1(u(t)), g0(u(t)) ρmax, (11c) ≤ ≤ ≤ ≤ g1(u(t)) σ(t), (12d) Cu(t) c, (11d) ≤ ≤ Cu(t) c, (12e) Hx(t) h, (11e) ≤ ≤ Hx(t) h, (12f) x(0) = x0, b(x(tf )) = 0. (11f) ≤ x(0) = x0, b(x(tf )) = 0. (12g) First, we note that Problem 11 is autonomous, in To guarantee lossless convexification for this convex other words the terminal cost in (11a), the dynamics relaxation, Condition1 can be modified to handle the (11b), and the boundary constraint (11f) are all indepen- new state and input constraints (11d) and (11e). To this dent of time. Note that a limited form of time variance end, we use the following notion of cyclic coordinates can still be included through an additional time integra- from mechanics [98]. tor state whose dynamics arez ˙(t) = 1. The limitation here is that time variance must not introduce noncon- Definition 1 vexity in the cost, in the dynamics, and in the terminal For a dynamical systemx ˙ = f(x), with state x n, nh n R constraint (11f). The matrix of facet normals H R × ∈ ∈ any components of x that do not appear explicitly in and the vector of facet offsets h Rnh define a new poly- ∈ f( ) are said to be cyclic coordinates. Without loss of topic (affine) state constraint set. A practical use-case generality,· we can decompose the state as follows: for this constraint is described in “Landing Glideslope as nc m an Affine State Constraint”. Similarly, C R × and xc nc ∈ x = , (13) c R define a new polytopic subset of the input con- xnc ∈   9 Preprint

Landing Glideslope as an Affine State Constraint requirement can be expressed as a convex second-order cone constraint:

T ˆe x x 2 cos(γgs), (S9) 3 ≥ k k

where γgs [0, π/2] is the glideslope angle, i.e. the maxi- ∈ mum angle that the spacecraft position vector is allowed to make with the local vertical. As illustrated in Figure S3, we can approximate (S9) as an intersection of four halfspaces with the following outward normal vectors:     cos(γgs) 0     ˆn1, 0 , ˆn2 , cos(γgs) , (S10a) sin(γgs) sin(γgs) − −     FIGURE S3 A spacecraft planetary landing glideslope cone can be cos(γgs) 0 approximated as an affine state constraint (11e). By adding more −    ˆn3, 0 , ˆn4, cos(γgs) . (S10b) facets, a cone with circular cross-sections can be approximated to   −  sin(γgs) sin(γgs) arbitrary precision. − − Thus, (S9) can be written in the form (11e) by setting: typical planetary landing problem may include a  T A glideslope constraint to ensure sufficient elevation dur- ˆn1 ing approach and to prevent the spacecraft from colliding ˆn T H =  2  , h = 0 4. (S11) 3  T R with nearby terrain [45]. Letting ˆe3 R represent the lo- ˆn3 ∈ ∈   T cal vertical unit vector at the landing site, the glideslope ˆn4

nc nnc where xc R are the cyclic and xnc R are the i 1, ... , nh . If hi = 0, then there must exist a cyclic ∈ ∈ ∈ { } 6 non-cyclic coordinates, such that nc + nnc = n. We can transformation Ψc such that then writex ˙ = f(xnc).  T Many mechanical systems have cyclic coordinates. For Hi Ψc(x) = 0. (15) example, quadrotor drone and fixed-wing aircraft dy-  namics do not depend on the position or yaw angle [99]. To visualize the implication of Condition4, simply Satellite dynamics in a circular low Earth orbit, fre- consider the case of the landing glideslope constraint quently approximated with the Clohessy-Wiltshire-Hill from “Landing Glideslope as an Affine State Constraint”. equations [100], do not depend on the true anomaly an- Because the position of a spacecraft in a constant grav- gle which locates the spacecraft along the orbit. ity field is a cyclic coordinate, Condition4 confirms our With Definition1 in hand, we call a cyclic trans- n n intuitive understanding that we can impose landing at a formation Ψc : R R any mapping from the state → coordinate other than the origin without compromising space to itself which translates the state vector along the lossless convexification. Figure5 provides an illustration. cyclic coordinates. In other words, assuming without loss From linear systems theory [89, 101, 102], it turns out of generality that the state is given by (13), we can write: that x(t) i can hold for a non-zero time interval (i.e. the state∈ “sticks” F to a facet) if and only if there exists xc + ∆xc m n m nv Ψc(x) = (14) a triplet of matrices Fi R × , Gi R × , Hi xnc m p { ∈ ∈ ∈   R × such that: } n for some translation ∆xc R . Let us now consider u(t) = Fix(t) + Giv(t) + Hiw. (16) the polytopic state constraint∈ (12f) and, in particular, n T T nv let i = x R : Hi x = hi be its i-th facet (Hi is the The “new” control input v(t) R effectively gets fil- F { ∈ } i-th row of H). The following condition must then hold. tered through (16) to produce a∈ control that maintains x( ) on the hyperplane i. The situation is illustrated · F Condition 4 in Figure6 in a familiar block diagram form. While the n T Let i = x R : Hi x = hi denote the i-th matrix triplet is not unique, a valid triplet may be com- facetF of the{ polytopic∈ state constraint} (12f), for any puted via standard linear algebra operations. The reader 10 Preprint

Using Halfspaces to Further Constrain the Input Set components, or on combinations thereof. To enhance the descriptiveness of the input constraint set, halfspace con- straints in (11d) may be used. This is illustrated in Figure S4

where g0( ) = 1, g1( ) = 2, and · k · k · k · k " # " # ˆnT c 1 1 (S12) C = T , c = . ˆn2 0

For example, u1 and u2 may describe the concentrations of two chemical reactants – hydrochloric acid (HCl) and

ammonia (NH3) – to produce ammonium chloride (NH4Cl). FIGURE S4 Halfspace constraints from (11d) can be used to se- Then, the first affine constraint may describe the maximum lect only portions of the nonconvex input set defined by (11c). The concentration of NH3 while the second affine constraint may remaining feasible input set is filled in red. describe an inter-dependency in the concentrations of both reactants. Meanwhile, the by-now familiar constraint (11c) he input constraint set defined by (11c) generally lacks describes the joint reactant concentration upper and lower T the ability to describe constraints on individual input bounds.

may consult these operations directly in the source code of the LCvx examples provided at the end of this article. The proof of LCvx for Problem 11 was originally de- veloped in [88, 89]. The theory behind equation (16) is among the most abstract in all of LCvx, and we shall not attempt a proof here. The ultimate outcome of the proof is that the following condition must hold.

Condition 5 n For each facet i R of the polytopic state constraint (12f), the followingF ⊆ “dual” linear system has no trans- mission zeros:

T T λ˙ (t) = (A + BFi) λ(t) (CFi) µ(t), − − T T FIGURE 5 Illustration of a landing glideslope constraint (red, sideview y(t) = (BGi) λ(t) + (CGi) µ(t). of Figure S3) that undergoes a cyclic shift in position along the positive x2 axis, to arrive at a new landing location (blue). Thanks to Condi-  tion4, the LCvx guarantee continues to hold for the new glideslope Transmission zeros are defined in [103, Section 4.5.1]. constraint, even though the new constraint facets are not subspaces. Roughly speaking, if there are no transmission zeros then there cannot exist an initial condition λ(0) Rn and an ∈ input trajectory µ Rnc such that y(t) = 0 for a non- zero time interval. ∈ We can now state when lossless convexification holds for Problem 12. Note that the statement is very similar to Theorem1. Indeed, the primary contribution of [88, 89] was to introduce Condition5 and to show that LCvx holds by using a version of the maximum principle that includes state constraints [104, 105]. The actual lossless convexification procedure, meanwhile, does not change.

Theorem 3 The solution of Problem 12 is globally optimal for Prob- FIGURE 6 Given x(0) i, the dynamical system (11b) evolves on ∈ F lem 11 if Conditions2 and5 hold.  i if and only if u(t) is of the form (16). F Most recently, a similar LCvx result was proved for problems with affine equality state constraints that can 11 Preprint furthermore depend on the input [93]. These so-called mixed constraints are of the following form: tf min m(x(tf )) + ζ `(σ(t)) dt (19a) σ,u,tf y(t) = C(t)x(t) + D(t)u(t), (17) Z0 s.t.x ˙ 1(t) = Ax2(t), (19b) where y, C, and D are problem data. x˙ 2(t) = Bu(t) + w, (19c)

ρmin σ(t), g0(u(t)) ρmax, (19d) Quadratic State Constraints ≤ ≤ g1(u(t)) σ(t), (19e) In the last section, we showed an LCvx result for state ≤ T constraints that can be represented by the affine descrip- x2(t) Hx2(t) 1, (19f) ≤ tion (11e). The natural next question is whether LCvx x1(0) = x1,0, x2(0) = x2,0, b(x(tf )) = 0. (19g) extends to more complicated state constraints. It turns out that a generalization of LCvx exists for quadratic Thanks to the structure of the dynamics (19b)-(19c), it state constraints, if one can accept a slight restriction can be shown that Condition1 is automatically satisfied. to the system dynamics. This result was originally pre- On the other hand, Condition2 must be modified to sented in [97, 102]. The nonconvex problem statement is account for the quadratic state constraint. as follows: Condition 6 2n If ζ = 0, the vector xm[tf ] R and the columns of tf the following matrix∇ must be∈ linearly independent: min m(x(tf )) + ζ `(g1(u(t))) dt (18a) u,tf 0 Z b[t ]T 0 ˜ x1 f 2n (nb+1) s.t.x ˙ 1(t) = Ax2(t), (18b) BLCvx = ∇ T R × . (20) x2 b[tf ] 2Hx2(tf ) ∈ x˙ 2(t) = Bu(t) + w, (18c) ∇ 

ρmin g1(u(t)), g0(u(t)) ρmax, (18d)  ≤ ≤ T Note that Condition6 carries a subtle but important x2(t) Hx2(t) 1, (18e) T ≤ implication. Recall that due to Assumption1, xb[tf ] x1(0) = x1,0, x2(0) = x2,0, b(x(tf )) = 0, (18f) must be full column rank. Hence, if ζ = 0 then∇ the 2n T vector 0, 2Hx2(tf ) R and the columns of xb[tf ] where (x , x ) n n is the state that has been ∈ ∇ 1 2 R R must be linearly dependent. Otherwise, B˜ is full partitioned into∈ two distinct× parts, u n is an input  LCvx R column rank and Condition6 cannot be satisfied. With of the same dimension, and the exogenous∈ disturbance this in mind, the following LCvx result was proved in w n is some fixed constant. The matrix H n n is R R × [102, Theorem 2]. symmetric∈ positive definite such that (18e) maintains∈ the state in an ellipsoid. An example of such a constraint is Theorem 4 illustrated in “Maximum Velocity as a Quadratic State The solution of Problem 19 is globally optimal for Prob- Constraint”. lem 18 if Condition6 holds.  Although the dynamics (18b)-(18c) are less general than (11b), they can still accommodate problems related General Convex State Constraints to vehicle trajectory generation. In such problems, the The preceding two sections discussed problem classes vehicle is usually closely related to a double integrator where an LCvx guarantee is available even in the pres- system, for which A = In and B = In such that x1 is ence of affine or quadratic state constraints. For obvious the position and x2 is the velocity of the vehicle. The reasons, an engineer may want to impose more exotic control u in this case is the acceleration. constraints than afforded by Problems 11 and 18. Luck- The following assumption further restricts the problem ily, an LCvx guarantee is available for general convex setup, and is a consequence of the lossless convexification state constraints. proof [88]. As may be expected, generality comes at the price of a somewhat weaker result. In the preceding sections, the Assumption 3 LCvx guarantee was independent from the way in which The matrices A, B, and H in Problem 11 are invertible. the affine and quadratic state constraints get activated: 1 The functions g0( ) and g1( ) satisfy ρmin g1( B− w) instantaneously, for periods of time, or even for the entire 1 · · ≤ − and g0( B− w) ρmax. optimal trajectory duration. In contrast, for the case − ≤  Assumption3 has the direct interpretation of requiring of general convex state constraints, an LCvx guarantee that the disturbance w can be counteracted by an input will only hold as long as the state constraints are active that is feasible with respect to (18d). The relaxed prob- pointwise in time. In other words, they get activated lem once again simply convexifies the nonconvex input at isolated time instances and never persistently over a lower bound by introducing a slack input: time interval. This result was originally provided in [86]. 12 Preprint

Maximum Velocity as a Quadratic State Constraint

f we set A = In and B = In in Problem 18 then the state I x2 can be interpreted as a vehicle’s velocity. A maximum velocity constraint x2 2 vmax can then be equivalently k k ≤ written in the form of (18e) as

T 2 x (v − In)x2 1. (S13) 2 max ≤ Figure S5 illustrates the maximum velocity constraint.

FIGURE S5 Illustration of the boundary of a maximum velocity con- straint, which can be expressed as (18e).

The nonconvex problem statement is:

tf min m(x(tf )) + ζ `(g1(u(t))) dt (21a) u,tf Z0 s.t.x ˙(t) = Ax(t) + Bu(t) + Ew, (21b)

ρmin g1(u(t)), g0(u(t)) ρmax, (21c) ≤ ≤ x(t) , (21d) ∈ X x(0) = x0, b(x(tf )) = 0, (21e) where Rn is a convex set that defines the state X ⊆ constraints. Without the state constraint, Problem 21 is FIGURE 7 The dashed blue curve represents any segment of the nothing but the autonomous version of Problem6. As for optimal state trajectory for Problem 21 that evolves in the interior of Problem 11, time variance can be introduced in a limited the state constraint set (21d). Because the optimal control prob- lem is autonomous, any such segment is the solution to the state- way by using a time integrator state, as long as this does unconstrained Problem6. When ζ = 1 and in the limit as a , not introduce nonconvexity. The relaxed problem uses LCvx applies to the entire (open) segment inside int [86]. → ∞ X the by-now familiar slack variable relaxation technique for (21c): Consider an interior trajectory segment, as illustrated in Figure7. The optimal trajectory for the dashed por- tf tion in Figure7 is the solution of the following fixed final min m(x(tf )) + ζ `(σ(t)) dt (22a) σ,u,tf 0 state, free final time problem: Z

s.t.x ˙(t) = Ax(t) + Bu(t) + Ew, (22b) te ρmin σ(t), g0(u(t)) ρmax, (22c) min m(z(te)) + ζ `(g1(u(t))) dt (23a) u,tf ≤ ≤ ts g1(u(t)) σ(t), (22d) Z ≤ s.t.z ˙(t) = Az(t) + Bu(t) + Ew, (23b) x(t) , (22e) ∈ X ρmin g1(u(t)), g0(u(t)) ρmax, (23c) x(0) = x0, b(x(tf )) = 0. (22f) ≤ ≤ z(ts) = x(ts), z(te) = x(te). (23d) The LCvx proof is provided in [86, Corollary 3], and relies on recognizing two key facts: We recognize that Problem 23 is an instance of Prob- lem6 and, as long as ζ = 1 (in order for Condition2 to 1. When x(t) int for any time interval t [t1, t2], hold), Theorem1 applies. Because a > 0 can be arbi- ∈ X ∈ the state of the optimal control problem is uncon- trarily large in Figure7, lossless convexification applies strained along that time interval; over the open time interval (t1, t2). Thus, the solution segments of the relaxed problem that lie in the interior of 2. For autonomous problems (recall the description the state constraint set are feasible and globally optimal after Problem 11), every segment of the trajectory for the original Problem 21. is itself optimal [102]. The same cannot be said when x(t) ∂ . During As a result, whenever x(t) int , the solution of these segments, the solution can become∈ X infeasible Problem 22 is equivalent to the∈ solutionX of Problem7. for Problem 21. However, as long as x(t) ∂ at ∈ X 13 Preprint

Permissible State Constraint Activation for General Convex State Constraints

lope constraint is activated only once at time t1 prior to land-

ing. Hence, = t [0, tf ]: x(t) ∂ = t1, tf is a T { ∈ ∈ X} { } discrete set and Theorem5 holds. Note that the constraint

may be activated at other times, e.g. = t1, t2, tf , as T { } long as these times are all isolated points. In Figure S6(b) there is an interval of time for which the glideslope constraint is active. This results in a non-discrete

set of state constraint activation times = [t1, t2] tf . For T ∪{ } t [t1, t2], there is no LCvx guarantee and the solution of ∈ Problem 22 may be infeasible for Problem 21 over that in- terval. FIGURE S6 Illustration of when lossless convexification with gen- For historical context, LCvx theory was initially devel- eral convex state constraints may fail. The two figures on the right oped specifically for planetary rocket landing. For this ap- show an in-plane projection of the 3D figure on the left. In (a) the glideslope constraint is only activated once prior to landing, hence plication, glideslope constraint activation behaves like Fig- the solution is lossless. In (b) the constraint is activated for a non- ure S6(a), so LCvx holds for that application. Indeed, the trivial duration and the solution may be infeasible over that interval. constraint (S9) was part of NASA Jet Propulsion Labo- ratory’s optimization-based rocket landing algorithm flight et us go back to the landing glideslope constraint for a tests [37, 38, 39, S1, S2]. L spacecraft, which was previously motivated in “Landing Glideslope as an Affine State Constraint”. Because Prob- REFERENCES lem 21 allows us to use any convex state constraint, we can [S1] Behc¸et Ac¸ıkmes¸e et al. “Flight Testing of Trajectories Computed by G- FOLD: Fuel Optimal Large Divert Guidance Algorithm for Planetary Land- directly use the second-order cone constraint (S9). ing”. In: 23rd AAS/AIAA Space Flight Mechanics Meeting, Kauai, HI, 2013. As illustrated in Figure S6, lossless convexification in this AAS/AIAA, Feb. 2013. case will only hold if the spacecraft touches the “landing [S2] Daniel P. Scharf et al. “ADAPT demonstrations of onboard large-divert Guidance with a VTVL rocket”. In: 2014 IEEE Aerospace Conference. IEEE, cone” a finite number of times. In Figure S6(a), the glides- Mar. 2014. DOI: 10.1109/aero.2014.6836462.

isolated time instances, LCvx can be guaranteed to Algorithm 1 Solution algorithm for Problem 21. When hold. This idea is further illustrated in “Permissible ζ = 0, a two-step procedure is used where an auxiliary State Constraint Activation for General Convex State problem with ζ = 1 searches over the optimal solutions Constraints”. to the original problem.

When ζ = 0, the situation becomes more complicated 1: Solve Problem 22 to obtain x∗(tf∗ ) because Condition2 does not hold for Problem 23. This 2: if ζ = 0 then is clear from the fact that the terms defined in Condi- 3: Solve Problem 22 again, with the modifications: tf tion2 become: Use the cost 0 `(σ(t))dt zm[te] In m = , B = , Set b(x(tf )) =R x(tf ) x∗(tf∗ ) LCvx ∇ 0 LCvx 0 −     which are clearly not linearly independent since BLCvx is full column rank. Thus, even for interior segments crete example, Problem 22 may be searching for a mini- the solution may be infeasible for Problem 21. To rem- mum miss distance solution for a planetary rocket land- edy this, [86, Corollary 4] suggests Algorithm1. At ing trajectory [85]. The ancillary problem in Algorithm1 its core, the algorithm relies on the following simple can search for a minimum fuel solution that achieves the idea. By solving Problem 22 with the suggested modifi- same miss distance. Clearly, other running cost choices cations on line3 of Algorithm1, every interior segment are possible. Thus, the ancillary problem’s running cost once again becomes an instance of Problem6 for which becomes an extra tuning parameter. Theorem1 holds. Furthermore, due to the constraint We are now able to summarize the lossless convex- x(tf ) = x∗(tf∗ ), any solution to the modified problem ification result for problems with general convex state will be optimal for the original formulation where ζ = 0 constraints. (since m(x(tf )) = m(x∗(tf∗ ))). This modification can be viewed as a search for an Theorem 5 equivalent solution for which LCvx holds. As a con- Algorithm1 returns the globally optimal solution of 14 Preprint

Problem 21 if the state constraint (21d) is activated at Using the above condition, we can state the following isolated time instances, and Conditions1 and2 hold.  quite general LCvx guarantee for problems that fit the Problem 24 template. Nonlinear Dynamics A unifying theme of the previous sections is the assump- Theorem 6 tion that the system dynamics are linear. In fact, across The solution of Problem 25 is globally optimal for Prob- all LCvx results that we have mentioned so far, the dy- lem 24 if Conditions2 and7 hold.  namics did not vary much from the first formulation in Alas, Condition7 is generally quite difficult to check. (6b). Many engineering applications, however, involve Nevertheless, two general classes of systems have been non-negligible nonlinearities. A natural question is then shown to automatically satisfy this condition thanks to whether the theory of lossless convexification can be ex- the structure of their dynamics [46]. These classes ac- tended to systems with general nonlinear dynamics. commodate vehicle trajectory generation problems with An LCvx result is available for a class of nonlinear dy- double integrator dynamics and nonlinearities like mass namical systems. The groundwork for this extension was depletion, aerodynamic drag, and nonlinear gravity. The presented in [46]. The goal here is to show that the stan- following discussion of these system classes can appear dard input set relaxation based on the LCvx equality hard to parse at first sight. For this reason, we provide constraint is also lossless when the dynamics are non- two practical examples of systems that belong to each linear. Importantly, note that the dynamics themselves class in “Examples of Losslessly Convexifiable Nonlinear are not convexified, so the relaxed optimization problem Systems”. is still nonlinear, and it is up to the user to solve the The first corollary of Theorem6 introduces the first problem to global optimality. This is possible in special class of systems. A key insight is that the nullspace con- cases, for example if the nonlinearities are approximated ditions of the corollary require that 2m n, in other by piecewise affine functions. This yields a mixed-integer words there are at least twice as many control≥ variables convex problem whose globally optimal solution can be as there are state variables. This is satisfied by some ve- found via mixed-integer programming [106]. hicle trajectory generation problems where 2m = n, for With this introduction, let us begin by introducing example when the state consists of position and velocity the generalization of Problem6 that we shall solve using while the control is an acceleration that acts on all the LCvx: velocity states. This is a common approximation for fly-

tf ing vehicles. We shall see an example for rocket landing min m(tf , x(tf )) + ζ `(g(u(t))) dt (24a) in Part III of the article. u,tf Z0 s.t.x ˙(t) = f(t, x(t), u(t), g(u(t))), (24b) Corollary 1

ρmin g(u(t)) ρmax, (24c) Suppose that the dynamics (24b) are of the form: ≤ ≤ x(0) = x0, b(tf , x(tf )) = 0, (24d) x f (t, x) x = 1 , f(t, x, u, g(u)) = 1 , (26) x2 f2(t, x, u) where f : R Rn Rm R Rn defines the nonlinear     × × × → dynamics. Just as for Problem9, it is required that where null( uf2) = 0 and null( x f1) = 0 . Then ∇ { } ∇ 2 { } g0 = g1 , g. Consider the following convex relaxation of Theorem6 applies if Condition2 holds.  the input constraint by using a slack input: The next corollary to Theorem6 introduces the second

tf class of systems, for which 2m < n is allowed. This min m(tf , x(tf )) + ζ `(σ(t)) dt (25a) class is once again useful for vehicle trajectory generation σ,u,tf Z0 problems where the dynamics are given by (27) and g(u) s.t.x ˙(t) = f(t, x(t), u(t), σ(t)), (25b) is a function that measures control effort. A practical

ρmin σ(t) ρmax, (25c) example is when the state x2 is mass, which is depleted ≤ ≤ g(u(t)) σ(t), (25d) as a function of the control effort (such as thrust for a ≤ rocket). x(0) = x0, b(tf , x(tf )) = 0. (25e) Corollary 2 Note that the slack input σ makes a new appearance Suppose that the dynamics (24b) are of the form: in the dynamics (25b). The more complicated dynamics require an updated version of Condition1 in order to x f (t, x, u) x = 1 , f(t, x, u, g(u)) = 1 . (27) guarantee that LCvx holds. x2 f2(t, g(u))     Condition 7 Define the matrix:

The pair xf[t], uf[t] must be totally controllable T {∇ ∇ } ( uf1) on [0, tf ] for all feasible sequences of x( ) and u( ) for M d( f )T ∇ . (28) · · , u 1 T T Problem 25[46, 94]. ∇ ( uf) ( xf1)  " dt − ∇ ∇ # 15 Preprint

Examples of Losslessly Convexifiable Nonlinear Systems for landing either on small celestial bodies with a weak grav- itational pull, or on planets with a thick atmosphere (such as our very own Earth). In this case, the lander dynamics can be written as:

c T (t) ¨r(t) = g(r(t)) ˙r(t) 2 ˙r(t) + , (S14a) − m(t) k k m(t)

˙m(t) = α T (t) 2, (S14b) − k k

where r denotes position, m is mass, T is thrust, c is the drag coefficient, α is inversely proportional to the rocket en- 3 3 gine’s specific impulse, and g : R R is the nonlinear → gravity model. An illustration is given in Figure S7a. (a) Landing a rocket in the pres- (b) Autonomous aircraft trajectory We can rewrite the above dynamics using the template ence of nonlinear gravity and at- generation with a stall constraint 7 mospheric drag. and aerodynamic drag. of (24b) by defining the state x , (r, ˙r, m) R , the input 3 ∈ u T R , and the input penalty function g(u) u 2. , ∈ , k k The equations of motion can then be written as (omitting FIGURE S7 Illustrations of two nonlinear system for which the loss- less convexification result of Theorem6 can be applied. the time argument for concision):

  t first sight, the discussion around Corollaries1 and2 fr (x, u) may be hard to parse into something useful. However,   A f (t, x, u, g(u)) =  f˙r (x)  , we will now show two concrete and very practical examples fm(g(u)) of dynamical systems that satisfy these corollaries. Both   ˙r examples originate from [46]. 1 1 = g(r) cm− ˙r 2 ˙r + T m−  .  − k k  α T 2 Nonlinear Rocket Landing − k k First, LCvx can be used for rocket landing with nonlinear gravity and aerodynamic drag. Both effects are important (continued...)

Furthermore, suppose that the terminal constraint Embedded Lossless Convexification function b is affine and x2(tf ) is unconstrained, such that The reader will notice that the LCvx theory of the pre-

x2 b = 0. Then Theorem6 applies if null( M) = 0 vious sections deals with special cases of problems whose ∇ { } and Condition2 holds.  nonconvexity is “just right” for an LCvx guarantee to be provable using the maximum principle. Although such problems have found their practical use in prob- lems like spaceflight [37] and quadrotor path planning It must be emphasized that Problem 25 is still a non- [52], it leaves out many trajectory generation applica- linear program and that for Theorem6 to hold, a globally tions that do not fit the tight mold of original problems optimal solution of Problem 25 must be found. Although and conditions of the previous sections. this cannot be done for general , Despite this apparent limitation, LCvx is still highly if the dynamics f are piecewise affine then the problem relevant for problems that simply do not conform to one can be solved to global optimality via mixed-integer pro- of the forms given in the previous sections. For such gramming [106, 107, 108]. In this case, convexification problems, we assume that the reader is facing the chal- of the nonconvex input lower bound reduces the num- lenge of solving a nonconvex optimal control problem ber of disjunctions in the branch-and-bound tree, and that fits the mold of Problem 39 (the subject of Part II hence lowers the problem complexity [46]. Several ex- of the article), and is considering whether LCvx can help. amples of nonlinear systems that can be modeled in this There is evidence that the answer is affirmative, by using way, and which comply with Corollaries1 and2, are LCvx theory only on the constraints that are losslessly illustrated in “Approximating Nonlinear Systems with convexifiable. We call this embedded LCvx, because Piecewise Affine Functions”. it is used to convexify only part of the problem, while 16 Preprint

(continued...) known as the freestream velocity). An illustration is given in This system belongs to the class in Corollary2. In partic- Figure S7b. ular, let f1 = (fr , f˙r ) and f2 = fm. Working out the algebra in The key transformation in (S16) is to make the desired air- (28) , we have: speed vd the control variable, and to include a velocity con- " 1 # trol inner-loop via a proportional controller with gain κ. Be- 0 m− I (S15) M = 1 , cause the velocity dynamics are of first-order, v converges m− IM22 to vd along a straight path in the velocity phase plane, hence  T  T c ˙r˙r α TT where M22 = m2 ˙r 2I + ˙r + m2 T . Thanks to the the stall speed constraint is closely approximated for small − k k k k2 k k2 off-diagonal terms, null(M) = 0 unconditionally, so the control errors vd v 2. { } k − k rocket lander dynamics satisfy the requirements of Corol- We can rewrite the dynamics using the template of (24b) lary2. 6 by defining the state x (r, v) R and the input u vd , ∈ , ∈ 3. The equations of motions can then be written as: AAV Trajectory Generation Without Stalling R An autonomous aerial vehicle (AAV) can lose control and " # f (t, x) fall out of the sky if its airspeed drops below a certain value. f (t, x, u, g(u)) = r , f (x, u) This occurs because the wings fail to generate enough lift, v " # resulting in an aerodynamic stall. It was shown in [46] that v + v = ∞ . a stall speed constraint of the form vmin v 2 can be han- κv c v 2v + κvd ≤ k k − − k k dled by LCvx via the following dynamics: This system belongs to the class in Corollary1. In partic- ˙r(t) = v(t) + v (t), (S16a) ∞  ular, let f1 = fr and f2 = fv . We then have uf2 = κI and ˙v(t) = κ vd (t) v(t) c v(t) 2v(t), (S16b) ∇ − − − k k x f1 = I . Therefore, null( uf2) = 0 and null( x f1) = ∇ 2 ∇ { } ∇ 2 where v is the airspeed, ˙r is the ground velocity, and v 0 , so the aircraft dynamics satisfy the requirements of ∞ { } is the velocity of the air mass relative to the ground (also Corollary1.

the rest is handled by another nonconvex optimization stall speed constraint of the form: method such as presented in Part II of this article. Be- 0 < vmin vcmd(t) 2 vmax, (29) cause LCvx reduces the amount of nonconvexity present ≤ k k ≤ in the problem, it can significantly improve the conver- where the input v ( ) 3 is the commanded veloc- gence properties and reduce the computational cost to cmd R ity, while v and v · are∈ lower and upper bounds that solve the resulting problem. An example of this approach min max guarantee a stable flight envelope. The same constraint for quadrotor trajectory generation is demonstrated in is also considered in [46]. In [53], the authors develop Part III. a highly nonlinear planetary entry trajectory optimiza- The basic procedure for applying embedded LCvx is tion problem where the control input is the bank an- illustrated in Figure8. As shown in Figure 8b, we reiter- gle β R, parameterized via two inputs u1 , cos(β) ∈ 2 ate that LCvx is not a computation scheme, but rather and u2 , sin(β). The associated constraint u 2 = 1 is convexified to u 2 1, and equality at thek optimalk it is a convex relaxation with an accompanying proof of k k2 ≤ equivalence to the original problem. As such, it happens solution is shown in an LCvx-like fashion (the authors prior to the solution and simply changes the problem de- call it “assurance of active control constraint”). Similar scription seen by the subsequent numerical optimization methods are used in [55] in the context of rocket landing algorithm. with aerodynamic controls. A survey of related meth- ods is available in [110]. Finally, we will mention [52, There are a number of examples of embedded LCvx 58] where embedded LCvx is used to convexify an in- that we can mention. First, the previous section on non- put lower bound and an attitude pointing constraint for linear dynamics can be intepreted as embedded LCvx. rocket landing and for agile quadrotor flight. Sequential For example, [46] solves a rocket landing problem where convex programming from Part II is then using to solve only the nonconvex input constraint (24c) is convexified. the remaining nonlinear optimal control problems. The This leaves behind a nonconvex problem due to nonlin- quadrotor application in particular is demonstrated as a ear dynamics, and mixed-integer programming is used numerical example in Part III. to solve it. Another example is [109], where LCvx is As a result of the success of these applications, we fore- embedded in a mixed-integer autonomous aerial vehicle see there being further opportunities to use LCvx as a trajectory generation problem in order to convexify a strategy to derive simpler problem formulations. The re- 17 Preprint

Approximating Nonlinear Systems with Piecewise Affine Functions

i n m iecewise affine functions can be used for arbitrarily ac- angular region R R defined by the following in- R ⊂ × P curate approximation of any nonlinear function. This is equalities: the technique used by [46] to write the nonlinear dynamics i i i i (25b) in piecewise affine form. Doing so enables solving ¯x + bL,x x ¯x + bU,x , (S20a) ≤ ≤ Problem 25 to global optimality via mixed-integer program- ¯ui + bi u ¯ui + bi , (S20b) L,u ≤ ≤ U,u ming, which is imperative for the LCvx guarantee of Theo- i i rem6. We now show how a piecewise affine approximation where bL,x and bU,x represent the upper and lower bounds i i can be obtained in the general case, and concretely in the on the state, while bL,u and bU,u relate to the input. The case of the dynamics from “Examples of Losslessly Con- index i now takes on a clear meaning: it represents the i- vexifiable Nonlinear Systems”. th “validity” region. Without loss of generality, we assume i j = if i = j. In general, i = 1, ... , N, which means R ∩ R ∅ 6 General Case that f is approximated by affine functions over N regions. We begin by finding a first-order approximation of the non- The piecewise affine approximation of f can then be written n m n linear equation f : R R R R R from (25b) about as: × i ×i × →n m an operating point (t, ¯x , ¯u ) R R R , where i is an ∈ × × i i integer index that shall become clear in a moment. Without fpwa fa + f , for i such that (x, u) . (S21) , na ∈ R loss of generality, assume that f is decomposable into an affine and a non-affine part: The big-M formulation can be used to write (S21) in a form that is readily used in a mixed-integer program [108]. f = fa + fna. (S17) To this end, let M > 0 be a sufficiently large fixed scalar pa- rameter, and let z i 0, 1 be a binary variable indicating The first-order Taylor expansion of f is given by: ∈ { } that (x, u) i if and only if z i = 1. Then, the piecewise ∈ R i f fa + f , affine dynamics in mixed-integer programming form are en- ≈ na i   coded by the following set of constraints: f ¯fna + x¯fna (x ¯x) + ˜ u¯fna (u ¯u), (S18) na , ∇ − ∇ − PN i i where we have used the shorthand: ˙x = fa(t, x, u, g(u)) + i=1 z fna(t, x, u, g(u)), (S22a) x ¯x i + bi M(1 z i ), (S22b) i i ≥ L,x − − fna = fna(t, x, u, g(u)), (S19a) x ¯x i + bi + M(1 z i ), (S22c) ¯ i i i ≤ U,x − fna = fna(t, ¯x , ¯u , g(¯u )), (S19b) i i i u ¯u + bL,u M(1 z ), (S22d) T ˜ ufna = ufna + ( g fna)( g) . (S19c) ≥ − − ∇ ∇ ∇ ∇ u ¯ui + bi + M(1 z i ), (S22e) ≤ U,u − N i In (S19c), we understand g fna as the Jacobian of fna with 1 = P z . (S22f) ∇ i=1 respect to its fourth argument. Suppose that the lineariza- tion in (S18) is sufficiently accurate only over the hyperrect- (continued...)

sult would be a speedup in computation for optimization- that expand the method to new and interesting problem based trajectory generation. types. This section briefly surveys these new results.

The Future of Lossless Convexification Fixed-final Time Problems Lossless convexification is a method that solves noncon- vex trajectory generation problems with one or a small The first new LCvx result applies to a fixed-final time number of calls to a convex solver. This places it among and fixed-final state version of Problem6 with no state the most reliable and robust methods for nonconvex tra- constraints. To begin, recognize that the classical LCvx jectory generation. The future of LCvx therefore has an result from Theorem1 does not apply when both tf and obvious motivation: to expand the class of problems that x(tf ) are fixed. In this case, BLCvx = In+1 in (8b) and can be losslessly convexified. The most recent result dis- therefore its columns, which span all of Rn+1, cannot cussed in the previous sections is for problems with affine be linearly independent from mLCvx. Thus, tradition- state constraints [89], and this dates back to 2014. In the ally one could not fix the final time and the final state past two years, LCvx research has been rejuvenated by simultaneously. Very recently, Kunhippurayil et al. [92] several fundamental discoveries and practical methods showed that Condition2 is in fact not necessary for the 18 Preprint

4 2 (continued...) R and the input u = vd R . The non-affine part of the ∈ dynamics (S17) then takes the particular form: Nonlinear Rocket Landing " # Consider the rocket dynamics in (S14a) and, for simplic- 0 (S25) ity, suppose that the mass is constant. Define the state fna = 1 . c− v 2v 6 3 − k k x (r, ˙r) R and the input u T R . The non-affine , ∈ , ∈ part of the dynamics (S17) then takes the particular form: Let us focus on the aerodynamic drag term, fd = " # 0 c v 2v. The first order Taylor expansion about a refer- fna = 1 . (S23) − k k i g(r) cm− ˙r 2 ˙r ence airspeed ¯v is: − k k For simplicity, let us consider only the nonlinear gravity i 1 i i 2 i i T i fd = c− ¯v 2[I + ¯v 2− ¯v ¯v ](v ¯v ). (S26) term, g(r). We will deal with atmospheric drag for AAV tra- − k k k k − jectory generation below. Suppose that g(r) = (0, 0, gz (rz )),

which means that we only need to consider the vertical Using (S21) and (S26), Figure S9 draws fd,pwa. 2 component gz : R R. A typical profile is gz (rz ) = µ/rz , → − where µ is the gravitational parameter. Given a reference i point ¯rz , we have:

i i i g = gz (¯r ) + g 0(¯r )(rz ¯rz ). (S24) z z z z − Using (S21) and (S24), Figure S8 draws gz,pwa.

gz (a) Surface of the continuous function fd,x . The blue dots display op- gz,pwa − bi bi erating points at which the gradient is evaluated. U,rz − L,rz

rz i i ¯rz R

FIGURE S8 Illustration of a piecewise affine approximation of gz , (b) Surface of the discontinuous piecewise affine approximation (S24) fd ,pwa. The blue dashed rectangle in the airspeed space shows the the z-component of the nonlinear gravity force, using . − x boundary of the approximation validity region i . R AAV Trajectory Generation Without Stalling (S16) Consider the AAV dynamics in and, for simplicity, as- FIGURE S9 Illustration of a piecewise affine approximation of fdx , sume constant altitude flight. Define the state x (r, v) the x-component of the aerodynamic drag force, using (S26). , ∈

following version of Problem6: ρmin σ(t) ρmax, (31c) ≤ ≤ g(u(t)) σ(t), (31d) tf ≤ min `(g(u(t))) dt (30a) x(0) = x0, x(tf ) = xf . (31e) u,tf Z0 s.t.x ˙(t) = Ax(t) + Bu(t), (30b) The following result is then proved in [92]. By dropping Condition2, the result generalizes Theorem1 ρmin g(u(t)) ρmax, (30c) ≤ ≤ and significantly expands the reach of LCvx to problems x(0) = x0, x(tf ) = xf , (30d) without state constraints.

n where tf is fixed and xf R specifies the final state. ∈ Theorem 7 The lossless relaxation is the usual one, and is just a The solution of Problem 31 is globally optimal for Prob- specialization of Problem7 for Problem 30: lem 30 if Condition1 holds and tf is between the min- imum feasible time and the time that minimizes (30a). tf min `(σ(t)) dt (31a) For longer trajectory durations, there exists a solution σ,u,tf Z0 to Problem 31 that is globally optimal for Problem 30. s.t.x ˙(t) = Ax(t) + Bu(t)w(t), (31b)  19 Preprint

method, however, has poor (combinatorial) worst-case complexity. Historically, this made it very difficult to put optimization with discrete logic onboard computation- ally constrained and safety-critical systems throughout aerospace, automotive, and even state-of-the-art robotics [117]. (a) The solution process for a nonconvex optimal control problem, without using LCvx. Two recent results showed that LCvx can be applied to certain classes of hybrid optimal control problems that are useful for trajectory generation [90, 91]. While the results are more general, the following basic problem will help ground our discussion:

tf M min u (t) dt (32a) (b) The solution process for a nonconvex optimal control problem, where i 2 u,γ,tf k k 0 i=1 LCvx is embedded to convexify part of the original problem. Z X M

s.t.x ˙(t) = Ax(t) + B ui(t), (32b) FIGURE 8 Illustration of how embedded LCvx can be used to solve i=1 an optimal control problem that does not fit into any of the templates X presented in Part I of this article. γi(t)ρmin,i ui(t) 2 γi(t)ρmax,i, (32c) ≤ k k ≤ γi(t) 0, 1 , (32d) ∈ { } Ciui(t) 0, (32e) Perhaps the most important part of Theorem7, and a ≤ significant future direction for LCvx, is in its final sen- x(0) = x0, b tf , x(tf ) = xf , (32f) tence. Although a lossless solution “exists”, how does where M is the number of individual input vectors and algorithm one find it? An is provided in [92] to find the binary variables γi are used to model the on-off na- the lossless solution, that is, one solution among many ture of each input. Compared to the traditional Prob- others which may not be lossless. This is similar to The- lem6, this new problem can be seen as a system con- orem5 and Algorithm1: we know that slackness in (31d) trolled by M actuators that can be either “off” or “on” may occur, so we devise an algorithm that works around and norm-bounded in the [ρmin,i, ρmax,i] interval. The the issue and is able to recover an input for which (31d) affine input constraint (32e) represents an affine cone, holds with equality. Most traditional LCvx results place and is a specialized version of the earlier constraint (11d). further restrictions on the original problem in order to Figure9 illustrates the kind of input set that can be “avoid” slackness, but this by definition limits the appli- modeled. Imitating the previous results, the convex re- cability of LCvx. By instead providing algorithms which laxation uses a slack input for each control vector: recover lossless inputs from problems that do not ad- M mit LCvx naturally, we can tackle lossless convexification tf min `(σi(t)) dt (33a) “head on” and expand the class of losslessly convexifiable σ,u,γ,tf 0 i=1 problems. A similar approach is used for spacecraft ren- Z X M dezvous in [56], where an iterative algorithm modifies the s.t.x ˙(t) = Ax(t) + B u (t), (33b) dynamics in order to extract bang-bang controls from a i i=1 solution that exhibits slackness. X γi(t)ρmin,i σi(t) γi(t)ρmax,i, (33c) ≤ ≤ ui(t) 2 σi(t), (33d) Hybrid System Problems k k ≤ Many physical systems contain on-off elements such as 0 γi(t) 1, (33e) ≤ ≤ valves, relays, and switches [46, 108, 111, 112, 113]. Dis- Ciui(t) 0, (33f) crete behavior can also appear through interactions be- ≤ x(0) = x0, b tf , x(tf ) = xf , (33g) tween the autonomous agent and its environment, such as through foot contact for walking robots [114, 115]. where the only real novelty is that the γi variables have Modeling discrete behavior is the province of hybrid sys- also been relaxed to the continuous [0, 1] interval. tems theory, and the resulting trajectory problems typi- Taking Problem 33 as an example, the works of [90, cally combine continuous variables and discrete logic el- 91] prove lossless convexification from slightly different ements (i.e., “and” and “or” gates) [111, 113, 116]. Be- angles. In [91] it is recognized that Problem 33 will be cause these is no concept like local perturbation for val- a lossless convexification of Problem 32 if the dynami- ues that, for example, can only be equal to zero or one, cal system is “normal” due to the so-called bang-bang problems with discrete logic are fundamentally more dif- principle [12, 118]. Normality is related to, but much ficult. Traditional solution methods use mixed-integer stronger than, the notion of controllability from Con- programming [106]. The underlying branch-and-bound dition1. Nevertheless, it is shown that the dynamical 20 Preprint

Toy Example The following example provides a simple illustration of how LCvx can be used to solve a nonconvex problem. This example is meant to be a “preview” of the practi- cal application of LCvx. More challenging and realistic examples are given in Part III. The problem that we will solve is minimum-effort con- trol of a double integrator system (such as a car) with a constant “friction” term g. This can be written as a linear time-invariant instance of Problem6:

tf min u(t)2 dt (34a) u Z0 s.t.x ˙ 1(t) = x2(t), (34b) FIGURE 9 Example of a feasible input set that can be modeled in Problem 32. It is a nonconvex disconnected set composed of the ori- x˙ 2(t) = u(t) g, (34c) − gin, a point, an arc, and a nonconvex set with an interior. For example, 1 u(t) 2, (34d) this setup can represent a satellite equipped with thrusters and drag ≤ | | ≤ plates, or a rocket with a thrust-gimbal coupled engine [90, 91]. x1(0) = x2(0) = 0, (34e)

x1(tf ) = s, x2(tf ) = 0, tf = 10. (34f)

The input u( ) R is the acceleration of the car. The · ∈ system can be perturbed by an arbitrarily small amount constraint (34d) is a nonconvex one-dimensional version to induce normality. This phenomenon was previously of the constraint (S1). Assuming that the car has unit observed in a practical context for rocket landing LCvx mass, the integrand in (34a) has units of Watts. The with a pointing constraint, which we discussed for Prob- objective of Problem 34 is therefore to move a car by a lem9[87]. Practical examples are shown for spacecraft distance s in tf = 10 seconds while minimizing the aver- orbit reshaping, minimum-energy transfer, and CubeSat age power. Following the relaxation template provided differential drag and thrust maneuvering. It is noted by Problem7, we propose the following convex relaxation that while mixed-integer programming fails to solve the to solve Problem 34: latter problem, the convex relaxation is solved in < 0.1 tf seconds. min σ(t)2 dt (35a) σ,u Z0 The results in [57, 90] also prove lossless convexifi- s.t.x ˙ 1(t) = x2(t), (35b) cation for Problem 33. However, instead of leveraging x˙ 2(t) = u(t) g, (35c) normality and perturbing the dynamics, the nonsmooth − 1 σ(t) 2, (35d) maximum principle [104, 119, 120] is used directly to de- ≤ ≤ velop a set of conditions for which LCvx holds. These u(t) σ(t), (35e) | | ≤ conditions are an interesting mix of problem geometry x1(0) = x2(0) = 0, (35f) (i.e., the shapes and orientations of the constraint cones x1(tf ) = s, x2(tf ) = 0, tf = 10. (35g) (33f)) and Conditions1 and2. Notably, they are more general than normality, so they can be satisfied by sys- To guarantee that LCvx holds, in other words that tems that are not normal. Practical examples are given Problem 35 finds the globally optimal solution of Prob- for spacecraft rendezvous and rocket landing with a cou- lem 34, let us first attempt to verify the conditions of pled thrust-gimbal constraint. The solution is observed Theorem1. In particular, we need to show that Con- to take on the order of a few seconds and to be more ditions1 and2 hold. First, from (34b)-(34c), we can than 100 times faster than mixed-integer programming. extract the following state-space matrices: We see the works [90, 91] as complementary: [90] 0 1 0 A = , B = . (36) shows that for some systems, the perturbation proposed 0 0 1 by [91] is not necessary. On the other hand, [91] pro-     vides a method to recover LCvx when the conditions of We can verify that Condition1 holds by either showing [90] fail. Altogether, the fact that an arbitrarily small that the controllability matrix is full rank, or by using perturbation of the dynamics can recover LCvx suggests the PBH test [95]. Next, from (35a) and (35f)-(35g), a deeper underlying theory for how and why problems we can extract the following terminal cost and terminal can be losslessly convexified. We feel that the search constraint functions: for this theory will be a running theme of future LCvx tf 10 − research, and its eventual discovery will lead to more m(tf , x(tf )) = 0, b(tf , x(tf )) = x1(tf ) s . (37)  −  general lossless convexification algorithms. x2(tf ) 21   Preprint

We can now substitute (37) into (8) to obtain: 50 Maximum principle 0 40 LCvx [m]

mLCvx = 0 , BLCvx = I3. (38) 1 x 30  2 σ(tf ) 20  

Thus, BLCvx is full column rank and its columns can- Position 10 not be linearly independent from mLCvx. We conclude 0 that Condition2 does not hold, so Theorem1 cannot be 10 Maximum principle applied. In fact, Problem 34 has both a fixed final time 8 LCvx and a fixed final state. This is exactly the edge case for [m/s]

2 6 which traditional LCvx does not apply, as was mentioned x in the previous section on future LCvx. Instead, we fall 4 2 back on Theorem7 which says that Condition2 is not Velocity 0 needed as long as tf is between the minimum and opti-

] 2 mal times for Problem 34. It turns out that this holds 2

for the problem parameters used in Figure 10. The min- [m/s 1 imum time is just slightly below 10 s and the optimal u Feasible input set 0 Maximum principle time is 13.8 s for Figure 10a and 13.3 s for Fig- LCvx ≈ ≈ 1 ure 10b. Most interestingly, lossless convexification fails −

(i.e., (35e) does not hold with equality) for tf values al- Acceleration 2 most exactly past the optimal time for Figure 10a, and − 0 2 4 6 8 10 just slightly past it for Figure 10b. Time [s] Although Problem 35 is convex, it has an infinite num- (a) Solution of Problem 35 for g = 0.1 m/s2 and s = 47 m. ber of solution variables because time is continuous. To 30 Maximum principle be able to find an approximation of the optimal solution LCvx [m]

using a numerical convex optimization algorithm, the 1 20 x problem must be temporally discretized. To this end, we apply a first-order hold (FOH) discretization with 10 N = 50 temporal nodes, as explained in “Discretizing Position Continuous-time Optimal Control Problems”. 0 6 Looking at the solutions in Figure 10 for two values of Maximum principle LCvx the friction parameter g, we can see that the nonconvex 4 [m/s] 2 constraint (34d) holds in both cases. We emphasize that x 2 this is despite the trajectories in Figure 10 coming from the solution of Problem 35, where u(t) < 1 is feasible. 0 | | Velocity The fact that this does not occur is the salient feature 2 −

] 2 of LCvx theory, and for this problem it is guaranteed by 2 Theorem7. Finally, we note that Figure 10 also plots

[m/s 1 the analytical globally optimal solution obtained via the u Feasible input set 0 Maximum principle maximum principle, where no relaxation nor discretiza- LCvx tion is made. The close match between this solution and 1 −

the numerical LCvx solution further confirms the theory, Acceleration 2 as well as the accuracy of the FOH discretization method. − 0 2 4 6 8 10 Note that the mismatch at t = 0 in the acceleration plot Time [s] in Figure 10b is a benign single-time-step discretization (b) Solution of Problem 35 for g = 0.6 m/s2 and s = 30 m. artifact that is commonly observed in LCvx numerical solutions. FIGURE 10 LCvx solutions of Problem 35 for two scenarios. The close match of the analytic solution using the maximum principle (drawn as PART II: SEQUENTIAL CONVEX PROGRAMMING a continuous line) and the discretized solution using LCvx (drawn as We now move on to a different kind of convex optimiza- discrete dots) confirms that LCvx finds the globally optimal solution of the problem. tion-based trajectory generation algorithm, known as se- quential convex programming (SCP). The reader will see that this opens up a whole world of possibilities beyond the restricted capabilities of lossless convexification. One A wealth of industrial and research applications, in- could say that if LCvx is a surgical knife to remove acute cluding high-profile experiments, support this statement. nonconvexity, then SCP is a catch-all sledgehammer for Examples can be found in many engineering domains, nonconvex trajectory design [6]. ranging from aerospace [59, 121, 122, 123] and mechan- 22 Preprint ical design [124, 125] to power grid technology [126], update the trust region (and possibly other parameters) chemical processes [127], and computer vision [128, 129]. that are internal to the SCP algorithm. The solution Just last year, the Tipping Point Partnership between then becomes the new reference trajectory for the next NASA and Blue Origin started testing an SCP algo- iteration of the algorithm. rithm (that we will discuss in this section) aboard the The SCP approach offers two main advantages. First, New Shepard rocket [8, 68]. Another application of SCP a wide range of algorithms exist to reliably solve each methods is for the SpaceX Starship landing flip maneu- convex subproblem at 2 . Because SCP is agnostic to the ver [130]. Although SpaceX’s methods are undiscolsed, particular choice of subproblem optimizer, well-tested we know that convex optimization is used by the Falcon algorithms can be used. This makes SCP very attrac- 9 rocket and that SCP algorithms are highly capable of tive for safety-critical applications, which are ubiquitous solving such challenging trajectories [36, 131]. throughout disciplines like aerospace and automotive en- Further afield, examples of SCP can be found in gineering. Second, one can derive meaningful theoreti- medicine [132, 133], economics [134, 135], biology [136, cal guarantees on algorithm performance and computa- 137], and fisheries [138]. Of course, in any of these tional complexity, as opposed to general NLP optimiza- applications, SCP is not the only methodology that can tion where the convergence guarantees are much weaker. be used to obtain good solutions. Others might include Taken together, these advantages have led to the devel- interior point methods [13, 139], opment of very efficient SCP algorithms with runtimes [140], augmented Lagrangian techniques [141], genetic low enough to enable real-time deployment for some ap- or evolutionary algorithms [142], and machine learning plications [63]. and neural networks [143], to name only a few. However, A fundamental dilemma of NLP optimization is that it is our view that SCP methods are fast, flexible and one can either compute locally optimal solutions quickly, efficient local optimization algorithms for trajectory or globally optimal solutions slowly. SCP techniques generation. They are a powerful tool to have in a are not immune to this trade-off, despite the fact that trajectory engineer’s toolbox, and they will be the focus certain subclasses of convex optimization can be viewed of this part of the article. as “easy” from a computational perspective due to the As the name suggests, at the core of SCP is the idea availability of interior point methods. Some of the afore- of iterative convex approximation. Most, if not all, SCP mentioned applications may favor the ability to com- algorithms for trajectory generation can be cast in the pute solutions quickly (i.e., in near real-time), such as form illustrated by Figure 11. Strictly speaking, SCP aerospace and power grid technologies. Others, such as methods are nonlinear local optimization algorithms. In economics and structural truss design, may favor global particular, the reader will recognize that SCP algorithms optimality and put less emphasis on solution time (al- are specialized trust region methods for continuous-time though early trade studies may still benefit from a fast optimal control problems [29, 47, 48]. optimization method). Given the motivation from the beginning of this article, our focus is on the former class All SCP methods solve a sequence of convex ap- of algorithms that provide locally optimal solutions in proximations, called subproblems, to the original near real-time. nonconvex problem and update the approximation as new solutions are obtained. Going around the loop This part of the article will provide an overview of the of Figure 11, all algorithms start with a user-supplied algorithmic design choices and assumptions that lead to initial guess, which can be very coarse (more on this effective SCP implementations. The core tradeoffs in- later). At 1 , the SCP algorithm has available a so- clude how the convex approximations are formulated, called reference trajectory, which may be infeasible with what structure is devised for updating the solutions, how respect to the problem dynamics and constraints. The progress towards a solution is measured, and how all nonconvexities of the problem are removed by a local of the above enables theoretical convergence and per- linearization around the reference trajectory, while con- formance guarantees. vex elements are kept unchanged. Well-designed SCP We hope that the reader comes away with the following algorithms add extra features to the problem in order view of SCP: it is an effective and flexible way to do to maintain subproblem feasibility after linearization. trajectory optimization, and inherits some but not all of The resulting convex continuous-time subproblem is the theoretical properties of convex optimization. SCP then temporally discretized to yield a finite-dimensional works well for complex problems, but it is definitely not convex optimization problem. The optimal solution to a panacea for all of nonconvex optimization. SCP can the discretized subproblem is computed at 2 , where the fail to find a solution, but usually a slight change to the SCP algorithm makes a call to any appropriate convex parameters recovers convergence. This part of the article optimization solver. The solution is tested at 3 against provides the reader with all the necessary insights to get stopping criteria. If the test passes, we say that the started with SCP. The numerical examples in Part III algorithm has converged, and the most recent solution provide a practical and open-source implementation of from 2 is returned. Otherwise, the solution is used to the algorithms herein. 23 Preprint

Algorithm start Convex optimizer

Convex subproblem creation Initial 2 trajectory T Handle artificial Solve 1 Linearize Temporally guess infeasibility and convex nonconvexities discretize F unboundedness subproblem Starting First iteration?

Update Fail Test trust region 3 Iteration Stopping Pass

Algorithm stop (converged)

FIGURE 11 Block diagram illustration of a typical SCP algorithm. Every SCP-based trajectory generation method is comprised of three major components: a way to guess the initial trajectory (Starting), an iteration scheme which refines the trajectory until it is feasible and locally optimal (Iteration), and an exit criterion to stop once the trajectory has been computed (Stopping). In a well-designed SCP scheme, the test (convergence) criterion is guaranteed to trigger, but the solution may be infeasible for the original problem.

Historical Development of SCP mentation of this idea, and it has been applied, among Tracing the origins of what we refer to as sequential con- other places, in the field of machine learning to support vex programming is not a simple task. Since the field vector machines and principal component analysis [156]. of nonlinear programming gained traction as a popular Perhaps the earliest and simplest class of SCP meth- discipline in the 1960s and 70s, many researchers have ods whose structure resembles that shown in Figure 11 explored the solution of nonconvex optimization prob- is, unsurprisingly, sequential linear programming (SLP). lems via convex approximations. This section attempts These algorithms linearize all nonlinear functions about to catalogue some of the key developments, with a focus a current reference solution so that each subproblem is on providing insight into how the field moved toward the a linear program. These linear programs are then solved present day version of sequential convex programming with a trust region to obtain a new reference, and the for trajectory generation. process is repeated. Early developments came from the The idea to solve a general (nonconvex) optimization petroleum industry and were intended to solve large- problem by iteratively approximating it as a convex scale problems [157, 158]. From a computational per- program was perhaps first developed using branch- spective, SLP was initially attractive due to the maturity and-bound techniques [144, 145, 146, 147]. Early of the simplex algorithm. Over time, however, solvers results were of mainly academic interest, and compu- for more general classes of convex optimization problems tationally tractable methods remained elusive. One have advanced to the point that restricting oneself to of the most important ideas that emerged from these linear programs to save computational resources at 2 in early investigations appears to be that of McCormick Figure 11 has become unnecessary, except perhaps for relaxations [148]. These are a set of atomic rules for very large-scale problems. constructing convex/concave relaxations of a specific Another important class of SCP methods is that of class of functions that everywhere under-/over-estimate sequential quadratic programming (SQP). The works the original functions. These rules result in a class of of Han [159, 160], Powell [161, 162, 163], Boggs and SCP methods, and algorithms based on McCormick Tolle [164, 165, 166], and Fukushima [167] appear to relaxations continue to be developed with increasing have exerted significant influence on the early devel- computational capabilities [149, 150, 151, 152]. opments of SQP-type algorithms, and their impact Difference-of-convex programming is a related class of remains evident today. An excellent survey was written SCP methods [153, 154]. These types of algorithms rely by Boggs and Tolle [168] and an exhaustive monograph on the formulation of nonconvex constraints as the dif- is available by Conn, Gould, and Toint [47]. SQP ference between two convex functions, say f = f1 f2, methods approximate a nonconvex program with a − where both f1 and f2 are convex functions. The ad- quadratic program using some reference solution, and vantage of this decomposition is that only the function then use the solution to this quadratic program to f2 needs to be linearized in order to approximate the update the approximation. Byrd, Schnabel, and Schultz nonconvex function f. The convex-concave procedure provide a general theory for inexact SQP methods presented in [155] is one example of a successful imple- [169]. The proliferation of SQP-type algorithms can 24 Preprint

Because SCP is agnostic to the choice of subproblem optimizer, well-tested algorithms can be used, which is attractive for safety-critical applications.

be attributed to three main aspects: 1) their similarity eralize the idea of SQP in order to deal with exactly this with the familiar class of Newton methods, 2) the fact limitation. Semidefinite programs are the most general that the initial reference need not be feasible, and 3) class of convex optimization problems for which efficient the existence of algorithms to quickly and reliably solve off-the-shelf solvers are available. Fares et al. introduced quadratic programs. In fact, the iterates obtained by sequential semidefinite programming [175], which uses SQP algorithms can be interpreted either as solutions matrix variables that are subject to definiteness con- to quadratic programs or as the application of Newton’s straints. Such algorithms find application most com- method to the optimality conditions of the original monly in robust control, where problems are formulated problem [29]. SQP algorithms are arguably the most as (nonconvex) semidefinite programs with linear and mature class of SCP methods [170], and modern de- bilinear matrix inequalities [176]. Recent examples have velopments continue to address both theoretical and appeared for robust planetary rocket landing [177, 178]. applied aspects [171, 172]. We can view sequential semidefinite programming as the Their long history of successful deployment in NLP furthest possible generalization of SLP to the idea of ex- solvers notwithstanding [170], SQP methods do come ploiting existing convexity in the subproblems. with several drawbacks. Gill and Wong nicely sum- This article focuses on the class of SCP methods that marize the difficulties that can arise when using SQP solve a general convex program at each iteration, with- methods [173], and we will only outline the basic out a priori restriction to one of the previously mentioned ideas here. Most importantly (and this goes for any classes of convex programs (e.g., LPs, QPs, SOCPs, and “second-order” method), it is difficult to accurately and SDPs). This class of SCP methods has been developed reliably estimate the Hessian of the nonconvex program’s largely over the last decade, and represents the most Lagrangian. Even if this is done, say, analytically, there active area of current development, with successful ap- is no guarantee that it will be positive semidefinite, and plications in robot and spacecraft trajectory optimiza- an indefinite Hessian results in an NP-hard nonconvex tion [6, 53, 59, 62, 63, 121, 122, 179, 180, 181, 182]. We quadratic program. Hessian approximation techniques focus, in particular, on two specific algorithms within must therefore be used, such as keeping only the positive this class of SCP methods: SCvx and GuSTO. These semidefinite part or using the BFGS update [174]. In two algorithms are complementary in a number of ways, the latter case, additional conditions must be met to and enjoy favorable theoretical guarantees. On the one ensure that the Hessian remains positive-definite. These hand, the theoretical analysis of SCvx works with the impose both theoretical and computational challenges temporally discretized problem and provides guarantees which, if unaddressed, can both impede convergence in terms of the Karush-Kuhn-Tucker (KKT) optimality and curtail the real-time applicability of an SQP-type conditions [49, 64, 183]. On the other hand, GuSTO is algorithm. Fortunately, a great deal of effort has gone analyzed for the continuous-time problem and provides into making SQP algorithms highly practical, resulting theoretical guarantees in terms of the Pontryagin maxi- in mature algorithm packages like SNOPT [170]. mum principle [11, 12, 50, 69]. The numerical examples One truly insurmountable drawback of SQP methods in Part III of this article are solved using both SCvx and for trajectory generation in particular is that quadratic GuSTO exactly as they are presented here. These ex- programs require all constraints to be affine in the solu- amples illustrate that the methods are, to some degree, tion variable. Alas, many motion planning problems are interchangeable. naturally subject to non-affine convex constraints. We Typically, although not necessarily, the convex solver have already seen an example of a second-order cone con- used at 2 in Figure 11 is based on an interior point straint that arises from a spacecraft glideslope require- method [31, 139]. This leads to a nice interpretation of ment in “Landing Glideslope as an Affine State Con- SCP as the “next layer up” in a hierarchy of optimiza- straint”, shown in Figure5. For problems with non- tion algorithms described in [26, Chapter 11], and which affine convex constraints, the use of an SQP algorithm we illustrate in Figure 12. In the bottommost layer, we may require more iterations to converge compared to a have the unconstrained Newton’s method, which solves more general SCP algorithm, leading to a reduction in a sequence of unconstrained QPs. The next layer solves computational efficiency. Moreover, each SQP iterate is linear equality constrained convex problems. This again not guaranteed to be feasible with respect to the original uses Newton’s method, but with a more complicated step convex constraints, whereas the SCP iterates will be. computation. The third layer is the IPM family of meth- There are several classes of SCP algorithms that gen- ods, which solve a convex problem with linear equality 25 Preprint

where x( ) Rn is the state trajectory, u( ) Rm is the · ∈ · ∈ control trajectory, and p Rd is a vector of parameters. 4 SCP n∈ m d n The function f : R R R R R represents the (nonlinear) dynamics,× × which× are assumed→ to be at 3 IPM least once continuously differentiable. Initial and termi- Algorithm complexity nal boundary conditions are enforced by using the con- n d nic tinuously differentiable functions gic : Equality constrained R R R 2 n d ntc × → Newton and gtc : R R R . We separate convex and non- convex path× (i.e., state→ and control) constraints by using 1 Unconstrained Newton the convex sets (t) and (t) to represent convex path constraints, andX the continuouslyU differentiable function s : R Rn Rm Rd Rns to represent nonconvex path constraints.× × It× is assumed→ that the sets (t) and (t) FIGURE 12 Sequential convex programming can be placed atop a are compact (i.e., closed and bounded). ThisX amountsU to hierarchy of classical optimization algorithms. In this illustration, the saying that the vehicle cannot escape to infinity or apply “width” of each layer is representative of the corresponding algorithm’s implementation and runtime complexity (to be used only as an intuitive infinite control action, which is obviously reasonable for guide). Each layer embeds within itself the algorithms from the layers all practical applications. Finally, note that Problem 39 below it. is defined on the [0, 1] time interval, and the constraints (39b)-(39e) have to hold at each time instant. We highlight that the parameter vector p can be used, and convex inequality constraints as a sequence of linear among other things, to capture free initial and/or free fi- equality constrained problems. Thus, we may think of nal time problems by making t0 and tf elements of p. In IPMs as iteratively calling the algorithm in layer 2 of particular, an appropriate scaling of time can transform Figure 12. Analogously, SCP solves a nonconvex prob- the [0, 1] time interval in Problem 39 into a [t0, tf ] inter- lem as a sequence of convex problems with linear equality val. This allows us to restrict the problem statement to and convex inequality constraints. Thus, SCP iteratively the [0, 1] time interval without loss of generality [27]. We calls an IPM algorithm from layer 3 . Numerical expe- make use of this transformation in the numerical exam- rience has shown that for most problems, IPMs require ples at the end of the article. on the order of tens of iterations (i.e., calls to layer 2 ) Hybrid systems like bouncing balls, colliding objects, to converge [26]. Similarly, our experience has been that and bipedal robots require integer variables in their op- SCP requires on the order of tens of iterations (i.e., calls timization models. The integer variable type, however, to layer 3 ) to converge. is missing from Problem 39. Nevertheless, methods exist The rest of this part of the article is organized as fol- to embed integer variables into the continuous-variable lows. We first state the general continuous-time optimal formulation. Among these methods are state-triggered control problem that we wish to solve, and discuss the constraints [59, 61, 62, 65, 66, 112, 184], and homotopy common algorithmic underpinnings of the SCP frame- techniques such as the relaxed autonomously switched work. We then describe the SCvx and GuSTO algo- hybrid system and composite smooth control [6, 185, rithms in full detail. At the end of Part II, we compare 186]. We shall therefore move forward using Problem 39 SCvx and GuSTO and give some advice on using SCP “without loss of generality”, keeping in mind that there in the real world. Part III will present two numerical are methods to embed integer solution variables exactly experiments that provide practical insight and highlight or as an arbitrarily accurate approximation [6]. the capabilities of each algorithm. We also take this opportunity to note that Problem 39 is not the most general optimal control problem that Problem Formulation SCP methods can solve. However, it is general enough The goal of SCP methods is to solve continuous-time for the introductory purpose of this article, and can al- optimal control problems of the following form: ready cover the vast majority of trajectory optimization problems [6]. The numerical implementation attached to min J(x, u, p) (39a) this article (see Figure2) was applied to solve problems u,p ranging from quadrotor trajectory generation to space- s.t.x ˙(t) = f t, x(t), u(t), p , (39b) craft rendezvous and docking [66, 112]. x(t), p (t), (39c) The cost function in (39a) is assumed to be of the ∈ X Bolza form [187]: u(t), p (t), (39d)  ∈ U s t, x(t), u(t), p 0, (39e) 1  ≤ J(x, u, p) = φ(x(1), p) + Γ(x(t), u(t), p) dt, (40) gic x(0), p = 0, (39f) Z0 g x(1), p = 0, (39g) n d tc  where the terminal cost φ : R R R is assumed × →  26 Preprint

n to be a convex function and the running cost Γ : R We begin by fixing the initial and final states xic and m d × R R R can be in general a nonconvex function. xtc that represent either single-point boundary condi- Note× that→ convexity assumptions on φ are without loss of tions or points in a desired initial and terminal set de- generality. For example, a nonconvex φ can be replaced fined by (39f) and (39g). The state trajectory is then by a linear terminal cost τf (where τf becomes an element defined as a linear interpolation between the two end- of p), and a nonconvex terminal boundary condition is points: added to the definition of gtc in (39g): x¯(t) = (1 t)xic + txtc, for t [0, 1]. (42) − ∈ φ(x(1), p) = τf . (41) If a component of the state is a non-additive quantity, SCP Algorithm Foundations such as a unit quaternion, then linear interpolation is All SCP methods work by solving a sequence of local not the most astute choice. In such cases, we opt for convex approximations to Problem 39, which we call sub- the simplest alternative to linear interpolation. For unit problems. As shown in Figure 11, this requires having quaternions, this would be spherical linear interpolation access to an existing reference trajectory at location 1 [188]. of the figure. We will call this a reference solution, Whenever possible, we select the initial input trajec- with the understanding that this trajectory need not be tory based on insight from the physics of the problem. a feasible solution to the problem (neither for the dynam- For example, for an aerial vehicle we would choose an ics nor for the constraints). SCP methods update this input that opposes the pull of gravity. In the case of reference solution after each passage around the loop of a rocket, the choice can be uic = mwetg and utc = − I Figure 11, with the solution obtained at 2 becoming the mdryg , where mwet and mdry are the initial and esti- mated− finalI masses of the vehicle, and g is the inertial reference for the next iteration. This begs the question: I where does the reference solution for the first iteration gravity vector. If the problem structure does not readily come from? admit a physics-based choice of control input, our go-to approach is to set the input to the smallest feasible value Initial Trajectory Guess that is compliant with (39d). The intuition is that small A user-supplied initial trajectory guess is responsible for inputs are often associated with a small cost (39a). In providing the first SCP iteration with a reference solu- any case, the initial control solution is interpolated using 1 a similar expression to (42): tion. Henceforth, the notation x¯(t),u ¯(t),p ¯ 0 shall de- note a reference trajectory on the{ time interval} [0, 1]. u¯(t) = (1 t)uic + tutc, for t [0, 1]. (43) We will see in the following sections that the SCP al- − ∈ gorithms that we discuss, SCvx and GuSTO, are guar- The initial guess forp ¯ can have a significant impact anteed to converge almost regardless of the initial tra- on the number of SCP iterations required to obtain a jectory guess. In particular, this guess can be grossly solution. For example, ifp ¯ represents the final time of a infeasible with respect to both the dynamics (39b) and free final time problem that evolves on the [0, tf ] inter- the nonconvex constraints (39e)-(39g). However, the al- val, then it is best to guess a time dilation value that is gorithms do require the guess to be feasible with respect reasonable for the expected trajectory. Since parameters to the convex path constraints (39c)-(39d). Assuring are inherently problem specific, however, it is unlikely this is almost always an easy task, either by manually that any generic rule of thumb akin to (42) and (43) will constructing a simplistic solution that respects the con- prove reliable. Fortunately, since SCP runtime is usually vex constraints, or by projecting an infeasible guess onto on the order of a few seconds or less, the user can ex- the (t) and (t) sets. For reference, both strategies are X U periment with different initial guesses forp ¯ and come up implemented in our open-source code linked in Figure2. with a good initialization strategy relatively quickly. Numerical experience has shown that both SCvx and For all but the simplest problems, the initial guess GuSTO are extremely adept at morphing coarse initial 1 x¯(t),u ¯(t),p ¯ 0 constructed above is going to be (highly) guesses into feasible and locally optimal trajectories. {infeasible with} respect to the dynamics and constraints This represents a significant algorithmic benefit, since of Problem 39. Nevertheless, SCP methods like SCvx most traditional methods, like SQP and NLP, require and GuSTO can, and often do, converge to usable tra- good (or even feasible) initial guesses, which can be very jectories using such a coarse initial guess. However, this hard to come by [27, 28]. does not relieve the user entirely from choosing an initial To give the reader a sense for what kind of initial guess guess that exploits the salient features of their particu- can be provided, we present an initialization method lar problem. A well-chosen initial guess will (likely) have called straight-line interpolation. We have observed the following three benefits for the solution process: that this technique works well for a wide variety of prob- lems, and we use it in the numerical examples at the end It will reduce the number of iterations and the time of this article. However, we stress that this is merely a required to converge. This is almost always a driving rule of thumb and not a rigorously derived technique. objective in the design of a trajectory optimization 27 Preprint

algorithm since fast convergence is not only a wel- r0(t) , s(t,x ¯(t),u ¯(t),p ¯) Cx¯(t) Du¯(t) Gp¯, (44h) come feature but also a hard requirement for onboard − − − H0 , xgic(¯x(0),p ¯), (44i) implementation in an autonomous system; ∇ K0 pgic(¯x(0),p ¯), (44j) , ∇ It will encourage the converged solution to be feasi- `0 gic(¯x(0),p ¯) H0x¯(0) K0p¯, (44k) ble for Problem 39. As mentioned, SCP methods like , − − Hf xgtc(¯x(1),p ¯), (44l) SCvx and GuSTO will always converge to a trajec- , ∇ tory, but without a guarantee that the trajectory will Kf , pgtc(¯x(1),p ¯), (44m) be feasible for the original problem. The fact that ∇ `f gtc(¯x(1),p ¯) Hf x¯(1) Kf p¯. (44n) the solution often is feasible with respect to Prob- , − − lem 39 is a remarkable “observation” that researchers and engineers have made, and it is a driving reason These matrices can be used to write down the first- for the modern interest in SCP methods. Neverthe- order Taylor series approximations for each of f, s, gic, less, an observation is not a proof, and there are limits and gtc. Note that we will not linearize the cost function to how bad an initial guess can be. The only rule of (40) at this point, since SCvx and GuSTO make different thumb that is always valid is that one should embed assumptions about its particular form. Convexification as much problem knowledge as possible in the initial of the cost function will be tackled separately in later guess; sections on SCvx and GuSTO. Using the terms in (44), we obtain the following ap- A better initial guess may also improve the converged proximation of Problem 39 about the reference trajec- trajectory’s optimality. However, the level of optimal- tory: ity is usually difficult to measure, because a globally optimal solution is rarely available for the kinds of dif- min J(x, u, p) (45a) u,p ficult trajectory problems that we are concerned with using SCP. Nevertheless, some attempts to character- s.t.x ˙(t) = A(t)x(t) + B(t)u(t) + F (t)p + r(t), (45b) ize the optimality level have been made in recent years x(t), p (t), (45c) ∈ X [63, 189]. u(t), p (t), (45d)  ∈ U C(t)x(t) + D(t)u(t) + G(t)p + r0(t) 0, (45e) Linearization  ≤ Let us now place ourselves at location 1 in Figure 11, H0x(0) + K0p + `0 = 0, (45f) and imagine that the algorithm is at some iteration dur- Hf x(1) + Kf p + `f = 0. (45g) ing the SCP solution process. The first task in the way of constructing a convex subproblem is to remove the Problem 45 is convex in the constraints and poten- nonconvexities of Problem 39. For this purpose, recall tially nonconvex in the cost. Note that the convex path that the algorithm has access to the reference trajec- constraints in (39c)-(39d) are kept without approxima- 1 tory x¯(t),u ¯(t),p ¯ 0. If we replace every nonconvexity by tion. This is a key advantage of SCP over methods like { } its first-order approximation around the reference trajec- SLP and SQP, as was discussed in the previous section tory, then we are guaranteed to generate convex subprob- on SCP history. lems. Furthermore, these are computationally inexpen- Because the control trajectory u( ) belongs to an infi- sive to compute relative to second-order approximations nite-dimensional vector space of continuous-time· func- (i.e., those involving Hessian matrices). As we alluded in tions, Problem 45 cannot be implemented and solved the previous section on SCP history, linearization of all numerically on a digital computer. To do so, we must nonconvex elements is not the only choice for SCP – it is consider a finite-dimensional representation of the con- simply a very common one, and we take it for this article. trol function u(t), which can be obtained via temporal To formulate the linearized nonconvex terms, the follow- discretization or direct collocation [27, 190]. These repre- ing Jacobians must be computed (the time argument is sentations turn the original infinite-dimensional optimal omitted where necessary to keep the notation short): control problem into a finite-dimensional parameter op- timization problem that can be solved on a digital com- A(t) xf(t,x ¯(t),u ¯(t),p ¯), (44a) , ∇ puter. B(t) , uf(t,x ¯(t),u ¯(t),p ¯), (44b) In general, and rather unsurprisingly, solutions to ∇ discretized problems are only approximately optimal F (t) pf(t,x ¯(t),u ¯(t),p ¯), (44c) , ∇ and feasible with respect to the original problem. r(t) f(t,x ¯(t),u ¯(t),p ¯) Ax¯(t) Bu¯(t) F p¯, (44d) , − − − In particular, a discrete-time control signal has fewer degrees of freedom than its continuous-time counterpart. C(t) , xs(t,x ¯(t),u ¯(t),p ¯), (44e) ∇ Therefore, it may lack the flexibility required to exactly D(t) us(t,x ¯(t),u ¯(t),p ¯), (44f) , ∇ match the true continuous-time optimal control signal. G(t) ps(t,x ¯(t),u ¯(t),p ¯), (44g) By adding more temporal nodes, the approximation can , ∇ 28 Preprint become arbitrarily accurate, albeit at the expense of whether to shrink, grow, or maintain the trust region problem size and computation time. radius. SCP methods can be differentiated by how they Another problem with discretization is that the path update the trust region, and so the trust region update constraints are usually enforced only at the discrete tem- will be discussed separately for SCvx and GuSTO in the poral nodes, and not over the entire time horizon. This following sections. can lead to (typically mild) constraint violation between Figure 13 shows a two-dimensional toy problem that the discrete-time nodes, although some techniques exist exemplifies a single iteration of an SCP convergence pro- to remedy this artifact [191, 192]. cess. In this example, the “original problem” consists The bad news notwithstanding, there are well estab- of one parabolic (nonconvex) equality constraint (blue), lished discretization methods that ensure exact satisfac- a convex equality constraint (green), and a convex half- tion of the original continuous-time nonlinear dynam- space inequality constraint (feasible to the left of the ics (39b). Thus, the discretized solution can still pro- vertical red dashed line). The original problem is ap- duce strictly dynamically feasible continuous-time tra- proximated about the reference solutionz ¯, resulting in jectories. We refer the reader to [6, 59, 62, 190] for the blue dash-dot equality constraint and the same con- detailed explanations of discretization methods that en- vex equality and inequality constraints. The trust region sure exact satisfaction of the continuous-time nonlinear is shown as the red circle, and represents the region in dynamics. An introduction to the technique that we will which the SCP algorithm has deemed the convex approx- use for the numerical examples at the end of this article is imation to be valid. Evidently, if the new solution z de- given in “Discretizing Continuous-time Optimal Control viates too much fromz ¯, the linear approximation of the Problems”. parabola becomes poor. Moreover, had the green equal- Our systematic linearization of all nonconvex elements ity constraint been removed, removal of the trust region has ensured that Problem 45 is convex in the constraints, would lead to artificial unboundedness, as the cost could which is good news. However, linearization unsurpris- be decreased indefinitely. ingly has a price. We have inadvertently introduced two Clearly, there is another problem with the lineariza- artifacts that must be addressed: artificial unbounded- tion in Figure 13 – the resulting subproblem is infeasi- ness and artificial infeasibility. ble. This is because the green and blue dash-dot equality constraints do not intersect inside the set defined by the Artificial Unboundedness trust region and the convex inequality constraint halfs- Linear approximations are only accurate in a neigh- pace. This issue is known as artificial infeasibility. 1 borhood around the reference solution x¯(t),u ¯(t),p ¯ 0. Thus, for each t [0, 1], the subproblem{ solution must} Artificial Infeasibility be kept “sufficiently∈ close” to the linearization point Linearization can make the resulting subproblem infea- defined by the reference solution. Another reason to sible. Two independent cases can arise wherein the con- not deviate too far from the reference is that, in certain straints imposed in Problem 45 are inconsistent (i.e., no malicious cases, linearization can render the solution un- feasible solution exists), even though the original con- bounded below (i.e., the convex cost (45a) can be driven straints admit a non-empty feasible set: to negative infinity). We refer to this phenomenon as In the first case, the intersection of the convexified artificial unboundedness. To mitigate this problem path constraints may be empty. This occurs in the and to quantify the meaning of “sufficiently close”, we example of Figure 13, where no feasible solution ex- add the following trust region constraint: ists because the linearized constraints (green and blue δx(t) = x(t) x¯(t), dash-dot) do not intersect to the left of the red in- − equality constraint; δu(t) = u(t) u¯(t), − δp = p p¯, In the second case, the trust region may be so small − that it restricts the solution variables to a part of the αx δx(t) q + αu δu(t) q+ k k k k solution space that is outside of the feasible set. In αp δp q η, for t [0, 1]. (46) k k ≤ ∈ other words, the intersection of the trust region with the (non-empty) feasible region of the convexified con- for some choice of q 1, 2, 2+, and constants ∈ { ∞}+ straints may itself be empty. This would have been αx, αu, αp 0, 1 . We use q = 2 to denote the ∈ { } the case in Figure 13 if the green and blue dash-dot ability to impose the trust region as the quadratic lines were to intersect outside of the red trust region two-norm squared. The trust region radius η is a fixed circle, but to the left of the halfspace inequality (for scalar that is updated between SCP iterations (i.e., example, ifz ¯ was slightly further to the right). passages around the loop in Figure 11). The update rule associated with the trust region measures how well The occurence of either case would prevent SCP from the linearization approximates the original nonconvex finding a new reference solution at 2 in Figure 11. Thus, elements at each iterate. This informs the algorithm the solution loop cannot continue, and the algorithm will 29 Preprint

Discretizing Continuous-time Optimal Control Problems uppose that we have a continuous-time linear time- trajectories using a basis of so-called Lagrange interpolat- S varying dynamical system governed by the ordinary ing polynomials, over a non-uniform temporal grid defined differential equation (45b). To temporally discretize this sys- by the roots of an orthogonal polynomial [28, S5, S6, S7]. tem, we choose a set of N temporal nodes: In contrast, interpolating polynomial methods only approx- imate the control signal. This enables the exact solution 0 = t1 < t2 < < tN = 1, (S27) ··· of the LTV system (45b) over each time interval, yielding which do not need to be evenly spaced. The objective of an “exact” discrete-time representation [59]. Here, exact any discretization method is to represent the state trajec- means that the continuous- and discrete-time states match n tory x( ) : [0, 1] R at each of these temporal nodes as at the temporal nodes. A key attribute of interpolating poly- · → a function of the state, control, and parameters at the tem- nomial discretization is that upon convergence, an SCP so- poral nodes. Mathematically, this transforms the ODE (45b) lution satisfying the discretized dynamics (S32a) will exactly into a linear time-varying difference equation. satisfy the original nonlinear differential equation (39b). There are countless ways to achieve this objective, As an example, we shall now present a discretization and [190] provides a comparative study of several such approach from the class of interpolating polynomial meth- methods for motion planning problems. When choosing a ods. This method, called first-order hold (FOH) interpola- method, it is important to remember that not all discretiza- tion, provides a suitable balance between optimality, feasi- tions perform equally well. Coarse discretization techniques bility, and computation time. We have found it to work well can render a convex subproblem infeasible even if the orig- for many problems [37, 59, 190]. inal optimal control problem is feasible. Moreover, different m In FOH, the control signal u( ) : [0, 1] R is con- · → discretization methods will change the sparsity properties strained to be a continuous and piecewise affine function of of the resulting convex subproblem, impacting the compu- N time. By defining the inputs at the temporal nodes, uk , { }k=1 tational requirements for solving each subproblem at stage the continuous-time signal is obtained by linear interpolation 2 in Figure 11. inside each time interval [tk , tk+1]: In practice, two methods appear to be the most appro- priate for SCP-based trajectory generation: pseudospectral tk+1 t t tk u(t) = − uk + − uk+1, tk+1 tk tk+1 tk and interpolating polynomial methods. Pseudospectral dis- − − cretization has gained popularity after its original introduc- , σ−(t)uk + σ+(t)uk+1. (S28) tion to the optimal control community by Vlassenbroeck [S3, S4]. This method approximates both the state and control (continued...)

fail. As a result, even if the original problem admits a The ultimate result of both strategies is that subproblem feasible solution (shown as the black square in Figure 13), is guaranteed to be feasible. either of the two aforementioned scenarios would prevent Because the virtual control is a new term that we add SCP from finding it. We refer to this phenomenon as to the problem, it follows that the converged solution artificial infeasibility. must have a zero virtual control in order to be feasible Artificial infeasibility in sequential convex program- and physically meaningful with respect to the original ming was recognized early in the development of SCP problem. If, instead, the second strategy is used and algorithms [158, 161]. Two equivalent strategies exist constraint violations are penalized in the cost, the con- to counteract this issue. One approach adds an uncon- verged solution must not violate the constraints (i.e., the strained, but penalized, slack variable to each linearized penalty terms should be zero). Intuitively, if the con- constraint. This variable is sometimes called a virtual verged solution uses a non-zero virtual control or has a control when applied to the dynamics constraint (39b), non-zero constraint violation penalty, then it is not a and a virtual buffer when applied to other constraints solution of the original optimal control problem. [49]. To keep our language succinct, we will use virtual The fact that trajectories can converge to an infeasible control in both cases. solution is one of the salient limitations of SCP. However, The second approach penalizes constraint violations it is not unlike the drawback of any NLP optimization by augmenting the original cost function with soft method, which may fail to find a solution entirely even penalty terms (45a). When the same functions and if one exists. When SCP converges to an infeasible so- weights are used as to penalize the virtual control terms, lution, we call it a “soft” failure, since usually only a this strategy results in the same optimality conditions. few virtual control terms are non-zero. A soft failure 30 Preprint

Z tk+1 (continued...) + 1 Bk = Ak Φ(τ, tk )− B(τ)σ+(τ) dτ, (S32d) The dynamics (45b) can then be expressed for t [tk , tk+1] tk ∈ Z tk+1 as: 1 Fk = Ak Φ(τ, tk )− F (τ) dτ, (S32e) tk t ˙x(t) = A(t)x(t) + B(t)σ (t)uk + Z k+1 − 1 rk = Ak Φ(τ, tk )− r(τ) dτ. (S32f) B(t)σ+(t)uk+1 + F (t)p + r(t), (S29) tk

which is again an LTV system, except for the fact that the In implementation, the state transition matrix, the inte- control is no longer a continuous-time function u( ), but in- grals (S32c)-(S32f), and the reference trajectory’s state ¯x( ) · m · stead it is fully determined by the two vectors uk , uk+1 R . ∈ are computed simultaneously over each interval [tk , tk+1]. A standard result from linear systems theory is that the This procedure is explained in more detail in [190], and unique solution of (S29) is given by [95]: pseudocode can be found in [63].

Z t Ultimately, the update equation (S32a) is used to write  x(t) = Φ(t, tk )x(tk ) + Φ(t, τ) B(τ)σ−(τ)uk N 1 affine equality constraints for k = 1, ... , N 1, which tk − −  represent the discretized dynamic feasibility constraint on + B(τ)σ+(τ)u + F (τ)p + r(τ) dτ, (S30) k+1 the trajectory. Equation (55b) provides an example embed-

where Φ(t, tk ) is the state transition matrix that satisfies the ding of such constraints into an optimization problem. following matrix differential equation: REFERENCES Φ˙ (t, tk ) = A(t)Φ(t, tk ), Φ(tk , tk ) = In. (S31) [S3] Jacques Vlassenbroeck. “A chebyshev polynomial method for optimal control with state constraints”. In: Automatica 24.4 (1988), pp. 499–506. DOI: 10.1016/0005-1098(88)90094-5. By setting t = tk+1, (S30) allows us to write a linear [S4] Jacques Vlassenbroeck and R. Van Dooren. “A Chebyshev technique time-varying difference equation that updates the discrete- for solving nonlinear optimal control problems”. In: IEEE Transactions on Au- time state xk = x(tk ) to the next discrete-time state xk+1 = tomatic Control 33.4 (1988), pp. 333–340. DOI: 10.1109/9.192187. [S5] Carl de Boor and Blair Swartz. “Collocation at Gaussian Points”. In: x(tk+1): SIAM Journal on Numerical Analysis 10.4 (Sept. 1973), pp. 582–606. DOI: 10.1137/0710052. x = A x + B −u + B +u + F p + r , (S32a) k+1 k k k k k k+1 k k [S6] Anil V. Rao. “A Survey of Numerical Methods for Optimal Control”. In: Advances in the Astronautical Sciences 135 (2010), pp. 497–528. Ak = Φ(tk+1, tk ), (S32b) [S7] I. Michael Ross and Mark Karpenko. “A review of pseudospectral opti- Z tk+1 − 1 mal control: From theory to flight”. In: Annual Reviews in Control 36.2 (Dec. B = Ak Φ(τ, tk )− B(τ)σ−(τ) dτ, (S32c) k 2012), pp. 182–197. DOI: 10.1016/j.arcontrol.2012.09.002. tk

carries important information, since the temporal nodes The terminal and running costs in (40) are both as- and constraints with non-zero virtual control hint at how sumed to be convex functions. We already mentioned and where the solution is infeasible. Usually, relatively that this is without loss of generality for the terminal mild tuning of the algorithm parameters or the problem cost, and the same reasoning applies for the running definition will recover convergence to a feasible solution. cost. Any nonconvex term in the cost can be offloaded In relation to the optimization literature at large, the into the constraints, and an example was given in (41); soft failure exhibited by SCP is related to one-norm reg- ularization, lasso regression, and basis pursuit, used to To handle artificial unboundedness, SCvx enforces find sparse approximate solutions [26, Section 11.4.1]. (46) as a hard constraint. While several choices are The specific algorithmic choices made to address arti- possible [49, 183], this article uses αx = αu = αp = 1. ficial unboundedness (e.g., selection of the trust region The trust region radius η is adjusted at each iteration radius update rule) and artificial infeasibility (e.g., vir- by an update rule, which we discuss below; tual control versus constraint penalization) lead to SCP To handle artificial infeasibility, SCvx uses virtual algorithms with different characteristics. The next two control terms. sections review two such methods, SCvx and GuSTO, and highlight their design choices and algorithmic prop- Let us begin by describing how SCvx uses virtual con- erties. To facilitate a concise presentation, we henceforth trol to handle artificial infeasibility. This is done by aug- suppress the time argument t whenever possible. menting the linear approximations (45b) and (45e)-(45g) as follows: The SCvx Algorithm In light of the above discussion, the SCvx algorithm x˙ = Ax + Bu + F p + r + Eν, (47a) makes the following algorithmic choices: νs Cx + Du + Gp + r0, (47b) ≥ 31 Preprint

where y and z are placeholder arguments. The cost func- z2 tion (45a) is augmented using the penalty function as f (z) z z2 = 0 follows: 1 , 2 − 1 1

Jλ(x, u, p,ν ˆ) φλ(x(1), p, νic, νtc)+ f2(z) , z2 + 0.1z1 = 0.6 z z¯ 2 , k − k ≤ 1 0.5 η Γλ(x, u, p, Eν, νs) dt, (49) z¯ Z0 z 0.85 φλ(x(1), p, νic, νtc) = φ(x(1), p) + λP (0, νic) + λP (0, νtc), 1 ≤ Γλ(x, u, p, Eν, νs) = Γ(x, u, p) + λP Eν, νs . (50) z1 1 0.5 0.5 1  − − The positive weight λ R++ is selected by the user, and must be sufficiently large.∈ We shall make this state- 0.5 ment more precise later in the section on SCvx conver- − Decreasing gence guarantees. For now, we note that in practice it f (z¯) + f (z¯)T z z¯ = 0 Cost 1 ∇ 1 − is quite easy to find an appropriately large λ value by  selecting a power of ten. In general, λ can be a function 1 − of time, but we rarely do this in practice. We also point to an important notational feature of FIGURE 13 A two-dimensional nonconvex toy problem that exempli- (50), where we used Eν in place of ν for the argument list fies a convex subproblem obtained during an SCP iteration. In this of Γλ. This will help later on to highlight that the con- case, the cost function L(z) = z2 and level curves of the cost are shown as gray dashed lines. The− blue curve represents a nonconvex tinuous-time matrix E is substituted with its discrete- equality constraint, and its linearization is shown as the blue dash-dot time version after temporal discretization (see ahead in line. Another convex equality constraint is shown in green, and a con- (53)). vex inequality constraint is shown as the vertical red dashed line. The The continuous-time convex subproblem that is solved trust region is the red circle centered at the linearization point z¯, and has radius η. The optimal solution of the original (non-approximated) at each iteration of the SCvx algorithm can then be problem is shown as the black square. The convex subproblem is ar- stated formally as: tificially infeasible. Without the trust region and the green constraint, it would also be artifically unbounded. min Jλ(x, u, p,ν ˆ) (51a) u,p,ˆν s.t.x ˙ = Ax + Bu + F p + r + Eν, (51b) 0 = H0x(0) + K0p + `0 + νic, (47c) (x, p) ,(u, p) , (51c) 0 = Hf x(1) + Kf p + `f + νtc. (47d) ∈ X ∈ U Cx + Du + Gp + r0 νs, (51d) nν ns nic ≤ where ν( ) R , νs( ) R , νic R , and νtc · ∈ · ∈ ∈ ∈ H0x(0) + K0p + `0 + νic = 0, (51e) ntc are the virtual control terms. To keep notation R H x(1) + K p + ` + ν = 0, (51f) manageable, we will use the symbolν ˆ as a shorthand for f f f tc δx q + δu q + δp q η. (51g) the argument list (ν, νs, νic, νtc). k k k k k k ≤ The virtual control ν in (47a) can be viewed simply as another control variable that can be used to influence It was mentioned in the previous section that the state trajectory. Like the other virtual control terms, Problem 51 is not readily implementable on a com- we would like ν to be zero for any converged trajectory, puter because it is a continuous-time, and hence because it is a synthetic input that cannot be used in infinite-dimensional, optimization problem. To solve reality. Note that it is required that the pair (A, E) in the problem numerically, a temporal discretization is (47a) is controllable, which is easy to verify using any applied, such as the one discussed in “Discretizing of several available controllability tests [95]. A common Continuous-time Optimal Control Problems”. In par- ticular, we select a set of temporal nodes tk [0, 1] choice is E = In, in which case (A, E) is unconditionally ∈ controllable. for k = 1, ... , N and recast the subproblem as a parameter optimization problem in the (overloaded) The use of non-zero virtual control in the subprob- N N N 1 variables x = xk k=1, u = uk k=1, p, ν = νk k=1− , lem solution is discouraged by augmenting the cost func- N { } { } { } νs = νs,k , νic, and νtc. tion with a virtual control penalty term. Intuitively, this { }k=1 means that virtual control is used only when it is neces- Depending on the details of the discretization scheme, sary to avoid subproblem infeasibility. To this end, we there may be fewer decision variables than there are tem- n p poral nodes. In particular, for simplicity we will use a define a positive definite penalty function P : R R × → zeroth-order hold (ZOH) assumption (i.e., a piecewise R+ where p is any appropriate integer. The following choice is typical in practice: constant function) to discretize the dynamics virtual con- trol ν( ). This means that the virtual control takes the · P (y, z) = y 1 + z 1, (48) value ν(t) = νk inside each time interval [tk, tk+1), and k k k k 32 Preprint the discretization process works like any other interpolat- The most general interpolating polynomial discretiza- ing polynomial method from “Discretizing Continuous- tion fits into the following discrete-time dynamics con- time Optimal Control Problems”. Because this leaves straint:

νN undefined, we take νN = 0 for notational convenience N whenever it appears in future equations. j xk+1 = Akxk + Bkuj + Fkp + rk + Ekνk, The continuous-time cost function (49) can be dis- j=1 cretized using any appropriate method. Pseudospectral X methods, for example, specify a numerical quadrature where the j superscript indexes different input coefficient that must be used to discretize the integral [28]. For matrices. Other discretization methods lead to yet other simplicity, we shall assume that the time grid is uniform affine equality constraints, all of which simply replace (i.e., tk+1 tk = ∆t for all k = 1, ... , N 1) and that (55b)[190]. With this in mind, we will continue to write trapezoidal− numerical integration is used.− This allows us (55b) for simplicity. Furthermore, it is implicitly under- to write the discrete-time version of (49) as: stood that the constraints (55b)-(55d) and (55g) hold at each temporal node. N λ(x, u, p,ν ˆ) = φλ x(1), p, νic, νtc + trapz(Γλ ), (52) Because Problem 55 is a finite-dimensional convex op- L timization problem, it can be solved to global optimal- ΓN = Γ (x , u , p, E ν , ν ). (53) λ,k λ k k k k s,k ity using an off-the-shelf convex optimization solver [26]. N We shall denote the optimal solution by x∗ = xk∗ k=1, where trapezoidal integration is implemented by the N N 1 {N } N u∗ = u∗ , p∗, ν∗ = ν∗ − , ν∗ = ν∗ , ν∗ , function trapz : R R, defined as follows: { k}k=1 { k }k=1 s { s,k}k=1 ic → and νtc∗ . N 1 ∆t − trapz(z) zk + zk+1. (54) SCvx Update Rule , 2 k=1 At this point, we know how to take a nonconvex opti- X mal control problem like Problem 39 and: 1) convexify We call (52) the linear augmented cost function. it to Problem 45, 2) add a trust region (46) to avoid ar- This name is a slight misnomer, because (52) is in fact a tificial unboundedness, 3) add virtual control terms (47) general nonlinear convex function. However, we use the to avoid artificial infeasibility, 4) penalize virtual control “linear” qualifier to emphasize that the cost relates to usage in the cost (49), and 5) apply discretization to ob- the convex subproblem for which all nonconvexities have tain a finite-dimensional convex Problem 55. In fact, this been linearized. In particular, the virtual control terms gives us all of the necessary ingredients to go around the can be viewed as linear measurements of dynamic and loop in Figure 11, except for one thing: how to update nonconvex path and boundary constraint infeasibility. the trust region radius η in (46). In general, the trust Lastly, the constraints (51c), (51d), and (51g) are en- region changes after each pass around the loop. In this forced at the discrete temporal nodes tk for each k = section, we discuss this missing ingredient. 1, ... , N. In summary, the following discrete-time con- To begin, we define a linearization accuracy metric vex subproblem is solved at each SCvx iteration (i.e., called the defect: location 2 in Figure 11): δk xk+1 ψ(tk, tk+1, xk, u, p) (56) , − min λ(x, u, p,ν ˆ) (55a) x,u,p,ˆν L for k = 1, ... , N 1. The function ψ is called the flow map and its role− is to “propagate” the control input u s.t. xk+1 = Akxk + Bkuk + Fkp + rk + Ekνk, (55b) through the continuous-time nonlinear dynamics (39b), (xk, p) k,(uk, p) k, (55c) ∈ X ∈ U starting at state xk at time tk and evolving until the next Ckxk + Dkuk + Gkp + r0 νs,k, (55d) k ≤ temporal grid node tk+1 [193]. It is important that the H0x1 + K0p + `0 + νic = 0, (55e) flow map is implemented in a way that is consistent with the chosen discretization scheme, as defined below. Hf xN + Kf p + `f + νtc = 0, (55f)

δxk q + δuk q + δp q η. (55g) k k k k k k ≤ Definition 2 The flow map ψ in (56) is consistent with the discretiza- We want to clarify that (55b) is written as a shorthand tion used for Problem 55, if the following equation holds convenience for discretized dynamics, and is not rep- for all k = 1, ... , N 1: resentative of every possible discretization choice. For − example, (55b) is correct if ZOH discretization is used. ψ(tk, tk+1,x ¯k,u ¯,p ¯) = Akx¯k + Bku¯k + Fkp¯ + rk. (57) However, as specified in (S32a), FOH discretization would lead to the following constraint that replaces  (55b): There is an intuitive way to think about the consis- tency property of ψ. The reader may follow along us- − + xk+1 = Akxk + Bk uk + Bk uk+1 + Fkp + rk + Ekνk. ing the illustration in Figure 14. On the one hand, 33 Preprint

x Rn x Rn ∈ ∈ (t3, t, x3, u, p) x˜k+1 ψ 5 (tk, t, x¯k, u¯, p¯) 2 x5 xδ6 xˆ 3 ψ k+1 4 x x 1 δ 3 δ 4 δ x¯k+1 δ x2 x¯k x t 1 t

tk tk+1 0 = t1 t2 t3 t4 t5 t6 = tN = 1

FIGURE 14 Illustration of the flow map consistency property in Def- FIGURE 15 Illustration of defect calculation according to (56). The inition2. When the flow map is consistent with the discretization flow map computation restarts at each discrete-time node. At each scheme, the state x˜k+1 propagated through the flow map (yellow temporal node, the defect is computed as the difference between the circle) and the state xˆk+1 propagated through the discrete-time lin- discrete solution output by Problem 55 and the corresponding flow earized update equation (dashed green circle) match. When the ref- map value. erence trajectory is not dynamically feasible, it generally deviates from the flow map trajectory, hence x˜k+1 =x ¯k+1. 6 and the state obtained by using the flow map starting at time t . The defect has the following interpretation: a ψ(t , t ,x ¯ ,u ¯,p ¯) maps an initial statex ¯ through the k k k+1 k k non-zero defect indicates that the solution to the sub- continuous-time nonlinear dynamics (39b) to a new state problem is dynamically infeasible with respect to the x˜ . On the other hand, the right-hand side of (57) k+1 original nonconvex dynamics. For a dynamically fea- does the same, except that it uses the linearized and sible subproblem solution, the flow map trajectories in discretized dynamics and outputs a new statex ˆ . Be- k+1 Figure 15 coincide with the discrete-time states at the cause the linearization is being evaluated at the refer- discrete-time nodes. This is a direct consequence of the ence trajectory (i.e., at the linearization point), the lin- consistency property from Definition2. We shall see this earized continuous-time dynamics will yield the exact happen for the converged solutions of the numerical ex- same trajectory. Thus, the only difference between the amples presented in Part III of this article (e.g., see Fig- left- and right-hand sides of (57) is that the right-hand ures 23 and 31). side works in discrete-time. Consistency, then, simply We now know how to compute a consistent flow map means that propagating the continuous-time dynamics and how to use it to calculate defects using (56). We will yields the same point as performing the discrete-time up- now leverage defects to update the trust region radius in date (i.e.,x ˜ =x ˆ ). For every discretization method k+1 k+1 SCvx. First, define a nonlinear version of (52) as follows: that is used to construct Problem 55, there exists a con- sistent flow map. λ(x, u, p,ν ˆ) = φλ x(1), p, gic(x(0), p), gtc(x(1), p) + The reader is likely already familiar with flow maps, J N even if the term sounds new. Consider the following con- trapz(Γλ ), (60) crete examples. When using forward Euler discretiza- N Γλ,k = Γλ xk, uk, p, δk, tion, the corresponding consistent flow map is simply: + s(tk, xk, uk, p) . (61) ψ(tk, tk+1, xk, u, p) = xk + ∆tf(tk, xk, uk, p). (58) where the positive-part function [ ]+ returns  zero when When using an interpolating polynomial discretiza- its argument is negative, and otherwise· just returns the tion scheme like the one described in “Discretizing argument. A salient feature of (60) is that it evaluates Continuous-time Optimal Control Problems”, the the penalty function (48) based on how well the actual corresponding consistent flow map is the solution to the nonconvex constraints are satisfied. To do so, when com- dynamics (39b) obtained through numerical integration. pared to (52), Ekνk is replaced with the defect δk mea- In other words, the flow map satisfies the following suring dynamic infeasibility, νs,k is replaced with the ac- conditions at each time instant t [tk, tk+1]: tual nonconvex path constraints (39e), while ν and ν ∈ ic tc are replaced by the actual boundary conditions (39f) and ψ(tk, tk, xk, u, p) = xk, (59a) (39g). Because evaluation of the defect and the noncon-

ψ˙(tk, t, xk, u, p) = f t, ψ(tk, t, xk, u, p), u, p . (59b) vex path and boundary constraints is a nonlinear opera- tion, we call (60) the nonlinear augmented cost function. As illustrated in Figure 15, the defect (56) captures the The SCvx algorithm uses the linear augmented cost discrepancy between the next discrete-time state xk+1 (52) and the nonlinear augmented cost (60) to, roughly 34 Preprint

Case 1: < Case 2: [ , ) Case 3: [ , ) Case 4: 0 ∈ 0 1 ∈ 1 2 ≥ 2 max( , / ) max( , / ) min( , gr ) ← ρ 0ρ sh ← ρ 0 ρ ρsh ← ρ ρ ρ ← ρ 1 ρ x¯ x¯ x¯ x x¯ x x¯ x ← ← ∗ ← ∗ ← ∗ ηu¯ u¯ η η β ηu¯ u η η β ηu¯ ηu ηu¯ u η β η ← ← ∗ ← ∗ ← ∗ p¯ p¯ p¯ p p¯ p p¯ p ← ← ∗ ← ∗ ← ∗ R 0 1 2

ρ ρ ρ FIGURE 16 The SCvx trust region update rule. The accuracy metric ρ is defined in (62) and provides a measure of how accurately the convex subproblem given by Problem 55 describes the original Problem 39. Note that Case 1 actually rejects the solution to Problem 55 and shrinks the trust region before proceeding to the next iteration. In this case, the convex approximation is deemed so poor that it is unusable. speaking, measure the accuracy of the convex approxi- In fact, the denominator can be loosely interpreted as mation of Problem 39 by Problem 45. Using the refer- a lower bound prediction: the algorithm “thinks” that ence solution and the optimal solution to Problem 55, the actual cost improves by at least that much. We can SCvx defines the following scalar metric to measure con- thus adopt the following intuition based on seeing Prob- vexification accuracy: lem 55 as a local model of Problem 39. A small ρ value indicates that the model is inaccurate because the actual λ(¯x,u ¯,p ¯) λ(x∗, u∗, p∗) ρ , J − J . (62) cost decrease is much smaller than predicted. If ρ is close λ(¯x,u ¯,p ¯) λ(x , u , p , ν , νs ) J − L ∗ ∗ ∗ ∗ ∗ to unity, the model is accurate because the actual cost Let us carefully analyze the elements of (62). First of decrease is similar to the prediction. If the ρ value is all, the denominator is always nonnegative because the greater than unity, the model is “conservative” because following relation holds [49, Theorem 3.10]: it underestimates the cost reduction. As a result, large ρ values incentivize growing the trust region because the λ(¯x,u ¯,p ¯) λ(x∗, u∗, p∗, ν∗, νs∗). (63) model is trustworthy and we want to utilize more of it. J ≥ L On the other hand, small ρ values incentivize shrinking The proof of (63) works by constructing a feasible sub- the trust region in order to not “overstep” an inaccurate problem solution that matches the cost λ(¯x,u ¯,p ¯). Be- model [49]. gin by setting the state, input, and parameterJ values to 1 The SCvx update rule for the trust region radius η the reference solution x¯(t),u ¯(t),p ¯ 0. We now have to formalizes the above intuition. Using three user-defined choose the virtual controls{ that make} this solution feasi- scalars ρ0, ρ1, ρ2 (0, 1) that split the real number line ble for Problem 55 and that yield a matching cost (55a). into four parts, the∈ trust region radius and reference The consistency property from Definition2 ensures that trajectory are updated at the end of each SCvx itera- this is always possible to do. In particular, choose the tion according to Figure 16. The user-defined constants dynamics virtual control to match the defects, and the βsh, βgr > 1 are the trust region shrink and growth rates, other virtual controls to match the constraint values at respectively. Practical implemenations of SCvx also let the reference solution. This represents a feasible solu- the user define a minimum and a maximum trust region tion of Problem 55 whose cost equals λ(¯x,u ¯,p ¯). The radius by using η0, η1 > 0. optimal cost for the subproblem cannotJ be worse, so the The choice of the user-defined scalars ρ0, ρ1, and ρ2 inequality (63) follows. greatly influences the algorithm runtime. Indeed, as If the denominator of (62) is zero, it follows from the shown in Figure 16, when ρ < ρ0 the algorithm will ac- above discussion that the reference trajectory is an opti- tually outright reject the subproblem solution, and will mal solution of the convex subproblem. This signals that resolve the same problem with a smaller trust region. it is appropriate to terminate SCvx and to exit the loop This begs the question: can rejection go on indefinitely? in Figure 11. Hence, we can use the denominator of (62) If this occurs, the algorithm will be stuck in an infinite as part of a stopping criterion that avoids a division by loop. The answer is no, the metric (62) must eventually zero. The stopping criterion is discussed further in the rise above ρ0 [49, Lemma 3.11]. next section. Looking at (62) holistically, it is essentially a ratio SCvx Stopping Criterion between the actual cost improvement (the numerator) The previous sections provide a complete description of and the predicted cost improvement (the denominator), the Starting and Iteration regions in Figure 11. A cru- achieved during one SCvx iteration [48]: cial remaining element is how and when to exit from actual improvement the “SCP loop” of the Iteration region. This is defined ρ = . (64) predicted improvement by a stopping criterion (also called an exit criterion), 35 Preprint

which is implemented at location 3 in Figure 11. When SCvx Convergence Guarantee the stopping criterion triggers, we say that the algo- The SCvx trust region update rule in Figure 16 is de- rithm has converged, and the final iteration’s trajectory signed to permit a rigorous convergence analysis of the 1 x∗(t), u∗(t), p∗ is output. algorithm. A detailed discussion is given in [49]. To { }0 The basic idea of the stopping criterion is to mea- arrive at the convergence result, the proof requires the 1 sure how different the new trajectory x∗(t), u∗(t), p∗ following (mild) technical condition that is common to { }0 is from the reference trajectory x¯(t),u ¯(t),p ¯ 1. Intu- most if not all optimization algorithms. We really do { }0 itively, if the difference is small, then the algorithm con- mean that the condition is “mild” because we almost siders the trajectory not worthy of further improvement never check it in practice and SCvx just works. and it is appropriate to exit the SCP loop. The formal SCvx exit criterion uses the denominator of (62) (i.e, Condition 8 the predicted cost improvement) as the stopping crite- The gradients of the equality and active inequality con- rion [49, 64, 183, 192]: straints of Problem 39 must be linearly independent for the final converged trajectory output by SCvx. This is λ(¯x,u ¯,p ¯) λ(x∗, u∗, p∗, ν∗, νs∗) ε, (65) J − L ≤ known as a linear independence constraint qualification (LICQ) [29, Chapter 12].  where ε R++ is a user-defined (small) threshold. For notational∈ simplicity, we will write (65) as: It is also required that the weight λ in (49) is suf- ficiently large. This ensures that the integral penalty ¯ λ λ∗ ε. term in (49) is a so-called exact penalty function. A J − L ≤ precise condition for “large enough” is provided in [49, Numerical experience has shown a control-dependent Theorem 3.9], and a possible strategy is outlined follow- stopping criterion to sometimes lead to an unnecessarily ing the theorem in that paper. However, in practice, we conservative definition of convergence [62]. For example, simply iterate over a few powers of ten until SCvx con- in an application like rocket landing, common vehicle verges with zero virtual control. An approximate range characteristics (e.g., inertia and engine specific impulse) that works for most problems is 102 to 104. Once a mag- imply that relatively large changes in control can have nitude is identified, it usually works well across a wide little effect on the state trajectory and optimality. When range of parameter variations for the problem. the cost is control dependent (which it often is), (65) The SCvx convergence guarantee is stated below, and may be a conservative choice that will result in more it- is proved in [49, Theorem 3.21]. Beside Condition8, the erations without providing a more optimal solution. We theorem requires a few more mild assumptions on the may thus opt for the following simpler and less conser- inequality constraints in Problem 39 and on the penal- vative stopping criterion: ized cost (52). We do not state them here due to their technical nature, and refer the reader to the paper. It is p∗ p¯ qˆ + max xk∗ x¯k qˆ ε, (66) k − k k 1,...,N k − k ≤ enough to say that, like Condition8, these assumptions ∈{ } are mild enough that we rarely check them in practice. whereq ˆ 1, 2, 2+, defines a norm similarly to (46). ∈ { ∞} Importantly, the following correspondence holds between Theorem 8 (65) and (66). For any ε choice in (66), there is a (gen- Suppose that Condition8 holds and the weight λ in (49) erally different) ε choice in (65) such that: if (65) holds, is large enough. Regardless of the initial reference tra- then (66) holds. We call this an “implication correspon- jectory provided in Figure 11, the SCvx algorithm will dence” between stopping criteria. always converge at a superlinear rate by iteratively solv- SCvx guarantees that there will be an iteration for ing Problem 55. Furthermore, if the virtual controls are which (65) holds. By implication correspondence, this zero for the converged solution, then it is a stationary guarantee extends to (66). In general, the user can de- point of Problem 39 in the sense of having satisfied the fine yet another stopping criterion that is tailored to the KKT conditions.  specific trajectory problem, as long as implication corre- Theorem8 confirms that SCvx solves the KKT con- spondence holds. In fact, we do this for the numerical ditions of the original Problem 39. Because these are examples at the end of the article, where we use the fol- first-order necessary conditions of optimality, they are lowing stopping criterion that combines (65) and (66): also satisfied by local maxima and saddle points. Nev- ¯ ¯ ertheless, because each convex subproblem is minimized (66) holds or λ λ∗ εr λ , (67) J − L ≤ J by SCvx, the event of converging to a stationary point where εr R++ is a user-defined (small) threshold on that is not a local minimum is very small. the relative∈ cost improvement. The second term in (67) Theorem8 also states that the convergence rate is su- allows the algorithm to terminate when relatively little perlinear [29], which is to say that the distance away progress is being made in decreasing the cost, a signal from the converged solution decreases superlinearly [49, that it has achieved local optimality. Theorem 4.7]. This is better than general NLP methods, 36 Preprint and on par with SQP methods that usually also attain i.e., s(t, x, u, p) = s(t, x, p). Altogether, these assump- at most superlinear convergence [168]. tions specialize Problem 39 to problems that are convex In conclusion, note that Theorem8 is quite intuitive in the control variable. and confirms our basic intuition. If we are “lucky” to get At its core, GuSTO is an SCP trust region method a solution with zero virtual control, then it is a local min- just like SCvx. Thus, it has to deal with the same issues imum of the original nonconvex Problem 39. The reader of artificial infeasibility and artificial unboundedness. To will be surprised, however, at just how easy it is to be this end, the GuSTO algorithm makes the following al- “lucky”. In most cases, it really is a simple matter of en- gorithmic choices: suring that the penalty weight λ in (49) is large enough. To handle artificial unboundedness, GuSTO aug- If the problem is feasible but SCvx is not converging or is ments the cost function with a soft penalty on the converging with non-zero virtual control, the first thing violation of (46). Because the original Problem 39 to try is to increase λ. In more difficult cases, some is convex in the control variables, there is no need nonconvex constraints may need to be reformulated as for a trust region with respect to the control and we equivalent versions that play nicer with the linearization use α = 0. In its standard form, GuSTO works process (the free-flyer example in Part III discusses this u with default values α = α = 1. However, one can in detail). However, let us be clear that there is no a x p choose different values α , α > 0 without affecting priori hard guarantee that the converged solution will x p the convergence guarantees; satisfyν ˆ = 0. In fact, since SCvx always converges, even an infeasible optimal control problem can be solved, and To handle artificial infeasibility, GuSTO augments the it will return a solution for whichν ˆ is non-zero. cost function with a soft penalty on nonconvex path constraint violation. As the algorithm progresses, the The GuSTO Algorithm penalty weight increases. The GuSTO algorithm is another SCP method that can From these choices, we can already deduce that a be used for trajectory generation. The reader will find salient feature of the GuSTO algorithm is its use of soft that GuSTO has a familiar feel to that of SCvx. Nev- penalties to enforce nonconvex constraints. Recall that ertheless, the algorithm is subtly different from both in SCvx, the trust region (46) is imposed exactly, and computational and theoretical standpoints. For exam- the linearized constraints are (equivalently) relaxed by ple, while SCvx works directly with the temporally dis- using virtual control (47). By penalizing constraints, cretized Problem 55 and derives its convergence guaran- GuSTO can be analyzed in continuous-time via the tees from the KKT conditions, GuSTO performs all of classical Pontryagin maximum principle [11, 12, 50, its analysis in continuous-time using Pontryagin’s maxi- 69]. An additional benefit of having moved the state mum principle [11, 12, 49, 50]. Temporal discretization constraints into the cost is that the virtual control term is only introduced at the very end to enable numerical employed in the dynamics (47a) can be safely removed, solutions to the problem. GuSTO applies to versions of since linearized dynamics are almost always controllable Problem 39 where the running cost in (40) is quadratic [72]. in the control variable: Let us begin by formulating the augmented cost func-

T T tion used by the GuSTO convex subproblem at 2 in Fig- Γ(x, u, p) = u S(p)u + u `(x, p) + g(x, p), (68) ure 11. This involves three elements: the original cost where the parameter p, as before, can be used to cap- (40), soft penalties on path constraints that involve the ture free final time problems. The functions S, `, and g state, and a soft penalty on trust region violation. We must all be continuously differentiable. Furthermore, the will now dicuss the latter two elements. function S must be positive semidefinite. The dynamics To formulate the soft penalties, consider a continu- must be affine in the control variable: ously differentiable, convex, and nondecreasing penalty function hλ : R R+ that depends on a scalar weight m → λ 1 [50, 72]. The goal of hλ is to penalize any positive f(t, x, u, p) = f0(t, x, p) + uifi(t, x, p), (69) value≥ and to be agnostic to nonpositive values. Thus, a i=1 X simple example is a quadratic rectifier: n d n 2 where the fi : are nonlinear func- + R R R R hλ(z) = λ [z] , (70) tions representing× a decomposition× → of the dynamics into terms that are control-independent and terms that lin- where higher λ values indicate higher penalization. An- early depend on each of the control variables ui. Note other example is the softplus function [195]: that any Lagrangian mechanical system can be expressed 1 σz h (z) = λσ− log 1 + e , (71) in the control affine form, and so (69) is applicable to λ the vast majority of real-world vehicle trajectory gen- where σ R++ is a sharpness parameter.  As σ grows, eration applications [194]. Finally, the nonconvex path the softplus∈ function becomes an increasingly accurate constraints (39e) are independent of the control terms, approximation of λ max 0, z . { } 37 Preprint

We use hλ to enforce soft penalties for violating the Using (44e), (44g), and (75), we can write the convex constraints (39c) and (39e) that involve the state and approximation of (74c): parameter. To this end, let w : Rn Rd Rnw be a con- × → 1 vex continuously differentiable indicator function that is Γ Γ Γ L˜λ(x, u, p) = Γ(¯x,u ¯,p ¯) + A δx + B δu + F δp+ nonpositive if and only if the convex state constraints in Z0 (39c) are satisfied: ns hλ si(t,x ¯,p ¯) + Ciδx + Giδp dt, (76) w(x, p) 0 (x, p) . i=1 ≤ ⇔ ∈ X X  Note that because is a convex set, such a function where Ci and Gi are the i-th rows of the Jacobians (44e) X ˜ always exists. Using w, s from (39e), and hλ, we can and (44g), respectively. Note that, strictly speaking, Lλ ˜ lump all of the state constraints into a single soft penalty is not a linearized version of Jλ because the second term function: in (74c) is only linearized inside the hλ( ) function. · Replacing J˜λ in (74a) with L˜λ, we obtain a convexified nw ns augmented cost function: gsp(t, x, p) , hλ wi(t, x) + hλ si(t, x, p) . (72) i=1 i=1 ˘ ˜ X  X  Lλ(x, u, p) = Jλ(x, p) + Lλ(x, u, p). (77) Another term is added to the augmented cost function to handle artificial unboundedness. This term penalizes To summarize the above discussion, the continuous- violations of the trust region constraint, and is defined time convex subproblem that is solved at each iteration in a similar fashion as (72): of the GuSTO algorithm can be stated formally as:

min Lλ(x, u, p) (78a) gtr(x, p) hλ δx q + δp q η , (73) u,p , k k k k − although we note that hard enforced versions of the trust s.t.x ˙ = Ax + Bu + F p + r, (78b) region constraint can also be used [72]. (u, p) , (78c) ∈ U The overall augmented cost function is obtained by H0x(0) + K0p + `0 = 0, (78d) combining the original cost (39a) with (72) and (73). Hf x(1) + Kf p + `f = 0. (78e) Because the resulting function is generally nonconvex, we take this opportunity to decompose it into its convex Problem 78 can be compared to the SCvx continuous- and nonconvex parts: time convex subproblem, given by Problem 51. Broadly speaking, the problems are quite similar. Their main dif- ˘ ˜ Jλ(x, u, p) , Jλ(x, p) + Jλ(x, u, p), (74a) ferences stem from how the SCvx and GuSTO algorithms 1 handle artificial infeasibility and artificial unbounded- J˘ (x, p) = φ(x(1), p) + g (x, p)+ λ tr ness. In the case of SCvx, virtual control terms are intro- Z0 nw duced and a hard trust region (51g) is imposed. In the hλ(wi(t, x)) dt, (74b) case of GuSTO, everything is handled via soft penalties i=1 in the augmented cost. The result is that penalty terms X 1 ns in the cost (78a) replace the constraints (51c), (51d), and ˜ Jλ(x, u, p) = Γ(x, u, p) + hλ(si(t, x, p)) dt. (51g). 0 i=1 Z X There is another subtle difference between Problem 78 (74c) and Problem 51, which is that GuSTO does not use virtual control terms for the linearized boundary con- The terms g (x, p) and h (w (x)) in (74b) are convex tr λ i ditions (78d) and (78e). In SCvx, these virtual con- functions, since they are formed by composing a convex trol terms maintain subproblem feasibility when the lin- nondecreasing function h with a convex function [26]. λ earized boundary conditions define hyperplanes that do Thus, (74b) is the convex part of the cost, while (74c) is not intersect with the hard-enforced convex state path the nonconvex part. constraints in (51c). This is not a problem in GuSTO, Our ultimate goal is to construct a convex subprob- since the convex state path constraints are only penal- lem that can be solved at location 2 in Figure 11. Thus, ized in the cost (72), and violating them for the sake the nonconvex part of the cost (74c) must be convexified of retaining feasibility is allowed. Furthermore, the lin- around the reference trajectory x¯(t),u ¯(t),p ¯ 1. This re- 0 earized dynamics (78b) are theoretically guaranteed to quires the following Jacobians in{ addition to} the initial be almost always controllable [72]. This implies that the list (44): dynamics (78b) can always be steered between the lin- Γ A xΓ(¯x,u ¯,p ¯), (75a) earized boundary conditions (78d) and (78e)[95]. , ∇ Γ Similar to how we treated Problem 51, a temporal dis- B uΓ(¯x,u ¯,p ¯), (75b) , ∇ cretization scheme must be applied in order to numer- Γ F pΓ(¯x,u ¯,p ¯). (75c) ically solve the subproblem. Discretization proceeds in , ∇ 38 Preprint

the same way as for SCvx: we select a set of temporal weight λ by a user-defined factor γfail > 1. Otherwise, if points tk [0, 1] for k = 1, ... , N, and recast Problem 78 (46) holds, the algorithm proceeds by computing the fol- as a parameter∈ optimization problem in the (overloaded) lowing convexification accuracy metric that is analogous N N variables x = xk k=1, u = uk k=1, and p. The same to (62): discretization{ choices} are available{ } as for SCvx, such as described in “Discretizing Continuous-time Optimal Jλ(x∗, u∗, p∗) Lλ(x∗, u∗, p∗) + Θ∗ ρ , − 1 , (83) Control Problems”. Lλ(x∗, u∗, p∗) + 0 x˙ ∗ 2 dt The integrals of the continuous-time cost function k k R (78a) also need to be discretized. This is done in a where, recalling (39b) and ( 78b), we have defined: similar fashion to the way (52) was obtained from (49) x˙ ∗ , Ax∗ + Bu∗ + F p∗ + r, (84a) for SCvx. For simplicity, we will again assume that 1 the time grid is uniform with step size ∆t and that Θ∗ f(t, x∗, u∗, p∗) x˙ ∗ 2 dt. (84b) , k − k trapezoidal numerical integration (54) is used. For Z0 notational convenience, by combining the integral terms Equation (84a) integrates to yield the continuous-time in (74b) and (76), we can write (77) compactly as: state solution trajectory. This is done in accordance with 1 the temporal discretization scheme, such as the one de- Γ Lλ(x, u, p) = φ(x(1), p) + Lλ(x, u, p) dt, (79) scribed in “Discretizing Continuous-time Optimal Con- 0 Z trol Problems”. As a result, the value of Θ∗ is nothing Γ but a measure of the total accumulated error that results where the convex function Lλ is formed by summing the integrands of (74b) and (76). We can then compute the from linearizing the dynamics along the subproblem’s op- discrete-time version of (79), which is the GuSTO equiv- timal solution. In a way, Θ∗ is the counterpart of SCvx alent of its SCvx counterpart (52): defects defined in (56), with the subtle difference that while defects measure the discrepancy between the lin- Γ,N λ(x, u, p) = φ(x(1), p) + trapz(L ), (80) earized and nonlinear state trajectories, Θ∗ measures the L λ Γ,N Γ discrepancy between the linearized and nonlinear state Lλ,k = Lλ(xk, uk, p). (81) dynamics. Lastly, like in SCvx, the constraint (78c) is enforced The continuous-time integrals in (83) can, in principle, only at the discrete temporal nodes. In summary, the be evaluated exactly (i.e., to within numerical precision) following discrete-time convex subproblem is solved at based on the discretization scheme used. In practice, we approximate them by directly numerically integrat- each GuSTO iteration (i.e., location 2 in Figure 11): ing the solution of Problem 82 using (for example) trape-

min λ(x, u, p) (82a) zoidal integration (54). This means that we evaluate the x,u,p L following approximation of (83): s.t. xk+1 = Akxk + Bkuk + Fkp + rk, (82b) λ(x∗, u∗, p∗) λ(x∗, u∗, p∗) + Θ∗ (uk, p) k, (82c) ρ J − L , (85) ∈ U ≈ λ(x∗, u∗, p∗) + trapz(x∗ ) H0x1 + K0p + `0 = 0, (82d) L Θ∗ trapz (∆f ∗), Hf xN + Kf p + `f = 0, (82e) ≈ ∆fk∗ = f(tk, xk∗, uk∗, p∗) x˙ k∗ 2, xk∗ = x˙ k∗ 2. where it is implicitly understood that the constraints k − k k k (82b) and (82c) hold at each temporal node. The nonlinear augmented cost λ in the numerator of (85) is a temporally discretizedJ version of (74a). In GuSTO Update Rule particular, combine the integrals of (74b) and (74c) to We are now at the same point as we were in the SCvx express (74a) as: section: by using Problem 82 as the discrete-time convex 1 Γ subproblem, all of the necessary elements are available to Jλ(x, u, p) = φ(x(1), p) + Jλ (x, u, p) dt, (86) go around the loop in Figure 11, except for how to update Z0 the trust region radius η and the penalty weight λ. Both which allows us to compute λ as follows: J values are generally different at each GuSTO iteration, Γ,N λ(x, u, p) = φ(x(1), p) + trapz(J ), (87) and we shall now describe the method by which they are J λ updated. The reader will find the concept to be similar Γ,N Γ Jλ,k = Jλ (xk, uk, p). (88) to how things worked for SCvx. First, recall that GuSTO imposes the trust region (46) Looking at (83) holistically, it can be seen as a nor- as a soft constraint via the penalty function (73). This malized error that results from linearizing the cost and means that the trust region constraint can possible be the dynamics: violated. If the solution to Problem 82 violates (46), cost error + dynamics error ρ = . (89) GuSTO rejects the solution and increases the penalty normalization term 39 Preprint

Note that as long as the solution of Problem 82 is non- is the first SCP iteration where shrinking is applied. The trivial (i.e., x∗(t) = 0 for all t [0, 1] does not hold), the user sets both µ and k , and in this manner can regulate normalization term is guaranteed∈ to be strictly positive. how fast the trust region∗ shrinks to zero. We view low Thus, there is no danger of dividing by zero. µ values and high k values as setting the algorithm up ∗ For comparison, SCvx evaluates convexification accu- for a longer “exploration” phase prior to tightening the racy through (64), which measures accuracy as a relative trust region. error in the cost improvement prediction. This predic- tion is a “higher order” effect: linearization error indi- GuSTO Stopping Criterion rectly influences cost prediction error through the opti- To complete the description of the GuSTO algorithm, mization of Problem 55. GuSTO takes a more direct it remains to define the stopping criterion used at loca- route with (89), and measures convexification accuracy tion 3 of Figure 11. The formal GuSTO exit criterion directly as a normalized error that results from lineariz- directly checks if the control and parameters are all al- ing both the cost and the dynamics. most unchanged [50, 71]: Looking at (89), we can adopt the following intuition about the size of ρ. As for SCvx, let us view Problem 82 p¯ p∗ qˆ ε and k − k ≤ as a local model of Problem 39. A large ρ value indi-  1 cates an inaccurate model, since the linearization error u∗(t) u¯(t) qˆ ε or λ > λmax, (91) is large. A small ρ value indicates an accurate model, 0 k − k ≤ Z  since the linearization error is relatively small compared + where ε R++ andq ˆ 2, 2 , have the same mean- to the normalization term. Hence, large ρ values incen- ∈ ∈ { ∞} ing as before in (66), and the new parameter λmax R++ tivize shrinking the trust region in order to not overstep is a (large) maximum penalty weight. ∈ the model, while small ρ values incentivize growing it When (91) triggers due to λmax, GuSTO exits with in order to exploit a larger region of an accurate model. an unconverged trajectory that violates the state and/or Note that the opposite intuition holds for SCvx, where trust region constraints. This is equivalent to SCvx ex- small ρ values are associated with shrinking the trust iting with non-zero virtual control, and indicates that region. the algorithm failed to solve the problem due to inherent The GuSTO update rule formalizes the above intu- infeasibility or numerical issues. ition. Using two user-defined constants ρ0, ρ1 (0, 1) ∈ Computing (91) can be computationally expensive due that split the real number line into three parts, the trust to numerical integration of the control deviation. We region radius η and the penalty weight λ are updated at can simplify the calculation by directly using the dis- the end of each GuSTO iteration according to Figure 17. crete-time solution. This leads to the following stopping Just like in SCvx, the user-defined constants βsh, βgr > 1 criterion: are the trust region shrink and growth rates, respectively.

The sizes of the trust region and the penalty weight are p∗ p¯ qˆ + trapz(∆u∗) ε or λ > λmax, (92) restricted by user-defined constants: η0 is the minimum k − k ≤ ∆uk∗ = uk∗ u¯k qˆ. trust region radius, η1 is the maximum trust region ra- k − k dius, and λ0 1 is the minimum penalty weight. Im- ≥ Just as discussed for SCvx, an implication correspon- portantly, for cases 1 and 2 in Figure 17, whenever the dence holds between the stopping criteria (91) and (92). solution of Problem 82 is accepted, the penalty weight In practice, we follow the example of (67) and add the λ is increased by a factor γfail if any of the nonconvex option for exiting when relatively little progress is being state constraints in (39e) are violated. This incentivizes made in decreasing the cost, which can often signal local subsequent iterates to become feasible with respect to optimality sooner than (92) is satisfied: (39e). ¯ ¯ In addition, GuSTO requires that η eventually shrinks (92) holds or λ λ∗ εr λ , (93) to zero as the SCP iterations progress (we cover this in J − J ≤ J

where, as before, εr R++ is a user-defined (small) more detail in a later section on GuSTO convergence) ∈ [72]. However, if the trajectory is converging, then threshold on the relative cost improvement. The numer- δx, δu, δp 0 and the linearizations in Problem 78 ical examples at the end of the article implement (93). are bound to→ become more accurate. Hence, we expect near-zero ρ values when the trajectory is converging, GuSTO Convergence Guarantee which means that Figure 17 will grow the trust region Each iteration of the GuSTO numerical implementation instead of shrinking it. A simple remedy is to apply the can be seen as composed of three stages. Looking at following exponential shrink factor following the update Figure 11, first the convex Problem 82 is constructed in Figure 17: and solved with a convex optimizer at location 2 . Using + [1+k k∗] the solution, the stopping criterion (93) is checked at η µ − η, (90) ← 3 . If the test fails, the third and final stage updates where µ (0, 1) is an exponential shrink rate and k 1 the trust region radius and soft penalty weight according ∈ ∗ ≥ 40 Preprint

0, if s(x∗, p∗) 0, x∗ , if x x¯ > update( ) , ≤ ∈ X k k∗ − kkq ( fail , else. for some k 1, . . . , N λ ∈ { η } Case 1: λ< Case 2: [ , ) Case 3: Case 4: 0 γ λ ∈ 0 1 ≥ 1 min( , gr ) max( , / ) ← ρ 1 ρ ← ρ ρ ρ ← ρ 0ρ sh ← x¯ x x¯ x x¯ x¯ x¯ x¯ ← ∗ ← ∗ ← ← ηu¯ u η β η ηu¯ ηu ηu¯ u¯ η η β ηu¯ ηu¯ ← ∗ ← ∗ ← ← p¯ p p¯ p p¯ p¯ p¯ p¯ ← ∗ ← ∗ ← ← update( ) update( ) ← ← ← ← fail R λ λ 0 λ λ 1 λ λ λ γ λ

ρ ρ FIGURE 17 The GuSTO trust region update rule. The accuracy metric ρ is defined in (83) and provides a measure of how accurately the convex subproblem given by Problem 82 describes the original Problem 39. Note that Cases 3 and 4 reject the solution to Problem 82. In Case 3, this is due to the convex approximation being deemed so inaccurate that it is unusable, and the trust region is shrunk accordingly. In Case 4, this is due to the trust region constraint (46) being violated. to Figure 17 and (90). In this context, a convergence SCP algorithm that is intimately linked to its trust re- guarantee ensures that the stopping criterion at 3 in gion update rule. This is not the case for GuSTO, whose Figure 11 eventually triggers. convergence proof does not rely on the update rule defini- GuSTO is an SCP algorithm that was designed tion at all. The only thing assumed by Theorem9 is that and analyzed in continuous-time using the Pontryagin the trust region radius η eventually shrinks to zero [72]. maximum principle [50, 72]. Thus, the first conver- Ensuring this is the reason for (90). Thus, GuSTO is an gence guarantee that we are able to provide assumes SCP algorithm that accepts any update rule that eventu- that the GuSTO algorithm solves the continuous-time ally shrinks to zero. Clearly, some update rules may work subproblem (i.e., Problem 78) at each iteration [50, better than others, and the one presented in this article Corollary III.1], [72, Theorem 3.2]. is simply a choice that we have implemented in practice. Another simple update rule is given in [72]. Overall, the Theorem 9 GuSTO update rule can be viewed as a mechanism by Regardless of the initial reference trajectory provided in which to accept or reject solutions based on their con- Figure 11, the GuSTO algorithm will always converge by vexification accuracy, and not as a function designed to facilitate formal convergence proofs. iteratively solving Problem 51. Furthermore, if λ λmax (in particular, the state constraints are exactly satisfied),≤ The reason why it is necessary to shrink the trust then the solution is a stationary point of Problem 39 region to zero is the second nuanced detail of The- in the sense of having satisfied the necessary optimality orem9. In nonlinear optimization literature, most conditions of the Pontryagin maximum principle.  claims of convergence fall into the so-called weak We will emphasize once again the following duality convergence category [29, 196]. This is a technical between the GuSTO and SCvx convergence guarantees. term which means that a subsequence among the Both algorithms can converge to infeasible points of the i i i 1 trajectory iterates x (t), u (t), p 0 converges. For original Problem 39. For SCvx, this corresponds to a example, the subsequence{ may be} i = 1, 3, 5, 7, ... , non-zero virtual control (recall Theorem8). For GuSTO, while the iterates i = 2, 4, 6, ... behave differently. Both this corresponds to λ > λmax. The situations are com- SCvx and GuSTO provide, at the baseline, the weak pletely equivalent, and correspond simply to the differ- convergence guarantee. The next step is to provide a ent choices made by the algorithms to impose nonconvex strong convergence guarantee, which ensures that the constraints either with virtual control (as in SCvx) or as full sequence of iterates converges. For SCvx, this is soft cost penalties (as in GuSTO). possible by leveraging properties of the update rule in Theorem9 should be viewed as the GuSTO counter- Figure 16. As we mentioned in the above paragraph, part of Theorem8 for SCvx. Despite the similarities GuSTO is more flexible in its choice of update rule. between the two statements, there are three important To get a converging iterate sequence, the additional nuances about Theorem9 that we now discuss. element (90) is introduced to force all subsequences to The first difference concerns how the GuSTO update converge to the same value. Because our analysis for rule in Figure 17 and (90) play into the convergence SCvx has shown the strong convergence guarantee to be proof. In SCvx, the update rule from Figure 16 plays tied exclusively to the choice of update rule, GuSTO can a critical role in proving convergence. Thus, SCvx is an likely achieve a similar guarantee with an appropriately 41 Preprint designed update rule. on the type of discretization. One example was given The third and final nuanced detail of Theorem9 is in detail for FOH, which is an interpolating polynomial that it assumes continuous-time subproblems (i.e., Prob- method, in “Discretizing Continuous-time Optimal Con- lem 78) are solved. In reality, the discretized Problem 82 trol Problems”. This example encapsulates a core issue is implemented on the computer. This difference between with many other discretization schemes, so we will work proof and reality should ring a familiar tone to Part I on with it for concreteness. LCvx, where proofs are also given using the continuous- In FOH discretization, the integrals in (S32) require 1 time Pontryagin’s maximum principle, even though the the continuous-time reference trajectory x¯(t),u ¯(t),p ¯ 0 implementation is in discrete-time. If the discretization in order to evaluate the corresponding Jacobians{ in (44}). error is small, the numerically implemented GuSTO al- However, the convex subproblem solution only yields a gorithm will remain in the vicinity of a convergent se- discrete-time trajectory. To obtain the continuous-time quence of continuous-time subproblem solutions. Thus, reference, we use the continuous-time input obtained di- given an accurate temporal discretization, Theorem9 rectly from (S28) and integrate (59) in tandem with the can be reasonably assumed to apply for the numerical integrals in (S32). This operation is implemented over GuSTO algorithm [72]. a time interval [tk, tk+1] as one big integration of a con- Our default choice in research has been to use an inter- catenated time derivative composed of (59) and all the polating polynomial method such as described in “Dis- integrands in (S32)[63]. This saves computational re- cretizing Continuous-time Optimal Control Problems”. sources by not repeating multiple integrals, and has the This approach has three advantages [190]: 1) the contin- numerical advantage of letting an adaptive step integra- uous-time dynamics (78b) are satisfied exactly, 2) there tor automatically regulate the temporal resolution of the is a cheap way to convert the discrete-time numerical continuous-time reference trajectory. Because the inte- solution into a continuous-time control signal, and 3) gration is reset at the end of each time interval, the con- the discretized dynamics (82b) result in a more sparse tinuous-time reference state trajectory is discontinuous, problem than alternative formulations (e.g., pseudospec- as illustrated in Figure 15. Further details are provided tral), which benefits real-time solution. With this choice, in [63, 190] and the source code of our numerical exam- discretization introduces only two artifacts: the control ples, which is linked in Figure2. signal has fewer degrees of freedom, and the objective In theory, discretization occurs in the forward path of function (82a) is off from (78a) by a discretization error. the SCP loop, just before subproblem solution, as shown Thus, by using an interpolating polynomial discretiza- in Figure 11. However, by integrating (59) as described tion we can rigorously say that Problem 82 finds a “local” above, we can actually obtain the SCvx defects that are optimum for a problem that is “almost” Problem 78. We needed at stage 3 in Figure 11. Integrating the flow say “local” because the control function has fewer DoFs, map twice, once for discretization and another time for and “almost” due to discretization error in the objective the defects, is clearly wasteful. Hence, in practice, we im- function. At convergence, this means that the GuSTO plement discretization at stage 3 in Figure 11, and store solution satisfies the Pontryagin maximum principle for a the result in memory to be used at the next iteration. slightly different problem than Problem 39. In practice, Note that there is no computational benefit when the this technical discrepancy makes little to no difference. flow map does not need to be integrated, such as in (58). This is the case for many discretization schemes like for- Implementation Details ward Euler, Runge-Kutta, and pseudospectral methods. We have now seen a complete description of the SCvx The unifying theme of these methods is that the next dis- and GuSTO algorithms. In particular, we have all the crete-time state is obtained algebraically as a function of elements that are needed to go around, and to eventu- (a subset of) the other discrete-time states, rather than ally exit, the SCP loop in Figure 11. In this section, we through a numerical integration process. discuss two implementation details that significantly im- prove the performance of both algorithms. The first de- Variable Scaling tail concerns the temporal discretization procedure, and The convex SCP subproblems consist of the following op- the second detail is about variable scaling. timization variables: the states, the inputs, the param- eter vector, and possibly a number of virtual controls. Temporal Discretization The variable scaling operation uses an invertible func- The core task of discretization is to convert a continuous- tion to transform the optimization variables into a set of time LTV dynamics constraint into a discrete-time con- new “scaled” variables. The resulting subproblems are straint. For SCvx, this means converting (51b) to (55b). completely equivalent, but the magnitudes of the opti- For GuSTO, this means converting (78b) to (82b). In all mization variables are different [26, 29]. cases, our approach is to find the equivalent discrete-time SCvx and GuSTO are not scale-invariant algorithms. representation of the consistent flow map ψ from Defi- This means that good variable scaling can significantly nition2. The details of this conversion depend entirely impact not only how quickly a locally optimal solution 42 Preprint is found, but also the solution quality (i.e., the level of vector are roughly bounded by a unit hypercube: x optimality it achieves). Hence, it is imperative that the [0, 1]n, u [0, 1]m, and p [0, 1]d. Another advantage to∈ reader applies good variable scaling when using SCP to using a [0,∈ 1] interval is that∈ box constraint lower bounds solve trajectory problems. We will now present a stan- on the variables can be enforced “for free” if the low-level dard variable scaling technique that we have used suc- convex solver operates on nonnegative variables [63]. cessfully in practice. To give a concrete example of (94), suppose that To motivate our scaling approach, let us review the the state is composed of two quantities: a position two major effects of variable magnitude on SCP algo- that takes values in [100, 1000] m, and a velocity that 1 rithms. The first effect is common throughout scien- takes values in [ 10, 10] m s− . We would then choose − 2 2 2 tific computing and arises from the finite precision arith- Sx = diag(900, 20) R × and cx = (100, 10) R . metic used by modern computers [13, 29, 32, 197, 198]. Most trajectory problems∈ have enough problem-specific− ∈ When variables have very different magnitudes, a lot of information to find an appropriate scaling. When exact numerical error can accumulate over the iterations of a bounds on possible variable values are unknown, an numerical optimization algorithm [29]. Managing vari- engineering approximation is sufficient. ables of very different magnitudes is a common require- ment for numerical optimal control, where optimization Discussion problems describe physical processes. For example, the The previous two sections make it clear that SCvx and state may include energy (measured in Joules) and an- GuSTO are two instances of SCP algorithms that share gle (measured in radians). The former might be on the 6 0 many practical and theoretical properties. The algo- order of 10 while the latter is on the order of 10 [199]. rithms even share similar parameters sets, which are Most algorithms will struggle to navigate the resulting listed in Table1. decision space, which is extremely elongated along the From a practical point of view, SCvx can be shown energy axis. to have superlinear convergence rates under some mild The second effect of different variable magnitudes oc- additional assumptions. On the other hand, the theory curs in the formulation of the trust region constraint of GuSTO allows one to leverage additional information (46). This constraint mixes all of the states, inputs, and from dual variables to accelerate the algorithm, provid- parameters into a single sum on its left-hand side. We ing quadratic convergence rates. Finally, both methods have long been taught not to compare apples and or- can be readily extended to account for manifold-type anges, and yet (without scaling), this is exactly what we constraints, which can be thought of as implicit, nonlin- are doing in (46). Reusing the previous example, a trust ear equality constraints, and stochastic problem settings, region radius η = 1 means different things for an an- e.g., [50, 69, 71, 73]. gle state than for an energy state. It would effectively Although theoretical convergence proofs of SCvx and bias progress to the angle state, while allowing almost GuSTO do rely on a set of assumptions, these assump- no progress in the energy state. However, if we scale tions are not strictly necessary for the algorithms to work the variables to nondimensionalize their values, then the well in practice. Much of our numerical experience sug- components in the left-hand side of (46) become compa- gests that SCP methods can be deployed to solve diverse rable and the sum is valid. Thus, variable scaling plays and challenging problem formulations that do not nec- an important role in ensuring that the trust region radius essarily satisfy the theory of the past two sections. It is η is “meaningful” across states, inputs, and parameters. often the case that what works best in practice, we can- Without variable scaling, SCP algorithms usually have a not (yet) prove rigorously in theory [184]. As Renegar very hard time converging to feasible solutions. writes regarding convex optimization [200, p. 51], “It is We now have an understanding that variable scaling one of the ironies of the IPM literature that algorithms should seek to nondimensionalize the state, input, and which are more efficient in practice often have somewhat parameter vector in order to make them comparable. worse complexity bounds.” The same applies for SCP Furthermore, it should bound the values to a region methods, where algorithms that work better in practice where finite precision arithmetic is accurate. To this may admit weaker theoretical guarantees in the general end, we have used the following affine transformation case [201]. with success across a broad spectrum of our trajectory optimization research: The reader should thus feel free to modify and adapt the methods based on the requirements of their partic- x = Sxxˆ + cx, (94a) ular problem. In general, we suggest adopting a mod-

u = Suuˆ + cu, (94b) ify-first-prove-later approach. For example, when using SCvx, we are often able to significantly speed up conver- p = Sppˆ + cp, (94c) gence by entirely replacing the trust region update step wherex ˆ Rn,u ˆ Rm, andp ˆ Rd are the new scaled with a soft trust region penalty in the cost function [59, variables.∈ The user-defined∈ matrices∈ and offset vectors in 62, 63]. (94) are chosen so that the state, input, and parameter To conclude Part II, we note that SCvx, GuSTO, 43 Preprint

TABLE 1 A summary of the user-selected parame- ters for the SCvx and GuSTO algorithms.

Parameter dependence:

SCvx N q ˆqP 0 1 0 1 2 sh gr r GuSTO N q ˆqP h 0 max 0 1 0 1 sh gr fail k r ∗ λ η η η ρ ρ ρ β β ε ε Parameter key: λ λ λ η η η ρ ρ β β γ µ ε ε N N density of temporal discretization q 1, 2, 2+, ∞ trust region norm { } ˆq 1, 2, 2+, ∞ stopping criterion norm n{ p } P R R R+ positive definite penalty function × → h R R+ convex nondecreasing soft penalty function → R++ penalty weight λ, 1 min (initial) and max penalty weight 0 max ≥ λ R++ initial trust region radius λ0, λ1 R++ min and max trust region radii η0, 1, 2 (0, 1) trust region update parameters ηshη, gr > 1 trust region shrink and growth factors ρfailρ ρ R++ penalty weight growth factor β β (0, 1) trust region exponential shrink rate γk N first iteration when shrinking is applied ∗ µ R++ absolute trajectory change tolerance r R++ relative cost improvement tolerance ε ε and related SCP methods have been used to solve a plethora of trajectory generation problems, including: reusable launch vehicles [202, 203, 204], robotic ma- FIGURE 18 The Masten Xombie rocket near the end of a 750 m divert nipulators [179, 205, 206], robot motion planning [207, maneuver. Figure reproduced with permission from [37, Fig. 1]. 208, 209], and other examples mentioned in the article’s introduction and at the beginning of Part II. All of these site without consuming more propellant than what is SCP variants and applications should inspire the reader available. Using optimization to compute this sequence to come up with their own SCP method that works best can greatly increase the range of feasible landing sites for their particular application. and the precision of the final landing location [211, 212, PART III: APPLICATION EXAMPLES 213]. Parts I and II of this article provided the theoretical Lossless convexification for PDG was first introduced background necessary to start solving nonconvex tra- in [45, 84], where minimizing fuel usage was the objec- jectory generation problems using LCvx, SCvx, and tive. It was the first time that convex optimization was GuSTO. This final part of the article is dedicated to shown to be applicable to PDG. This discovery unlocked doing just that: providing examples of how trajectory a polynomial time algorithm with guaranteed conver- generation problems can be solved using the three gence properties for generating optimal landing trajec- algorithms. We cover rocket landing first, followed by tories. The work was later expanded to handle state quadrotor flight with obtacle avoidance, and finally six constraints [86, 88, 102], to solve a mininimum landing- degree-of-freedom flight of a robot inside a space station. error problem [85], to include nonconvex pointing con- The implementation code that produces the exact plots straints [87, S1], and to handle nonlinear terms in the seen in this part of the article is entirely available at the dynamics such as aerodynamic drag and nonlinear grav- link in Figure2. We provide a CVX-like parser interface ity [46]. to SCvx and GuSTO, so the reader can leverage the Today, we know that SpaceX uses convex optimization code to begin solving their own problems. for the Falcon 9 rocket landing algorithm [36]. Flight tests have also been performed using lossless convexi- LCvx: 3-DoF Fuel-Optimal Rocket Landing fication in a collaboration between NASA and Masten Rocket-powered planetary landing guidance, also known Space Systems [37], as shown in Figure 18. as powered descent guidance (PDG), was the orig- We will now present a lossless convexification PDG inal motivation for the development of lossless convex- example based on a mixture of original ideas from [45, ification. It makes use of several LCvx results that we 87]. Note that LCvx considers the 3-degree-of-freedom presented, making it a fitting first example. (DoF) PDG problem, where the vehicle is modeled as a Work on PDG began in the late 1960s [19], and has point mass. This model is accurate as long as attitude since been extensively studied [210]. The objective is to can be controlled in an inner loop faster than the outer find a sequence of thrust commands that transfer the translation control loop. This is a valid assumption for vehicle from a specified initial state to a desired landing many vehicles, including rockets and aircraft. The thrust 44 Preprint vector is taken as the control input, where its direction consequence, the familiar LCvx equality constraint ap- serves as a proxy for the vehicle attitude. We begin by pears in (97f). Following “Convex Relaxation of an Input stating the raw minimum fuel 3-DoF PDG problem. Pointing Constraint”, this also replaces Tc(t) 2 in (95f) with σ(t) in (97g). k k

tf

min Tc(t) 2dt (95a) tf Tc,tf k k Z0 min σ(t)dt (97a) σ,Tc,tf s.t.r ˙(t) = v(t), (95b) Z0 s.t.r ˙(t) = v(t), (97b) Tc(t) v˙(t) = g + ω×ω×r(t) 2ω×v(t), (95c) m(t) − − Tc(t) v˙(t) = g + ω×ω×r(t) 2ω×v(t), (97c) m(t) − − m˙ (t) = α Tc(t) 2, (95d) − k k m˙ (t) = ασ(t), (97d) ρmin Tc(t) 2 ρmax, (95e) ≤ k k ≤ − T ρmin σ(t) ρmax, (97e) Tc(t) eˆz Tc(t) 2 cos(γp), (95f) ≥ k k ≤ ≤ Tc(t) 2 σ(t), (97f) Hgsr(t) hgs, (95g) k k ≤ ≤ T Tc(t) eˆz σ(t) cos(γp), (97g) v(t) 2 vmax, (95h) k k ≤ ≥ Hgsr(t) hgs, (97h) mdry m(tf ), (95i) ≤ ≤ v(t) 2 vmax, (97i) r(0) = r0, v(0) = v0, m(0) = mwet, (95j) k k ≤ mdry m(tf ), (97j) r(tf ) = v(tf ) = 0. (95k) ≤ r(0) = r0, v(0) = v0, m(0) = mwet, (97k) The vehicle translational dynamics correspond to a r(tf ) = v(tf ) = 0. (97l) double integrator with variable mass, moving in a con- stant gravitational field, viewed in the planet’s rotating Next, we approximate the nonlinearity Tc/m in (97c). 3 3 frame. In particular, r R is the position, v R To this end, [45] showed that the following change of 3 ∈ ∈ is the velocity, g R is the gravitational acceleration, variables can be made: ∈ and ω R3 is the planet’s constant angular velocity. The ∈ σ Tc notation ω× denotes the skew-symmetric matrix repre- ξ , , u , , z , ln(m). (98) sentation of the cross product ω ( ). The mass m R m m × · ∈ is depleted by the rocket engine according to (95d) with Using the new variables, note that the fuel consumption rate m˙ (t) t 1 = αξ(t) z(t) = z(0) α ξ(τ)dτ. (99) α , , (96) m(t) − ⇒ − 0 Ispge Z Since the cost in (97a) maximizes m(tf ), and since 2 where ge 9.807 m s− is the Earth’s standard grav- tf ≈ α > 0, an equivalent cost is to minimize 0 ξ(t)dt. It itational acceleration and Isp s is the rocket engine’s 3 turns out that the new variables linearize all constraints specific impulse. As illustrated in Figure 19, Tc R R ∈ except for the upper bound part of (97e). In the new is the rocket engine thrust vector, which is upper and variables, (97e) becomes: lower bounded via (95e). The lower bound was moti- z(t) z(t) vated in “Convex Relaxation of an Input Lower Bound”. ρmine− ξ(t) ρmaxe− . (100) The thrust vector, and therefore the vehicle attitude, also ≤ ≤ has a tilt angle constraint (95f) which prevents the vehi- To keep the optimization problem an SOCP, it is de- cle from deviating by more than angle γp away from the sirable to also do something about the lower bound in vertical. Following the discussion in “Landing Glideslope (100), which is a convex exponential cone. In [45], it as an Affine State Constraint”, we also impose an affine was shown that the following Taylor series approxima- glideslope constraint via (95g) with a maximum glides- tion is accurate to within a few percent, and is therefore lope angle γgs. The velocity is constrained to a maximum acceptable: magnitude vmax by (95h). The final mass must be greater 1 2 than mdry (95i), which ensures that no more fuel is con- µmin(t) 1 δz(t) + δz(t) ξ(t), (101a) − 2 ≤ sumed than what is available. Constraints (95j) and   (95k) impose fixed boundary conditions on the rocket’s µmax(t) 1 δz(t) ξ(t), (101b) − ≥ state. In particular, the starting mass is mwet > mdry. To begin the lossless convexification process, the stan- where:   dard input slack variable σ R from “Convex Relax- z0(t) ation of an Input Lower Bound∈ ” is introduced in order µmin(t) = ρmine− , (102a) z0(t) to remove the nonconvex lower bound in (95e). As a µmax(t) = ρmaxe− , (102b) 45 Preprint

FIGURE 19 Illustration of the 3-DoF PDG problem, showing some of the relevant constraints on the rocket-powered lander’s trajectory. The thrust direction Tc(t)/ Tc(t) 2 servers as a proxy for vehicle attitude. k k

δz(t) = z(t) z0(t), (102c) Hgsr(t) hgs, (104i) − ≤ z0(t) = ln(mwet αρmaxt). (102d) v(t) 2 vmax, (104j) − k k ≤ ln(mdry) z(tf ), (104k) The reference profile z0( ) corresponds to the maxi- ≤ · z0(t) z(t) ln(mwet αρmint), (104l) mum fuel rate, thus z0(t) lower bounds z(t). To ensure ≤ ≤ − that physical bounds on z(t) are not violated, the follow- r(0) = r0, v(0) = v0, z(0) = ln(mwet), (104m) ing extra constraint is imposed: r(tf ) = v(tf ) = 0. (104n)

z0(t) z(t) ln(mwet αρmint). (103) Several conditions must now be checked to ascertain ≤ ≤ − lossless convexification, i.e., that the solution of Prob- The constraints (101) and (103) together approximate lem 104 is globally optimal for Problem 95. To begin, let (100) and have the important property of being conserva- us view z in (104d) as a “fictitious” state that is used for tive with respect to the original constraint [45, Lemma 2]. concise notation. In practice, we know explicitly that Thus, using this approximation will not generate solu- t z(t) = ln(mwet) α 0 ξ(t)dt and thus we can replace tions that are infeasible for the original problem. We can every instance of−z(t) in (104e)-(104l) with this expres- now write the finalized convex relaxation of Problem 95. sion. Thus, we do notR consider z as part of the state and, furthermore, (104e), (104f), (104k), and (104l) are input tf and not state constraints. Defining the state as 6 min ξ(t)dt (104a) x = (r, v) R , the state-space matrices are ξ,u,t ∈ f Z0 0 I 0 s.t.r ˙(t) = v(t), (104b) A = 3 , B = , E = B. (105) ω×ω× 2ω× I3 v˙(t) = g + u(t) ω×ω×r(t) 2ω×v(t), (104c) − −    − − z˙(t) = αξ(t), (104d) The pair A, B is unconditionally controllable and − { } 1 2 therefore Condition1 is satisfied. We also need to verify µmin(t) 1 δz(t) + δz(t) ξ(t), (104e) − 2 ≤ Condition3 due to the presence of the pointing con-   straint (104h). In this casen ˆu =e ˆz and N = eˆx eˆy . µmax(t) 1 δz(t) ξ(t), (104f) − ≥ Condition3 holds as long as ω×eˆz = 0, which means that u(t) 2 ξ(t), (104g) 6   k k ≤  the planet does not rotate about the local vertical of the T u(t) eˆz ξ(t) cos(γp), (104h) landing frame. ≥ 46 Preprint

of Problem 104 is guaranteed to be globally optimal for Problem 95 as long as (104j) is never persistently active. We now present a trajectory result obtained by solv- ing Problem 104 using a ZOH discretized input signal with a ∆t = 1 s time step (see “Discretizing Continuous- time Optimal Control Problems”). We use the following numerical parameters: 2 g = 3.71ˆez m s− , mdry = 1505 kg, (108a) − mwet = 1905 kg, Isp = 225 s, (108b) 3 1 ω = (3.5ˆex + 2ˆez) 10− ° s− , (108c) · ρmin = 4.971 kN, ρmax = 13.258 kN, (108d) FIGURE 20 A force balance in the normal direction nˆgs can be used 1 to guarantee that the glideslope constraint can only be activated in- γgs = 86 °, γp = 40 °, vmax = 500 km h− , (108e) stantaneously. r0 = 2ˆex + 1.5ˆez km, (108f) 1 v0 = 288ˆex + 108ˆey 270ˆez km h− . (108g) The glideslope constraint (104i) can be treated either − via Condition5 or by checking that it can only be in- Golden search [48] is used to find the optimal time of stantaneously active. In this case, we can prove the flight tf . This is a valid choice because the cost function latter by considering a force balance in the glideslope is unimodal with respect to tf [85]. For the problem plane’s normal directionn ˆgs, as illustrated in Figure 20. parameters in (108), an optimal rocket landing trajectory We also invoke the fact that the optimal thrust pro- is found with a minimum fuel time of flight tf∗ = 75 s. file for the rocket landing problem is bang-bang [45]. Figure 21 visualizes the computed optimal landing tra- This allows to distill the verification down to a conserva- jectory. From Figure 21(a), we can clearly see that the tive condition that the following inequalities hold for all glide slope constraint is not only satisfied, it is also ac- π π tivated only twice (once mid-flight and another time at θ [ 2 γp γgs, 2 + γp γgs]: ∈ − − − touchdown), as required by the LCvx guarantee. From ρmin cos(θ) < mdry g 2 sin(γgs), (106a) Figure 21(c), we can see that the velocity constraint is k k never activated. Hence, in this case, it does not pose ρmax cos(θ) > mwet g 2 sin(γgs). (106b) k k a threat to lossless convexification. Several other in- It turns out that for the problem parameters (108) teresting views of the optimal trajectory are plotted in of this example, the above inequalities hold. Therefore Figure 21(d)-(g). In all cases, the state and input con- (104i) can only be instantaneously active and does not straints of Problem 95 hold. The fact that Problem 104 is pose a threat to lossless convexification. a lossless convexification of Problem 95 is most evident We also need to check Condition2, which relates to the during the minimum-thrust segment from about 40 to transversality condition of the maximum principle [11, 70 seconds in Figure 21(f). Here, it is important realize that Tc(t) 2 < ρmin is feasible for the relaxed problem. 12, 187]. In the case of Problem 104, we have m[tf ] = 0 k k and b[tf ] = x(tf ), hence The fact that this never occurs is a consequence of the lossless convexification guarantee that (104g) holds with 6 0 R I6 equality. mLCvx = ∈ , BLCvx = . (107) ξ(tf ) 0 In conclusion, we emphasize that the rocket landing     trajectory presented in Figure 21 is not just a feasible so- Hence, as long as ξ(tf ) > 0, Condition2 holds. Since lution, nor even a locally optimal one, but it is a globally ξ(tf ) > 0 is guaranteed by the fact that the lower optimal solution to this rocket landing problem. This bound in (101) is greater than the exact lower bound means that one can do no better given these parameters z(tf ) ρmine− > 0 [45, Lemma 2], Condition2 holds. and problem description. The remaining constraint to be checked is the max- imum velocity bound (104j). Although it can be writ- SCP: Quadrotor Obstacle Avoidance 2 T ten as the quadratic constraint vmax− v(t) v(t) 1, the We now move on to trajectory generation for problems dynamics (104b)-(104c) do not match the form≤ (18b)- which cannot be handled by LCvx. The objective of this (18c) required by the LCvx result for quadratic state first example is to compute a trajectory for a quadrotor constraints. Thus, we must resort to the more restricted that flies from one point to another through an obstacle- statement for general state constraints in Theorem5. filled flight space. This example is sourced primarily According to Theorem5, we conclude that lossless con- from [52], and a practical demonstration is shown in Fig- vexification will hold as long as the maximum velocity ure 22. bound (104j) is activated at most a discrete number of The quadrotor is modeled as a point mass, which is a times. In summary, we can conclude that the solution reasonable approximation for small and agile quadrotors 47 Preprint

the “up” direction, and a(t) R3 is the commanded ac- celeration, which is the control∈ input. The time t spans the interval [0, tf ]. Because the previous section on SCP Quadrotor used a normalized time t [0, 1], we call t the absolute time and use the special∈ font to denote it. We allow the trajectory duration to be optimized, and impose the following constraint to keep the final time bounded:

tf,min tf tf,max, (110) ≤ ≤

where tf,min R and tf,max R are user-defined param- eters. Boundary∈ conditions on∈ the position and velocity FIGURE 22 A quadrotor at the University of Washington’s Au- are imposed to ensure that the vehicle begins and ends tonomous Controls Laboratory, executing a collision-free trajectory computed by SCvx [52, 214, 215]. at the desired states:

r(0) = r0,r ˙(0) = v0, (111a) whose rotational states evolve over much shorter time r(tf ) = rf ,r ˙(tf ) = vf . (111b) scales than the translational states [45]. We express the equations of motion in an East-North-Up (ENU) inertial The magnitude of the commanded acceleration is lim- coordinate system, and take the state vector to be the ited from above and below by the electric motor and position and velocity. Using a simple double-integrator propeller configuration, and its direction is constrained model, the continuous-time equations of motion are ex- to model a tilt angle constraint on the quadrotor. In pressed in the ENU frame as: effect, the acceleration direction is used as a proxy for r¨(t) = a(t) gnˆ, (109) the vehicle attitude. This is an accurate approximation − for a “flat” quadrotor configuration where the propellers where r R3 denotes the position, g R is the (con- are not canted with respect to the plane of the quadro- ∈ ∈ stant) acceleration due to gravity,n ˆ = (0, 0, 1) R3 is tor body. Specifically, we enforce the following control ∈

400 1.75 1.75 600 Tc,k Tc,k r(t) r(t) 1.50 1.50 350 rk rk 500 Hgs rk = hgs Hgs rk = hgs 300 1.25 1.25 400

250 [km/h]

[km/h] 1.00 1.00 2 k 2 ) k

t 300 v

200 (

k 0.75 0.75 v k

150 Altitude [km] Altitude [km] 0.50 0.50 200 Spped

100 Speed 0.25 0.25 100 50 v(t) k k2 v 0.00 0.00 k k k2 0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.5 0.0 0.5 0 20 40 60 75 − Downrange [km] Crossrange [km] Time [s] (a) Side view of the optimal landing trajectory. The glideslope constraint (b) Front-on view of the optimal landing (c) Velocity history. Constraint (103j) is (103i) is activated at a single time instance mid-flight. trajectory. never activated.

50 16 16 1.9 14 14 40 12 12 1.8 [kN]

10 z 10

30 ˆe 1.7 8 8 Mass [t] Mass [t] 20

Thrust [kN] 6 6 1.6

4 Thrust along 4 10 T Tc (t) 2 ˆez gmk (hover) T k k T m(t) arccos(ˆe Tc (t)/ Tc (t) 2) 2 T 2 ˆe Tc (t) 1.5 z k k k c,k k2 z m arccos(ˆeTT / T ) σ ˆeTT k z c,k k c,k k2 k z c,k 0 0 0 0 20 40 60 75 0 20 40 60 75 0 20 40 60 75 0 20 40 60 75 Time [s] Time [s] Time [s] Time [s] (d) Mass history. The vehicle uses a (e) Pointing angle history. Constraint (f) Thrust magnitude history. The LCvx (g) Vertical thrust projection. The feasible amount of fuel. (103h) gets activated. equality constraint (96f) is tight. rocket never hovers.

FIGURE 21 Various views of the globally optimal rocket landing trajectory obtained via lossless convexification for Problem 95. Circular markers show the discrete-time trajectory directly returned from the convex solver. Lines show the continuous-time trajectory, which is obtained by numerically integrating the ZOH discretized control signal through the dynamics (95b)-(95d). The exact match at every discrete-time node demonstrates that the solution is dynamically feasible for the actual continuous-time vehicle.

48 Preprint

constraints: TABLE 2 Algorithm parameters for the quadrotor obstacle avoidance example. amin a(t) 2 amax, (112a) ≤ k k ≤ T a(t) 2 cos θmax nˆ a(t), (112b) Problem parameters: k k ≤ tf ,min 0.0 s min final time where 0 < amin < amax are the acceleration bounds and tf ,max 2.5 s max final time 2 θmax (0°, 180°] is the maximum angle by which the g 9.81 m s− gravity ∈ 2 acceleration vector is allowed to deviate from the “up” amin 0.6 m s− min acceleration 2 direction. amax 23.2 m s− max acceleration Finally, obstacles are modeled as three-dimensional el- max 60 ° max tilt angle c ( ) m center of obstacle 1 lipsoidal keep-out zones, described by the following non- 1 1, 2, 0 θc2 (2, 5, 0) m center of obstacle 2 convex constraints: 1 H1 diag(2, 2, 0) m− shape of obstacle 1 1 H2 diag(1.5, 1.5, 0) m− shape of obstacle 2 Hj r(t) cj 2 1, j = 1, ... , nobs, (113) k − k ≥ r0 (0, 0, 0) m initial position vector 1 v0 (0, 0, 0) m s− initial velocity vector where c 3 denotes the center, and H 3 defines j R j S++ rf (2.5, 6, 0) m final position vector ∈ ∈ 1 the size and shape of the j-th obstacle. Note that the vf (0, 0, 0) m s− final velocity vector formulation allows the obstacles to intersect. Using the above equations, we wish to solve the fol- Algorithm parameters: lowing free final time optimal control problem that min- SCvx GuSTO imizes control energy: N 30 30 q ∞ ∞ tf 2 ˆq ∞ ∞ min a(t) 2 dt (114a) P (48) (48) u,tf 0 k k Z h (71) s.t. (109)-(113). (114b) 30 λ 4 9 0, max 10 , 10 Due to the presence of nonconvex state constraints λ 1 10 3 3 (113), only embedded LCvx applies to the above prob- λ0, λ1 10− , 10 10− , 10 lem. In particular, LCvx can be used to handle the non- η , , 0, 0.1, 0.7 0.1, 0.9, 0 1 2 − convex input lower bound in (112a) along with the tilt ηsh,η gr 2, 2 2, 2 constraint in (112b). This removes some of the noncon- ρfailρ ρ 5 vexity. However, it is only a partial convexification which β β 0.8 γk 6 still leaves behind a nonconvex optimal control problem. ∗ µ 0 0 Thus, we must resort to SCP techniques for the solution. r 0 0 ε SCvx Formulation ε We begin by demonstrating how SCvx can be used to solve Problem 114. To this end, we describe how the input vectors: problem can be cast into the template of Problem 39. Once this is done, the rest of the solution process is r 6 a 4 x = R , u = R . (116) completely automated by the mechanics of the SCvx al- r˙ ∈ σ ∈ gorithm as described in the previous section. To make     the notation lighter, we will omit the argument of time Next, we have to deal with the fact that Problem 39 whenever possible. uses normalized time t [0, 1], while Problem 114 uses ∈ Let us begin by defining the state and input vectors. absolute time t [0, tf ]. To reconcile the two quanti- ∈ For the input vector, note that the nonconvex input con- ties, we use a one-dimensional parameter vector p R. ∈ straints in (112) can be convexified via embedded LCvx. The parameter defines a time dilation such that the In particular, the relaxation used for Problem 10 can following relation holds: losslessly convexify both input constraints by introduc- t = pt, (117) ing a slack input σ R and rewriting (112) as: ∈ from which it follows that p tf . In absolute time, the amin σ amax, (115a) ≤ ≤ dynamics are given directly≡ by writing (109) in terms a 2 σ, (115b) k k ≤ of (116), which gives a set of time-invariant first-order T σ cos θmax nˆ a, (115c) ≤ ordinary differential equations: where (115b) is the familiar LCvx equality constraint. r˙ f(x, u) = . (118) Thus, we can define the following state and “augmented” a gnˆ  −  49 Preprint

For Problem 39, we need to convert the dynamics to We now have a complete definition of Problem 39 for normalized time. We do so by applying the chain rule: the quadrotor obstacle avoidance problem. The only re- maining task is to choose the SCvx algorithm parame- dx dx dt = = pf x, u , (119) ters listed in Table1. The rest of the solution process dt dt dt is completely automated by the general SCvx algorithm which yields the dynamics (39b) in normalized time: description in Part II.

f(x, u, p) = pf x, u . (120) GuSTO Formulation The GuSTO algorithm can also be used to solve Prob- The convex path constraints (39c)-(39d) are easy to lem 114. The formulation is almost identical to SCvx, write. Although there are no convex state constraints, which underscores the fact that the two algorithms can there are convex final time bounds (110). These can be be used interchangeably to solve many of the same prob- included as a convex state path constraint (39c), which lems. The quadratic running cost (68) encodes (125) as is mixed in the state and parameter. Using (117), we follows: define the convex state path constraints as: 2 6 S(p) = diag 0, 0, 0, g− , (126a) = (x, p) R R : tf,min p tf,max . (121) X ∈ × ≤ ≤ `(x, p) = 0,  (126b)  On the other hand, the convex input constraint set g(x, p) = 0. (126c) is given by all the input vectors that satisfy (115). U The nonconvex path constraints (39e) are given by the We can also cast the dynamics (118) into the control vector function s : R3 Rnobs , whose elements encode affine form (69): the obstacle avoidance→ constraints: r˙ f0(x) = , (127a) sj(r) = 1 Hj(r cj) 2, j = 1, ... , nobs. (122) gnˆ − k − k −  We will briefly mention how to evaluate the Jacobian 0 fi(x) = , i = 1, 2, 3, (127b) (44e) for (122). Suppose that a reference position trajec- ei 1   tory r¯(t) 0 is available, and consider the j-th obstacle f4(x) = 0, (127c) constraint{ } in (122). The following gradient then allows 3 to evaluate (44e): where ei R is the i-th standard basis vector. The reader may∈ be surprised that these are the only changes T Hj Hj r cj required to convert the SCvx formulation from the previ- sj(r) = − . (123) ∇ − ous section into a form that can be ingested by GuSTO. Hj r cj 2 − The only remaining task is to choose the GuSTO algo-  The boundary conditions (39f) and ( 39g) are obtained rithm parameters listed in Table1. Just like for SCvx, from (111): the rest of the solution process is completely automated by the general GuSTO algorithm description in Part II. r(0) r0 gic x(0), p = − , (124a) r˙(0) v0  −  Initial Trajectory Guess  r(1) rf The initial state reference trajectory is obtained by a gtc x(1), p = − . (124b) r˙(1) vf simple straight-line interpolation as provided by (42):  −   Lastly, we have to convert the cost (114a) into the r r x(t) = (1 t) 0 + t f , for t [0, 1]. (128) Bolza form (40). There is no terminal cost, hence φ 0. − v0 vf ∈ On the other hand, the direct transcription of the run-≡     ning cost would be Γ(x, u, p) = pσ2. However, we will The initial parameter vector, which is just the time simplify this by omitting the time dilation parameter. dilation, is chosen to be in the middle of the allowed This simplification makes the problem easier by remov- trajectory durations: ing nonconvexity from the running cost, and numerical t + t p = f,min f,max . (129) results still show good resulting trajectories. Further- 2 more, since SCvx augments the cost with penalty terms in (49), we will normalize the running cost by its nomi- The initial input trajectory is guessed based on insight nal value in order to make it roughly unity for a “mild” about the quadrotor problem. Ignoring any particular trajectory. This greatly facilitates the selection of the trajectory task, we know that the quadrotor will gener- penalty weight λ. By taking the nominal value of σ as ally have to support its own weight under the influence the hover condition, we define the following running cost: of gravity. Thus, we choose a constant initial input guess that would make a static quadrotor hover: σ 2 Γ(x, u, p) = . (125) a(t) = gnˆ, σ(t) = g, for t [0, 1]. (130) g ∈   50 Preprint

TABLE 3 Breakdown of subproblem size for the Initial r r 4.0 quadrotor obstacle avoidance example. 6 Converged r 6 a (scaled) Obstacle Obstacle 3.5 SCvx GuSTO 5 5 3.0 Variables 640 542 [m] [m] 2 2

r 4 r 4 Affine equalities 186 186 2.5 [m/s] 2 k

Affine inequalities 240 360 ˙r

3 3 k One-norm cones 29 0 2.0 Infinity-norm cones 61 31 2 2 1.5 Second-order cones 30 30 Velocity North position North position 1.0 1 1 0.5 0 0 0.0 0.5 0.5 1.5 2.5 3.5 0.5 0.5 1.5 2.5 3.5 This initial guess is infeasible with respect to both − − the dynamics and the obstacle constraints, and it is ex- East position r1 [m] East position r1 [m] tremely cheap to compute. The fact that it works well (a) SCvx. in practice highlights two facts: SCP methods are rela- Initial r r 4.0 6 6 tively easy to initialize, and they readily accept infeasible Converged r a (scaled) Obstacle Obstacle 3.5 initial guesses. Note that in the particular case of this 5 5 problem, an initial trajectory could also be computed 3.0 [m] [m] 2 2

r 4 r 4 using a convex version of the problem obtained by re- [m/s] 2.5 2 k

moving the obstacle constraints (113) and applying the ˙r

3 3 k LCvx relaxation (115). 2.0 2 2 1.5 Velocity Numerical Results North position North position 1.0 1 1 We now have a specialized instance of Problem 39 and an 0.5 initialization strategy for the quadrotor obstacle avoid- 0 0 ance problem. The solution is obtained via SCP ac- 0.0 0.5 0.5 1.5 2.5 3.5 0.5 0.5 1.5 2.5 3.5 − − cording to the general algorithm descriptions for SCvx East position r1 [m] East position r1 [m] and GuSTO. For temporal discretization, we use the (b) GuSTO. FOH interpolating polynomial method from “Discretiz- ing Continuous-time Optimal Control Problems”. The algorithm parameters are provided in Table2, and the FIGURE 23 The position trajectory evolution (left) and the final con- full implementation is available in the code repository verged trajectory (right) for the quadrotor obstacle avoidance prob- lem. The continuous-time trajectory in the right plots is obtained by linked in Figure2. ECOS is used as the numerical con- numerically integrating the dynamics (131). The fact that this trajec- vex optimizer at 2 in Figure 11[216]. The timing results tory passes through the discrete-time subproblem solution confirms correspond to a Dell XPS 13 9260 laptop powered by an dynamic feasibility. The red lines in the right plots show the accelera- Intel Core i5-7200U CPU clocked at 2.5 GHz. The com- tion vector as seen from above. puter has 8 GiB LPDD3 RAM and 128 KiB L1, 512 KiB L2, and 3 MiB L3 cache. The convergence process for both algorithms is crease the initial guess (129) until the maximum allowed illustrated in Figure 24. We have set the convergence flight time tf,max. Note that this is optimal, since a quadrotor minimizing control energy (114a) will opt for tolerances ε = εr = 0 and we terminate both algorithms after 15 iterations. At each iteration, the algorithms a slow trajectory with the lowest acceleration. have to solve subproblems of roughly equal sizes, as The SCvx and GuSTO solutions are practically iden- shown in Table3. Differences in the sizes arise from tical. The fact that they were produced by two different the slightly different subproblem formulations of each algorithms can only be spotted from the different con- algorithm. Among the primary contributors are how vergence histories on the left side in Figure 23. It is not artificial infeasibility and unboundedness are treated, always intuitive how the initial guess morphs into the and differences in the cost penalty terms. Notably, final trajectory. It is remarkable that the infeasible tra- GuSTO has no one-norm cones because it does not have jectories of the early iterations morph into a smooth and a dynamics virtual control term like SCvx (compare feasible trajectory. Yet, this is guaranteed by the SCP (55b) and (82b)). Despite their differences, Figure 24 convergence theory in Part II. For this and many other shows that GuSTO and SCvx are equally fast. trajectory problems, we have observed time and again The trajectory solutions for SCvx and GuSTO are how SCP is adept at morphing rough initial guesses into shown in Figures 23 and 25. Recall that this is a free fine-tuned feasible and locally optimal trajectories. final time problem, and both algorithms are able to in- Finally, we will make a minor note of that temporal 51 Preprint 2 2 k k 2 2 ∗ ∗ k k X X ∗ ∗ − − X X i i k k

X X 1 k k 10− 2 10−

3 10−

4 10− 5 10−

Distance from solution, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Distance from solution, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Iteration number Iteration number

Solve 1.0 Solve 1.0 0.06 Discretize 0.06 Discretize Formulate Formulate 0.8 0.8 0.04 0.04 0.6 0.6

0.4 0.4 0.02 0.02 Cumulative time [s] Cumulative time [s]

Time per iteration [s] 0.2 Time per iteration [s] 0.2

0.00 0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Iteration number Iteration number (a) SCvx. (b) GuSTO.

FIGURE 24 Convergence and runtime performance for the quadrotor obstacle avoidance problem. Both algorithms take a similar amount of time and number of iterations to converge to a given tolerance. The runtime subplots in the bottom row show statistics on algorithm performance over 50 executions. The solution times per subproblem are roughly equal, which shows that the subproblem difficulty stays constant over the iterations. Formulate measures the time taken to parse the subproblem into the input format of the convex optimizer; Discretize measures the time taken to temporally discretize the linearized dynamics (45b); and Solve measures the time taken by the core convex numerical optimizer. Each bar shows the median time. The blue trace across the diagonal shows the cumulative time obtained by summing the runtimes of all preceding iterations. Its markers are placed at the median time, and the error bars show the 10% (bottom) and 90% (top) quantiles. Because small runtime differences accumulate over iterations, the error bars grow with iteration count. discretization results in some clipping of the obstacle Internal Ball Camera (Int-Ball) are two recent successful keep-out zones in Figure 23. This is the direct result deployments of such robots. Their goals include filming of imposing constraints only at the discrete-time nodes. the ISS and assisting with maintenance tasks [221]. The Various strategies exist to mitigate the clipping effect, particulars of this SCP example are taken primarily such as increasing the radius of the keep-out zones, in- from [50]. creasing the number of discretization points, imposing a The quadrotor in the previous section was modeled sufficiently low maximum velocity constraint, or numeri- as a point mass whose attitude is approximated by the cally minimizing a function related to the state transition direction of the acceleration vector. The free-flyer, on matrix [191, 192]. the other hand, is a more complex vehicle that must generally perform coupled translational and rotational SCP: 6-DoF Free-flyer motion using a multi-thruster assembly. Maneuvers may Having demonstrated the use of sequential convex pro- require to point a camera at a fixed target, or to emu- gramming on a relatively simple quadrotor trajectory, we late nonholonomic behavior for the sake of predictability now present a substantially more challenging example in- and operator comfort [222, 223]. This calls for whole- volving nonlinear 6-DoF dynamics and a more complex body motion planning, for which we model the free-flyer set of obstacle avoidance constraints. as a full 6-DoF rigid body with both translational and The objective is to compute a trajectory for a 6-DoF rotational dynamics. free-flying robotic vehicle that must navigate through To describe the equations of motion, we need to in- an environment akin to the International Space Station troduce two reference frames. First, let be an iner- (ISS). Free-flyers are robots that provide assistance to tial reference frame with a conveniently positioned,FI but human operators in micro-gravity environments [219, otherwise arbitrary, origin. Second, let be a rotating 220]. As shown in Figure 26, Astrobee and the JAXA reference frame affixed to the robot’s centerFB of mass, and 52 Preprint

] 25 2 a k k2 σ

[m/s 20 2 k

a 15 k

10

5

Acceleration 0 0.0 0.5 1.0 1.5 2.0 2.5 Time [s] ] ◦

)[ 60 1 − 2

k FIGURE 26 Two examples of free-flyer robots at the International a k

a 40 Space Station, the JAXA Int-Ball (left) and the Naval Postgraduate T

ˆn School/NASA Astrobee (right) [217, 218]. These robots provide a helping hand in space station maintenance tasks. 20

Tilt arccos( 3 0 M R . The physical parameters of the free-flyer, the 0.0 0.5 1.0 1.5 2.0 2.5 massB ∈m > 0 and the principal moment of inertia ma- Time [s] 3 3 trix J R × , are fixed. The dynamics are written in ∈ absolute time t that spans the interval [0, tf ]. We allow FIGURE 25 The acceleration norm and tilt angle time histories for the final time tf to be optimized, and bound it using the the converged trajectory of the quadrotor obstacle avoidance prob- previous constraint (110). lem. These are visually identical for SCvx and GuSTO, so we only show a single plot. The continuous-time acceleration norm is obtained The initial and final conditions for each state are spec- from the FOH assumption (S28), while the continuous-time tilt angle is ified in this example to be fixed constants: obtained by integrating the solution through the nonlinear dynamics. Similar to Figure 21, the acceleration time history plot confirms that r (0) = r0, r (tf ) = rf , (132a) lossless convexification holds (i.e., the constraint (115b) holds with I I v (0) = v0, v (tf ) = vf , (132b) equality). I I qB←I (0) = q0, qB←I (tf ) = qf , (132c)

ω (0) = 0, ω (tf ) = 0. (132d) whose unit vectors are aligned with the robot’s principal B B axes of inertia. We call the inertial frame and the The free-flyer robot implements a 6-DoF holonomic ac- body frame. Correspondingly,FI vectors expressedF inB FI tuation system based on a centrifugal impeller that pres- are inertial vectors and carry an subscript (e.g., x ), surizes air, which can then be vented from a set of noz- while those expressed in are bodyI vectors and carryI FB zles distributed around the body [224, 225]. Holonomic a subscript (e.g., x ). For the purpose of trajectory actuation means that the thrust and torque vectors are generation,B we encodeB the orientation of with respect FB independent from the vehicle’s attitude [226]. The ca- to using a vector representation of a unit quaternion, pability of this system can be modeled by the following FI 4 qB←I R . control input constraints: Our∈ convention is to represent the translational dy- namics in and the attitude dynamics in . This T (t) 2 Tmax, M (t) 2 Mmax, (133) yields the followingFI Newton-Euler equations thatFB govern k I k ≤ k B k ≤ the free-flyer’s motion [59]: where Tmax > 0 and Mmax > 0 are user-defined constants representing the maximum thrust and torque. r˙ (t) = v (t), (131a) This problem involves both convex and nonconvex I I 1 state constraints. Convex constraints are used to bound v˙ (t) = m− T (t), (131b) I I the velocity and angular velocity magnitudes to user- 1 q˙B←I (t) = qB←I (t) ω (t), (131c) defined constants: 2 ⊗ B 1 ω˙ (t) = J − M (t) ω (t)×Jω (t) . (131d) v (t) 2 vmax, ω (t) 2 ωmax. (134) B B − B B k I k ≤ k B k ≤ The state variables in the above equations are the in- Nonconvex state constraints are used to model the (fic- ertial position r R3, the inertial velocity v R3, tional) ISS flight space and to avoid floating obstacles. I ∈ I ∈ 4 the aforementioned unit quaternion attitude qB←I R , The latter are modeled exactly as in the quadrotor ex- ∈ and the body angular velocity ω R3. The latter vari- ample using the constraints in (113). The flight space, able represents the rate at whichB ∈ rotates with respect on the other hand, is represented by a union of rectan- to . The free-flyer’s motion isF controlledB by an iner- gular rooms. This is a difficult nonconvex constraint and FI tial thrust vector T R3 and a body torque vector its efficient modeling requires some explanation. I ∈ 53 Preprint

dss (exact) d˜ss for σ = 1 d˜ss for σ = 10 d˜ss for σ = 50 1

6 0

[m] 4 ,2

I 1 r − 2

2 SDF value

Position 0 −

3 2 − − 6 8 10 12 6 8 10 12 6 8 10 12 6 8 10 12 Position r ,1 [m] Position r ,1 [m] Position r ,1 [m] Position r ,1 [m] I I I I (a) (b) (c) (d)

FIGURE 27 A heatmap visualization of the exact SDF (143) and the approximate SDF (146) for several values of the sharpness parameter σ. Each plot also shows the SDF zero-level set boundary as a dashed line. This boundary encloses the feasible flight space, which corresponds to nonnegative values of the SDF.

ss At the conceptual level, the space station flight space where vector division is entrywise, and the new terms ci 3 ss is represented by a function dss : R R that maps and si are the room’s centroid and diagonal: → inertial position to a scalar number. This is commonly ss ss ss ss ss ui + li ss ui li referred to as a signed distance function (SDF) [227]. Let ci = , si = − . (140) us denote the set of positions that are within the flight 2 2 3 space by ss R . A valid SDF is given by any function To find the SDF for the overall flight space, we begin O ⊂ that satisfies the following property: with the simpler task of writing an SDF for a single room. This is straightforward using (139): r ss dss(r ) 0. (135) I I ss ∈ O ⇔ ≥ r ci dss,i(r ) , 1 I − , (141) I − sss If we can formulate a continuously differentiable dss, i ∞ then we can model the flight space using (135) as the which is a concave function that satisfies a similar prop- following nonconvex path constraint (39e): erty to (135): d (r ) 0. (136) ss r i dss,i(r ) 0. (142) I ≥ I ∈ O ⇔ I ≥ Open-source libraries such as Bullet [228] are available Because dss,i is concave, the constraint on the right to compute the SDF for obstacles of arbitrary shape. In side of (142) is convex. This means that constraining this work, we will use a simpler custom implementation. the robot to be inside room i is a convex operation, O To begin, let us model the space station as an assembly which makes sense since i is a convex set. of several rooms. This is expressed as a set union: As the free-flyer traversesO the flight space, one can imagine the room SDFs to evolve based on the robot’s nss position. When the robot enters the i-th room, dss,i be- ss i, (137) O , O comes positive and grows up to a maximum value of one i=1 [ as the robot approaches the room center. As the robot where each room i is taken to be a rectangular box: exits the room, dss,i becomes negative and decreases in O value as the robot flies further away. To keep the robot 3 ss ss i , r R : li r ui . (138) inside the space station flight space, the room SDFs have O I ∈ ≤ I ≤ to evolve such that there is always at least one nonnega-  ss ss The coordinates li and ui represent the “rear bottom tive room SDF. This requirement precisely describes the right” and the “ahead top left” corners of the i-th room, overall SDF, which can be encoded mathematically as a when looking along the positive axis directions. Equiv- maximization: alently, but more advantageously for implementing the dss(r ) , max dss,i(r ). (143) SDF, the set (138) can be represented as: I i=1,...,nss I

ss 3 r ci We now have an SDF definition which satisfies the i , r R : I −ss 1 , (139) required property (135). Figure 27(a) shows an example O I ∈ si ≤ n ∞ o 54

Preprint

scalar field generated by (143) for a typical space station by replacing the max operator with Lσ. We readily ob- layout. Visually, when restricted to a plane with a fixed serve that dss is indeed highly nonconvex, and that the altitude r ,3, the individual room SDFs form four-sided approximation quickly converges to the exact SDF as σ “pyramids”I above their corresponding room. increases. We now generalize this one-dimensional exam- The room SDFs dss,i are concave, however the maxi- ple and replace (143) with the following approximation: mization in (143) generates a nonconvex function. Two possible strategies to encode (143) are by introducing in- ˜ dss(r ) , Lσ δss(r ) , (146a) teger variables or by a smooth approximation [46]. The I I former strategy generates a mixed-integer convex sub- δss,i(r ) dss,i(r ), i = 1, ... , nss, (146b) problem, which is possible to solve but does not fit the I ≤ I SCP algorithm mold of this article. As mentioned before where the new functions δss,i as so-called “slack” room for Problem 39, our subproblems do not involve integer SDFs. The model (146) admits several favorable prop- variables. We thus pursue the smooth approximation erties. First, (146a) is smooth in the new slack SDFs strategy, which yields an arbitrarily accurate approxima- and can be included directly in (39e) as the following tion of the feasible flight space ss and carries the sig- nonconvex path constraint: nificant computational benefit ofO avoiding mixed-integer d˜ss(r ) 0. (147) programming. I ≥ The crux of our strategy is to replace the maximiza- Second, the constraints in (146b) are convex and can tion in (143) with the softmax function. Given a general be included directly in (39c). Overall, the approximate vector v Rn, this function is defined by: ∈ SDF (146a) satisfies the following property: n 1 σvi ˜ Lσ(v) = σ− log e , (144) r ss δss(r ) such that (147) holds. (148) I ∈ O ⇔ ∃ I i=1 X ˜ 3 where ss R is an approximation of ss that becomes O ⊂ O where σ > 0 is a sharpness parameter such that Lσ upper arbitrarily more accurate as the sharpness parameter σ bounds the exact max with an additive error of at most increases. The geometry of this convergence process is log(n)/σ. To develop intuition, consider the SDF (143) illustrated in Figure 27(b)-(c) for a typical space station for two adjacent rooms and restricted along the j-th axis layout. Crucially, the fact that Lσ is an upper bound of the inertial frame. We can then write the SDF simply of the max means that the approximate SDF d˜ss is non- as: negative at the interfaces of adjacent rooms. In other ss ss words, the passage between adjacent rooms is not artifi- r ,j c1j r ,j c2j dss(r ,j) = max 1 I − , 1 I − . cially blocked by our approximation. I − sss − sss  1j 2j  Summarizing the above discussion, we wish to solve (145) the following free final time optimal control problem that Figure 28 illustrates the relationship between the ex- minimizes control energy: act SDF (145) and its approximation, which is obtained tf 2 2 min T (t) 2 + M (t) 2 dt (149a) TI ,MB,tf k I k k B k Z0 s.t. (131)-(134), (147), (110), (113). (149b)

SCvx Formulation d˜ss Like for the quadrotor example, we begin by demonstrat-

SDF value ing how to cast Problem 149 into the standard template of Problem 39. While this process is mostly similar to r ,j I that of the quadrotor, the particularities of the flight lss 1 uss lss 2 uss 1j O 1j 2j O 2j space constraint (147) will reveal a salient feature of effi- cient modeling for SCP. Once the modeling step is done and an initial guess trajectory is defined, the solution dss process is completely automated by the general SCvx algorithm description in Part II. To keep the notation dss,2 dss,1 light, we omit the argument of time where possible. Looking at the dynamics (131), we define the following FIGURE 28 Illustration of the correspondence between between the state and control vectors: exact SDF (143) and its approximation d˜ss using the softmax function (144). As the sharpness parameter σ increases, the approximation 13 x = r , v , qB←I , ω R , (150a) quickly converges to the exact SDF. This figure illustrates a sweep for I I B ∈ 6 σ [1, 5], where lower values are associated with darker colors. u = T , M R . (150b) ∈ I B ∈ 55  Preprint

TABLE 4 Algorithm parameters for the 6-DoF free- by: flyer example. αt 1+Nnss p = R . (152) ∆ss ∈ Problem parameters:   In absolute time, the dynamics (39b) are given directly tf ,min 60 s min final time tf ,max 200 s max final time by (131). As for the quadrotor, this forms a set of time- m 7.2 kg free-flyer mass invariant first-order ordinary differential equations: J 0.1083 I kg m2 free-flyer inertia matrix · 3 Tmax 20 mN max two-norm of thrust v Mmax 100 µN m max two-norm of torque I 1 1 vmax 0.4 m s− max free-flyer speed m− T 1 f(x, u) = I . (153) max 1 ° s− max free-flyer angular velocity  1  2 qB←I ω c1 (8.5, 0.15, 5.0) m center of obstacle 1 1 ⊗ B − J M ω×Jω ωc2 (11.2, 1.84, 5.0) m center of obstacle 2  −   B − B B  c3 (11.3, 3.8, 4.8) m center of obstacle 3   1 H1, H2, H3 3.33 I3 m− shape of obstacles 1, 2, and 3 The boundary conditions (39f) and (39g) are obtained ss · li see code m room rear bottom right corner ss from (132): ui see code m room ahead top left corner r0 (6.5, 0.2, 5) m initial position vector − 1 v0 (0.035, 0.035, 0) m s− initial velocity vector r (0) r0 q 40(0, 1, 1) initial attitude quaternion I − 0 − v (0) v0 rf (11.3, 6, 4.5) m final position vector gic x(0), p =  I −  , (154a) 1 qB←I (0) q0 vf (0, 0, 0) m s− final velocity vector − qf 0(0, 0, 1) final attitude quaternion   ω (0)  50 softmax sharpness parameter  B  4   ss 10− slack room SDF weight r (1) rf I − σ v (1) vf Algorithmε parameters: gtc x(1), p =  I −  . (154b) qB←I (1) qf SCvx GuSTO −   ω (1)  N 50 30  B  q ∞ ∞   ˆq ∞ ∞ The dynamics are converted to normalized time in the P (48) (48) same way as (120): h (71) 103 λ 4 9 0, max 10 , 10 f(x, u, p) = αtf(x, u). (155) λ 1 1 3 3 λ0, λ1 10− , 10 10− , 10 The convex state and input path constraints (39c)- η , , 0, 0.1, 0.7 0.1, 0.5, 0 1 2 − (39d) are straightforward. As for the quadrotor example, ηsh,η gr 2, 2 2, 2 ρfailρ ρ 5 we leverage the mixed state-parameter nature of (39c) to β β 0.8 include all of the convex state and parameter constraints. γk 6 ∗ µ 0 0 In particular, these are (110), (134), and (146b). Using r 0 0 the definition of time dilation, we translate (110) into ε the constraint: ε

tf,min α tf,max. (156) ≤ t ≤ Next, we define the parameter vector to serve two pur- Using the definition of the concatenated slack SDF poses. First, as for the quadrotor, we define a time dila- vector (151), we translate (146b) into the following con- tion αt such that (117) holds, yielding t = αtt. Second, straints: we take advantage of the fact that (39c) is mixed in the state and parameter in order to place the slack room ∆ss,ik dss,i r (tk) , i = 1, ... , nss, (157) ≤ I SDFs in (146b) into the parameter vector. In particu- k = 1, ... , N. lar, we recognize that according to (55c), the constraint  (146b) is imposed only at the discrete-time grid nodes. Consequently, the convex path constraint set in X Thus, for a grid of N nodes, there are only Nnss instances (39c) is given by: of (146b) to be included. We can therefore define the fol- 13 1+Nn lowing vector: = (x, p) R R ss :(134), (156), X ∈ ×  and (157) hold . (158) 1 N Nnss ∆ss , (δss, ... , δss ) R , (151) ∈ The convex input constraint set in (39d ) is given simply by all the input vectors thatU satisfy (133). k where δss δss r (tk) and ∆ss,i+(k 1)nss denotes the The nonconvex path constraints (39e) for the free-flyer ≡ I − slack SDF value for the i-th room at time tk. To keep problem involve the ellipsoidal floating obstacles (113)  the notation concise, we will use the shorthand ∆ss,ik and the approximate flight space constraint (147). The ≡ ∆ss,i+(k 1)nss . The overall parameter vector is then given floating obstacle constraints are modeled exactly like for − 56 Preprint the quadrotor using (122). For the flight space con- k straint, we leverage the concatenated slack SDF vector ∂L ( ss) k ∂ ss,i (151) and the eventual temporal discretization of the σ δ1.0 problem in order to impose (147) at each temporal grid δ

node as follows: 50 0.8

=

k σ Lσ δss 0, k = 1, ... , N. (159) ≥ 0.6  Vanishing gradient Hence, the nonconvex path constraint function in (39e) = 1 3 Nnss nobs+1 threshold, 0.4 can be written as s : R R R . The first k × → ss,j=i 1 n components are given by (122) and the last compo- 6 , nss− obs dss,i r (tk ) σ I 0.2 k  δ  ss,i nent is given by the negative left-hand side of (159).  dss,i r (tk ) I It remains to define the running cost of the Bolza cost 0.0 δ 3 2 1 0 1  function (40). Like for the quadrotor, we scale the inte- − − − grand in (149a) to be mindful of the penalty terms which the SCvx algorithm will add. Furthermore, we simplify FIGURE 29 Visualization of the effect of slackness in the SDF lower- the cost by neglecting time dilation and directly associ- bound constraint (146b) on the gradient of the approximate SDF ating the absolute-time integral of (149a) with the nor- (146a). This plot is obtained by setting nss = 6 and δss,j = 1 for all − malized-time integral of (40). This yields the following j = i. The individual curves are obtained by gradually reducing δss,i 6 convex running cost definition: from its maximum value of dss,i. As the sharpness parameter σ in- creases, a “cutoff” value appears, below which the approximate SDF 2 2 becomes insensitive to changes in δss,i. T 2 M 2 Γ(x, u, p) = k I k + k Bk . (160) T M  max   max  At this point, it may seem as though we are finished has a gradient ∂dss/∂dss,i = 1. Since we want the SCP with formulating Problem 39 for SCvx. However, the subproblem to be an accurate local approximation of the seemingly innocuous flight space constraint (159) actu- nonconvex problem, we expect the same behavior for the ally hides an important difficulty that we will now ad- approximate SDF (146a) for high values of σ. However, dress. The importance of the following discussion can- this may not be the case. not be overstated, as it can mean the difference between Figure 29 illustrates what happens to the approximate successful trajectory generation, and convergence to an SDF gradient (161) as the slackness in (157) increases. infeasible trajectory (i.e., one with non-zero virtual con- First, we note that when there is no slackness, increas- trol). In the case of the 6-DoF free-flyer, omission of the ing the σ parameter does indeed make the approximate following discussion incurs a 40% optimality penalty for gradient converge to the exact value of one. However, the value of (149a). as slackness grows, there is a distinct cutoff value below We begin investigating (159) by writing down its Jaco- which (161) becomes zero. This is known as a vanish- bians, which SCvx will use for linearizing the constraint. ing gradient problem, and has been studied extensively k nss Note that (159) is a function of only δss R , which is for machine learning [229]. The core issue is that SCP part of the concatenated slack SDF (151∈), and thus re- relies heavily on gradient information in order to deter- sides in the parameter vector. Hence, only the Jacobian mine how to improve the feasibility and optimality of (44g) is non-zero. Using the general softmax definition the subproblem solution. As an analogy, the gradient k (144), the i-th element of Lσ δss is given by: acts like a torchlight that illuminates the local surround- ∇ ings in a dark room and allows one to take a step closer k nss  1 ∂Lσ δss σδk − σδk to a light switch. When the gradient vanishes, so does = e ss,j e ss,i . (161) ∂δk the torchlight, and SCP no longer has information about ss,i  j=1  X  which direction is best to take. Unless the solution is al- When the slack SDF satisfies the lower-bound (157) ready locally optimal, a vanished gradient most often with equality, the Jacobian (161) is an accurate repre- either blocks SCP from finding more optimal solutions, sentation of how the overall SDF (146a) changes due to or forces it to use non-zero virtual control. The result is small perturbations in the robot’s position. The prob- that the converged trajectory is either (heavily) subop- lematic case occurs when this bound is loose. To illus- timal or even infeasible. trate the idea, suppose that the robot is located near the Looking at Figure 29, one may ask, in order to recover center of i, such that dss,i r (tk) = 0.8. We assume gradient information, why does SCP not simply increase O I k that the rooms do not overlap and that the slack SDF val- δss,i above the vanishing threshold? But remember, it is k  k ues of the other rooms satisfy δss,j = dss,j r (tk) = 1 the gradient that indicates that increasing δss,i is a good I − for all j = i. Since the robot is uniquely inside i, the approach in the first place. The situation is much like 6 O exact SDF (143) is locally a linear function of dss,i and focusing your eyes on the flat region of the σ = 50 curve 57 Preprint

Favorable gradient behavior is instrumental for good performance.

on the very left in Figure 29. If you only saw the part 0 k 1 of the curve for δ /dss,i r (tk) [ 3, 2], you would m− ei ss,i fi(x, p) =   , i = 1, 2, 3, (164b) also not know whether it isI best to∈ increase− − or decrease 0  δk .  0  ss,i   Fortunately, the remedy is quite simple. Because (135)  0  is a necessary and sufficient condition, we know that 0 f (x, p) = , i = 4, 5, 6, (164c) slackness in (157) cannot be used to make the trajec- i  0  1 tory more optimal. In other words, a trajectory with J − ei non-zero slackness will not achieve a lower cost (149a).   3   where ei R is the i-th standard basis vector. Just like Hence, we simply need some way to incentivize the con- ∈ vex subproblem optimizer to make (157) hold with equal- for the quadrotor, we are “done” at this point and the ity. Our approach is to introduce a terminal cost that rest of the optimization model is exactly the same as for maximizes the concatenated slack SDF: SCvx in the last section.

N nss Initial Trajectory Guess φ(∆ss) = εss ∆ss,ik, (162) − The initial trajectory guess is based on some simple in- k=1 i=1 X X tuition about what a feasible free-flyer trajectory might look like. Although this guess is more complicated than where ε is any user-chosen positive number. ss R++ the straight-line initialization used for the quadrotor, it To make sure∈ that (162) does not interfere with the ex- is based on purely kinematic considerations. This makes tra penalty terms introduced by SCvx, we set ε to a ss the guess quick to compute but also means that it does very small but numerically tolerable value as shown in not satisfy the dynamics and obstacle constraints. The Table4. fact that SCP readily morphs this coarse guess into a In summary, our investigation into (159) allowed us feasible and locally optimal trajectory corroborates the to identify a vanishing gradient issue. This resulted in a effectiveness of SCP methods for high-dimensional non- simple yet effective remedy in the form of a terminal cost convex trajectory generation tasks. (162). The discussion hopefully highlights three salient To begin, the time dilation α is obtained by averaging features of good modeling for SCP-based trajectory gen- t the final time bounds as in (129). An “L-shape” path is eration. First, SCP does not have equal performance for then used for the position trajectory guess. In particular, mathematically equivalent problem formulations (such recall that according to (132a), the free-flyer has to go as the free-flyer problem with and without (162)). Sec- from r to r . We define a constant velocity trajectory ond, favorable gradient behavior is instrumental for good 0 f which closes the gap between r and r along the first performance. Third, remedies to recover good perfor- 0 f axis, then the second, and finally the third, in a total mance for difficult problems are often surprisingly sim- time of α seconds. The trajectory thus consists of three ple. t straight legs with sharp 90 degree turns at the transition points, which is akin to the Manhattan or taxicab geome- GuSTO Formulation try of the one-norm [230]. The velocity is readily derived Like for the quadrotor example, the GuSTO formulation from the position trajectory, and is a constant-norm vec- is very similar to SCvx. We can express (160) as the tor whose direction changes twice to align with the ap- quadratic running cost (68) as follows: propriate axis in each trajectory leg. Furthermore, we initialize the concatenated slack SDF parameter vector S(p) = diag T 2 I , M 2 I , (163a) ( max− 3 max− 3) (151) by evaluating (141) along the position trajectory `(x, p) = 0, (163b) guess for each room and discrete-time grid node. g(x, p) = 0. (163c) The attitude trajectory guess is only slightly more in- volved, and it is a general procedure that we can recom- The dynamics (153) are also cast into the control affine mend for attitude trajectories. According to (132c), the form (69): free-flyer has to rotate between the attitudes encoded by q0 and qf . Since quaternions are not additive and must v maintain a unit norm to represent rotation, straight-line I 0 interpolation from q0 to qf is not an option. Instead, we f0(x, p) =  1  , (164a) 2 qB←I ω use spherical linear interpolation (SLERP) [188, 231]. 1 ⊗ B  J − ω×Jω  This operation performs a continuous rotation from q0 − B B     58 Preprint

to qf at a constant angular velocity around a fixed axis. the initial and final quaternion vectors are expressed in To define the operation, we introduce the exponential degrees using the angle-axis representation of (165a). and logarithmic maps for unit quaternions: ECOS is used as the numerical convex optimizer, and the full implementation is available in the code reposi- 3 Exp(q) αu, where α R, u R , (165a) tory linked in Figure2. , ∈ ∈ u sin(α/2) Log αu . (165b) The convergence processes for SCvx and GuSTO are , cos(α/2) shown in Figure 30. We have again set ε = ε = 0 so   r  that we can observe the convergence process for exactly The exponential map converts a unit quaternion to 15 iterations. At each iteration, the algorithms solve its equivalent angle-axis representation. The logarithmic a convex subproblem whose size is documented in Ta- map converts an angle-axis rotation back to a quater- ble5. Note that the subproblems of both algorithms nion, which we write here in the vectorized form used are substantially larger than for the quadrotor example, to implement qB←I in (131c). SLERP for the attitude and represent a formidable increase in dimensionality for quaternion qB←I can then be defined by leveraging (165): the numerical problem. However, modern IPMs easily handle problems of this size and we will see that the in- creased variable and constraint count is of little concern. qe = q∗ qf , (166a) 0 ⊗ We further note that the larger number of variables and qB←I (t) = q0 Exp t Log(qe) , (166b) affine inequalities for GuSTO is due to how our imple- ⊗ mentation uses extra slack variables to encode the soft  where qe is the error quaternion between qf and q0, penalty function (70). Because GuSTO does not use a and t [0, 1] is an interpolation parameter such that ∈ dynamics virtual control, it has no one-norm cones, while qB←I (0) = q0 and qB←I (1) = qf . The angular velocity SCvx has several such cones to model the virtual con- trajectory guess is simple to derive, since SLERP per- trol penalty (48). Due to its larger subproblem size and forms a constant velocity rotation around a fixed axis. slightly more complicated code for including constraints Using (166a): as soft penalties, the “solve” and “formulate” times per

1 subproblem are slightly larger for GuSTO in this exam- ω (t) = αt− Log(qe). (167) B ple. Nevertheless, both algorithms have roughly equal runtimes, and GuSTO has the advantage of converging The free-flyer is a very low thrust vehicle to begin with. to a given tolerance in slightly fewer iterations. By using (149a), we are in some sense searching for the lowest of low thrust trajectories. Hence, we expect the The converged trajectories are plotted in Figures 31 control inputs T and M to be small. Without any and 32. The left subplots in Figures 31a and 31b show further insight, itI is hardB to guess what the thrust and a remarkably similar evolution of the initial guess into torque would look like for a 6-DoF vehicle in a micro- the converged trajectory. The final trajectories are vi- gravity environment. Hence, we simply set the initial sually identical, and both algorithms discover that the control guess to zero. maximum allowed flight time of tf,max is control energy- optimal, as expected. Numerical Results Lastly, Figure 33 plots the evolution of the nonconvex We now have a specialized instance of Problem 39 and flight space and obstacle avoidance inequalities (147) and an initialization strategy for the 6-DoF free-flyer prob- (113). Our first observation is that the constraints hold lem. The trajectory solution is generated using SCvx at the discrete-time nodes, and that the free-flyer ap- and GuSTO with temporal discretization performed us- proaches the ellipsoidal obstacles quite closely. This is ing the FOH interpolating polynomial method in “Dis- similar to how the quadrotor brushes against the obsta- cretizing Continuous-time Optimal Control Problems”. cles in Figure 23, and is a common feature of time- or The algorithm parameters are provided in Table4, where energy-optimal trajectories in a cluttered environment. Our second observation concerns the sawtooth-like non- smooth nature of the SDF time history. Around t = 25 s TABLE 5 Breakdown of subproblem size for the 6- and t = 125 s, the approximate SDF comes close to zero DoF free-flyer example. even though the position trajectory in Figure 31 is not near a wall at those times. This is a consequence of our SCvx GuSTO modeling, since the SDF is near-zero at the room in- Variables 2267 3352 Affine equalities 663 663 terfaces (see Figures 27 and 28), even though these are Affine inequalities 350 1650 not physical “walls”. However, around t = 100 s, the One-norm cones 49 0 flight space constraint (147) is actually activated as the Infinity-norm cones 401 351 free-flyer rounds a corner. Roughly speaking, this is in- Second-order cones 200 200 tuitively the optimal thing to do. Like a Formula One driver rounding a corner by following the racing line, the 59 Preprint 2 2 k k 2 2 ∗ ∗ k k X X ∗ ∗ − − X X i i k k 3 X 2 X

k 10− k 10−

6 10− 4 10−

9 10−

6 10− 12 10−

Distance from solution, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Distance from solution, 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Iteration number Iteration number

Solve 15 Solve Discretize 1.0 Discretize 15 0.8 Formulate Formulate 0.8 10 0.6 10 0.6 0.4 0.4 5 5

0.2 Cumulative time [s] Cumulative time [s]

Time per iteration [s] Time per iteration [s] 0.2

0.0 0.0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Iteration number Iteration number (a) SCvx. (b) GuSTO.

FIGURE 30 Convergence and runtime performance for the 6-DoF free-flyer problem. Both algorithms take a similar amount of time to converge. The runtime subplots in the bottom row show statistics on algorithm performance over 50 executions. GuSTO converges slightly faster for this 14 example and, although both algorithms reach numerical precision for practical purposes, GuSTO converges all the way down to a 10− tolerance. The plots are generated according to the same description as for Figure 24. free-flyer spends less control effort by following the short- off-the-shelf codes [216, 232]. This makes the injection est path. An unfortunate consequence is that this results of convex optimization into an autonomous system eas- in minor clipping of the continuous-time flight space con- ier than ever before, provided that the right high-level straint. The issue can be mitigated by the same strate- algorithms exist to leverage it. gies as proposed in the last section for the quadrotor To leverage convex optimization for the difficult task of example [191, 192]. nonconvex trajectory generation, this article provides an expansive tutorial of three algorithms. First, the loss- CONCLUSIONS less convexification (LCvx) algorithm is introduced to Modern vehicle engineering is moving in the direction of remove acute nonconvexities in the control input con- increased autonomy. This includes aerospace, automo- straints. This provides an optimal -backed tive, and marine transport, as well as robots on land, in way to transform certain families of nonconvex trajectory the air, and in our homes. No matter the application, a generation tasks into ones that can be solved in one shot common feature across autonomous systems is the basic by a convex optimizer. A variable-mass rocket landing requirement to generate trajectories. In a general sense, example at the end of the article illustrates a real-world trajectories serve like plans to be executed in order for application of the LCvx method. the system to complete its task. Due to the large scale of Not stopping there, the article then motivates an en- deployment and/or the safety-critical nature of the sys- tire family of optimization methods called sequential con- tem, reliable real-time onboard trajectory generation has vex programming (SCP). These methods use a linearize- never been more important. solve loop whereby a convex optimizer is called several This article takes the stance that convex optimization times until a locally optimal trajectory is obtained. SCP is a prime contender for the job, thanks to 40 years of op- strikes a compelling middleground between “what is pos- timization research having produced a remarkable suite sible” and “what is acceptable” for safety-critical real- of numerical methods for quickly and reliably solving time trajectory generation. In particular, SCP inher- convex problems [26, 29, 34]. Many of these methods its much from trust region methods in numerical opti- are now packaged as either commercial or open-source mization, and its performance is amenable to theoretical 60 Preprint

Initial r r Initial r r I I I I 0.06 Converged r T (scaled) 0.06 Converged r T (scaled) I I I I 6 dss(r ) = 0 6 dss(r ) = 0 6 dss(r ) = 0 6 dss(r ) = 0 I I I I i i O Oi O Oi 0.05 Obstacle Obstacle 0.05 Obstacle Obstacle 4 4 4 4 [m/s] [m/s] [m] [m] 0.04 [m] [m] 0.04 2 2 ,2 ,2 ,2 ,2 k k I I I I I I r r r r v v 2 2 0.03 k 2 2 0.03 k

Position Position Position Position 0.02 0 0 0.02 0 0 Velocity Velocity

0.01 0.01 2 2 2 2 − − − − 0.00 6 8 10 12 6 8 10 12 6 8 10 12 6 8 10 12 Position r ,1 [m] Position r ,1 [m] Position r ,1 [m] Position r ,1 [m] I I I I (a) SCvx. (b) GuSTO.

FIGURE 31 The position trajectory evolution (left) and the final converged trajectory (right) for the 6-DoF free-flyer problem. In the right plots for each algorithm, the continuous-time trajectory is obtained by numerically integrating the dynamics (131). The fact that this trajectory passes through the discrete-time subproblem solution confirms dynamic feasibility.

20 T 2 M 2 Obstacle j = 1 k I k 100 k I k T ,1 M ,1 20 Obstacle j = 2 I I 0.6 T ,2 M ,2 2 Obstacle j = 3 I m] I k ) · 80

T ,3 M ,3 ) I j

I N I 10 r 15 c ( [mN] µ

[ 0.4 ss I

60 − ˜ d I T I r

M 10 ( 0 40 0.2 j SDF H

k 5 Thrust 20 Torque 0.0 10 − 0 0 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 0 50 100 150 200 Time [s] Time [s] Time [s] Time [s]

ϕ ω 2 1.0 k Bk 20 θ ω ,1 B FIGURE 33 Signed distance function and obstacle avoidance time his-

ψ /s] ω ,2 ] ◦ B

[ tories for the converged trajectory of the 6-DoF free-flyer problem. Like ◦ 0.8 ω ,3 10 B B

as ZYX for Figure 32, these are visually identical for SCvx and GuSTO, so we ω 0 0.6 only show a single plot. Note the highly nonlinear nature of the SDF, B←I

q whose time history exhibits sharp corners as the robot traverses the 10 0.4 − feasible flight space. Although the SDF constraint (159) is satisfied at Euler angles [ the discrete-time nodes, minor inter-sample constraint clipping occurs 20 0.2 Angular rate Attitude − around 100 seconds as the robot rounds a turn in the middle of its 30 0.0 trajectory (see Figure 31). − 0 50 100 150 200 0 50 100 150 200 Time [s] Time [s] The theory behind LCvx, SCvx, and GuSTO is rel- FIGURE 32 State and control time histories for the converged trajec- atively new and under active research, with the oldest tory of the 6-DoF free-flyer problem. These are visually identical for method in this article (i.e., classical LCvx [84]) being SCvx and GuSTO, so we only show a single plot. Euler angles using just 15 years old. We firmly believe that neither one the intrinsic Tait-Bryan convention are shown in place of the quater- nion attitude. Like in Figure 25, the dots represent the discrete-time of the methods has attained the limits of its capabilities, solution while the continuous lines are obtained by propagating the and this presents the reader with an exciting opportunity solution through the actual continuous-time dynamics (39b). to contribute to the effort. It is clear to us that convex optimization has a role to play in the present and future of advanced trajectory generation. With the help of this analysis using standard tools of analysis, constrained op- article and the associated source code, we hope that the timization, and optimal control theory [11, 29, 47]. This reader now has the knowledge and tools to join in the articles provides a detailed overview of two specific and adventure. closely related SCP algorithms called SCvx and GuSTO [49, 50]. To corroborate their effectiveness for difficult ACKNOWLEDGEMENTS trajectory generation tasks, two numerical examples are This work was supported in part by the National Sci- presented based on a quadrotor and a space-station free- ence Foundation, Cyber-Physical Systems (CPS) pro- flyer maintenance robot. gram (award 1931815), by the Office of Naval Research, 61 Preprint

ONR YIP Program (contract N00014-17-1-2433), and from Sorbonne Universit´e, France in by the King Abdulaziz City for Science and Technol- 2018 in collaboration with ONERA – The French ogy (KACST). The authors would like to extend their Aerospace Lab. He is recipient of the ONERA DTIS gratitude to Yuanqi Mao for his invaluable inputs on Best Ph.D. Student Award 2018. He is now a post- sequential convex programming algorithms, to Abhinav doctoral researcher at the Department of Aeronautics Kamath for his meticulous review of every detail, and to and Astronautics at Stanford University. His main Jonathan P. How for the initial encouragement to write research interests concern theoretical and numerical this article. robust optimal control with applications in aerospace systems and robotics. AUTHOR INFORMATION Marco Pavone is an Associate Professor of Aeronau- Danylo Malyuta received the B.Sc. degree in Me- tics and Astronautics at Stanford University, where he chanical Engineering from EPFL and the M.Sc. degree is the Director of the Autonomous Systems Laboratory. in Robotics, Systems and Control from ETH Z¨urich. Before joining Stanford, he was a Research Technologist He is currently a Ph.D. candidate in the Autonomous within the Robotics Section at the NASA Jet Propulsion Controls Lab at the Department of Aeronautics and Laboratory. He received the Ph.D. degree in Aeronau- Astronautics of the University of Washington. His tics and Astronautics from the Massachusetts Institute research is primarily focused on computationally effi- of Technology in 2010. His main research interests are cient optimization-based control of dynamical systems. in the development of methodologies for the analysis, Danylo has held internship positions at the NASA Jet design, and control of autonomous systems, with an em- Propulsion Laboratory, NASA Johnson Space Center, phasis on self-driving cars, autonomous aerospace vehi- and Amazon Prime Air. cles, and future mobility systems. He is a recipient of Taylor P. Reynolds received the B.S. degree in a number of awards, including a Presidential Early Ca- Mathematics & Engineering from Queen’s University in reer Award for Scientists and Engineers, an ONR YIP 2016. He received the Ph.D. degree from the Depart- Award, an NSF CAREER Award, and a NASA Early ment of Aeronautics & Astronautics at the University Career Faculty Award. He was identified by the Ameri- of Washington in 2020 under the supervision of Mehran can Society for Engineering Education (ASEE) as one of Mesbahi. During his Ph.D., Taylor worked with NASA America’s 20 most highly promising investigators under JSC and Draper Laboratories to develop advanced the age of 40. He is currently serving as an Associate guidance algorithms for planetary landing on the Editor for the IEEE Control Systems Magazine. Beh¸cetA¸cıkme¸se SPLICE project, and also co-founded the Aeronautics is a Professor at the University & Astronautics CubeSat Team at the University of of Washington, Seattle. He received the Ph.D. degree Washington. He now works as a Research Scientist at in Aerospace Engineering from Purdue University. He Amazon Prime Air. was a senior technologist at the NASA Jet Propulsion Laboratory (JPL) and a lecturer at the California In- Michael Szmuk received the B.S. and M.S. degrees stitute of Technology. At JPL, he developed control in Aerospace Engineering from the University of Texas algorithms for planetary landing, spacecraft formation at Austin. In 2019, he received the Ph.D. degree while flying, and asteroid and comet sample return missions. working in the Autonomous Controls Lab at the De- He developed the “flyaway” control algorithms used suc- partment of Aeronautics and Astronautics of the Uni- cessfully in NASA’s (MSL) and versity of Washington, under the supervision of Beh¸cet Mars 2020 missions during the landings of and A¸cıkme¸se.During his academic career, he completed in- Perseverance rovers on Mars. He is a recipient of the ternships at NASA, AFRL, Emergent Space, Blue Ori- NSF CAREER Award, the IEEE Award for Technical gin, and Amazon Prime Air. He now works as a Research Excellence in Aerospace Control, and numerous NASA Scientist at Amazon Prime Air, specializing in the design Achievement awards for his contributions to NASA mis- of flight control algorithms for autonomous air delivery sions and technology development. His research interests vehicles. include optimization-based control, nonlinear and robust Thomas Lew is a Ph.D. candidate in Aeronautics control, and stochastic control. and Astronautics at Stanford University. He received the ´ B.Sc. degree in Microengineering from Ecole Polytech- REFERENCES nique F´ed´eralede Lausanne in 2017 and the M.Sc. de- [1] Raffaello D’Andrea. “Guest Editorial Can Drones Deliver?” In: gree in Robotics from ETH Z¨urich in 2019. His research IEEE Transactions on Automation Science and Engineering 11.3 focuses on the intersection between optimal control and (July 2014), pp. 647–648. doi: 10.1109/tase.2014.2326952. [2] A. Miguel San Martin and Edward C. Lee Steven W. Wong. machine learning techniques for robotics and aerospace “The development of the MSL guidance, navigation, and control applications. system for entry, descent, and landing”. In: 23rd Space Flight Me- Riccardo Bonalli obtained the M.Sc. degree in chanics Meeting. 2013. Mathematical Engineering from Politecnico di Mi- [3] Adam D. Steltzner et al. “Mars Science Laboratory Entry, De- scent, and Landing System Development Challenges”. In: Journal lano, Italy in 2014 and the Ph.D. degree in applied 62 Preprint

of Spacecraft and Rockets 51.4 (July 2014), pp. 994–1003. doi: [26] Stephen Boyd and Lieven Vandenberghe. Convex Optimiza- 10.2514/1.a32866. tion. Cambridge, UK: Cambridge University Press, 2004. isbn: [4] David W. Way et al. “Mars Science Laboratory: Entry, Descent, 9780521833783. and Landing System Performance”. In: 2007 IEEE Aerospace Con- [27] John T. Betts. “Survey of numerical methods for trajectory ference. IEEE, 2007. doi: 10.1109/aero.2007.352821. optimization”. In: Journal of Guidance, Control, and Dynamics [5] Gunter Stein. “Respect the unstable”. In: IEEE Control Sys- 21.2 (1998), pp. 193–207. doi: 10.2514/2.4231. tems 23.4 (Aug. 2003), pp. 12–25. doi: 10.1109/mcs.2003.1213600. [28] Matthew Kelly. “An Introduction to Trajectory Optimization: [6] Danylo Malyuta et al. “Advances in Trajectory Optimization How to Do Your Own Direct Collocation”. In: SIAM Review 59.4 for Space Vehicle Control”. In: Annual Reviews in Control (2021). (Jan. 2017), pp. 849–904. doi: 10.1137/16m1062569. under review. [29] J. Nocedal and S. Wright. Numerical Optimization. Springer [7] David A. Mindell. Digital Apollo. MIT Press, 2008. isbn: New York, 1999. isbn: 9780387303031. doi: 10.1007/978-0-387- 9780262134972. 40065-5. [8] John M. Carson III et al. “The SPLICE Project: Continuing [30] R. Tyrrell Rockafellar. “Lagrange Multipliers and Optimal- NASA Development of GN&C Technologies for Safe and Precise ity”. In: SIAM Review 35.2 (June 1993), pp. 183–238. doi: 10. Landing”. In: AIAA Scitech 2019 Forum. American Institute of 1137/1035044. Aeronautics and Astronautics, Jan. 2019. doi: 10.2514/6.2019- [31] Stephen J. Wright. Primal-Dual Interior-Point Methods. So- 0660. ciety for Industrial and , Jan. 1997. doi: 10. [9] Daniel Dueri et al. “Customized Real-Time Interior-Point 1137/1.9781611971453. Methods for Onboard Powered-Descent Guidance”. In: Journal of [32] Philip E. Gill, Walter Murray, and Margaret H. Wright. Practi- Guidance, Control, and Dynamics 40.2 (Feb. 2017), pp. 197–212. cal Optimization. Society for Industrial and Applied Mathematics, doi: 10.2514/1.g001480. Jan. 2019. doi: 10.1137/1.9781611975604. [10] NASA. The Mars 2020 Rover’s “Brains”. https : / / mars . [33] C. Roos, T. Terlaky, and J.-Ph. Vial. Theory and Algorithms .gov/mars2020/spacecraft/rover/brains/. 2020. for Linear Optimization: An Interior Point Approach. Wiley, Oct. [11] Lev S. Pontryagin et al. The Mathematical Theory of Optimal 2000. isbn: 9780470865934. Processes. Montreux: Gordon and Breach Science Publishers, 1986. [34] Margaret H. Wright. “The interior-point revolution in opti- doi: 10.1201/9780203749319. mization: history, recent developments, and lasting consequences”. [12] Leonard D. Berkovitz. Optimal Control Theory. Springer New In: BULL. AMER. MATH. SOC. (N.S) 42 (2005), pp. 39–56. York, 1974. doi: 10.1007/978-1-4757-6097-2. [35] Jiming Peng, Cornelis Roos, and Tam´asTerlaky. “Primal- [13] John T. Betts. Practical Methods for Optimal Control and Dual Interior-Point Methods for Second-Order Conic Op- Estimation Using Nonlinear Programming. SIAM, 2010. isbn: timization Based on Self-Regular Proximities”. In: SIAM 9780898716887. doi: 10.1137/1.9780898718577. Journal on Optimization 13.1 (Jan. 2002), pp. 179–203. doi: [14] D.F. Lawden. Optimal Trajectories for Space Navigation. Lon- 10.1137/s1052623401383236. don: Butterworths, 1963. [36] Lars Blackmore. “Autonomous Precision Landing of Space [15] J-P. Marec. Optimal Space Trajectories. Amsterdam: Elsevier Rockets”. In: The Bridge 4.46 (Dec. 2016), pp. 15–20. Scientific Publishing Company, 1979. isbn: 0444418121. doi: 10. [37] Daniel P. Scharf et al. “Implementation and Experimental 1002/oca.4660040210. Demonstration of Onboard Powered-Descent Guidance”. In: Jour- [16] Arthur Earl Bryson Jr. and Yu-Chi Ho. Applied Optimal Con- nal of Guidance, Control, and Dynamics 40.2 (Feb. 2017), pp. 213– trol. Washington, D.C.: Hemisphere Publishing Corporation, 1975. 229. doi: 10.2514/1.g000399. isbn: 0470114819. [38] JPL and Masten Space Systems. 500 meter divert Xombie [17] Donald E. Kirk. Optimal Control Theory: An Introduction. test flight for G-FOLD, Guidance for Fuel Optimal Large Divert, Englewood Cliffs, NJ: Prentice-Hall, 1970. isbn: 9780486434841. validation. http://www.youtube.com/watch?v=1GRwimo1AwY. Aug. [18] James M Longuski, Jos´eJ. Guzm´an,and John E. Prussing. 2012. Optimal Control with Aerospace Applications. Springer New York, [39] JPL and Masten Space Systems. 750 meter divert Xombie 2014. doi: 10.1007/978-1-4614-8945-0. test flight for G-FOLD, Guidance for Fuel Optimal Large Divert, [19] J. Meditch. “On the problem of optimal thrust programming validation. http://www.youtube.com/watch?v=jl6pw2oossU. July for a lunar soft landing”. In: IEEE Transactions on Automatic 2012. Control 9.4 (Oct. 1964), pp. 477–484. doi: 10.1109/tac.1964. [40] Brian Paden et al. “A Survey of Motion Planning and Con- 1105758. trol Techniques for Self-Driving Urban Vehicles”. In: IEEE Trans- [20] and Jay H Lee. “Model predictive control: actions on Intelligent Vehicles 1.1 (Mar. 2016), pp. 33–55. doi: past, present and future”. In: Computers & Chemical Engineering 10.1109/tiv.2016.2578706. 23.4-5 (1999), pp. 667–682. [41] Martin Buehler, Karl Iagnemma, and Sanjiv Singh, eds. The [21] John W Eaton and James B Rawlings. “Model-predictive con- DARPA Urban Challenge: Autonomous Vehicles in City Traffic. trol of chemical processes”. In: Chemical Engineering Science 47.4 Springer Berlin Heidelberg, 2009. doi: 10.1007/978-3-642-03991- (1992), pp. 705–720. 1. [22] Peter J Campo and Manfred Morari. “Robust model predic- [42] Russell Buchanan et al. “Perceptive whole-body planning tive control”. In: 1987 American control conference. IEEE. 1987, for multilegged robots in confined spaces”. In: Journal of Field pp. 1021–1026. Robotics 38.1 (June 2020), pp. 68–84. doi: 10.1002/rob.21974. [23] Frauke Oldewurtel et al. “Use of model predictive control and [43] Jean-Pierre Sleiman et al. “A Unified MPC Framework for weather forecasts for energy efficient building climate control”. In: Whole-Body Dynamic Locomotion and Manipulation”. In: IEEE Energy and Buildings 45 (2012), pp. 15–27. Robotics and Automation Letters 6.3 (July 2021), pp. 4688–4695. doi: 10.1109/lra.2021.3068908. [24] Yudong Ma et al. “Model predictive control for the operation of building cooling systems”. In: IEEE Transactions on control [44] Elia Kaufmann et al. “Deep Drone Racing: Learning Agile systems technology 20.3 (2011), pp. 796–803. Flight in Dynamic Environments”. In: Proceedings of The 2nd Conference on Robot Learning. Ed. by Aude Billard et al. Vol. 87. [25] Sorin C Bengea et al. “Implementation of model predictive Proceedings of Machine Learning Research. PMLR, Oct. 2018, control for an HVAC system in a mid-size commercial building”. pp. 133–145. In: HVAC&R Research 20.1 (2014), pp. 121–135. [45] Beh¸cet A¸cıkme¸seand Scott R. Ploen. “Convex Programming Approach to Powered Descent Guidance for Mars Landing”. In:

63 Preprint

Journal of Guidance, Control, and Dynamics 30.5 (Sept. 2007), [62] Taylor P. Reynolds et al. “Dual Quaternion-Based Powered pp. 1353–1366. doi: 10.2514/1.27553. Descent Guidance with State-Triggered Constraints”. In: Journal [46] Lars Blackmore, Beh¸cetA¸cıkme¸se,and John M. Carson III. of Guidance, Control, and Dynamics 43.9 (Sept. 2020), pp. 1584– “Lossless convexification of control constraints for a class of nonlin- 1599. doi: 10.2514/1.g004536. ear optimal control problems”. In: Systems & Control Letters 61.8 [63] Taylor P. Reynolds et al. “A Real-Time Algorithm for Non- (Aug. 2012), pp. 863–870. doi: 10.1016/j.sysconle.2012.04.010. Convex Powered Descent Guidance”. In: AIAA SciTech Forum. [47] Andrew R. Conn, Nicholas I. M. Gould, and Philippe L. Toint. AIAA, 2020. doi: 10.2514/6.2020-0844. Trust Region Methods. SIAM, Philadelphia, PA, 2000. doi: 10 . [64] Yuanqi Mao et al. “Successive Convexification of Non- 1137/1.9780898719857. Convex Optimal Control Problems with State Constraints”. [48] Mykel J. Kochenderfer and Tim A. Wheeler. Algorithms for In: IFAC-PapersOnLine 50.1 (2017), pp. 4063–4069. doi: Optimization. Cambridge, Massachusetts: The MIT Press, 2019. 10.1016/j.ifacol.2017.08.789. isbn: 9780262039420. [65] Michael Szmuk et al. “Real-Time Quad-Rotor Path Planning [49] Yuanqi Mao, Michael Szmuk, and Behcet Acikmese. “Succes- Using Convex Optimization and Compound State-Triggered Con- sive Convexification: A Superlinearly Convergent Algorithm for straints”. In: 2019 IEEE/RSJ International Conference on Intelli- Non-convex Optimal Control Problems”. In: arXiv e-prints (2018). gent Robots and Systems (IROS). IEEE, Nov. 2019. doi: 10.1109/ arXiv: 1804.06539. iros40897.2019.8967706. [50] R. Bonalli et al. “GuSTO: Guaranteed Sequential Trajectory [66] Danylo Malyuta et al. “Fast Trajectory Optimization via Optimization via Sequential Convex Programming”. In: IEEE Successive Convexification for Spacecraft Rendezvous with International Conference on Robotics and Automation. 2019, Integer Constraints”. In: AIAA Scitech 2020 Forum. American pp. 6741–6747. doi: 10.1109/ICRA.2019.8794205. Institute of Aeronautics and Astronautics, Jan. 2020. doi: [51] Matthew W. Harris and Beh¸cetA¸cıkme¸se.“Maximum Divert 10.2514/6.2020-0616. for Planetary Landing Using Convex Optimization”. In: Journal of [67] Taylor Reynolds et al. “SOC-i: A CubeSat Demonstration of Optimization Theory and Applications 162.3 (Dec. 2013), pp. 975– Optimization-Based Real-Time Constrained Attitude Control”. In: 995. doi: 10.1007/s10957-013-0501-7. 2021 IEEE Aerospace Conference. IEEE, Mar. 2021. [52] M. Szmuk et al. “Convexification and real-time on-board [68] NASA. NASA Tipping Point Partnership with Blue Ori- optimization for agile quad-rotor maneuvering and obstacle avoid- gin to Test Precision Lunar Landing Technologies. https : ance”. In: IEEE/RSJ International Conference on Intelligent //www.nasa.gov/directorates/spacetech/NASA_Tipping_Point_ Robots and Systems. Vancouver, Canada, 2017, pp. 4862–4868. Partnership_to_Test_Precision_Lunar_Landing_Tech/. Sept. doi: 10.1109/IROS.2017.8206363. 2020. [53] Xinfu Liu, Zuojun Shen, and Ping Lu. “Entry Trajectory Op- [69] R. Bonalli et al. “Trajectory Optimization on Manifolds: A timization by Second-Order Cone Programming”. In: Journal of Theoretically-Guaranteed Embedded Sequential Convex Program- Guidance, Control, and Dynamics 39.2 (Feb. 2016), pp. 227–241. ming Approach”. In: Robotics: Science and Systems XV. Robotics: doi: 10.2514/1.g001210. Science and Systems Foundation, 2019. doi: 10.15607/rss.2019. [54] Zhenbo Wang and Michael J. Grant. “Constrained Trajectory xv.078. Optimization for Planetary Entry via Sequential Convex Program- [70] S. Banerjee et al. “Learning-based Warm-Starting for Fast Se- ming”. In: Journal of Guidance, Control, and Dynamics 40.10 quential Convex Programming and Trajectory Optimization”. In: (Oct. 2017), pp. 2603–2615. doi: 10.2514/1.g002150. IEEE Aerospace Conference. Big Sky, Montana, Mar. 2020. [55] Xinfu Liu. “Fuel-Optimal Rocket Landing with Aerodynamic [71] Thomas Lew, Riccardo Bonalli, and Marco Pavone. “Chance- Controls”. In: Journal of Guidance, Control, and Dynamics 42.1 Constrained Sequential Convex Programming for Robust (Jan. 2019), pp. 65–77. doi: 10.2514/1.g003537. Trajectory Optimization”. In: 2020 European Control Conference [56] Matthew W. Harris and Beh¸cet A¸cıkme¸se.“Minimum Time (ECC). IEEE, May 2020. doi: 10.23919/ecc51009.2020.9143595. Rendezvous of Multiple Spacecraft Using Differential Drag”. In: [72] R. Bonalli, T. Lew, and M. Pavone. “Analysis of Theoreti- Journal of Guidance, Control, and Dynamics 37.2 (Mar. 2014), cal and Numerical Properties of Sequential Convex Programming pp. 365–373. doi: 10.2514/1.61505. for Continuous-Time Optimal Control”. In: IEEE Transactions on [57] Danylo Malyuta and Beh¸cetA¸cıkme¸se.“Lossless convexifica- Automatic Control (2021). Submitted. tion of non-convex optimal control problems with disjoint semi- [73] R. Bonalli, T. Lew, and M. Pavone. “Sequential Convex Pro- continuous inputs”. In: Preprint (Nov. 2019), arXiv:1902.02726. gramming For Non-Linear Stochastic Optimal Control”. In: (2021). [58] Michael Szmuk, Behcet Acikmese, and Andrew W. Berning. Submitted. “Successive Convexification for Fuel-Optimal Powered Landing [74] Yuanqi Mao, Michael Szmuk, and Beh¸cetA¸cıkme¸se.“A Tuto- with Aerodynamic Drag and Non-Convex Constraints”. In: rial on Real-time Convex Optimization Based Guidance and Con- AIAA Guidance, Navigation, and Control Conference. American trol for Aerospace Applications”. In: 2018 Annual American Con- Institute of Aeronautics and Astronautics, Jan. 2016. doi: trol Conference (ACC). IEEE, June 2018. doi: 10 . 23919 / acc . 10.2514/6.2016-0378. 2018.8430984. [59] Michael Szmuk, Taylor P Reynolds, and Beh¸cet A¸cıkme¸se. [75] Yuanqi Mao et al. “Convexification and Real-Time Optimiza- “Successive Convexification for Real-Time 6-DoF Powered Descent tion for MPC with Aerospace Applications”. In: Handbook of Model Guidance with State-Triggered Constraints”. In: Journal of Guid- Predictive Control. Springer International Publishing, Sept. 2018, ance, Control, and Dynamics 48.8 (2020), pp. 1399–1413. doi: pp. 335–358. doi: 10.1007/978-3-319-77489-3_15. 10.2514/1.G004549. [76] David Q. Mayne. “Model predictive control: Recent develop- [60] Michael Szmuk and Beh¸cet A¸cıkme¸se.“Successive Convexifi- ments and future promise”. In: Automatica 50.12 (Dec. 2014), cation for 6-DoF Mars Rocket Powered Landing with Free-Final- pp. 2967–2986. doi: 10.1016/j.automatica.2014.10.128. Time”. In: 2018 AIAA Guidance, Navigation, and Control Con- [77] Carlos E. Garcia, David M. Prett, and Manfred Morari. ference. American Institute of Aeronautics and Astronautics, Jan. “Model predictive control: Theory and practice—A survey”. In: 2018. doi: 10.2514/6.2018-0617. Automatica 25.3 (May 1989), pp. 335–348. doi: 10.1016/0005- [61] Michael Szmuk et al. “Successive Convexification for 6-DoF 1098(89)90002-2. Powered Descent Guidance with Compound State-Triggered Con- [78] Utku Eren et al. “Model Predictive Control in Aerospace Sys- straints”. In: AIAA Scitech 2019 Forum. American Institute of tems: Current State and Opportunities”. In: Journal of Guidance, Aeronautics and Astronautics, Jan. 2019. doi: 10.2514/6.2019- Control, and Dynamics 40.7 (July 2017), pp. 1541–1566. doi: 10. 0926. 2514/1.g002507.

64 Preprint

[79] James B. Rawlings, David Q. Mayne, and Moritz M. [98] L. D. Landau and E. M. Lifshitz. Mechanics. 2nd ed. Bristol, Diehl. Model Predictive Control: Theory, Computation and UK: Pergamon Press, 1969. Design. 2nd ed. Madison, WI: Nob Hill Publishing, 2017. isbn: [99] Warren F. Phillips. Mechanics of Flight. Hoboken, NJ: Wiley, 9780975937730. 2010. isbn: 9780470539750. [80] Jeff Bezanson et al. “Julia: A fresh approach to numerical com- [100] Anton H. J. de Ruiter, Christopher J. Damaren, and James R. puting”. In: SIAM review 59.1 (2017), pp. 65–98. Forbes. Spacecraft Dynamics and Control: An Introduction. Wiley, [81] Ralph T. Rockafellar. Convex Analysis. Princeton, NJ: Prince- 2013. isbn: 9781118342367. ton University Press, 1970. isbn: 9780691015866. doi: 10.2307/j. [101] H. L. Trentelman, A. A. Stoorvogel, and M. Hautus. Control ctvcm4g0s. Theory for Linear Systems. Springer, 2001. [82] N. Karmarkar. “A new polynomial-time algorithm for linear [102] Matthew W. Harris and Beh¸cetA¸cıkme¸se.“Lossless convexi- programming”. In: Combinatorica 4.4 (Dec. 1984), pp. 373–395. fication for a class of optimal control problems with quadratic state doi: 10.1007/bf02579150. constraints”. In: 2013 American Control Conference. IEEE, June [83] Jiming Peng, Cornelis Roos, and Tam´as Terlaky. Self- 2013. doi: 10.1109/acc.2013.6580359. Regularity: A New Paradigm for Primal-Dual Interior-Point Al- [103] Sigurd Skogestad and Ian Postlethwaite. Multivariable Feed- gorithms. Princeton University Press, 2002. isbn: 9780691091938. back Control: Analysis and Design. 2nd ed. Wiley, 2005. isbn: [84] Beh¸cetA¸cıkme¸seand Scott Ploen. “A Powered Descent Guid- 9780470011676. ance Algorithm for Mars Pinpoint Landing”. In: AIAA Guidance, [104] Richard F. Hartl, Suresh P. Sethi, and Raymond G. Vickson. Navigation, and Control Conference and Exhibit. American Insti- “A Survey of the Maximum Principles for Optimal Control Prob- tute of Aeronautics and Astronautics, Aug. 2005. doi: 10.2514/6. lems with State Constraints”. In: SIAM Review 37.2 (June 1995), 2005-6288. pp. 181–218. doi: 10.1137/1037043. [85] Lars Blackmore, Beh¸cet A¸cikme¸se, and Daniel P. Scharf. [105] A. Milyutin and N. Osmolovskii. and “Minimum-Landing-Error Powered-Descent Guidance for Mars Optimal Control. American Mathematical Society, 1998. Landing Using Convex Optimization”. In: Journal of Guidance, [106] Tobias Achterberg and Roland Wunderling. “Mixed Integer Control, and Dynamics 33.4 (July 2010), pp. 1161–1171. doi: Programming: Analyzing 12 Years of Progress”. In: Facets of Com- 10.2514/1.47202. binatorial Optimization. Springer Berlin Heidelberg, 2013, pp. 449– [86] Beh¸cetA¸cıkme¸seand Lars Blackmore. “Lossless convexifica- 481. doi: 10.1007/978-3-642-38189-8_18. tion of a class of optimal control problems with non-convex control [107] Tobias Achterberg. “Constrained Integer Programming”. constraints”. In: Automatica 47.2 (Feb. 2011), pp. 341–347. doi: PhD thesis. Berlin: Technische Universit¨atBerlin, 2007. 10.1016/j.automatica.2010.10.037. [108] Tom Schouwenaars et al. “Mixed integer programming for [87] John M. Carson III, Beh¸cetA¸cıkme¸se,and Lars Blackmore. multi-vehicle path planning”. In: 2001 European Control Confer- “Lossless convexification of Powered-Descent Guidance with non- ence (ECC). IEEE, Sept. 2001. doi: 10.23919/ecc.2001.7076321. convex thrust bound and pointing constraints”. In: Proceedings of [109] Zhe Zhang, Jun Wang, and Jianxun Li. “Lossless convexifi- the 2011 American Control Conference. IEEE, June 2011. doi: cation of nonconvex MINLP on the UAV path-planning problem”. 10.1109/acc.2011.5990959. In: Optimal Control Applications and Methods 39.2 (Dec. 2017), [88] Matthew W. Harris and Beh¸cetA¸cıkme¸se.“Lossless convex- pp. 845–859. doi: 10.1002/oca.2380. ification for a class of optimal control problems with linear state [110] Xinfu Liu, Ping Lu, and Binfeng Pan. “Survey of convex opti- constraints”. In: 52nd IEEE Conference on Decision and Control. mization for aerospace applications”. In: Astrodynamics 1.1 (Sept. IEEE, Dec. 2013. : . doi 10.1109/cdc.2013.6761017 2017), pp. 23–40. doi: 10.1007/s42064-017-0003-8. [89] Matthew W. Harris and Beh¸cetA¸cıkme¸se.“Lossless convexifi- [111] Alberto Bemporad and Manfred Morari. “Control of systems cation of non-convex optimal control problems for state constrained integrating logic, dynamics, and constraints”. In: Automatica 35.3 linear systems”. In: Automatica 50.9 (Sept. 2014), pp. 2304–2311. (Mar. 1999), pp. 407–427. doi: 10.1016/s0005-1098(98)00178-2. doi: 10.1016/j.automatica.2014.06.008. [112] Danylo Malyuta and Beh¸cet A¸cıkme¸se. “Fast Continuous [90] Danylo Malyuta and Beh¸cetA¸cikme¸se.“Lossless Convexifica- Embedding for Spacecraft Rendezvous Trajectory Optimization tion of Optimal Control Problems with Semi-continuous Inputs”. with Discrete Logic”. In: Journal of Guidance, Control, and In: IFAC-PapersOnLine 53.2 (2020), pp. 6843–6850. doi: 10.1016/ Dynamics (submitted) (2021). j.ifacol.2020.12.341. [113] Tom Schouwenaars. “Safe trajectory planning of autonomous [91] Matthew W. Harris. “Optimal Control on Disconnected Sets vehicles”. Dissertation (Ph.D.) Massachusetts Institute of Technol- using Extreme Point Relaxations and Normality Approximations”. ogy, 2006. In: IEEE Transactions on Automatic Control (2021), pp. 1–1. doi: [114] Michael Posa, Cecilia Cantu, and Russ Tedrake. “A direct 10.1109/tac.2021.3059682. method for trajectory optimization of rigid bodies through con- [92] Sheril Kunhippurayil, Matthew W. Harris, and Olli Jansson. tact”. In: The International Journal of Robotics Research 33.1 “Lossless Convexification of Optimal Control Problems with An- (Oct. 2013), pp. 69–81. doi: 10.1177/0278364913506757. nular Control Constraints”. In: Automatica (2021). [115] Jemin Hwangbo, Joonho Lee, and Marco Hutter. “Per- [93] Sheril Kunhippurayil and Matthew W. Harris. “Strong Ob- Contact Iteration Method for Solving Contact Dynamics”. servability as a Sufficient Condition for Non-singularity in Optimal In: IEEE Robotics and Automation Letters 3.2 (Apr. 2018), Control with Mixed Constraints”. In: Automatica (2021). pp. 895–902. doi: 10.1109/lra.2018.2792536. [94] Henry D’Angelo. Linear Time-Varying Systems: Analysis and [116] Rafal Goebel, Ricardo G. Sanfelice, and Andrew R. Teel. Synthesis. Allyn and Bacon, 1970. “Hybrid dynamical systems”. In: IEEE Control Systems 29.2 (Apr. [95] Panos J. Antsaklis and Anthony N. Michel. Linear Systems. 2009), pp. 28–93. doi: 10.1109/mcs.2008.931718. Birkhauser Basel, 2006. isbn: 9780817644345. doi: 10 . 1007 / 0 - [117] Russ Tedrake. “MIT’s Entry Into the DARPA Robotics Chal- 8176-4435-0. lenge”. In: Brains, Minds and Machines Summer Course. MIT [96] Beh¸cet A¸cıkme¸se,John M. Carson III, and Lars Blackmore. OpenCourseWare. Cambridge MA, 2015. “Lossless Convexification of Nonconvex Control Bound and Point- [118] Henry Hermes and Joseph P. LaSalle. Functional Analysis ing Constraints of the Soft Landing Optimal Control Problem”. and Time Optimal Control. Elsevier, 1969. isbn: 9780080955650. In: IEEE Transactions on Control Systems Technology 21.6 (Nov. [119] Richard Vinter. Optimal Control. Birkhauser, 2000. isbn: 2013), pp. 2104–2113. doi: 10.1109/tcst.2012.2237346. 0817640754. [97] Matthew Wade Harris. “Lossless Convexification of Optimal [120] Francis Clarke. “The Pontryagin maximum principle and a Control Problems”. PhD thesis. Austin: The University of Texas unified theory of dynamic optimization”. In: Proceedings of the at Austin, 2014.

65 Preprint

Steklov Institute of Mathematics 268.1 (Apr. 2010), pp. 58–69. for Industrial and Applied Mathematics, Jan. 1994. doi: doi: 10.1134/s0081543810010062. 10.1137/1.9781611970791. [121] Unsik Lee and Mehran Mesbahi. “Constrained Autonomous [140] Dimitri Bertsekas. Dynamic Programming and Optimal Precision Landing via Dual Quaternions and Model Predictive Control. 4th ed. Nashua, NH: Athena Scientific, 2017. isbn: Control”. In: Journal of Guidance, Control, and Dynamics 40.2 9781886529434. (2017), pp. 292–308. doi: 10.2514/1.G001879. [141] Dimitri Bertsekas. Convex Optimization Algorithms. Nashua, [122] Xinfu Liu and Ping Lu. “Solving Nonconvex Optimal Con- NH: Athena Scientific, 2015. isbn: 9781886529281. trol Problems by Convex Optimization”. In: Journal of Guidance, [142] Melanie Mitchell. An Introduction to Genetic Algo- Control, and Dynamics 37.3 (2014), pp. 750–765. doi: 10.2514/1. rithms. Cambridge, Massachusetts: The MIT Press, 1996. isbn: 62110. 9780262631853. [123] R. Bonalli, B. H´eriss´e,and E. Tr´elat. “Optimal Control of [143] Richard S. Sutton and Andrew G. Barto. Reinforcement Endo-Atmospheric Launch Vehicle Systems: Geometric and Com- Learning. 2nd ed. Cambridge, Massachusetts: The MIT Press, putational Issues”. In: IEEE Transactions on Automatic Control 2018. isbn: 9780262039246. 65.6 (2020), pp. 2418–2433. doi: 10.1109/TAC.2019.2929099. [144] James E. Falk and Richard M. Soland. “An Algorithm for [124] M. Koˇcvara. “On the modeling and solving of the truss de- Separable Nonconvex Programming Problems”. In: Management sign problem with global stability constraints”. In: Structural and Science 15.9 (1969), pp. 550–569. doi: 10.1287/mnsc.15.9.550. Multidisciplinary Optimization 23 (2002), pp. 189–203. doi: 10. [145] Richard M. Soland. “An Algorithm for Separable Nonconvex 1007/s00158-002-0177-3. Programming Problems II: Nonconvex Constraints”. In: Manage- [125] Amir Beck, Aharon Ben-Tal, and Luba Tetruashvili. “A se- ment Science 17.11 (1971), pp. 759–773. doi: 10.1287/mnsc.17. quential parametric convex approximation method with applica- 11.759. tions to nonconvex truss topology design problems”. In: Journal [146] Reiner Horst. “An algorithm for nonconvex programming of Global Optimization 47 (2010), pp. 29–51. doi: 10.1007/s10898- problems”. In: Mathematical Programming 10 (1976), pp. 312–321. 009-9456-5. doi: 10.1007/BF01580678. [126] Wei Wei et al. “Optimal Power Flow of Radial Networks and [147] R. Horst. “On the convexification of nonlinear programming Its Variations: A Sequential Convex Optimization Approach”. In: problems: An applications-oriented survey”. In: European Journal IEEE Transactions on Smart Grid 8.6 (2017), pp. 2974–2987. doi: of Operational Research 15.3 (1984), pp. 382–392. doi: 10.1016/ 10.1109/TSG.2017.2684183. 0377-2217(84)90107-3. [127] Lorenz T. Biegler. “Recent advances in chemical process op- [148] Garth P. McCormick. “Computability of global solutions to timization”. In: Chemie Ingenieur Technik 86.7 (2014), pp. 943– factorable nonconvex programs: Part I - Convex underestimating 952. issn: 15222640. doi: 10.1002/cite.201400033. problems”. In: Mathematical Programming 10 (1976), pp. 147–175. [128] Hao Jiang, Mark S. Drew, and Ze-Nian Li. “Matching by lin- doi: 10.1007/BF01580665. ear programming and successive convexification”. In: IEEE Trans- [149] Alexander Mitsos, Benoˆıt Chachuat, and Paul I. Barton. actions on Pattern Analysis and Machine Intelligence 29.6 (2007), “McCormick-Based Relaxations of Algorithms”. In: SIAM pp. 959–975. doi: 10.1109/TPAMI.2007.1048. Journal on Optimization 20.2 (2009), pp. 573–601. doi: [129] Tony F Chan, Selim Esedo¯glu,and Mila Nikolova. “Algo- 10.1137/080717341. rithms for finding global minimizers of image segmentation and de- [150] A. Tsoukalas and Alexander Mitsos. “Multivariate Mc- noising models”. In: SIAM Journal on Applied Mathematics 66.5 Cormick relaxations”. In: Journal of Global Optimization 59 (2006), pp. 1632–1648. (2014), pp. 633–662. doi: 10.1007/s10898-014-0176-0. [130] Space Exploration Technologies Corp. Starship — SN10 — [151] Adam B. Singer and Paul I. Barton. “Global solution of opti- High-Altitude Flight Recap. https://www.youtube.com/watch?v= mization problems with parameter-embedded linear dynamic sys- gA6ppby3JC8. Mar. 2021. tems”. In: Journal of Optimization Theory and Applications 121 [131] Danylo Malyuta et al. Starship landing SCP example. https: (2004), pp. 613–646. doi: 10.1023/B:JOTA.0000037606.79050.a7. //github.com/dmalyuta/scp_traj_opt. Apr. 2021. [152] Adam B. Singer and Paul I. Barton. “Bounding the Solutions [132] D. Rocha, Cristiana J. Silva, and Delfim F. M. Torres. “Sta- of Parameter Dependent Nonlinear Ordinary Differential Equa- bility and optimal control of a delayed HIV model”. In: Mathemat- tions”. In: SIAM Journal on Scientific Computing 27.6 (2006), ical Methods in the Applied Sciences 41.6 (2018), pp. 2251–2260. pp. 2167–2182. doi: 10.1137/040604388. doi: 10.1002/mma.4207. [153] Reiner Horst and N. V. Thoai. “DC programming: overview”. [133] Cristiana J. Silva, Helmut Maurer, and Delfim F. M. Tor- In: Journal of Optimization Theory and Applications 103.1 (1999), res. “Optimal control of a Tuberculosis model with state and con- pp. 1–43. issn: 00223239. doi: 10.1023/a:1021765131316. trol delays”. In: Mathematical Biosciences and Engineering 14.1 [154] Thomas Lipp and Stephen Boyd. “Variations and extension (2017), pp. 321–337. doi: 10.3934/mbe.2017021. of the convex–concave procedure”. In: Optimization and Engi- [134] Robert Dorfman. “An economic interpretation of optimal neering 17.2 (2016), pp. 263–287. issn: 15732924. doi: 10.1007/ control theory”. In: American Economic Review 59.5 (1969), s11081-015-9294-x. pp. 817–831. doi: https://www.jstor.org/stable/1810679. [155] A. L. Yuille and Anand Rangarajan. “The Concave-Convex [135] M.R. Caputo. Foundations of Dynamic Economic Procedure”. In: Neural Computation 15.4 (2003), pp. 915–936. doi: Analysis. Cambridge University Press, 2005. doi: 10 . 1017 / 10.1162/08997660360581958. CBO9780511806827. [156] Gert R. Lanckriet and Bharath K. Sriperumbudur. “On the [136] Julio R Banga. “Optimization in computational systems bi- Convergence of the Concave-Convex Procedure”. In: Advances in ology”. In: BMC Systems Biology 2.47 (2008). doi: 10.1186/1752- Neural Information Processing Systems 22. Curran Associates, 0509-2-47. Inc., 2009, pp. 1759–1767. [137] Anne Goelzer, Vincent Fromion, and G´erardScorletti. “ [157] F. Palacios-Gomez, L. Lasdon, and M. Engquist. “Nonlinear design in bacteria as a convex optimization problem”. In: Auto- Optimization by Successive Linear Programming”. In: Manage- matica 47.6 (2011), pp. 1210–1218. doi: 10.1016/j.automatica. ment Science 28.10 (1982), pp. 1106–1120. doi: 10.1287/mnsc.28. 2011.02.038. 10.1106. [138] Matti Liski, Peter M Kort, and Andreas Novak. “Increasing [158] Richard H. Byrd et al. “An algorithm for nonlinear optimiza- returns and cycles in fishing”. In: Resource and Energy Economics tion using linear programming and equality constrained subprob- 23.3 (2001), pp. 241–258. doi: 10.1016/S0928-7655(01)00038-0. lems”. In: Mathematical Programing, Series B 100 (2003), pp. 27– [139] Yurii Nesterov and Arkadii Nemirovskii. Interior-Point 48. doi: 10.1007/s10107-003-0485-4. Polynomial Algorithms in Convex Programming. Society

66 Preprint

[159] S. P. Han. “A Globally Convergent Method for Nonlinear [178] Taylor P. Reynolds. “Computation Guidance and Control for Programming”. In: Journal of Optimization Theory and Applica- Aerospace Systems”. PhD thesis. Seattle, WA: University of Wash- tions 22.3 (1977), pp. 297–309. doi: 10.1007/BF00932858. ington, 2021. [160] S. P. Han and Olvi L. Mangasarian. “Exact penalty func- [179] J. Schulman et al. “Motion planning with sequential convex tions in nonlinear programming”. In: Mathematical Programming optimization and convex collision checking”. In: The International 17 (1979), pp. 251–269. doi: 10.1007/BF01588250. Journal of Robotics Research 33.9 (2014), pp. 1251–1270. doi: 10. [161] Michael J. D. Powell. “A fast algorithm for nonlinearly con- 1177/0278364914528132. strained optimization calculations”. In: Numerical Analysis. Ed. by [180] Xinfu Liu, Zuojun Shen, and Ping Lu. “Solving the G. A. Watson. Springer, Berlin, Heidelberg, 1978. Chap. Lecture maximum-crossrange problem via successive second-order cone Notes in Mathematics, pp. 144–157. doi: 10.1007/BFb0067703. programming with a line search”. In: Aerospace Science and Tech- [162] Michael J. D. Powell. “Algorithms for nonlinear constraints nology 47 (2015), pp. 10–20. doi: 10.1016/j.ast.2015.09.008. that use lagrangian functions”. In: Mathematical Programming 14 [181] Marco Sagliano. “Pseudospectral Convex Optimization for (1978), pp. 224–248. doi: 10.1007/BF01588967. Powered Descent and Landing”. In: Journal of Guidance, Control, [163] Michael J. D. Powell and Y. Yuan. “A recursive quadratic and Dynamics 41.2 (2017), pp. 320–334. doi: 10.2514/1.G002818. programming algorithm that uses differentiable exact penalty func- [182] Pedro Simpl´ıcio,Andr´esMarcos, and Samir Bennani. “Guid- tions”. In: Mathematical Programming 35 (1986), pp. 265–278. doi: ance of Reusable Launchers: Improving Descent and Landing Per- 10.1007/BF01580880. formance”. In: Journal of Guidance, Control, and Dynamics 42.10 [164] Paul T. Boggs, Jon W. Tolle, and Pyng Wang. “On the Local (2019), pp. 2206–2219. doi: 10.2514/1.G004155. Convergence of Quasi-Newton Methods for Constrained Optimiza- [183] Y. Mao, M. Szmuk, and B. A¸cıkme¸se.“Successive convexifi- tion”. In: SIAM Journal of Control and Optimization 20.2 (1982), cation of non-convex optimal control problems and its convergence pp. 161–171. doi: 10.1137/0320014. properties”. In: Conference on Decision and Control. IEEE, 2016, [165] Paul T. Boggs and Jon W. Tolle. “A Family of Descent Func- pp. 3636–3641. doi: 10.1109/CDC.2016.7798816. tions for Constrained Optimization”. In: SIAM Journal on Numer- [184] Michael Szmuk. “Successive Convexification & High Perfor- ical Analysis 21.6 (1984), pp. 1146–1161. doi: 10.1137/0721071. mance Feedback Control for Agile Flight”. PhD thesis. Seattle, [166] Paul T. Boggs and W. Jon Tolle. “A Strategy for Global WA: University of Washington, 2019. Convergence in a Sequential Quadratic Programming Algorithm”. [185] Harish Saranathan and Michael J. Grant. “Relaxed Au- In: SIAM Journal of Numerical Analysis 26.3 (1989), pp. 600–623. tonomously Switched Hybrid System Approach to Indirect doi: 10.1137/0726036. Multiphase Aerospace Trajectory Optimization”. In: Journal [167] Masao Fukushima. “A Successive Quadratic Programming of Spacecraft and Rockets 55.3 (May 2018), pp. 611–621. doi: Algorithm with Global and Superlinear Convergence Properties”. 10.2514/1.a34012. In: Mathematical Programming 35 (1986), pp. 253–264. doi: 10. [186] Ehsan Taheri et al. “A novel approach for optimal trajectory 1007/BF01580879. design with multiple operation modes of propulsion system, part [168] Paul T. Boggs and W. Jon Tolle. “Sequential Quadratic Pro- 1”. In: Acta Astronautica 172 (July 2020), pp. 151–165. doi: 10. gramming”. In: Acta Numerica 4 (1995), pp. 1–52. doi: 10.1017/ 1016/j.actaastro.2020.02.042. S0962492900002518. [187] Daniel Liberzon. Calculus of variations and optimal control [169] Richard H. Byrd, Robert B. Schnabel, and Gerald A. Shultz. theory : a concise introduction. Princeton, N.J: Princeton Univer- “Approximate solution of the trust region problem by minimization sity Press, 2012. isbn: 9781400842643. over two-dimensional subspaces”. In: Mathematical Programming [188] Ken Shoemake. “Animating rotation with quaternion 40-40.1-3 (Jan. 1988), pp. 247–263. doi: 10.1007/bf01580735. curves”. In: Proceedings of the 12th annual conference on Com- [170] Philip E. Gill, Walter Murray, and Michael A. Saunders. puter graphics and interactive techniques - SIGGRAPH 1985. “SNOPT: An SQP algorithm for large-scale constrained optimiza- ACM Press, 1985. doi: 10.1145/325334.325242. tion”. In: SIAM Review 47.1 (2005), pp. 99–131. doi: 10.1137/ [189] Taylor P. Reynolds and Mehran Mesbahi. “Optimal Planar s0036144504446096. Powered Descent with Independent Thrust and Torque”. In: [171] John T. Betts and William P. Huffman. “Path-constrained Journal of Guidance, Control, and Dynamics 43.7 (July 2020), trajectory optimization using sparse sequential quadratic pro- pp. 1225–1231. doi: 10.2514/1.g004701. gramming”. In: Journal of Guidance, Control, and Dynamics [190] Danylo Malyuta et al. “Discretization Performance and Ac- 16.1 (1993), pp. 59–68. doi: 10.2514/3.11428. curacy Analysis for the Powered Descent Guidance Problem”. In: [172] Craig T. Lawrence and Andr´eL. Tits. “A Computationally AIAA SciTech Forum. 2019. doi: 10.2514/6.2019-0925. Efficient Feasible Sequential Quadratic Programming Algorithm”. [191] Beh¸cetA¸cıkme¸seet al. “Enhancements on the Convex Pro- In: SIAM Journal on Optimization 11.4 (2001), pp. 1092–1118. gramming Based Powered Descent Guidance Algorithm for Mars doi: 10.1137/s1052623498344562. Landing”. In: AIAA/AAS Astrodynamics Specialist Conference [173] Philip E. Gill and Elizabeth Wong. “Sequential quadratic and Exhibit. American Institute of Aeronautics and Astronautics, programming methods”. In: Mixed Integer Nonlinear Program- Aug. 2008. doi: 10.2514/6.2008-6426. ming. Ed. by J. Lee and S. Leyffer. Springer New York, 2012, [192] Daniel Dueri et al. “Trajectory optimization with inter- pp. 147–224. doi: 10.1007/978-1-4614-1927-3_6. sample obstacle avoidance via successive convexification”. In: 2017 [174] Roger Fletcher. Practical Methods of Optimization. John Wi- IEEE 56th Annual Conference on Decision and Control (CDC). ley & Sons, 2013. doi: 10.1002/9781118723203. IEEE, Dec. 2017. doi: 10.1109/cdc.2017.8263811. [175] B. Fares, D. Noll, and P. Apkarian. “Robust control via se- [193] Morris W. Hirsch, Stephen Smale, and Robert L. Devaney. quential semidefinite programming”. In: SIAM Journal of Con- Differential Equations, Dynamical Systems, and an Introduction trol and Optimization 40.6 (2002), pp. 1791–1820. doi: 10.1137/ to Chaos. Elsevier, 2013. doi: 10.1016/c2009-0-61160-0. S0363012900373483. [194] Francesco Bullo and Andrew D. Lewis. Geometric Control of [176] Stephen Boyd et al. Linear Matrix Inequalities in System and Mechanical Systems. Springer New York, 2005. doi: 10.1007/978- Control Theory. SIAM, Philadelphia, PA, 1994. doi: 10.1137/1. 1-4899-7276-7. 9781611970777. [195] Charles Dugas et al. “Incorporating Second-Order Functional [177] Taylor Reynolds et al. “Funnel Synthesis for the 6-DOF Pow- Knowledge for Better Option Pricing”. In: Proceedings of the 13th ered Descent Guidance Problem”. In: AIAA Scitech 2021 Forum. International Conference on Neural Information Processing Sys- American Institute of Aeronautics and Astronautics, Jan. 2021. tems. NIPS’00. Denver, CO: MIT Press, 2000, pp. 451–457. doi: 10.2514/6.2021-0504. [196] P. A. Absil, R. Mahony, and B. Andrews. “Convergence of the Iterates of Descent Methods for Analytic Cost Functions”. In:

67 Preprint

SIAM Journal on Optimization 16.2 (Jan. 2005), pp. 531–547. doi: [215] Autonomous Controls Laboratory. Scenario 5: Quadrotor 10.1137/040605266. Vehicle Tracking Performance. https://www.youtube.com/watch? [197] Norbert Schorghofer. Lessons in scientific computing: nu- v=Ou7ZfyiXeyw. Oct. 2020. merical mathematics, computer technology, and scientific discov- [216] Alexander Domahidi, Eric Chu, and Stephen Boyd. “ECOS: ery. CRC Press, 2018. isbn: 9781315108353. An SOCP solver for embedded systems”. In: 2013 European Con- [198] I Michael Ross et al. “Scaling and Balancing for High- trol Conference (ECC). IEEE, July 2013, pp. 3071–3076. doi: 10. Performance Computation of Optimal Controls”. In: Journal of 23919/ecc.2013.6669541. Guidance, Control, and Dynamics 41.10 (2018), pp. 2086–2097. [217] National Aeronautics and Space Administration. What is As- doi: 10.2514/1.G003382. trobee? https://www.nasa.gov/astrobee. Nov. 2020. [199] Danylo Malyuta. “An Optimal Endurance Power Limiter for [218] Japan Aerospace Exploration Agency. First disclosure of im- an Electric Race Car Developed for the AMZ Racing Team”. IDSC- ages taken by the Kibo’s internal drone “Int-Ball”. https://iss. CO-PE-22. Semester project. Zurich, Switzerland: ETH Zurich, In- jaxa.jp/en/kiboexp/news/170714_int_ball_en.html. July 2017. stitute for Dynamic Systems and Control, Sept. 2016. [219] M. A. Estrada et al. “Free-flyer acquisition of spinning objects [200] James Renegar. A Mathematical View of Interior-Point with gecko-inspired adhesives”. In: IEEE International Conference Methods in Convex Optimization. Society for Industrial and on Robotics and Automation. Stockholm, Sweden, 2016, pp. 4907– Applied Mathematics, Jan. 2001. doi: 10.1137/1.9780898718812. 4913. doi: 10.1109/ICRA.2016.7487696. [201] Taylor P Reynolds and Mehran Mesbahi. “The Crawling Phe- [220] M. Mote et al. “Collision-Inclusive Trajectory Optimization nomenon in Sequential Convex Programming”. In: American Con- for Free-Flying Spacecraft”. In: Journal of Guidance, Control, and trol Conference. Denver, CO: IEEE, 2020, pp. 3613–3618. Dynamics 43.7 (2020), pp. 1247–1258. doi: 10.2514/1.G004788. [202] Robert H. Goddard. A method of reaching extreme altitudes. [221] T. Smith et al. “Astrobee: a new platform for free-flying 1920. doi: 10.1038/105809a0. robotics on the international space station”. In: International [203] P. Lu. “Entry guidance and trajectory control for reusable Symposium on Artificial Intelligence, Robotics and Automation launch vehicle”. In: Journal of Guidance, Control, and Dynamics in Space. Beijing, China, 2016. 20.1 (1997), pp. 143–149. doi: 10.2514/2.4008. [222] J. Barlow et al. “Astrobee: A New Platform for Free-Flying [204] B. Bonnard et al. “Optimal control with state constraints Robotics on the International Space Station”. In: International and the space shuttle re-entry problem”. In: Journal of Dynamical Symposium on Artificial Intelligence, Robotics, and Automation and Control Systems 9 (2003), pp. 155–199. doi: 10 . 1023 / A : in Space (i-SAIRAS). 2016. 1023289721398. [223] Daniel Szafir, Bilge Mutlu, and Terry Fong. “Communicat- [205] N. Ratliff et al. “CHOMP: Gradient optimization techniques ing Directionality in Flying Robots”. In: Proceedings of the Tenth for efficient motion planning”. In: IEEE International Conference Annual ACM/IEEE International Conference on Human-Robot on Robotics and Automation. IEEE, 2009, pp. 489–494. doi: 10. Interaction. ACM, Mar. 2015. doi: 10.1145/2696454.2696475. 1109/ROBOT.2009.5152817. [224] Maria Bualat et al. “Astrobee: Developing a Free-flying [206] M. Kalakrishnan et al. “STOMP: Stochastic trajectory op- Robot for the International Space Station”. In: AIAA SPACE 2015 timization for motion planning”. In: IEEE International Confer- Conference and Exposition. American Institute of Aeronautics ence on Robotics and Automation. IEEE, 2011, pp. 4569–4574. and Astronautics, Aug. 2015. doi: 10.2514/6.2015-4643. doi: 10.1109/ICRA.2011.5980280. [225] JAXA. Ultra-compact triaxial attitude control module- [207] L. E. Kavraki et al. “Probabilistic roadmaps for path plan- Application to “Kibo” inboard drone (Int-ball). https : ning in high-dimensional configuration spaces”. In: IEEE Transac- //youtu.be/ZtIARUS7Lqc. July 2017. tions on Robotics and Automation 12.4 (1996), pp. 566–580. doi: [226] Pedro Roque and Rodrigo Ventura. “Space CoBot: Modu- 10.1109/70.508439. lar design of an holonomic aerial robot for indoor microgravity [208] Steven M. LaValle and James J. Kuffner Jr. “Ran- environments”. In: 2016 IEEE/RSJ International Conference on domized kinodynamic planning”. In: The International Intelligent Robots and Systems (IROS). IEEE, Oct. 2016. doi: Journal of Robotics Research 20.5 (2001), pp. 378–400. doi: 10.1109/IROS.2016.7759645. 10.1177/02783640122067453. [227] Tomas Akenine-M¨olleret al. Real-Time Rendering. 4th ed. [209] A. Majumdar and R. Tedrake. “Funnel libraries for real-time CRC Press, July 2018. doi: 10.1201/b22086. robust feedback motion planning”. In: The International Journal [228] Erwin Coumans and Yunfei Bai. PyBullet, a Python module of Robotics Research 36.8 (2017), pp. 947–982. doi: 10 . 1177 / for physics simulation for games, robotics and machine learning. 0278364917712421. http://pybullet.org. 2020. [210] Scott Ploen, Bechet Acikmese, and Aron Wolf. “A Compar- [229] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep ison of Powered Descent Guidance Laws for Mars Pinpoint Land- Learning. Cambridge, Massachusetts: The MIT Press, 2016. isbn: ing”. In: AIAA/AAS Astrodynamics Specialist Conference and Ex- 9780262035613. hibit. American Institute of Aeronautics and Astronautics, Aug. [230] Steven Michael LaValle. Planning Algorithms. Cambridge, 2006. doi: 10.2514/6.2006-6676. UK: Cambridge University Press, 2006. isbn: 9780521862059. [211] John M. Carson et al. “Capabilities of convex Powered- [231] Joan Sola. “Quaternion kinematics for the error-state Descent Guidance algorithms for pinpoint and precision Kalman filter”. In: CoRR (2017). landing”. In: 2011 Aerospace Conference. IEEE, Mar. 2011. doi: [232] A. Zanelli et al. “FORCES NLP: an efficient implementa- 10.1109/aero.2011.5747244. tion of interior-point methods for multistage nonlinear nonconvex [212] A.A. Wolf et al. “Performance Trades for Mars Pinpoint programs”. In: International Journal of Control 93.1 (May 2017), Landing”. In: 2006 IEEE Aerospace Conference. IEEE, 2006. doi: pp. 13–29. doi: 10.1080/00207179.2017.1316017. 10.1109/aero.2006.1655793. [213] A. A. Wolf et al. “Improving the landing precision of an MSL- class vehicle”. In: 2012 IEEE Aerospace Conference. IEEE, Mar. 2012. doi: 10.1109/aero.2012.6187005. [214] Skye Mceowen et al. “Visual Autonomy Interface for Optimization-Based Real-Time Motion Planning and Control”. In: IEEE International Conference on Robotics and Automation. Under review. IEEE, Oct. 2020.

68