<<

2005 American Control Conference ThA18.2 June 8-10, 2005. Portland, OR, USA

An Efficient Sequential Linear Quadratic for Solving Nonlinear Problems

Athanasios Sideris and James E. Bobrow Department of Mechanical and Aerospace Engineering University of California, Irvine Irvine, CA 92697 {asideris, jebobrow}@uci.edu

Abstract— We develop a numerically efficient algorithm for continuous-time optimal control problems can handled after computing controls for nonlinear systems that minimize a discretization. We are interested primarily in systems for quadratic performance measure. We formulate the optimal which the derivatives of the dynamic with respect control problem in discrete-time, but many continuous-time problems can be also solved after discretization. Our ap- to the state and the control are available, but whose second proach is similar to sequential for derivatives with respect to the states and the controls are too finite-dimensional optimization problems in that we solve the complex to compute analytically. Our algorithm is based nonlinear optimal control problem using sequence of linear on linearizing the system dynamics about a input/state quadratic subproblems. Each subproblem is solved efficiently trajectory and solving a corresponding linear quadratic using the Riccati difference . We show that each iteration produces a descent direction for the performance optimal control problem. From the solution of the latter measure, and that the sequence of controls converges to a problem, we obtain a search direction along which we solution that satisfies the well-known necessary conditions for minimize the performance index by direct simulation of the optimal control. the system dynamics. Given the structure of the proposed algorithm, we refer to it in the sequel as the Sequential I. INTRODUCTION Linear Quadratic (SLQ) algorithm. We prove that search As the complexity of nonlinear systems such as robots directions produced in this manner are descent directions, increases, it is becoming more important to find controls and that this algorithm converges to a control that locally for them that minimize a performance measure such as minimizes the cost . Solution of the linear quadratic power consumption. Although the Maximum Principle [1] problem is well-known, and can be reliably obtained by provides the optimality conditions for minimizing a given solving a Riccati difference equation. cost function, it does not provide a method for its nu- similar in spirit are reported in [6], [7], merical computation. Because of the importance in solving [8], [9]. Such algorithms implement a Newton search, these problems, many numerical algorithms and commercial or asymptotically Newton search, but require that 2nd- software packages have been developed to solve them order system derivatives be available. Newton algorithms since the 1960’s [2]. The various approaches taken can can achieve quadratic local convergence under favorable be classified as either indirect or direct methods. Indirect circumstances such as the existence of continuous 3rd-order methods explicitly solve the optimality conditions stated or Lipschiz continuous 2nd-order derivatives, but cannot in terms of the maximum principle, the adjoint equations, guarantee global convergence (that is convergence from any and the tranversality (boundary) conditions. Direct methods starting point to a local minimum) unless properly modified. such as collocation [3] techniques, or direct transcription, Many systems are too complex for 2nd-order derivatives to replace the ODE’s with algebraic constraints using a large be available or even do not satisfy such strong continuity set of unknowns. These collocation methods are powerful assumptions (e.g. see Section IV for examples.) In such for solving trajectory optimization problems [4], [5] and cases, Newton’s method cannot be applied, or the quadratic problems with inequality constraints. However, due to the convergence rate of Newton’s method does not material- large-scale nature of these problems, convergence can be ize. We have found that our approach efficiently solves slow. Furthermore, optimal control problems can be difficult optimal control problems that are difficult to solve with to solve numerically for a variety of reasons. For instance, other popular algorithms such as collocation methods (see the nonlinear system dynamics may create an unfavorable Example, Section IV.) More specifically, we have observed eigenvalue structure of the Hessian of the ensuing optimiza- that our algorithm exhibits near-quadratic convergence in tion problem, and -based descent methods will be many of the problems that we have tested. Indeed, it inefficient. is shown in the full version of the paper [10], that the In this paper, we focus on the case of finding opti- proposed algorithm can be interpreted as a Gauss-Newton mal controls for nonlinear dynamic systems with general method, thus explaining its excellent rate of convergence linear quadratic performance indexes (including tracking properties observed in simulations. Thus although many of and regulation). The discrete-time case is considered but the alternative methods ([2], [12], [11], [8]) can be applied 0-7803-9098-9/05/$25.00 ©2005 AACC 2275 to a broader class of problems, our SLQ algorithm provides we have the : a fast and reliable alternative to such algorithms for the T T λ (n)=Lx(x(n),u(n),n)+λ (n +1)fx(x(n),u(n)) important class of optimal control problems with quadratic o T T =[x(n) − x (n)] Q(n)+λ (n +1)fx(x(n),u(n)) (8) cost under general nonlinear dynamics, while relying only on first derivative information. by using (3) and where Lx and fx denote the partials of L and f respectively with respect to the state variables. The II. PROBLEM FORMULATION AND BACKGROUND previous recursion can be solved backward in time (n = RESULTS N − 1,...,0) given the control and state trajectories and it We consider the following general formulation of can be started with the final value: discrete-time optimal control problems. ∂L N λT N ( ) x N − xo N T Q N ( )= ∂x N =[ ( ) ( )] ( ) (9) N−1 ( ) Minimize J = φ(x(N)) + L(x(n),u(n),n) u(n),x(n) (1) derived from (4). In a similar manner, we compute the n=0 J n u n subject to x(n +1)=f(x(n),u(n)); x(0) = x sensitivity of ( ) with respect to the current control ( ). 0 (2) Clearly from (7), In the formulation above we assume a quadratic perfor- ∂J(n) T mance index, namely: = Lu(x(n),u(n),n)+λ (n +1)fu(x(n),u(n)) ∂u(n) 1 o T o o T T L(x(n),u(n),n)= [x(n) − x (n)] Q(n)[x(n) − x (n)] + =[u(n) − u (n)] R(n)+λ (n +1)fu(x(n),u(n)). (10) 2 [u(n) − uo(n)]T R(n)[u(n) − uo(n)] (3) In (10), Lu and fu denote the partials of L and f respec- and tively with respect to the control variables and (3) is used. 1 o T o ∂J ∂J(n) φ(x)= [x − x (N)] Q(N)[x − x (N)] (4) Next note that = since the first n terms in J 2 ∂u(n) ∂u(n) do not depend on u(n). We have then obtained the gradient o o In (3) and (4), u (n), x (n), n =1,... N are given of the cost with respect to the control variables, namely: control input and state target sequences. In standard optimal uo n xo n ∂J(0) ∂J(1) ∂J(N − 1) regulator control problem formulations, ( ), ( ) are ∇uJ = ... . (11) usually taken to be zero with the exception perhaps of ∂u(0) ∂u(1) ∂u(N − 1) xo N ( ), the desired final value for the state. The formulation Assuming u is unconstrained, the first order optimality considered here addresses the more general optimal tracking conditions require that control problem and is required for the linear quadratic step in the proposed algorithm presented in Section III.1. ∇uJ =0. (12)

A. First Order Optimality Conditions We remark that by considering the Hamiltonian We next briefly review the first order optimality condi- T H(x, u, λ, n) ≡ L(x, u, n)+λ f(x, u), (13) tions for the optimal control problem of (1) and (2), in ∂J a manner that brings out certain important interpretations we have that Hu(x(n),u(n),λ(n +1),n) ≡ ∂u(n) , i.e. we of the adjoint dynamical equations encountered in a control uncover the generally known but frequently overlooked fact theoretic approach and Lagrange Multipliers found in a pure that the partial of the Hamiltonian with respect to the control optimization theory approach. variables u is the gradient of the cost function with respect Let us consider the cost-to-go: to u. We emphasize here that in our approach for solving N−1 the optimal control problem, we take the viewpoint of the J(n) ≡ L(x(k),u(k),k)+φ(x(N)) (5) control variables u(n) being the independent variables of n=k the problem since the dynamical equations express (recur- sively) the state variables in terms of the controls and thus with L and φ as defined in (3) and (4) respectively. We can be eliminated from the cost function. Thus in taking remark that J(n) is a function of x(n),andu(k), k = the partials of J with respect to u, J is considered as a n,...,N− 1 and introduce the sensitivity of the cost to go function u(n), n =0,...,N− 1 alone, assuming that x(0) with respect to the current state: is given. With this perspective, the problem becomes one T ∂J(n) of unconstrained minimization, and having computed ∇uJ, λ (n)= (6) ∂x(n) Steepest Descent, Quasi-Newton, and other first derivative methods can be brought to bear to solve it. However, due From to the large-scale character of the problem, only methods J(n)=L(x(n),u(n),n)+J(n +1), (7) that take advantage of the special structure of the problem and since become viable. The Linear Quadratic Regulator algorithm is such an approach in case of linear dynamics. We briefly ∂J(n+1) = ∂J(n+1) · ∂x(n+1) = λT (n +1)f (x(n),u(n)) ∂x(n) ∂x(n+1) ∂x(n) x , review it next. 2276 B. Linear Quadratic Tracking Problem By replacing λ¯(n) and λ¯(n +1) in (19) in terms of x¯(n) x n We next consider the case of linear dynamics in the and ¯( +1)from (20), we obtain optimal control problem of (1) and (2). In the following, P (n)¯x(n)+s(n)= we distinguish all variables corresponding to the linear T = Q(n)¯x(n)+A (n)[P (n +1)¯x(n +1)+ optimal control problem that may have different values in s n − Q n xo n , the nonlinear optimal control problem by using an over-bar. + ( +1)] ( )¯ ( ) When the cost is quadratic as in (3) we have the well- and by expressing x¯(n +1) from (22) and (24) above, we known Linear Quadratic Tracking problem. The control get theoretic approach to this problem is based on solving the P n x n s n first order necessary optimality conditions (also sufficient in ( )¯( )+ ( )= this case) in an efficient manner by introducing the Riccati = Q(n)¯x(n)+AT (n)P (n +1)M(n)A(n)¯x(n) − equation. We briefly elaborate on this derivation next, for −AT (n)P (n +1)M(n)B(n)R−1(n)BT (n)s(n +1) completeness and also since most references assume that T o +A (n)P (n +1)M(n)B(n)¯u (n)+ the target sequences xo(n) and uo(n) are zero. First, we AT n s n − Q n xo n . summarize the first order necessary optimality conditions + ( ) ( +1) ( )¯ ( ) for this problem. The above equation is satisfied by taking P (n)=Q(n)+AT (n)P (n +1)M(n)A(n) (25) x¯(n +1) = A(n)¯x(n)+B(n)¯u(n) (14) s(n)=AT (n) I − P (n +1)M(n)B(n)R−1(n)· λ¯T (n)=[¯x(n) − x¯o(n)]T Q(n)+ ·BT (n) s(n +1)+ +λ¯T (n +1)A(n) (15) +AT (n)P (n +1)M(n)B(n)¯uo(n) − ∂J¯(n)/∂u¯(n)=[¯u(n) − u¯o(n)]T R(n)+ −Q(n)¯xo(n) (26) +λ¯T (n +1)B(n)=0 (16) Equation (25) is the well-known Riccati difference equation Note that the system dynamical equations (14) run forward and together with the auxiliary equation (26), which is un- in time n ,...,N − with initial conditions x o o =0 1 ¯(0) = necessary if x¯ (n) and u¯ (n) are zero, are solved backward x0 given, while the adjoint dynamical equations (15) run ¯ in time (n = N − 1,...,1), with final values given by (21) backward in time, n N − ,..., with final conditions = 1 0 and together with (23) and (24). The resulting values P (n) λ¯T N x N − xo N T Q N . From (16), we obtain ( )=[¯( ) ¯ ( )] ( ) and s(n) are stored and used to solve forward in time (22) u¯(n)=¯uo(n) − R−1(n)BT (n)λ¯(n +1) (17) and (17) for the optimal control and state trajectories. and by substituting in (14) and (15), we obtain the classical III. MAIN RESULTS two-point boundary system but with additional forcing A. Formulation of the SLQ Algorithm terms due to the x¯o(n) and u¯o(n) sequences. In the proposed SLQ algorithm, the control at stage k+1 x¯(n +1) = A(n)¯x(n) − B(n)R−1(n)BT (n)λ¯(n +1)+ is found by performing a one-dimensional search from the o k +B(n)¯u (n) (18) control at stage and along a search direction that is T T found by solving an Linear Quadratic (LQ) optimal control λ¯ (n)=Q(n)¯x(n)+A (n)λ¯(n +1)− T T T problem. Specifically, let Uk =[u (0) u (1) ...u (N − −Q n xo n . T ( )¯ ( ) (19) 1)] be the optimal solution candidate at step k,and T T T T The system of (18) and (19) can be solved by the sweep Xk =[x (1) x (2) ...x (N)] the corresponding state method [1], based on the postulated relation trajectory obtained by solving the dynamical equations (2) using Uk and with the initial conditions x(0).Wenext λ¯ n P n x n s n ( )= ( )¯( )+ ( ) (20) linearize the state equations (2) about the nominal trajectory where P (n) and s(n) are appropriate matrices that can be of Uk and Xk. The linearized equations are found as follows. For n = N, (20) holds with x¯(n +1)=fx(x(n),u(n))¯x(n)+fu(x(n),u(n))¯u(n) P (N)=Q(N),s(N)=−Q(N)¯xo(N). (21) (27) with initial conditions x¯(0) = 0. We then minimize the We now substitute (20) in (18) and after some algebra we cost index (1) with respect to u¯(n). The solution of this obtain T T T T LQ problem gives U¯k =[¯u (0)u ¯ (1) ...u¯ (N − 1)] , x¯(n +1)=M(n)A(n)¯x(n)+v(n) (22) the proposed search direction. Thus, the control variables where we defined at stage k +1of the algorithm are obtained from −1 T −1 M(n)= I + B(n)R (n)B (n)P (n +1) (23) Uk+1 = Uk + αk · U¯k (28) o −1 T v n M n B n u n − R n B n s n . + ( )= ( ) ( )[¯ ( ) ( ) ( ) ( +1)] where αk ∈ R is appropriate stepsize the selection of (24) which is discussed later in the paper. Note again our 2277 perspective of considering the optimal control problem as where A(n)=fx(x(n),u(n)) and B(n)=fu(x(n),u(n)). an unconstrained finite-dimensional Let us define in U. λ˜(n)=λ¯(n) − λ(n) (32) We emphasize that U¯k as computed above is not the and note from (8) and (15) that steepest descent direction. It is the solution to a linear quadratic tracking problem for a nonlinear system that has λ˜T (n)=¯xT (n)Q(n)+λ˜T (n +1)A(n) been linearized about Uk. Note that the objective function (33) is not linearized for this solution. Our algorithm is different λ˜ N Q N x N . than standard Quasilinearization [1] and Neighboring Ex- ( )= ( )¯( ) tremal [13] methods where the adjoint equations are also Next through the indicated algebra, we can establish the linearized and two-point boundary problems are solved. following relation: As it was mentioned earlier, our algorithm corresponds to ∂J n the Gauss-Newton method for solving the optimal control ( ) · u n ∂u n ¯( )= problem of (1) and (2). ( ) = [u(n) − uo(n)]T R(n)+λT (n +1)B(n) u¯(n) B. Properties of the SQL Algorithm (using (10)) In this section, we prove two important properties of the = −λ˜T (n +1)B(n)¯u(n) − u¯T (n)R(n)¯u(n) U¯ proposed algorithm. First, we show that search direction (using (16)) is a descent direction. = −λ˜T (n +1)¯x(n +1)+λ˜T (n +1)A(n)¯x(n) − T Theorem 1: Consider the discrete-time nonlinear optimal −u¯ (n)R(n)¯u(n) (using (30)) control problem of (1) and (2), and assume a quadratic −λ˜T n x n λ˜T n x n − cost function as in (3) and (4) with R(n)=RT (n) > 0, = ( +1)¯( +1)+ ( )¯( ) T T Q(n)=QT (n) ≥ 0, n =0, 1,...,N − 1,andQ(0) = 0, −x¯ (n)Q(n)¯x(n) − u¯ (n)R(n)¯u(n). Q N QT N ≥ ( )= ( ) 0. Also consider a control sequence (using (33)) U ≡ [uT (0) ...uT (N − 1)]T and the corresponding state trajectory X ≡ [xT (1) ...xT (N)]T . Next, linearize system Finally, summing up the above equation from n =0to (2) about U and X and solve the following linear quadratic n = N − 1 and noting that x¯(0) = 0 and from (21) that problem: λ¯(N)=Q(N)¯x(N),gives:

N−1 Minimize ¯ 1 o T o ∂J(n) J = [¯x(N) − x¯ (N)] Q(N)[¯x(N)+¯x (N)] + ∇ J · U¯ = · u¯(n) u¯(n), x¯(n) 2 u ∂u(n) n=0 1 N−1 + [¯x(n) − x¯o(n)]T Q(n)[¯x(n) − x¯o(n)]+ N−1 2 = − [¯xT (n)Q(n)¯x(n)+¯uT (n)R(n)¯u(n)] − n=0 n=0 o T o +[¯u(n) − u¯ (n)] R(n)[¯u(n) − u¯ (n)] T (29) −x¯ (N)Q(N)¯x(N) < 0 (34) subject to

x¯(n +1)=fx(x(n),u(n))¯x(n)+ and the proof of the theorem is complete. +fu(x(n),u(n))¯u(n); (30) x¯(0) = 0, We remark that the search direction U¯ can be found by the LQ algorithm of Section II.B with o o o o where x¯ (n) ≡ x (n) − x(n), u¯ (n) ≡ u (n) − u(n).Then A(n) ≡ fx(x(n),u(n)) and B(n) ≡ fu(x(n),u(n)). if U¯ ≡ [¯uT (0) ...u¯T (N − 1)]T is not zero, it is a descent direction for the cost function (1), i.e. J(U +α·U¯) 0. does in fact converge to a control locally minimizing the Proof: We establish that U¯ is a descent direction by cost function (1). We denote by J[U] the cost associated T T T showing that: with the control U =[u (0) ...u (N − 1)] (and the given initial conditions x(0).) N−1 ∂J(n) ∇uJ · U¯ = u¯(n) < 0, (31) ∂u n Theorem 2: Starting with an arbitrary control sequence n=0 ( ) U0, compute recursively new controls from (28) where the since ∇uJ in (11) is the gradient of the cost function with direction U¯k is obtained as in Theorem 1 by solving the respect to the control variables. Now, the components of LQ problem of (29) and the linearized system (30) about ∇uJ are expressed in (10) in terms of the adjoint variables the current solution candidate Uk and corresponding state λ(n) that satisfy recursion (8) with final values given by trajectory Xk;alsoαk is obtained by minimizing J[Uk + (9). On the other hand, x¯(n) and u¯(n) together with adjoint αU¯k] over α>0. Then every limit point of Uk gives a variables λ¯(n) satisfy the first order optimality conditions control that satisfies the first order optimality conditions for for the linear quadratic problem given in (14), (15) and (16), the cost function (1) subject to the system equations (2). 2278 Proof: Let us define Dk ≡∇U J[Uk] · U¯k < 0,the as in previous proof can be used to show that Dk → 0 still derivative of J[Uk + αU¯k] with respect to α at α =0.We follows and the proof is completed as above. first show that under the exact assumption, Dk has a subsequence that converges to zero as k →∞. Notice IV. NUMERICAL EXAMPLE:AGAS ACTUATED that for any fixed  such that 0 <<1 and α sufficiently HOPPING MACHINE small, it holds An interesting optimal control problem is that of creating motions for an autonomous hopping machine. A simple J[Uk + αU¯k] ≤ J[Uk]+αDk. (35) model for a gas actuated one-dimensional hopping system Let α¯k be such that (35) is satisfied with equality (such is shown on the left-hand side of Figure 1. This system α¯k < ∞ exists since otherwise J[Uk + αU¯k] →−∞as is driven by a pneumatic actuator, with the location of α →∞, contradicting the fact that J is bounded below by the piston relative to the mass under no external loading y . zero.) From the mean value theorem, there exists βk with defined as p After contact occurs with the ground with y ≤ y , 0 ≤ βk ≤ α¯k such that p the upward force on the mass from actuator can be approximated by a linear spring with F = k(yp −y), where J U β U¯ · U¯ D . [ k + k k] k = k (36) k is the spring constant. The position yp can be viewed as Then, it holds the unstretched spring length and it can be easily changed by pumping air into or out of either side of the cylinder. The J[Uk]−J[Uk+1] ≥ J[Uk]−J[Uk +¯αkU¯k]=α¯kDk. (37) equations of motion for the mass are my¨ = F (y,yp)−mg, where mg is the force due to gravity, and F (y,yp)= J Uk Since [ ] is monotonically decreasing and bounded be- 0 y>yp Note that in this case F (y,yp) low (by zero), it converges to some value and (37) implies k(yp − y) otherwise. α D → D ≥η> that ¯k k 0. Let us assume that k 0 for is not differentiable at y = yp, and the proposed algorithm all k. Then, it must be that α¯k → 0 and thus βk → 0. can not be used on this problem. However, the discontinuity Now divide both sides of (36) by Dk and note that since in the derivative can easily be smoothed. For instance, let D¯ ≡ D /D  R k k k belongs in the closed unit ball of ,a the spring compression be e = yp −y and choose an α>0, ∗ compact set, it has at least one limit point D¯ that satisfies then ⎧ from (36), D¯ ∗ = D¯ ∗,orD¯ ∗ =0, which a contradiction. ⎨ 00>e k 2 Therefore, we conclude that there is a subsequence of Dk F (e)= 2α e 0 ≤ e<α ⎩ kα that converges to zero. ke − 2 otherwise Next, from (34), we obtain: is C1. The final equation of motion for this system relates 2 Dk = ∇U J[Uk] · U¯k ≤−ρU¯k ≤ 0, (38) the air flow into the cylinder, which is the control u(t), to the equilibrium position yp of the piston. Assume for where ρ>0 is a uniform lower bound on the minimum the following that the equation yp u approximates this R n ,n ,...,N − ˙ = eigenvalue of ( ) =0 1. Then (38) implies relationship. U¯ that there is a corresponding subsequence of k that When the hopping machine begins its operation, we are converges to zero. For notational convenience, we identify interested in starting from rest, and reaching a desired hop this subsequence of U¯k (and corresponding subsequences o height yN at time tf . If we minimize of other sequences) with the whole sequence. But U¯k → 0 implies from (14) that X¯k → 0 (note that x¯(0) = 0 and that 1 o 2 2 J(u)= qfin(y(N) − yN ) +˙y(N) A(n) and B(n) can be assumed to be uniformly bounded 2 U −U α U¯ → N−1 since k+1 k = k k 0.) Consequently, (33) implies tf 2 2 λ¯ n → λ n n ,...,N − + qyp(n) + ru(n) , (39) that ( ) ( ) for =0 1, and that the 2N right-hand-side of (10) converges to the right-hand-side n=0 of (16) which is zero by the optimality of U¯k for the the terms outside the summation reflect the desire to reach LQ problem. This shows that the first order optimality the height at time tf with zero velocity, and the terms inside conditions (10) for the nonlinear control problem are the summation reflect the desire to minimize the gas used asymptotically satisfied, i.e. ∇U J[Uk] → 0 as k →∞.We to achieve this. The weighting on yp is used to keep the remark that the last conclusion follows for a subsequence piston motion within its bounds. We conducted numerical of Uk, but since the cost J[Uk] is monotonically decreasing experiments on this system with the following parameters: and clearly converges to a locally minimum value, any k/m = 500,g= 386.4,α=0.1. We assumed that all states limit point of Uk must be a stationary point. were initially zero, and that the initial control sequence was zero. The cost function parameters were selected as: o We remark, that the exact line search in Theorem 2 can be yN =20,tf =1,qfin = q = 1000, and r =1.0.Asimple replaced with an inaccurate line search that satisfies some Euler approximation was used to discretize the equations, standard conditions such as in the Armijo, Goldstein, or with N = 100. Note that the algorithm produced an Wolfe stepsize selection rules [14], [15]. Similar arguments alternating sequence of stance phases and flight phases for 2279 20

15 Stance Flight Stance Flight 10 phase phase phase phase m 5

0

−5

Mass and piston position, inch 0 0.2 0.4 0.6 0.8 1

yp y 40

20

0

−20

Minumum fuel control, u −40 0 0.2 0.4 0.6 0.8 1 Time, seconds

Fig. 1. Maximum height hopping motion and minimum fuel control.

the hopping system and it naturally identified the times to REFERENCES switch between these phases. If one were to use collocation [1] A.E. Bryson and Y.C. Ho, Applied Optimal Control, Wiley, New methods to solve this problem with explicit consideration York, 1995. of the different dynamics in the different phases, one would [2] J.T. Betts, “Survey of Numerical Methods for Trajectory Optimiza- tion,” Journal of Guidance, Control and Dynamics, V. 21: (2) 193- have to guess the number of switches between phases and 207, 1999. would need to treat the times at which the switch occurs [3] O. von Stryk, “Numerical solution of optimal control problems by as variables in the optimization. In the full version of the direct collocation,” Optimal Control, Bulirsch R, Miele A, Stoer J, Well KH (eds), International Series of Numerical , vol. paper [10] (omitted here due to space limitations), we show 111. Birkshauser Verlag; Basel, 1993; 129143. that the proposed SLQ algorithm can be interpreted as a [4] C. R. Hargraves and S. W. Paris, “Direct trajectory optimization Gauss-Newton algorithm and thus explain its excellent rate using and collocation,” Journal of Guidance, Control, and Dynamics, 1987; 10:338342. of convergence observed in simulations. Consistent with [5] P. J. Enright and B. A. Conway, “Optimal finite-thrust spacecraft this interpretation, our algorithm converges much faster trajectories using collocation and nonlinear programming,” Journal when the weighting on the control r is increased; also of Guidance, Control, and Dynamics, 1991; 14:981 985. [6] J. F. Pantoja, “Differential and Newton’s the number of iterations required for convergence in this Method,” International Journal on Control, Vol. 47, No. 5, pp. 1539- o o problem increases for larger yN , ranging from 3 for yN =1, 1553, 1988. o [7] J. C. Dunn and D. P. Bertsekas, “Efficient Dynamic Programming to 166 for yN =50. In addition, the algorithm failed to α< × −5, Implementations of Newton’s Method for Unconstrained Optimal converge for 1 10 which demonstrates the need Control Problems,” Journal of Optimization Theory and Applications, for the dynamics to be continuously differentiable. Vol. 63, No. 1, pp. 23-38, 1989. [8] J. F. Pantoja and D. Q. Mayne, “Sequential Quadratic Programming Algorithm for Discrete Optimal Control Problems with Control Inequality Constraints,” International Journal on Control, Vol. 53, V. C ONCLUSION No. 4, pp. 823-836, 1991. [9] L. Z. Liao and C. A. Shoemaker, “Convergence in Unconstrained Discrete- Time Differential Dynamic Programming,” IEEE Transac- tions on Automatic Control, Vol. 36, No. 6, pp. 692-706, 1991. We developed an algorithm for solving nonlinear optimal [10] A. Sideris and J. E. Bobrow, “An Efficient Sequential Linear control problems with quadratic performance measures and Quadratic Algorithm for Solving Nonlinear Optimal Control Prob- unconstrained controls. Each subproblem in the course of lems,” submitted to IEEE Transactions on Automatic Control, June 2004. the algorithm is a linear quadratic optimal control prob- [11] S. J. Wright, “Interior-Point Methods for Optimal Control of lem that can be efficiently solved by Riccati difference Discrete-Time Systems,” Journal of Optimization Theory and Ap- equations. We show that each search direction generated in plications, Vol. 77, No. 1, pp. 161-187, 1993. [12] C. R. Dohrmann and R. D. Robinett, “Dynamic Programming the linear quadratic subproblem is a descent direction, and Method for Constrained Discrete-Time Optimal Control,” Journal Of that the algorithm is convergent. Computational experience Optimization Theory And Applications, Vol. 101, No. 2. pp. 259-283, has demonstrated that the algorithm converges quickly to 1999. [13] T. Veeraklaew and S. K. Agrawal, “Neighboring optimal feedback the optimal solution. The SLQ algorithm is proposed as a law for higher-order dynamic systems,” ASME J. Dynamic Systems, powerful alternative to Newton methods in situations that Measurement, and Control, 124 (3): 492-497 SEP 2002. second order derivative information on the system dynamics [14] R. Fletcher, Practical Methods of Optimization, Wiley, 2nd edition, 1987. is not available or expensive to obtain and a near real-time [15] P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, solver is required. Harcourt Brace, 1981. 2280