JOURNAL OF GUIDANCE,CONTROL, AND DYNAMICS Vol. 31, No. 4, July–August 2008

Pseudospectral Methods for Infinite-Horizon Problems

Fariba Fahroo∗ and I. Michael Ross† Naval Postgraduate School, Monterey, California 93943 DOI: 10.2514/1.33117 A central computational issue in solving infinite-horizon nonlinear optimal control problems is the treatment of the horizon. In this paper, we directly address this issue by a domain transformation technique that maps the infinite horizon to a finite horizon. The transformed finite horizon serves as the computational domain for an application of pseudospectral methods. Although any pseudospectral method may be used, we focus on the Legendre pseudospectral method. It is shown that the proper class of Legendre pseudospectral methods to solve infinite- horizon problems are the Radau-based methods with weighted interpolants. This is in sharp contrast to the unweighted pseudospectral techniques for optimal control. The Legendre–Gauss–Radau pseudospectral method is thus developed to solve nonlinear constrained optimal control problems. An application of the covector mapping principle for the Legendre–Gauss–Radau pseudospectral method generates a covector mapping theorem that provides an efficient approach for the verification and validation of the extremality of the computed solution. Several example problems are solved to illustrate the ideas.

I. Introduction infinite horizon. Second, a pseudospectral (PS) method is applied to VER the last few years, it has become possible to routinely discretize the problem. The discretized problem generates a sequence fi of approximate problems parameterized by the size of the grid. In the O solve many practical nite-horizon optimal control problems. fi Practical problems are typically nonlinear and constrained in both the third and nal step, each problem in this sequence is solved by a state and control variables. A recent example of generating routine globally convergent sequential quadratic programming (SQP) solutions to practical problems is the flight implementation of technique [8]. Note that global convergence does not mean global pseudospectral solutions to an optimal “zero-propellant maneuver” optimality [9]. Among many, this is one reason why it is important to of the International Space Station [1,2]. Furthermore, a fairly large select a correct sequence of problems together with the SQP class of finite-horizon problems can also be solved in real time. A subsequence to design a proper algorithm for solving optimal control number of issues related to this advancement are discussed in [3]. problems [10]. Such an algorithm for PS methods is called a spectral Real-time computation of optimal controls implies optimal feedback algorithm [11], which forms the core of PS methods for control. For guidance; hence, a number of control problems that are not readily the purpose of brevity, we do not discuss all aspects of our proposed addressable by traditional methods can now be solved in a unified approaches because many of these concepts have been extensively discussed elsewhere. For example, a complete description of the manner [4]. Despite such advancements, obtaining feedback controls – – for stabilizing a nonlinear system poses significant challenges [5]. Legendre Gauss Lobatto PS method is described in [12,13]; the Among many, one of the reasons for the continuing challenge is the spectral algorithm is described in [11]; and a convergence theorem is treatment of the infinite horizon in a nonlinear optimal control developed in [10]. Consequently, in this paper, we limit the scope of the problem to a complete discussion of only the open-loop issues problem [6]. A well-known strategy to address this issue is the fi receding-horizon technique, which forms the basis of model- pertaining to the treatment of the in nite horizon. Stability of the predictive methods for control. Even though finite-horizon problems closed-loop system is discussed in [14]. An overview of an fi implementation of the closed-loop control is discussed in [15]. In this are solved in real time, in this approach, the nite horizon is receded fi indefinitely as a means to approximate the infinite horizon. As a context, this paper provides the foundations for solving in nite- result of the mismatch between the computational horizon and the horizon optimal control problems. Part of this foundation is the infinite horizon, a large number of new problems emerge. Conse- development of a Radau-based PS method. It turns out that the well- fl known Lobatto family of PS methods is inappropriate for the quently, different avors of model-predictive control techniques fi have been proposed to address these problems. These issues are treatment of the in nite horizon as a result of singularities in the surveyed in [6] and the references contained therein. domain transformation technique. A detailed discussion of these In this paper, we propose an altogether different approach for issues is included in this paper. solving infinite-horizon, nonlinear, constrained optimal control As a matter of context, we note that our proposed approach is not problems. Our approach follows [7], in which we presented some an application of PS discretization applied to receding-horizon techniques. This approach has been discussed elsewhere [16–20] and initial ideas; here, we further the concepts along several different fi directions. First, the infinite horizon is mapped to a finite horizon by a offers well-established bene ts of PS methods; however, as alluded domain transformation technique that preserves the notion of the to earlier, the fundamental problems and issues [6] of receding- horizon techniques arising from the finiteness of solving the finite- horizon optimal control problem do not disappear by choosing Received 28 June 2007; revision received 4 September 2007; accepted for alternative discretizations. Thus, the proposed approach in this paper publication 12 September 2007. This material is declared a work of the U.S. provides a foundation for a new approach to solving infinite-horizon Government and is not subject to copyright protection in the United States. optimal control problems. Copies of this paper may be made for personal or internal use, on condition Two example problems are solved to illustrate various elements of that the copier pay the $10.00 per-copy fee to the Copyright Clearance Center, the main points of the proposed ideas: one is a standard linear- Inc., 222 Rosewood Drive, Danvers, MA 01923; include the code 0731-5090/ 08 $10.00 in correspondence with the CCC. quadratic regulator (LQR) problem in which the solutions from our ∗Professor, Department of Applied Mathematics, Code MA/Ff; method are compared favorably against the solutions from the [email protected]. Associate Fellow AIAA. Riccati approach. The second example is a nonlinear control problem †Professor, Department of Mechanical and Astronautical Engineering, of stabilizing NPSAT1, an experimental spacecraft designed and Code ME/Ro; [email protected]. Associate Fellow AIAA. built at the Naval Postgraduate School and scheduled to be launched 927 928 FAHROO AND ROSS in fall 2009. Additional example problems have been solved [7,14] 1 t 1 t t : , (3) but are not discussed in this paper for the purpose of brevity. 1 t 1 where 21; 1. Because we assumed that quantities exist in the II. Problem Formulation limit as t !1, it is clear that Eq. (3) maps the infinite domain to a We consider the following infinite-horizon, constrained, nonlinear finite domain. Using optimal control problem: dt 2 8 R : r (4) > 1 1 2 <> Minimize Jx; u 0 Fxt; utdt d _ f 1 Subject to xt xt; ut B> 0 0 we reformulate problem 1B on the finite interval 1; 1 as :> x x h 0 8 R xt; ut > 1 > Minimize Jx; u 1 rFxt; utd < dxt N N rfxt; ut where 0; 1 3 t7 !fx 2 R x ; u 2 R u g is the state-control function Subject to d B> 1 0 pair; F: RNx RNu ! R, f: RNx RNu ! RNx , and h: RNx :> xt x 0 h 0 RNu ! RNh are continuously differentiable functions; and x is a xt; ut given value of the state at the initial time t 0. The symbol N 1 denotes the vector dimension (i.e., number of variables) of the Evaluations at must be performed in the sense of the limit subscripted quantities. Thus, it is clear that we are dealing with a consistent with t !1. In applying the Pontryagin minimum fairly general optimal control problem for which the problem principle for problem B, we generate the following problem: formulation is motivated by stabilizing nonlinear control systems 8 > Find x; u; ; that frequently describe high-performance aerospace vehicles. We > dxt 1 > Such that rfxt; ut assume that a solution exists for problem B in some appropriate > d > xt1 x0 Sobolev space [21] and that the limits > <> hxt; ut 0 lim lim @H xtx1 and utu1 B 0 t!1 t!1 > @u > tT hxt; ut 0 > exist. Then, under appropriate conditions [22], the Pontryagin > t 0 1 > minimum principle holds for problem B. That is, if fx; ug is a > _t r @H 1 :> @x solution to problem B, then there exist multipliers (duals or t1 0 covectors) that satisfy the adjoint equation, the Hamiltonian minimization condition, and the transversality conditions. This where evaluations at t1 are also to be understood in the sense of a dualization may be cast in terms of a boundary-value problem, limit. 1 fi denoted as problem B , and is de ned as The equivalence between problems 1B and B require a set of 8 technical assumptions that we implicitly assume. For the purposes of > Find x; u; ; > brevity, we do not list all these assumptions, but we note one of these > Such that x_tfxt; ut > 0 to illustrate the point. > x0x 1 > In problem B it is necessary that the function t7 !Fxt; ut be <> hxt; ut 0 1 0 in L ; 1; R; however, this does not necessarily guarantee that 1B @Ht 0 7 !rFxt; ut will be in L11; 1; R. Consequently, > @u > tT hxt; ut 0 we implicitly assume that the transformed cost function is locally > > t0 integrable. > > _t@Ht :> @x 1 0 IV. New Perspective on Pseudospectral Methods Although recognizing that the domain transformation technique is Nh Nx Nx Nu where H: R R R R ! R is the D form [23] of the quite independent of PS methods for solving optimal control – Hamiltonian Lagrangian problem, the rationale for transforming the domain from 0; 1 to 1; 1 and not, say, [0, 1] is indeed motivated by pseudospectral H ; ; x; u : H; x; uT hx; u (1) methods. Two of the most widely used PS methods are the Legendre and Chebyshev pseudospectral methods. These methods rely on H: RNx RNx RNu ! R is the Pontryagin Hamiltonian interpolation of the approximating polynomials for the states and H; x; u : Fx; uT fx; u (2) costates on Gaussian quadrature points (such as Legendre- or Chebyshev–Gauss–Lobatto points). These points lie in the interval 1 1 and H t is a shorthand notation [21] for H t; t; xt; ut.As ; , but by using a time-domain transformation, any arbitrary fi in the case of the states and controls, we use the notation nite interval can be mapped to this interval. Typically, a linear domain transformation is used; however, proper nonlinear domain 1 limt transformation techniques can be used quite effectively to create an t!1 adaptive grid [25]. A simple extension of this idea is to map the semi- infinite domain to the finite time domain 1; 1 and then use the for the limit and assume that it exists. appropriate quadrature nodes for the interpolation polynomials. Equation (3) is in fact the rational map commonly used [26,27] in III. Domain Transformation Technique pseudospectral methods for domain transformation. Other appropriate domain transformation techniques may be used as well, As noted earlier, the semi-infinite domain 0; 1 poses a number and as will be apparent in the sections to follow, our methods carry of problems. That is, it is necessary to properly account for the over quite trivially. infiniteness of the time domain. In [24], we proposed a Laguerre PS method as a means to address this problem head-on. Motivated by domain transformation techniques prevalent in PS methods, and our A. Overview previous success in designing an adaptive PS method for control In PS methods, the functions are approximated by Lagrange [25], we let interpolating polynomials of order N in which interpolation occurs at FAHROO AND ROSS 929

Z 1 1 1 Gaussian quadrature points that lie in the interval ; . There are ^ : three different families of Gauss quadrature points known as Gauss, w j jd (7) 1 j Gauss–Radau, and Gauss–Lobatto [27,28]. What distinguishes these ^ nodes is the choice of zeros of orthogonal polynomials such as Jacobi It is straightforward to show that wj is independent of the choice of polynomials, in general, or Chebyshev or Legendre polynomials in and is given explicitly by [31] most applications. Gauss quadrature uses zeros that are interior to the 8 1 1 – < 2 interval ; . The Gauss Radau zeros include one of the endpoints 2 j 0 1 – ^ N1 of the interval, usually the left-endpoint at . The Gauss w j 1 1j (8) : 2 2 j ≠ 0 Lobatto points include both endpoints of the interval at 1, and N1 LN j 1. Proper choice of these nodes depends on the boundary-value problem that is being solved. In a general optimal control problem The derivative of y is obtained by differentiating Eq. (5): over a finite-horizon, explicit formulations of initial and final XN conditions are often required. This is why Lobatto points, which are dy dyN d y (9) fixed at both of the boundary points, are the most natural choice. On j j d d 0 d j the other hand, for the infinite-horizon optimal control problems, one j generally has explicit boundary conditions at the initial point, and the Evaluating the derivatives at the LGR points leads to the notion of a fi implicit assumption is that solutions at in nity tend to zero. differentiation matrix: Therefore, for these problems, it suffices to use nodes that are fixed at the left-hand node. Moreover, in our technique of mapping the semi- ^ d infinite horizon to a finite interval, the derivative of the function r D kj j (10) d j has a singularity at 1, which corresponds to the right-hand k fi endpoint at in nity. Therefore, for these problems, only the left-hand For 1, we denote D^ as simply D^ ; this is given explicitly 1 kj kj boundary point is required; hence, the Radau points, which are by [31] fixed at the left-hand endpoint 1, are more appropriate than 8 Lobatto points. It should also be noted that in an implementation of > NN2 0 <> 4 j k ; PS methods, the Gauss points are not suitable because their use poses 1 ^ : LN k j 1 ≠ 1 D kj 1 j k; j; k N (11) subtle but serious issues in applying the boundary conditions [29]. > LN j k kj Consequently, we choose Legendre–Gauss–Radau (LGR) points for : 1 1 j k N 21k discretization. These points j (j 0; ;N) are defined as fixed at the initial point 0 1 and the rest of the nodes are defined as zeros ^ ^ For 1 , we denote Dkj as Dkj. It is straightforward to of LN LN1, where LN is the Legendre polynomial of degree N. show that the entries of this differentiation matrix are given by For these points, which are distributed over 1; 1, evaluation at the 8 right-hand point (which, for the mapped domain, corresponds to 1) > N121 0 1 < 4 j k ; is at N , where the size of depends inversely on N; that is, ^ : LN k 1 ≠ 1 ! 0 as N !1. Because evaluations at 1 are to be understood in D kj L j k; j; k N; (12) > N j k j : 1 1 the sense of the limit, it is clear that the last Radau node provides 21 j k N precisely this condition. k

B. Approximation of Functions, Derivatives, and Integrals C. Transformations to the Physical Domain To introduce the basic ideas, we consider a generic function The physical domain is 0; 1, and the computational domain is 7 !y2R. This allows us to ignore much of the bookkeeping 1; 1. The closure of the computational domain is exactly the same associated with vector-valued functions. Pseudospectral methods are as the domain of the transformed problems B and B. Consider now a based on approximating y by way of weighted interpolants of the function 0; 1 3 t7 !xt2R, which need not be a state variable. form [27,30] Following the approximation theory described in Sec. IV.B, the Nth- order approximation of the function, xNt, is given by XN N yy y (5) N j j X j0 j N x t jxj (13) j0 j N where is a positive weight function, fjgj0 are Gaussian points, N and fjgj0 satisfy jkjk, where jk is the Kronecker delta. Thus, Hence, we have : N N xj x tj x tj (14) yN y k k where 1 For the LGR method, we choose for approximating the t : t 20; 1 states and 1 for approximating the costates. The j j motivation for this choice of weight functions is based on generating are called the shifted LGR points. It is clear that t ≠ 1 for any a consistent set of new approximations and is further elaborated in j j 2f0; 1; ...;Ng, but that tN !1as N !1. From Eqs. (4) and Sec. V. (6), it follows that We approximate Z Z Z 1 1 XN 1 yd xtdt xtrd wjxj (15) 0 1 0 1 j by the LGR quadrature formula: where Z Z : ^ 1 1 XN wj rjwj (16) N ^ yd y d wjyj (6) 1 1 j0 are the quadrature weights in the physical domain. Similarly, we can derive an expression for the derivative of xNt at the shifted LGR ^ N fi where fwjgj0 are the LGR weights de ned as points, tk, using the chain rule: 930 FAHROO AND ROSS

N convergence dxNt dx t d (17) λ λ N dt d dt Problem B Problem B ttk k approximation (indirect method) Covector Let M be a diagonal matrix with entries Mapping Theorem d 1 Mkk (18) dt r N λ k k Problem B Then, following Eq. (10), we can write

XN dualization _ N x tk Dkjxj (19)

j0 (optimality conditions) conditions) (optimality dualization where convergence N : ^ Problem B Problem B Dkj MkkDkj (20) approximation (direct method) Fig. 2 Illustrating the covector mapping principle as first promulgated D. Radau Advantage by Ross and Fahroo [9,37,44]. As noted earlier, one of the reasons we used Radau points for the Legendre PS method is the singularity in r at 1. By avoiding this point, we avoid the singularity. The choice of Radau points is approximations is different from those of Polak [36] and hence must merely a pseudospectral approach for handling this singularity as be considered as additional requirements in generating convergent opposed to ignoring it. Nonetheless, Radau points offer an advantage approximations [33]. The totality of these new concepts is encapsulated in the covector mapping principle [12,33,34,37–39] for PS methods for control that are not offered by Lobatto points. A fi fi distribution of the Lobatto and Radau points over the physical (CMP), as depicted in Fig. 2. In this gure, the pre x 1 may be used 0 15 to denote the problems discussed in Sec. II. Note also that domain ; 1 is shown in Fig. 1 for N . Although both the Lobatto and Radau points cluster at the initial time, the Radau points problem B , introduced in Sec. III, is represented in Fig. 2 as a “ ” are further spread out over the time axis at instances well beyond the Pontryagin lift of problem B. Because problems B and B are 1 1 initial time, toward t !1. This implies that although both the equivalent to problems B and B , respectively, we simply need to fi Lobatto and Radau points offer good anti-aliasing properties [32] at apply the CMP to the transformed nite-horizon problems. That is, fi the initial time, the Radau points provide a more accurate we rst need to generate a CMP-consistent set of approximations for representation of the infinite horizon by distributing more points problems B and B . toward t !1(see Fig. 1). This implies a better representation of controls for feedback in terms of generating Carathéodory--type feedback solutions [14,15]. A. Generating Problems BN and BN An application of the CMP for PS methods requires that we use the V. Solving Problem 1B following approximations for the states and costates: As discussed in detail in [12,29,33], the general theory of pseudospectral methods for optimal control relies on generating a XN 1 N consistent approximation for problem B, the main problem x t jxj (21) described in Sec. II. That is, the approximation scheme for j0 problem 1B must be consistent with problem 1B. This notion of consistent approximations is not limited to PS methods alone; rather, this type of consistency is required for convergence of approximations for Runge–Kutta methods [34,35], including the XN 1 popular Hermite–Simpson method. This notion of consistent N t 1 jj (22) j0 j

In other words, to generate a CMP-consistent set of approximations, we need to choose different weights for interpolants that approximate the states and costates; that is, we require that 1 for the states and 1 for the costates. Additional discussions of this notion specific to pseudospectral methods are discussed in [29]. For LGR an introduction to the general principles, see [33]; for advanced topics, see [38]. As a simple illustration of the principle behind Eq. (22), observe that a substitution of 1 guarantees the transversality condition t1 0 in problem B, but 1 is not N fi part of the set of LGR points fjgj0 for any nite N. Because 1 N ! as N !1, it is clear that LGL

limNt!0 as N !1 t!1

0 50 100 150 200 in exactly the manner stipulated for problem 1B in Sec. II. Thus, time applying the results of Sec. IV to problems B, we generate Fig. 1 Sample distribution of shifted Radau and Lobatto points. problem BN: FAHROO AND ROSS 931

: x x ... x : u u ... u C. Developing Problem BN xk 0; 1; ; N ; uk 0; 1; ; N; 8 P This problem is obtained by applying the multiplier theory for N N > Minimize J x ; u 0 Fx ; u w N N > P k k k k k k problem B . It is straightforward to show that by using fw g 0 to <> N f 0 j j Subject to j0 Dkjxj xk; uk construct the discrete Lagrangian, we generate the following N 0 0 B > x0 x problem: > h 0 :> xk; uk 8 k 0; ...;N > ~ ~ ~ > Find Pxk; uk; k; k; 0 > N f 0 > Such that j0 Dkjxj xk; uk 1 > 0 0 where Dkj is Dkj with . Similarly, using the notation Dkj > x0 x 1 > h 0 for Dkj with , an approximation to problem B is > xk; uk N > ~ ~ given by problem B , as given next: > @Hk;k;xk;uk 0 > @u 8 < k ~ T hx ; u 0 > BN k k k > Find Pxk; uk; k; k; 0 > ~ 0 > N f 0 > k > Such that j0 Dkjxj xk; uk > 0 ... > 0 > for Pk ; ;N > x0 x 0 > ~ ~ > > N ~ @Hk;k;xk;k 0 > > 0 D j > h 0 > j kj @xk > xk; uk > 1 ... < @H ; ;x ;u > for k ; ;N k k k k 0 > P ~ ~ N @u > N ~ @H0;0;x0;0 B k > and 0 D0 c0 > T h 0 :> j j j @x0 > k xk; uk ~ > 0 0 ~0 w0c0 > k > P > N @Hk;k;xk;k 0 > j0 Dkjj c Nx > @xk where 0 2 R is arbitrary. > 0 0 :> k 0; ...;N D. Covector Mapping Theorem When the multiplier theory is applied to problem BN, the resulting Karush–Kuhn–Tucker (KKT) conditions are not automatically in the B. Remarks on the Transversality Conditions form of problem BN. The equations need to be rearranged or N Problem 1B (and, consequently, problem B) is written with a transformed to render them similar to problem B . This terminal transversality condition (cf. Secs. II and III). An initial transformation of the equations for PS methods is quite transversality condition of the form straightforward once the correct primal-dual interpolants are identified, as in Eqs. (21) and (22). A somewhat similar concept is – t0 (23) necessary in generating discrete Runge Kutta approximations, as observed by Hager [35]. As a matter of fact, if it does not satisfy ’ – may also be included in the problem definition, but it is clearly Hager s conditions, a Runge Kutta method that is convergent as an redundant, because the initial transversality condition provides no integrator for a differential equation will be divergent as a collocation method for optimal control. In any event, it is quite clear, by mere new information toward constructing a candidate solution for N inspection, that any solution to problem B is also a solution to problem B . In fact, as argued by Pontryagin et al. [22], under N appropriate conditions, problem B contains the right number of problem B ; in this case, we have boundary conditions to define a locally unique solution. An initial c 0 0 (25) transversality condition is useful when some of the initial conditions are unknown. In this case, they provide the missing initial conditions That is, Eq. (25) is a matching condition that matches problem BN to (in dual space) to generate extremals. The discretized version of this problem BN. Matching conditions are part of the totality of closure problem (namely, problem BN) is written with an initial fi conditions required to complete the circuit (arrows) indicated in transversality condition, but not the nal transversality condition. N N fi Fig. 2. Note that problems B and B are generated from problems The nal transversality condition is automatically met by our choice of the interpolant for t7 !t [cf. Eq. (22)]. Consequently, defining B and B , respectively, without introducing any additional continuous-time primal conditions to carry over to the discrete-time problem BN with a final transversality condition is redundant. In this problems. This notion is implicit in Fig. 2. context, our PS method has a Galerkin flavor. An initial In completing the steps suggested in Fig. 2 toward the transversality is introduced in problem BN because, unlike its development of a covector mapping theorem, we identify the continuous-time counterpart, we regard x0 as an unknown quantity. 0 following multiplier sets analogous to those introduced in [40]: Let If we were to treat x0 to be given by x , then the discretized form of N :fx ; u g and :f0; ; g. We denote by M the initial transversality condition k k k k the multiplier set corresponding to ,

0 0 (24) M N :f: satisfies the conditions of problem BNg is unnecessary in the problem formulation (i.e., in problem BN). In (26) this case, with x0 being regarded as a known quantity, the index k for fi ~ : ~ ~ ~ N 1 ... Similarly, we de ne f0; k; kg and M , the the unknown variables xk would now run from k ; ;N, whereas the corresponding index for the remaining variables would multiplier set: run from k 0 to N. We prefer to use a uniform indexing scheme, in M N :f:~ ~ satisfies the conditions of problem BNg which case, we regard x0 as an “artificial” unknown variable 0 constrained by x0 x 0. This preference for a uniform indexing (27) scheme, consistent with treating x0 as an unknown variable, implies that the continuous-time initial transversality condition (23) must Clearly, MNMN. We now define a new multiplier set: now be introduced as Eq. (24) in its discrete-time form, consistent ^ N : ~ N ~ c 0 with the notion of regarding xt0 as an unknown variable. Within the M f 2 M : satisfies 0 g (28) context of PS methods, this technique of “adding” equations to meet ^ N boundary conditions is known as the tau method [27]. Thus, it is clear Thus, M MN. That is, under a matching (closure) that a PS method for optimal control has flavors of both the Galerkin condition, every solution of problem BN is also a solution to and tau methods. problem BN. 932 FAHROO AND ROSS

Theorem 1 (covector mapping theorem): Let MN ≠ ; and 10 ^ ^ ^ ^ N ^ N N f0; k; kg 2 M , then the bijection M M is given by 8

N ^ N ^ ^ tkk; tkk; 0 0 (29) 6 u* Remark 1.1: Although the statement of Theorem 1 is identical to that of [40] for the Legendre–Gauss–Lobatto (LGL) method, note that the 4 problems BN and BN for the Radau methods are defined differently through the construction of weighted interpolation. The weighted 2 x* interpolants lead to a consistent pair of differentiation matrices D and 2 D that are dual to each other. In this context, the LGL case turns out 0 to be the special situation in which the formal adjoint of D is the same as D. * N ~ −2 x1 Remark 1.2: Note that Theorem 1 neither states tkk, N ~ ~ N ~ N tkk,or0 0, nor does it imply tkk, tk ~ ,or0 ~0, as is sometimes erroneously interpreted. −4 k 02468101214 Furthermore, note that Eq. (29) is an exact relationship for both the Time (sec) LGR and LGL PS methods. Fig. 3 Sample LQR results for the states and control using LGR PS (stars) and Riccati techniques (solid line). E. Remarks on the Covector Mapping Principle The covector mapping theorem was generated by an application of the covector mapping principle (see Fig. 2). Since the introduction of not to be construed as illustrating a replacement of Riccati Pontryagin’s principle, it has been known that the minimum techniques. To this end, consider the following LQR problem from principle may fail in the discrete-time domain if it is applied in Kirk [42]. The problem is to find the optimal state-control function exactly the same manner as in the continuous-time domain. For the pair that minimize the quadratic cost functional minimum principle to hold exactly in the discrete-time domain, Z 1 2 1 2 1 2 additional assumptions of convexity are required, whereas no such Jx;u x1t x2t u t dt (30) assumptions are necessary for the continuous-time versions. This is 0 2 4 because continuous time has a convexity property hidden through its continuity. Thus, when a continuous-time optimal control problem is subject to the second-order dynamics discretized, the time convexity is not carried over to the discrete-time x_ 1 x2t (31) domain, resulting in a loss of information. If this information loss is not restored, the discrete-time solution may be spurious, not converge to the correct solution, or may even provide completely x_ 2 2x1tx2tut (32) false results. A historical account of these issues, along with a simple counter example, is described in [37]. Further details are provided in and initial condition x10;x20 4; 4. The standard approach the references contained in [37]. A thorough discussion of these for solving this problem is to numerically solve the algebraic Riccati issues and their relationship to advance concepts in optimal control equation for constructing a gain matrix and to use the resulting theory has been developed by Mordukhovich [38,39]. optimal feedback control law to integrate the state equations forward The closure conditions introduced by Ross and Fahroo [12] are a for a sufficiently long horizon to obtain the optimal states. The form of matching conditions that are similar in spirit to optimal control computed from the matrix Riccati solution via the Mordukhovich’s [38,39] matching conditions for Euler approx- gain matrix K is given by [42] imations. These conditions can be in primal space alone [35,38] or in primal–dual space [33]. Note, however, that the primal-space x1 u 2K1 K2 (33) conditions [35,38] are obtained through dual-space considerations. x2 These new ideas reveal that dual-space issues cannot be ignored even in the so-called direct methods for optimal control. where K1 and K2 are the components of the matrix Riccati solution. The resulting states and controls from both the Riccati approach and the Radau PS technique are shown in Fig. 3. The solid line represents F. Software the Riccati-based solution, and the stars denote the Radau PS A proper implementation of a PS method requires addressing all solutions for N 12. Evidently, our method can accurately solve the the numerical stability and accuracy issues, and a case-by-case LQR problem for a small number of nodes. approach to problem-solving using PS approximations is inadvisable The plots in Figs. 4 and 5 show the costates obtained from the fi because it is tantamount to using rst principles in every situation. Riccati solution (solid lines): Consequently, it is preferable for all the intricacies of a PS method to be implemented just once for a general problem in a form of a 1t x1t K (34) reusable software package. DIDO [41] is such a general-purpose 2t x2t application package for solving optimal control problems within the MATLAB® environment. It is a minimalist’s approach to solving The stars in these plots are the costates obtained from Theorem 1. optimal control problems, in that only the problem formulation is From these plots, it is obvious that the costates obtained from the required in a manner akin to writing it on a piece of paper. DIDO covector mapping theorem are at least as accurate at the Riccati automatically completes the “circuit” shown in Fig. 2. The Radau approach. approach is implemented in an version of the software package. This software was used in the following sections that illustrate the VII. Example 2: Attitude Hold for NPSAT1 principles and practice. NPSAT1 (see Fig. 6) is an experimental spacecraft that was conceived, designed, and built at the Naval Postgraduate School VI. Example 1: Linear-Quadratic Regulator (NPS) and is scheduled to be launched in fall 2009. This spacecraft We first consider an LQR problem to illustrate and validate the has formed a testbed at NPS for illustrating various new ideas in LGR approach vis-a-vis the Riccati technique. Thus, this example is constrained nonlinear control and estimation [15,43]. To illustrate FAHROO AND ROSS 933

the key ideas of the LGR method, we use a standard rigid-body 0 dynamic model for NPSAT1: 1 _ q 1t2 !xtq4t!ytq3t!ztq2t 1 _ −5 q2t2 !xtq3t!ytq4t!ztq1t 1 1

λ _ q3t2 !xtq2t!ytq1t!ztq4t 1 q_4t ! tq1t! tq2t! tq3t −10 2 x y z _ I2 I3 !xt !yt!ztu1 I1 _ I3 I1 !yt !xt!ztu2 −15 I2 02468101214 Time (sec) _ I1 I2 !zt !xt!ytu3 t I3 Fig. 4 Comparison of the results for 1 using LGR PS (stars) and Riccati techniques (solid line). 1 where qi (i , 2, 3, 4) are the quaternion parameters that satisfy the state constraints

2 2 2 2 q1 q2 q3 q4 1 (35) 1 q4 0 (36) 1 0 ui (i , 2, 3) are the control torques that satisfy the control constraints 0 01 1 2 3 −1 juji : N mi ; ; (37)

and I1 5, I2 5:1, and I3 2 are the inertias of NPSAT1. The

2 −2 initial condition corresponding to a rest position at 30 deg off each λ axis is given by

−3 q1;q2;q3;q40:91856; 0:17678; 0:30619; 0:17678 (38)

0 0 0 −4 !x;!y;!z ; ; (39) The goal is to regulate NPSAT1 to its equilibrium position at −5 0 0 0 1 0 0 0 02468101214 q1;q2;q3;q4 ; ; ; ; !x;!y;!z ; ; Time (sec) t To solve this problem, we construct a quadratic cost function: Fig. 5 Comparison of the results for 2 using LGR PS (stars) and Riccati techniques (solid line). 2 2 2 Fe; !; ukek 10k!k 100kuk

where e :e1;e2;e3;e4 is the error quaternion

e1;e2;e3;e4q1;q2;q3;q4 1 : : and ! !x;!y;!z and u u1;u2;u3. The LGR PS control solution for N 60 is shown in Fig. 7. Note that t7 !u1 and t7 !u3 are saturated early on. The corresponding state trajectories are shown in Figs. 8 and 9. When the initial condition [see Eqs. (38) and (39)] is propagated through the dynamics with the interpolated PS controls, the resulting trajectories are indeed feasible, as illustrated in Figs. 8 and 9. Having established the feasibility of the PS controls, we now test their optimality. The control Hamiltonian is given by 2 2 2 1 2 10 2 2 2 H; q; !; uq1 q2 q3 q4 !x !y !z 2 2 2 100 u1 u2 u3 5u1 6u2 7u3 H0

where H0 denotes terms in the Hamiltonian function that are not dependent upon u. An application of the KKT conditions for the Hamiltonian minimization condition, Minimize H; q; !; u u (40) 0 01 1 2 3 Fig. 6 Artist’s rendition of NPSAT1 in orbit. Subject to kuik : ;i ; ; 934 FAHROO AND ROSS

0.015 requires that u 1 8 u < 0 01 200 0 01 0.01 2 : if i4= > : u 0 01 200 0 01 3 ui : : if i4= < : (41) 200 i4= otherwise 0.005 The costates obtained from an application of Theorem 1 are plotted 0 alongside the controls in Figs. 10–12. It is quite obvious from these figures that Eq. (41) is satisfied at all points, and hence we conclude

−0.005 0.05

u 1 −0.01 0.04 − λ /200 0.03 5 −0.015 0 102030405060708090100 0.02 time (s) 0.01 Fig. 7 Candidate LGR/PS-optimal control for stabilizing NPSAT1. 0 −0.01 −0.02 1.2 −0.03 −0.04 1 −0.05 020406080100 0.8 t u Fig. 10 Control trajectory 7! 1 and its switching function. q 1 0.6 q 2 0.01 q 3 u q 0.008 2 0.4 4 − λ /200 0.006 6 0.2 0.004

0 0.002 0 −0.2 0 102030405060708090100 −0.002 time (s) −0.004 Fig. 8 Quaternions for the NPSAT1 stability problem; stars represents the discrete solution with 60 nodes and the solid lines are propagated −0.006 trajectories. −0.008 −0.01 0 20406080100 t u Fig. 11 Control trajectory 7! 2 and its switching function. 0.02

0 0.02 0 −0.02 −0.02 −0.04 −0.04 −0.06 −0.06

−0.08 −0.08 ω −0.1 −0.1 x ω −0.12 y −0.12 ω −0.14 z u 3 −0.14 −0.16 λ 020406080100 − 7 /200 Fig. 9 Angular velocities for the NPSAT1 stability problem; stars −0.18 represents the discrete solution with 60 nodes and the solid lines are the 0 20406080100 t u propagated trajectories. Fig. 12 Control trajectory 7! 3 and its switching function. FAHROO AND ROSS 935

0.5 one to generate dual maps without resorting to solving difficult Hamiltonian 0.4 boundary-value problems. Numerical experiments with the LGR PS method indicate spectral convergence. In this regard, this paper 0.3 provides the computational foundations for a new approach to stabilizing constrained nonlinear control systems. Consequently, it 0.2 has opened up a large number of new avenues for research in 0.1 nonlinear control. These topics are at the intersection of control, optimization, approximation theory, and computational mathe- 0 matics. Given that our proposed method appears to be quite powerful, the new avenues of research it promises appear to be quite −0.1 exciting. −0.2 Acknowledgments −0.3 We gratefully acknowledge partial funding for this research −0.4 provided to one of the authors (Ross) by the Secretary of the U.S. Air Force and the U.S. Air Force Office of Scientific Research under −0.5 020406080100grant F1ATA0-60-6-2G002. We also express our deep gratitude to Fig. 13 Evolution of the minimized Hamiltonian. Qi Gong for providing us with the results for the NPSAT1 attitude stabilization problem. that the LGR PS controls satisfy the Hamiltonian minimization condition. References 0 The Hamiltonian value condition requires that Htf , whereas [1] Kang, W., and Bedrossian, N., “Pseudospectral Optimal Control the Hamiltonian evolution equation implies that t7 !h is a constant. A Theory Makes Debut Flight—Saves NASA $1M in Under 3 Hours,” construction of the Hamiltonian based on Theorem 1 is plotted in SIAM News, Vol. 40, No. 7, Sept. 2007, p. 1. Fig. 13. From this plot, it is clear that both the Hamiltonian value [2] Bedrossian, N., Bhatt, S., Lammers M., Nguyen, L., and Zhang, Y., condition and the Hamiltonian evolution equation are satisfied “First Ever Flight Demonstration of Zero Propellant Maneuver Attitude within numerical precision. Hence, it can be argued that the LGR PS Control Concept,” AIAA, 2007-6734, 2007. controls are at least Pontryagin extremals. [3] Ross, I. M., and Fahroo, F., “Issues in the Real-Time Computation of Optimal Control,” Mathematical and Computer Modelling, Vol. 43, Nos. 9–10, May 2006, pp. 1172–1188. VIII. Additional Examples, Computational Speed, Etc. doi:10.1016/j.mcm.2005.05.021 As noted in Sec. I, many more example problems have been solved [4] Ross, I. M., and Fahroo, F., “A Unified Computational Framework for in [7,14]. In particular, the inverted pendulum problem, with all its Real-Time Optimal Control,” Proceedings of the IEEE Conference on nonlinearities and saturation constraints, has been considered in Decision and Control, Vol. 3, Inst. of Electrical and Electronics Engineers, Piscataway, NJ, 9–12 Dec. 2003, pp. 2210–2215. [7,14]. In nearly all these problems, solutions were obtained without doi:10.1109/CDC.2003.1272946 implementing any of the computational speed enhancements [5] Clarke, F., “Lyapunov Functions and Feedback in Nonlinear Control,” discussed elsewhere [3]. Despite this rudimentary implementation, Optimal Control, Stabilization and Nonsmooth Analysis, edited by M. the run times to solve these problems on a Pentium 4 processor range S. de Queiroz, M. Malisoff, and P. Wolenski, Vol. 301, Lecture Notes in from fractions of a second to a few seconds. For instance, in [7], the Control and Information Sciences, Springer–Verlag, New York, 2004, nonlinear inverted pendulum problem was solved in about 2 s. When pp. 267–282. some (not all) of the computational enhancements suggested in [6] Mayne, D. Q., Rawlings, J. B., Rao, C. V., and Scokaert, P. O. M., [3,32] were implemented [14], an average run time of 0.08 s was “Constrained Model Predictive Control: Stability and Optimality,” – obtained. The average was based on over 100 runs with randomly Automatica, Vol. 36, No. 6, 2000, pp. 789 814. doi:10.1016/S0005-1098(99)00214-9 chosen initial conditions. This exercise not only illustrates the [7] Fahroo F., and Ross, I. M., “Pseudospectral Methods for Infinite- possible enhancements in computational speed by algorithmic Horizon Nonlinear Optimal Control Problems,” AIAA Guidance, means alone, but also the robustness of the approach. In [14], we have Navigation, and Control Conference, AIAA, Reston, VA, Aug. 2005, also demonstrated that this run time is more than adequate to generate pp. 15–18. feedback solutions. For linear-quadratic problems, Yan et al. [20] [8] Gill, P. E., Murray, W., and Saunders, M. A., SNOPT: An SQP showed that it is possible to generate a closed-form solution using PS Algorithm for Large-Scale Constrained Optimization, SIAM Review, methods and that this solution can be generated by an order of Vol. 47, No. 1, 2005, pp. 99–131. magnitude faster than the solution obtained by the Riccati equation. doi:10.1137/S0036144504446096 “ A full discussion on using PS methods for feedback stabilization is [9] Ross, I. M., and Fahroo, F., A Perspective on Methods for ,” AIAA/AAS Astrodynamics Conference, Monterey, beyond the scope of this paper; hence, we simply refer to [14] for CA, AIAA Paper 2002-4727, Aug. 2002. details that include a design and proof of stabilization by feedback [10] Gong, Q., Ross, I. M., Kang, W., and Fahroo, F., “Connections control, with the PS method constituting the computational Between the Covector Mapping Theorem and Convergence of technique. We also refer the reader to [15] for further details that Pseudospectral Methods for Optimal Control,” Computational include an experimental demonstration. In any event, what has Optimization and Applications (to be published). emerged in recent years is that true real-time solutions are fully doi:10.1007/s10589-007-9102-4 realizable with modern techniques without any expectation of further [11] Ross, I. M., Fahroo, F., and Gong, Q., “A Spectral Algorithm for improvements in computer hardware. Pseudospectral Methods in Optimal Control,” Journal of Guidance, Control, and Dynamics (to be published); also Proceedings of the 10th International Conference on Cybernetics and Information Technolo- IX. Conclusions gies, Systems and Applications (CITSA), International Inst. of Receding-horizon techniques for solving infinite-horizon optimal Informatics and Cybernetics, Orlando, FL, 21–25 July 2004, pp. 104– control problems introduce many new problems arising chiefly due 109. “ to the mismatch between the computational horizon and the infinite [12] Ross, I. M., and Fahroo, F., Legendre Pseudospectral Approximations – – of Optimal Control Problems,” New Trends in Nonlinear Dynamics and horizon. A nonreceding Legendre Gauss Radau (LGR) pseudo- Control, and Their Applications, Lecture Notes in Control and spectral (PS) method is developed and proposed as a viable method Information Sciences, Vol. 295, Springer–Verlag, New York, 2003, to solve infinite-horizon, nonlinear, constrained optimal control pp. 327–342. problems. An application of the covector mapping principle for the [13] Elnagar, J., Kazemi, M. A., and Razzaghi, M., “The Pseudospectral LGR PS methods produces a covector mapping theorem that allows Legendre Method for Discretizing Optimal Control Problems,” IEEE 936 FAHROO AND ROSS

Transactions on Automatic Control, Vol. 40, No. 10, 1995, pp. 1793– for Pseudospectral Methods,” AIAA/AAS Astrodynamics Conference, 1796. Keystone, CO, AIAA Paper 2006-6304, Aug. 2006 doi:10.1109/9.467672 [30] Weideman, J. A. C., and Reddy, S. C., “A Matlab Differentiation Matrix [14] Ross, I. M., Gong, Q., Fahroo, F., and Kang, W., “Practical Stabilization Suite,” ACM Transactions on Mathematical Software, Vol. 26, No. 3, Through Real-Time Optimal Control,” Proceedings of the 2006 2000, pp. 465–519. American Control Conference, Inst. of Electrical and Electronics doi:10.1145/365723.365727 Engineers, Piscataway, NJ, June 2006, pp. 14–16. [31] Karniadakis, G., and Sherwin, S., Spectral/hp Element Methods doi:10.1109/ACC.2006.1655372 for Computational Fluid Dynamics, Oxford Univ. Press, Oxford, [15] Ross, I. M., Sekhavat P., Fleming, A., and Gong, Q., “Optimal 2005. Feedback Control: Foundations, Examples and Experimental Results [32] Ross, I. M., Gong, Q., and Sekhavat P., “Low-Thrust, High-Accuracy for a New Approach,” Journal of Guidance, Control, and Dynamics (to Trajectory Optimization,” Journal of Guidance, Control, and be published). Dynamics, Vol. 30, No. 4, 2007, pp. 921–933. [16] Yan, H., Fahroo, F., and Ross, I. M., “Optimal Feedback Control Laws doi:10.2514/1.23181 by Legendre Pseudospectral Approximations,” Proceedings of the [33] Ross, I. M., “A Roadmap for Optimal Control: The Right Way to 2001 American Control Conference, Inst. of Electrical and Electronics Commute,” Annals of the New York Academy of Sciences, Vol. 1065, Engineers, Piscataway, NJ, 2001, pp. 2388–2393. Dec. 2005, pp. 210–231. [17] Yan, H., Fahroo, F., and Ross, I. M., “Real-Time Computation of doi:10.1196/annals.1370.015 Neighbouring Optimal Control Laws,” AIAA Paper 2002-4657, [34] Hager, W. W., “Numerical Analysis in Optimal Control,” International Aug. 2002. Series of Numerical Mathematics, edited by K.-H. Hoffmann, I. [18] Williams, P., Blanksby C., Trivailo, P., and Fuji, H. A., “Receding Lasiecka, G. Leugering, J. Sprekels, and F. Troeltzsch, Vol. 139, Horizon Control of a Tether System Using Quasilinearization and Birkhäuser, Basel, Switzerland, 2001, pp. 83–93. Chebyshev Pseudospectral Approximations,” AAS/AIAA Astrody- [35] Hager, W. W., “Runge–Kutta Methods in Optimal Control and the namics Specialists Conference, Big Sky, MT, American Astronautical Transformed Adjoint System,” Numerische Mathematik, Vol. 87, Society Paper 03-535, Aug. 2003. No. 2, 2000, pp. 247–282. [19] Williams, P., “Application of Pseudospectral Methods for Receding doi:10.1007/s002110000178 Horizon Control,” Journal of Guidance, Control, and Dynamics, [36] Polak, E., Optimization: Algorithms and Consistent Approximations, Vol. 27, No. 2, 2004, pp. 310–314. Springer–Verlag, Heidelberg, Germany, 1997. [20] Yan, H., Ross, I. M., and Alfriend, K. T., “Pseudospectral Feedback [37] Ross, I. M., “A Historical Introduction to the Covector Mapping Control for Three-Axis Magnetic Attitude Stabilization in Elliptic Principle,” Advances in the Astronautical Sciences, Vol. 123, Univelt, Orbits,” Journal of Guidance, Control, and Dynamics, Vol. 30, No. 4, San Diego, CA, 2006, pp. 1257–1278. July–Aug. 2007, pp. 1107–1115. [38] Mordukhovich, B. S., Variational Analysis and Generalized doi:10.2514/1.26591 Differentiation, 1: Basic Theory, Grundlehren der Mathematischen [21] Vinter, R. B., Optimal Control, Birkhäuser, Boston, 2000. Wissenschaften Series, Vol. 330, Springer, Berlin, 2005. [22] Pontryagin, L.S., Boltyanskii, V. G., Gamkrelidze, and Mishchenko, E. [39] Mordukhovich, B. S., Variational Analysis and Generalized F., The Mathematical Theory of Optimal Processes, Wiley, New York, Differentiation, 2: Applications, Grundlehren der Mathematischen 1962. Wissenschaften Series, Vol. 331, Springer, Berlin, 2005. [23] Hartl, R. F., Sethi, S. P., and Vickson, R. G., “ A Survey of the [40] Ross, I. M., and Fahroo, F., “Discrete Verification of Necessary Maximum Principles for Optimal Control Problems with State Conditions for Switched Nonlinear Optimal Control Systems,” Constraints,” SIAM Review, Vol. 37, No. 2, 1995, pp. 181–218. Proceedings of the 2004 American Control Conference, Vol. 2, Inst. of doi:10.1137/1037043 Electrical and Electronics Engineers, Piscataway, NJ, 30 June– [24] Fahroo, F., and Ross, I. M., “Legendre and Laguerre Pseudospectral 2 July 2004, pp. 1610–1615. Methods for Infinite-Horizon Optimal Control,” Sixth SIAM Confer- [41] Ross, I. M., A Beginner’s Guide to DIDO: A MATLAB Application ence on Control and its Applications, Society for Industrial and Applied Package for Solving Optimal Control Problems, Elissar, Monterey, Mathematics, Philadelphia, July 2005, pp. 11–14. CA, 2007. [25] Ross, I. M., Fahroo, F., and Strizzi, J., “Adaptive Grids for Trajectory [42] Kirk, D., Optimal Control Theory: An Introduction, Prentice–Hall, Optimization by Pseudospectral Methods,” AAS/AIAA Spaceflight Englewood Cliffs, NJ, 1970, pp. 216–217. Mechanics Conference, Ponce, Puerto Rico, American Astronautical [43] Sekhavat P., Gong, Q., and Ross, I. M., “NPSAT1 Parameter Society Paper 03-142, February 9–13 2003. Estimation Using Unscented Kalman Filtering,” Proceedings of the [26] Boyd, J., Chebyshev and Fourier Spectral Methods, Dover, New York, 2007 American Control Conference, Inst. of Electrical and Electronics 2001. Engineers, Piscataway, NJ, 9–13 July 2007, pp. 4445–4451. [27] Canuto, C., Hussaini, M. Y., Quarteroni, A., and Zang, T. A., Spectral doi:10.1109/ACC.2007.4283031 Methods in Fluid Dynamics, Springer–Verlag, New York, 1988. [44] Ross, I. M., and Fahroo, F., “A Pseudospectral Transformation of the [28] Davis, P. J., and Rabinowitz, P., Methods of Numerical Integration, Covectors of Optimal Control Systems,” Proceedings of the First IFAC Academic Press, New York, 1977. Symposium on System Structure and Control, Elsevier Science, [29] Fahroo, F., and Ross, I. M., “On Discrete-Time Optimality Conditions Frankfurt, Germany, 29–31 Aug. 2001, pp. 543–548.