JOURNAL OF GUIDANCE,CONTROL, AND DYNAMICS Vol. 31, No. 4, July–August 2008
Pseudospectral Methods for Infinite-Horizon Optimal Control Problems
Fariba Fahroo∗ and I. Michael Ross† Naval Postgraduate School, Monterey, California 93943 DOI: 10.2514/1.33117 A central computational issue in solving infinite-horizon nonlinear optimal control problems is the treatment of the horizon. In this paper, we directly address this issue by a domain transformation technique that maps the infinite horizon to a finite horizon. The transformed finite horizon serves as the computational domain for an application of pseudospectral methods. Although any pseudospectral method may be used, we focus on the Legendre pseudospectral method. It is shown that the proper class of Legendre pseudospectral methods to solve infinite- horizon problems are the Radau-based methods with weighted interpolants. This is in sharp contrast to the unweighted pseudospectral techniques for optimal control. The Legendre–Gauss–Radau pseudospectral method is thus developed to solve nonlinear constrained optimal control problems. An application of the covector mapping principle for the Legendre–Gauss–Radau pseudospectral method generates a covector mapping theorem that provides an efficient approach for the verification and validation of the extremality of the computed solution. Several example problems are solved to illustrate the ideas.
I. Introduction infinite horizon. Second, a pseudospectral (PS) method is applied to VER the last few years, it has become possible to routinely discretize the problem. The discretized problem generates a sequence fi of approximate problems parameterized by the size of the grid. In the O solve many practical nite-horizon optimal control problems. fi Practical problems are typically nonlinear and constrained in both the third and nal step, each problem in this sequence is solved by a state and control variables. A recent example of generating routine globally convergent sequential quadratic programming (SQP) solutions to practical problems is the flight implementation of technique [8]. Note that global convergence does not mean global pseudospectral solutions to an optimal “zero-propellant maneuver” optimality [9]. Among many, this is one reason why it is important to of the International Space Station [1,2]. Furthermore, a fairly large select a correct sequence of problems together with the SQP class of finite-horizon problems can also be solved in real time. A subsequence to design a proper algorithm for solving optimal control number of issues related to this advancement are discussed in [3]. problems [10]. Such an algorithm for PS methods is called a spectral Real-time computation of optimal controls implies optimal feedback algorithm [11], which forms the core of PS methods for control. For guidance; hence, a number of control problems that are not readily the purpose of brevity, we do not discuss all aspects of our proposed addressable by traditional methods can now be solved in a unified approaches because many of these concepts have been extensively discussed elsewhere. For example, a complete description of the manner [4]. Despite such advancements, obtaining feedback controls – – for stabilizing a nonlinear system poses significant challenges [5]. Legendre Gauss Lobatto PS method is described in [12,13]; the Among many, one of the reasons for the continuing challenge is the spectral algorithm is described in [11]; and a convergence theorem is treatment of the infinite horizon in a nonlinear optimal control developed in [10]. Consequently, in this paper, we limit the scope of the problem to a complete discussion of only the open-loop issues problem [6]. A well-known strategy to address this issue is the fi receding-horizon technique, which forms the basis of model- pertaining to the treatment of the in nite horizon. Stability of the predictive methods for control. Even though finite-horizon problems closed-loop system is discussed in [14]. An overview of an fi implementation of the closed-loop control is discussed in [15]. In this are solved in real time, in this approach, the nite horizon is receded fi indefinitely as a means to approximate the infinite horizon. As a context, this paper provides the foundations for solving in nite- result of the mismatch between the computational horizon and the horizon optimal control problems. Part of this foundation is the infinite horizon, a large number of new problems emerge. Conse- development of a Radau-based PS method. It turns out that the well- fl known Lobatto family of PS methods is inappropriate for the quently, different avors of model-predictive control techniques fi have been proposed to address these problems. These issues are treatment of the in nite horizon as a result of singularities in the surveyed in [6] and the references contained therein. domain transformation technique. A detailed discussion of these In this paper, we propose an altogether different approach for issues is included in this paper. solving infinite-horizon, nonlinear, constrained optimal control As a matter of context, we note that our proposed approach is not problems. Our approach follows [7], in which we presented some an application of PS discretization applied to receding-horizon techniques. This approach has been discussed elsewhere [16–20] and initial ideas; here, we further the concepts along several different fi directions. First, the infinite horizon is mapped to a finite horizon by a offers well-established bene ts of PS methods; however, as alluded domain transformation technique that preserves the notion of the to earlier, the fundamental problems and issues [6] of receding- horizon techniques arising from the finiteness of solving the finite- horizon optimal control problem do not disappear by choosing Received 28 June 2007; revision received 4 September 2007; accepted for alternative discretizations. Thus, the proposed approach in this paper publication 12 September 2007. This material is declared a work of the U.S. provides a foundation for a new approach to solving infinite-horizon Government and is not subject to copyright protection in the United States. optimal control problems. Copies of this paper may be made for personal or internal use, on condition Two example problems are solved to illustrate various elements of that the copier pay the $10.00 per-copy fee to the Copyright Clearance Center, the main points of the proposed ideas: one is a standard linear- Inc., 222 Rosewood Drive, Danvers, MA 01923; include the code 0731-5090/ 08 $10.00 in correspondence with the CCC. quadratic regulator (LQR) problem in which the solutions from our ∗Professor, Department of Applied Mathematics, Code MA/Ff; method are compared favorably against the solutions from the [email protected]. Associate Fellow AIAA. Riccati approach. The second example is a nonlinear control problem †Professor, Department of Mechanical and Astronautical Engineering, of stabilizing NPSAT1, an experimental spacecraft designed and Code ME/Ro; [email protected]. Associate Fellow AIAA. built at the Naval Postgraduate School and scheduled to be launched 927 928 FAHROO AND ROSS in fall 2009. Additional example problems have been solved [7,14] 1 t 1 t t : , (3) but are not discussed in this paper for the purpose of brevity. 1 t 1 where 2 1; 1 . Because we assumed that quantities exist in the II. Problem Formulation limit as t !1, it is clear that Eq. (3) maps the infinite domain to a We consider the following infinite-horizon, constrained, nonlinear finite domain. Using optimal control problem: dt 2 8 R : r (4) > 1 1 2 <> Minimize J x ; u 0 F x t ; u t dt d _ f 1 Subject to x t x t ; u t B > 0 0 we reformulate problem 1B on the finite interval 1; 1 as :> x x h 0 8 R x t ; u t > 1 > Minimize J x ; u 1 r F x t ; u t d < dx t N N r f x t ; u t where 0; 1 3 t7 !fx 2 R x ; u 2 R u g is the state-control function Subject to d B > 1 0 pair; F: RNx RNu ! R, f: RNx RNu ! RNx , and h: RNx :> x t x 0 h 0 RNu ! RNh are continuously differentiable functions; and x is a x t ; u t given value of the state at the initial time t 0. The symbol N 1 denotes the vector dimension (i.e., number of variables) of the Evaluations at must be performed in the sense of the limit subscripted quantities. Thus, it is clear that we are dealing with a consistent with t !1. In applying the Pontryagin minimum fairly general optimal control problem for which the problem principle for problem B, we generate the following problem: formulation is motivated by stabilizing nonlinear control systems 8 > Find x ; u ; ; that frequently describe high-performance aerospace vehicles. We > dx t 1 > Such that r f x t ; u t assume that a solution exists for problem B in some appropriate > d > x t 1 x0 Sobolev space [21] and that the limits > <> h x t ; u t 0 lim lim @H x t x 1 and u t u 1 B 0 t!1 t!1 > @u > t T h x t ; u t 0 > exist. Then, under appropriate conditions [22], the Pontryagin > t 0 1 > minimum principle holds for problem B. That is, if fx ; u g is a > _ t r @H 1 :> @x solution to problem B, then there exist multipliers (duals or t 1 0 covectors) that satisfy the adjoint equation, the Hamiltonian minimization condition, and the transversality conditions. This where evaluations at t 1 are also to be understood in the sense of a dualization may be cast in terms of a boundary-value problem, limit. 1 fi denoted as problem B , and is de ned as The equivalence between problems 1B and B require a set of 8 technical assumptions that we implicitly assume. For the purposes of > Find x ; u ; ; > brevity, we do not list all these assumptions, but we note one of these > Such that x_ t f x t ; u t > 0 to illustrate the point. > x 0 x 1 > In problem B it is necessary that the function t7 !F x t ; u t be <> h x t ; u t 0 1 0 in L ; 1 ; R ; however, this does not necessarily guarantee that 1B @H t 0 7 !r F x t ; u t will be in L1 1; 1 ; R . Consequently, > @u > t T h x t ; u t 0 we implicitly assume that the transformed cost function is locally > > t 0 integrable. > > _ t @H t :> @x 1 0 IV. New Perspective on Pseudospectral Methods Although recognizing that the domain transformation technique is Nh Nx Nx Nu where H: R R R R ! R is the D form [23] of the quite independent of PS methods for solving optimal control – Hamiltonian Lagrangian problem, the rationale for transforming the domain from 0; 1 to 1; 1 and not, say, [0, 1] is indeed motivated by pseudospectral H ; ; x; u : H ; x; u T h x; u (1) methods. Two of the most widely used PS methods are the Legendre and Chebyshev pseudospectral methods. These methods rely on H: RNx RNx RNu ! R is the Pontryagin Hamiltonian interpolation of the approximating polynomials for the states and H ; x; u : F x; u T f x; u (2) costates on Gaussian quadrature points (such as Legendre- or Chebyshev–Gauss–Lobatto points). These points lie in the interval 1 1 and H t is a shorthand notation [21] for H t ; t ; x t ; u t .As ; , but by using a time-domain transformation, any arbitrary fi in the case of the states and controls, we use the notation nite interval can be mapped to this interval. Typically, a linear domain transformation is used; however, proper nonlinear domain 1 lim t transformation techniques can be used quite effectively to create an t!1 adaptive grid [25]. A simple extension of this idea is to map the semi- infinite domain to the finite time domain 1; 1 and then use the for the limit and assume that it exists. appropriate quadrature nodes for the interpolation polynomials. Equation (3) is in fact the rational map commonly used [26,27] in III. Domain Transformation Technique pseudospectral methods for domain transformation. Other appropriate domain transformation techniques may be used as well, As noted earlier, the semi-infinite domain 0; 1 poses a number and as will be apparent in the sections to follow, our methods carry of problems. That is, it is necessary to properly account for the over quite trivially. infiniteness of the time domain. In [24], we proposed a Laguerre PS method as a means to address this problem head-on. Motivated by domain transformation techniques prevalent in PS methods, and our A. Overview previous success in designing an adaptive PS method for control In PS methods, the functions are approximated by Lagrange [25], we let interpolating polynomials of order N in which interpolation occurs at FAHROO AND ROSS 929
Z 1 1 1 Gaussian quadrature points that lie in the interval ; . There are ^ : three different families of Gauss quadrature points known as Gauss, w j j d (7) 1 j Gauss–Radau, and Gauss–Lobatto [27,28]. What distinguishes these ^ nodes is the choice of zeros of orthogonal polynomials such as Jacobi It is straightforward to show that wj is independent of the choice of polynomials, in general, or Chebyshev or Legendre polynomials in and is given explicitly by [31] most applications. Gauss quadrature uses zeros that are interior to the 8 1 1 – < 2 interval ; . The Gauss Radau zeros include one of the endpoints 2 j 0 1 – ^ N 1 of the interval, usually the left-endpoint at . The Gauss w j 1 1 j (8) : 2 2 j ≠ 0 Lobatto points include both endpoints of the interval at 1, and N 1 LN j 1. Proper choice of these nodes depends on the boundary-value problem that is being solved. In a general optimal control problem The derivative of y is obtained by differentiating Eq. (5): over a finite-horizon, explicit formulations of initial and final XN conditions are often required. This is why Lobatto points, which are dy dyN d y (9) fixed at both of the boundary points, are the most natural choice. On j j d d 0 d j the other hand, for the infinite-horizon optimal control problems, one j generally has explicit boundary conditions at the initial point, and the Evaluating the derivatives at the LGR points leads to the notion of a fi implicit assumption is that solutions at in nity tend to zero. differentiation matrix: Therefore, for these problems, it suffices to use nodes that are fixed at the left-hand node. Moreover, in our technique of mapping the semi- ^ d infinite horizon to a finite interval, the derivative of the function r D kj j (10) d j has a singularity at 1, which corresponds to the right-hand k fi endpoint at in nity. Therefore, for these problems, only the left-hand For 1, we denote D^ as simply D^ ; this is given explicitly 1 kj kj boundary point is required; hence, the Radau points, which are by [31] fixed at the left-hand endpoint 1, are more appropriate than 8 Lobatto points. It should also be noted that in an implementation of > N N 2 0 <> 4 j k ; PS methods, the Gauss points are not suitable because their use poses 1 ^ : LN k j 1 ≠ 1 D kj 1 j k; j; k N (11) subtle but serious issues in applying the boundary conditions [29]. > LN j k k j Consequently, we choose Legendre–Gauss–Radau (LGR) points for : 1 1 j k N 2 1 k discretization. These points j (j 0; ;N) are defined as fixed at the initial point 0 1 and the rest of the nodes are defined as zeros ^ ^ For 1 , we denote Dkj as Dkj. It is straightforward to of LN LN 1, where LN is the Legendre polynomial of degree N. show that the entries of this differentiation matrix are given by For these points, which are distributed over 1; 1 , evaluation at the 8 right-hand point (which, for the mapped domain, corresponds to 1) > N 1 2 1 0 1 < 4 j k ; is at N , where the size of depends inversely on N; that is, ^ : LN k 1 ≠ 1 ! 0 as N !1. Because evaluations at 1 are to be understood in D kj L j k; j; k N; (12) > N j k j : 1 1 the sense of the limit, it is clear that the last Radau node provides 2 1 j k N precisely this condition. k
B. Approximation of Functions, Derivatives, and Integrals C. Transformations to the Physical Domain To introduce the basic ideas, we consider a generic function The physical domain is 0; 1 , and the computational domain is 7 !y 2R. This allows us to ignore much of the bookkeeping 1; 1 . The closure of the computational domain is exactly the same associated with vector-valued functions. Pseudospectral methods are as the domain of the transformed problems B and B . Consider now a based on approximating y by way of weighted interpolants of the function 0; 1 3 t7 !x t 2R, which need not be a state variable. form [27,30] Following the approximation theory described in Sec. IV.B, the Nth- order approximation of the function, xN t , is given by XN N y y y (5) N j j X j 0 j N x t j xj (13) j 0 j N where is a positive weight function, f jgj 0 are Gaussian points, N and f j gj 0 satisfy j k jk, where jk is the Kronecker delta. Thus, Hence, we have : N N xj x t j x tj (14) yN y k k where 1 For the LGR method, we choose for approximating the t : t 2 0; 1 states and 1 for approximating the costates. The j j motivation for this choice of weight functions is based on generating are called the shifted LGR points. It is clear that t ≠ 1 for any a consistent set of new approximations and is further elaborated in j j 2f0; 1; ...;Ng, but that tN !1as N !1. From Eqs. (4) and Sec. V. (6), it follows that We approximate Z Z Z 1 1 XN 1 y d x t dt x t r d wjxj (15) 0 1 0 1 j by the LGR quadrature formula: where Z Z : ^ 1 1 XN wj r j wj (16) N ^ y d y d wjyj (6) 1 1 j 0 are the quadrature weights in the physical domain. Similarly, we can derive an expression for the derivative of xN t at the shifted LGR ^ N fi where fwjgj 0 are the LGR weights de ned as points, tk, using the chain rule: 930 FAHROO AND ROSS