Acta Astronautica 122 (2016) 114–136
Contents lists available at ScienceDirect
Acta Astronautica
journal homepage: www.elsevier.com/locate/actaastro
Suboptimal LQR-based spacecraft full motion control: Theory and experimentation
Leone Guarnaccia b,1, Riccardo Bevilacqua a,n, Stefano P. Pastorelli b,2 a Department of Mechanical and Aerospace Engineering, University of Florida 308 MAE-A building, P.O. box 116250, Gainesville, FL 32611-6250, United States b Department of Mechanical and Aerospace Engineering, Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino 10129, Italy article info abstract
Article history: This work introduces a real time suboptimal control algorithm for six-degree-of-freedom Received 19 January 2015 spacecraft maneuvering based on a State-Dependent-Algebraic-Riccati-Equation (SDARE) Received in revised form approach and real-time linearization of the equations of motion. The control strategy is 18 November 2015 sub-optimal since the gains of the linear quadratic regulator (LQR) are re-computed at Accepted 18 January 2016 each sample time. The cost function of the proposed controller has been compared with Available online 2 February 2016 the one obtained via a general purpose optimal control software, showing, on average, an increase in control effort of approximately 15%, compensated by real-time implement- ability. Lastly, the paper presents experimental tests on a hardware-in-the-loop six- Keywords: Spacecraft degree-of-freedom spacecraft simulator, designed for testing new guidance, navigation, Optimal control and control algorithms for nano-satellites in a one-g laboratory environment. The tests Linear quadratic regulator show the real-time feasibility of the proposed approach. Six-degree-of-freedom & 2016 The Authors. Published by Elsevier Ltd. on behalf of IAA. This is an open access Real-time article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). Experimental test
1. Introduction
As spacecraft technology has evolved, robust and effi- cient automated control has become an essential mission Abbreviation: ADAMUS, Advanced Autonomous Multiple Spacecraft; capability. Over the last fifty years, in fact, worldwide APF, artificial potential function; AS, attitude stage; BP, balancing plat- aerospace research environments have addressed their form; CPU, central processing unit; DARPA, Defense Advanced Research work to the optimization of guidance, navigation and Projects Agency; DCM, direction cosine matrix; EKF, extended Kalman filter; GDC, guidance dynamics corporation; IBPS, intelligent battery and control performances, also paying attention to the pro- power system; IPOPT, interior point optimizer; KKT, Karush–Kuhn– pellant consumption and time-to-launch costs. Tucker; LGR, Legendre Gauss Radau; LKF, linear Kalman filter; LQE, linear It is clear, then, how high efficiency controls, ensuring quadratic estimator; LQR, linear quadratic regulator; LVLH, local vertical local horizontal; MIMO, multi-input multi-output; MP, moving platform; both position accuracy and propellant optimization, have NLP, nonlinear programming; ONR, Office of Naval Research; PWM, pulse become a critical issue in the control systems scope and width modulator; RTAI, Real-Time Application Interface; S/C, spacecraft; specifically in the aerospace sector. SNOPT, Sparse Nonlinear OPTimizer; TS, translational stage The spacecraft six degrees of freedom optimal control n Corresponding author. Tel.: þ1 352 392 6230. E-mail addresses: [email protected] (L. Guarnaccia), problem has been widely analyzed in the literature, lately bevilr@ufl.edu (R. Bevilacqua), [email protected] (S.P. Pastorelli). focusing on spacecraft relative motion, usually requiring 1 Present address: 48 Via Salvatore Quasimodo, Acicatena numerical methods [1]. Spacecraft formation and the opti- 95022, Italy. 2 Politecnico di Torino, Department of Mechanical and Aerospace mization of relative maneuvers are becoming increasingly Engineering, Room 6939. important topics of investigation. This is due to the benefits http://dx.doi.org/10.1016/j.actaastro.2016.01.016 0094-5765/& 2016 The Authors. Published by Elsevier Ltd. on behalf of IAA. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 115
Nomenclature m qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffispacecraft simulator mass (kg) μ = 3 n ¼ R0 generic orbital rate (rad/s) 0m n zero matrix with dimensions m, n ω angular velocity vector (rad/s) A state matrix (Jacobian) ωx; ωy; ωz angular velocity vector components (rad/s) Arot state matrix rotational contribute P solution of the Riccati differential equation Atransl state matrix translational contribute P position cost (–) B input matrix (Jacobian) p thrusters inclination with respect to the ima- Btransl input matrix translational contribute ginary square circumscribed to the attitude Brot input matrix rotational contribute stage basis (p ¼ cos ð451Þ¼ sin ð451Þ) C output matrix Q state weighting matrix (mixed dimensions to D feedforward matrix generate dimensionless cost) d thrusters moment arm with respect to the Q transl state weighting matrix translational center of rotation of the AS [d¼0.32 m] contribute E DCMB direction cosine matrix from the body (B) to Q rot state weighting matrix rotational contribute the inertial reference frame (E) [23] R input weighting matrix (mixed dimensions to – F propellant cost ( ) generate dimensionless cost) F generalized force vector (F; M) R0 generic orbit radius (m) F nominal thrust (F ¼0.3 N) ω t t kinθ_ kinematics matrix relating the time derivative Fthrust force generated by the onboard thrusters (N) of the Euler angles with the angular velocity F LQR optimal generalized force, output of the LQR [23] Simulink block ρ additional gain influencing the input weight- 3 1 2 G universal gravitational constant (m kg s ) ing matrix (–) H thruster distribution matrix or θ Euler angles vector (rad) mapping matrix θx; θy; θz Euler angles (rad) Hf force term of the thruster distribution matrix U generic LQR optimal solution H Ucont normalized continuous thrust vector (N) Hm torque term of the thruster distribution matrix Ucost net propellant expenditure (–) H (m) u10 normalized binary thrust vector (–) Im n identity matrix with dimensions m, n u dimensionless parameter with magnitude 2 a J inertia matrix (kg m ) correspondent to the nominal thrust J global maneuver cost (–) vx; vy; vz linear velocity vector components (m/s) K Kalman gain X generalized state vector (x; x_; θ; ω) Kpos, Krot additional dimensionless gains characterizing Xdes desired generalized state the adaptive tuning, applied to the state Xerr actual error vector weighting matrix position and angular x position vector (m) terms only x_ linear velocity vector (m/s) M angular momentum vector (N m) x; y; z position vector components (m) M Earth mass (kg) μ μ Y output vector Earth gravitational parameter ¼ GM (m3 s 2)
in cost, responsiveness, and flexibility of a multi-spacecraft pushing the envelope with regards to fast computation of system versus the classical monolithic satellite. practical optimal/sub-optimal trajectories. Particularly new is the use of continuous on–off engines, The recent numerical approaches to the optimal control appearing on small spacecraft. This control constraint adds problem could be in short classified in two main categories: new complexity to finding the optimal solution. In fact, most of the literature on spacecraft optimal control assumes that 1. the indirect methods which employ the calculus of the thrust can be finely modulated, partially to mitigate the variations to obtain the first-order optimality conditions aforementioned problem. Unfortunately, this is not the case [6], where the resulting boundary-value problem is with real engines, which are usually limited to some sus- sometimes impossible or time-consuming to solve; tained value for thrust. As a partial response to this problem, 2. direct methods which approximate the trajectory via a new methodology has been presented in [2] with the aim parameterization, and transform the cost functional into to control spacecraft rendezvous maneuvers assuming multi- a cost function [7–9]. level continuous thrusters and impulsive thrusters on the same vehicle. Furthermore, very recently, the interests of the These second methods appear to be a viable tool for Department of Defense have been focusing on time/pro- real-time spacecraft optimal control. However, the major pellant optimal rendezvous and capture maneuvers of a non- issues with direct methods relate to the difficulties of cooperative target satellite [3–5]. This research has been defining parameters to represent feasible trajectories. The 116 L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 chosen parameterization may lead, for example, to unsa- was sub-optimized in real-time through re-computation of tisfactory final conditions, or, it may satisfy the final con- the LQR at each sample time, while the APF performed ditions but violate some of the constraints on the controls. collision avoidance and a high level decisional logic. Adding penalties to the cost function may help, but it There is a gap that has not been addressed directly in provides no guarantee of improvement as in [5,9]. Also, the past literature: real-time techniques for optimal con- the trajectory computation must be as fast as possible, trol of both attitude and position. This is partly due to the since it is iterated by an outer optimizer looping on the fact that attitude and position are commonly dealt with parameters. The rendezvous problem to a tumbling object independently; satellites primarily control their attitude to analyzed in [5] is a very good example, where a direct communicate to ground stations, to point to targets, to method may not converge to feasible solutions or may collect measurements or absorb energy from the Sun, converge in an unacceptable amount of CPU time for on whereas translational control is activated infrequently as it board implementation. is only required during specific maneuvers such as orbit Addressing the linear quadratic control problem, LQR- insertion and station-keeping. Nevertheless, current based algorithms were widely used to control space sys- spacecraft rendezvous and docking maneuvers require the tems, specifically satellites and spacecraft assemblies. Yang cooperation of multiple spacecraft deliberately designed to proved the effectiveness of a quaternion-based LQR work together; hence both attitude and position need to method for the design of nonlinear spacecraft control be controlled constantly and simultaneously to maintain systems in [10], demonstrating how the designed con- the formation and avoid collisions. troller globally stabilized the nonlinear spacecraft system, The present paper, in keeping with this context, illus- whereas it locally optimized the spacecraft performance. trates a control algorithm taking into account both position Beatty, in [11], successfully demonstrated the superiority and attitude, thus allowing a global control of a full six- of a linear quadratic regulator with respect to a degree-of-freedom spacecraft with a unique controller, in proportional-derivative (PD) controller. His work consisted view of an implementation on multiple spacecraft assembly of a comparison of spacecraft attitude control methodolo- for the execution of rendezvous and docking maneuvers. gies that use reaction wheels for torque actuation and star The adoption of a single controller would simplify the trackers to infer spacecraft orientation and angular rate. system's architecture, reducing the computational burden, Another proof of LQR reliability and effectiveness was without sacrificing required performances. Specifically, the given by Walker and Spencer [12]: this work focused on a control strategy is based on a LQR, whose gains are re- system for relative navigation and automated proximity computed at each time sample, hence generating a sub- operations for a microsatellite using continuous thrust optimized solution of the six-degree-of-freedom problem. propulsion and low-cost visible and infrared imagers. In this A comparison with an optimal control problem solver, case the state error was employed, together with the thrust based on a Radau pseudospectral Gaussian quadrature vector, to define the cost function to be minimized by the method, constitutes the theoretical validation of the pro- LQR optimal gains; moreover the parametric weighting posed algorithm, evaluating the real optimization level. matrices defined as functions of the actual distance to the Finally, the presented LQR-based control system was also goal and of the maximum available thrust, while Pulse- experimentally verified through a series of tests carried Width-Modulation was used to determine thrust control. out on the test bed at the Advanced Autonomous Multiple Bevilacqua et al. worked on multiple spacecraft control Spacecraft (ADAMUS) laboratory [17,18]. This work parti- by developing an autonomous distributed control algorithm cularly showcased the feasibility of a real-time approach. for close proximity operations of multiple spacecraft sys- The main contribution of this research to the state-of-the- tems, including rendezvous and docking scenarios. In order art on spacecraft optimal relative motion control consists in to validate the proposed control approach, both theoretical the application of a SDARE approach to obtain a real-time simulations [13] and experimental tests [14–16] were car- control algorithm for six-degree-of-freedom spacecraft. ried out, choosing the LQR as the system controller. SDARE approaches have been successfully applied in the past The LQR control effort served as the attractive force to several engineering problems [19–22] but, to the authors' toward goal positions, while APF repulsive functions pro- knowledge, the complete spacecraft control problem has not vided collision avoidance for both fixed and moving received the attention it deserves. The paper completely obstacles. Previous experiments, by the above authors, characterizes the methodology via a comparison of its per- assumed that each spacecraft is equipped with an attitude formances versus those of a general purpose software for control system, thus focusing only on the translational solving nonlinear optimal control problems: GPOPS-II. control. Regarding the control strategy, which represents GPOPS-II is based on an adaptive Radau pseudospectral the main subject of the present paper, as in [12],the Gaussian quadrature method. Furthermore, this work pre- weighting matrices were defined introducing a parametric sents experimental validation of the proposed optimizer, structure depending on the maximum allowable values of throughhardware-in-the-loopexperimentationonafullsix- the states and control effort. Simulations proved the LQR/ degree-of-freedom test bed, showing real-time feasibility. APF to be both effective and efficient in conducting simul- This paper is organized as follows. Section 2 illustrates taneous spacecraft missions and in disturbance rejection. the equations of the six-degree-of-freedom spacecraft The previous results obtained in [13] were then imple- motion, the dynamics linearization and deals with the mented and verified in [14], where both theoretical devel- control strategy by means of a linear quadratic regulator. opments and experimental validation of the hybrid LQR/ Section 3 introduces the comparison with the optimal APF were presented. In this case propellant consumption control software GPOPS-II, paying specific attention to the L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 117 transformation of the results in order to have a meaningful obtaining the well known expressions: matching with those from the LQR controller, thus ensur- F x€ 2ny_ 3n2x ¼ thrustx ing an effective and truthful comparison. Section 4 illus- m trates the robotic spacecraft simulator employed for the Fthrust fi y€ þ2nx_ ¼ y experimental veri cation. Section 5 presents the results of m the experimental tests. Section 6 concludes the paper. F z€ þn2z ¼ thrustz ð2Þ m The spacecraft simulator is not orbiting, thus implying the 2. Theory coincidence of the local reference frame with the inertial one in the laboratory: hence the terms related to the The complete dynamics of spacecraft relative motion, as it angular rate n, clearly null, vanish, finally obtaining: will be expanded in (2.1) and (2.2), consists of the Euler equations and the Clohessy–Wiltshire equations, describing F x€ ¼ thrustx ð3Þ the spacecraft attitude and translation respectively. When m dealing with the spacecraft simulator, used for the experi- F ments, the same equations have been used, simplifying the y€ ¼ thrusty ð4Þ Clohessy–Wiltshire equations to a double integrator, thus m obtaining a direct correspondence between an orbiting F z€ ¼ thrustz ð5Þ spacecraft's dynamics and the robot dynamics itself. The m resulting dynamics is represented by linear equations for the In the sequel, the forces will be assumed to be expressed in translation and nonlinear Euler's equations for the rotation. the coordinate system fixed to the spacecraft, requiring a In the next subsections we will simplify the dynamics to rotation matrix to obtain then in the inertial reference represent the spacecraft simulator and we will use the terms frame. This choice is convenient to simplify the procedure spacecraft (S/C) simulator or robot interchangeably. of thruster mapping. Turning to detail, the position and attitude motion is represented by the following twelve component vector: 2.2. Attitude dynamics three relative position coordinates: x, y, z; When the reference frame is the laboratory's floor, the the relative three linear velocity components: vx, vy, vz or equivalently x_, y_, z_; spacecraft simulator attitude dynamics can be modeled employing the Euler's rotational equation of absolute a set of three Euler angles: θx, θy, θz; three angular velocity components associated with the motion in vector/dyadic form, rotation rate about the three axes constituting the body M ¼ J ω_ þω J ω ð6Þ reference frame: ω , ω , ω . x y z Since the principal axes are also a central triad for the robot, that is the reference frame is baricentric and the Fig. 1 summarizes the conventional reference frames (body inertial tensor is diagonal, expanding along the three axial B, Local Vertical/Local Horizontal LVLH, inertial E) commonly directions all the products of inertia cancel out and Eq. (6) employed to describe the relative motion of a spacecraft with becomes respect to a generic target and expresses the sets of transla- 8 > J ω_ þðJ J Þω ω ¼ M tional and rotational coordinates mentioned above. > x x z y z y x < _ The most common solution employed to control space- J ωy þðJ J Þωxωz ¼ My y x z ð7Þ craft entails a set of body-mounted thrusters providing > J ω_ þðJ J Þω ω ¼ M :> z z y x y x z unidirectional forces. This requires thruster mapping stra- tegies, converting the tri-axial forces and torques, obtained from the equations of motion, into a series of unidirectional The kinematics equation, using the Euler angles para- forces correspondent to each employed thruster. metrization with chosen rotation sequence yxz,is[23] Eω θ_ θ_ θ_ ; Byxz ¼ ye2 þ xe1 þ ze3 e3 ¼ b3 ð8Þ 2.1. Translational dynamics j where ei are the basis vectors of different reference frames The translational equations of motion of the spacecraft (b body frame) the subscript of which defines the rotation simulator have been derived from the more general Clo- axis, whereas the superscript refers to the partial config- hessy–Wiltshire equations describing the relative motion uration obtained by each rotation. Expanding along the of a chase vehicle with respect to a target vehicleq lyingffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi on three axial components x, y, z, and then introducing the μ = 3 matrix notation, the previous equation assumes the form a circular orbit of radius R0 and orbital rate n ¼ R0. Starting from the Newton's second law, ω _ ω ¼ kinθ_ θ ð9Þ € mR ¼ F ¼ mg þF ð1Þ thrust with where m is the mass of the vehicle, that is, in the present 2 3 cz cx sz 0 work, the spacecraft simulator, g is the gravitational ω 6 7 kinθ_ ¼ 4 sz cx cz 0 5 ð10Þ acceleration and Fthrust is the control force, [23] illustrates ðYxzÞ how to derive the linear equations of relative motion, 0 sx 1 118 L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136
T _ _ _ _ T T where x ¼fx; y; zg , x ¼fx; y; zg , θ ¼fθx; θy; θzg and ω ¼ T fωx; ωy; ωzg , the decoupled rotational and translational equation of motion can be summarized with the following vectorial expression: x€ ¼ F=m _ θ_ ω 1 θ ¼ kinωω ¼ kinθ_ ω ω_ ¼ J 1ð ω J ωþMÞð12Þ
or equivalently _ XðtÞ¼GðXðtÞÞþBFðtÞð13Þ
where B is a constant matrix. This represents a multi-input multi-output (MIMO) nonlinear system, a function both of the 12-dimensional state vector X and the 6-dimensional input F, including forces F and torques M. The system is then linearized at the desired state, at every time step. Proceeding in this way for every time step, the whole trajectory will finally consist of a sequence of linearized sections, and will approach the non-linear one as the time sample tends to zero. In light of these considerations the linearized equations are expressed as _ _ F XðtÞ¼X des þAðXdesÞ½XðtÞ Xdes þB ðtÞð14Þ where the subscript des specifies the desired state terms and the matrix A is the state Jacobian matrix computed at the linearization point:
∂G AðÞ¼t ð15Þ ∂X Fig. 1. Vectorial representation of the chaser position with respect to the XdesðtÞ target, including the inertial frame E, the local vertical local horizontal Introducing the actual error vector as frame LVLH and the body frame B; attitude representation through the Euler angles parametrization, according to the sequence yxz (box). XerrðtÞ¼XðtÞ XdesðtÞð16Þ where ci and si represent the shorten form of cos ðθiÞ and the complete state space representation of the spacecraft fi sin ðθiÞ respectively. simulator dynamics is then de ned by Eq. (17): _ X ðtÞ¼AðX ÞX ðtÞþBFðtÞð17Þ 2.3. The linearized dynamics err des err Thus, given the rotation sequence yxz, a symbolic calcu- A linearization process has been applied to the non- lation tool was used to derive the parametric expression of linear rotational dynamics of the robot to obtain the the matrix A. matrices representing the required inputs of the LQR to In particular the state matrix A is composed of a determine the optimal Kalman gain. The following deri- translational and a rotational term, then gathered gen- vations address decoupled translational and rotational erating the global 12 12 state matrix: "# motions, and the LQR solver generates independently the Atransl 06 6 sub-optimal force and torque 3-component vectors to be A ¼ ð18Þ 0 A mapped to the body mounted thrusters. Despite the ability 6 6 rot of the LQR solver to generate directly the 12-component where vector of the forces to be generated by the thrusters, thus "# addressing the coupled rototranslational dynamics, the 03 3 I3 3 Atransl ¼ ð19Þ authors chose to work with decoupled dynamics to retain 03 3 03 3 the ability of a comparison with GPOPS-II. In fact, GPOPS-II "# has proven to have several convergence difficulties with Arot11 Arot12 the coupled dynamics, and to be very sensitive to the A ¼ ð20Þ rot 0 A desired initial and final conditions. Given the twelve 3 3 rot22 component state vector X expressed by The expression of the state matrix rotational term has 2 3 x been further partitioned due to complexity of its elements, 6 7 6 x_ 7 allowing us, to focus on those terms depending on the yxz 6 7 X ¼ 4 θ 5 ð11Þ convention, that is on the Euler angles:
ω Arot11 ¼ ½ Arot1 Arot2 Arot3 ð21Þ L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 119 2 3 0 where 6 7 6 sin ðθ Þfω ½2 sin ðθ =2Þ2 1 ω sin ðθ Þg 7 "# 6 x y z x z 7 6 2 7 03 3 ¼ sin ðθxÞ 1 Arot1 6 7 Btransl ¼ ð28Þ 6 7 I3 3=m 4 ωy cos ðθzÞþωx sin ðθzÞ 5 cos ðθ Þ2 x 2 3 ð22Þ 000 6 7 6 0007 6 7 2 3 6 7 6 0007 0 6 7 6 7 6 1 7 A ¼ 4 0 5 ð23Þ 6 007 rot2 6 J 7 Brot ¼ 6 x 7 ð29Þ 0 6 7 6 1 7 6 0 0 7 2 3 6 Jy 7 ω cos ðθ Þ ω sin ðθ Þ 6 7 6 y z x z 7 4 1 5 6 ω θ ω θ 7 00 6 x cos ð zÞ y sin ð zÞ 7 J 6 θ 7 z Arot3 ¼ 6 cos ð xÞ 7 ð24Þ 6 7 4 sin ðθxÞ½ωx cos ðθzÞ ωy sin ðθzÞ 5 Moreover, it was assumed that all the states are cos ðθxÞ observed output. This model provides the inputs of the LQR that given the problem dimensions (12 states and 2 3 6 inputs) provides an optimal Kalman gain as a 6 12 cos ðθzÞ sin ðθzÞ 0 6 7 6 sin ðθ Þ cos ðθ Þ 7 matrix to be multiplied by the 12 1 state error Xerr, 6 z z 0 7 Arot12 ¼ ð25Þ finally obtaining the optimal solution F LQR. This represents 4 cos ðθxÞ cos ðθxÞ 5 the optimal input generalized force, including both forces sin ðθzÞ tan ðθxÞ cos ðθzÞ tan ðθxÞ 1 and torques, necessary to reach the desired state Xdes, 2 ω ω 3 while minimizing the cost functional. However the zðJy JzÞ yðJy JzÞ 6 0 7 spacecraft simulator is moved by twelve cold gas thrusters, 6 J J 7 6 x x 7 placed on the upper part of the simulator; hence the 6 ω ðJ J Þ ω ðJ J Þ 7 6 z x z 0 x x z 7 generalized optimal force has been converted into a set of Arot22 ¼ 6 7 ð26Þ 6 Jy Jy 7 6 7 twelve unidirectional thruster commands by means of a 4 ωyðJ J Þ ωxðJ J Þ 5 x y x y 0 mapping matrix H, as proposed in [24]. According to Fig. 2, Jz Jz thrusters are placed on the edge of the four arms sym- metrically connected to the basis of the attitude stage and The 12 6 input matrix B (where 12 represents the extended upwards and downwards respectively: specifi- number of the states, while 6 represents the number of the cally eight thrusters are placed with an inclination of 45° force and torque inputs) remains unvaried, and is given by with respect to the sides of the imaginary square (side "# 2d ¼ 0:64 m) circumscribed to the basis, while the Btransl 06 3 B ¼ ð27Þ remaining four (thrusters 3, 6, 9, and 12) provide pure 0 B 6 3 rot vertical motion when the basis is placed horizontally.
Fig. 2. Axonometric view of the attitude stage (AS) and geometric representation of the thrusters configuration: dotted lines represent the inertial principal axes, black thin lines refer to the AS with the relative upper/lower arms, thick arrows illustrate the thrust direction, light thin lines symbolize the imaginary square circumscribed to the AS base. 120 L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136
The thrust distribution matrix is then given by with A and B from Eqs. (18) and (27) respectively, and where "# the state weighting matrix Q is assumed symmetric and EDCM ðX ÞH H ¼ B des f ð30Þ positive semi-definite and the input weighting matrix R H m symmetric and positive definite. In this case, the LQR generates the optimal solution where Hf and Hm represent the force and torque terms respectively. U ðtÞ (all the following functions are listed as functions of 2 3 time, as they depend on X which depends on time) p p 0 pp 0 p p 0 pp0 des 6 7 H ¼ 4 00100 10 0 10 015 1 T f U ðtÞ¼ KðtÞXerrðtÞ¼ R ðtÞB PðtÞXerrðtÞð36Þ pp0 p p 0 pp 0 p p 0 ð31Þ where KðtÞ is often referred to as Kalman gain and PðtÞ is 2 3 the solution of the Riccati differential equation, p p 1 p p 0 pp 0 pp 1 6 7 T 1 T _ Hm ¼ 4 pp0 p p 0 p p 0 pp 0 5d PðtÞAðtÞþAðtÞ PðtÞ PðtÞBR ðtÞB PðtÞþQ ðtÞ¼PðtÞ¼0 p p 0 p p 1 pp 1 pp 0 ð37Þ ð32Þ obtaining the following linear state feedback control: with p ¼ cos ð451Þ¼ sin ð451Þ and d ¼ 0:32 m. The twelve _ resulting normalized continuous inputs, Ucont, have been X errðtÞ¼ðAðtÞ BKðtÞÞXerrðtÞ then computed according to the following expression: UðtÞ¼ KðtÞXerrðtÞð38Þ F LQR Ucont ¼ pinvðÞH ð33Þ ua The presented control strategy has been simulated employing a Simulink integrated MATLAB function, which is pinvðÞ is the pseudoinverse operator, whereas ua is a dimensionless parameter whose magnitude corresponds to based on the original code employed in the ‘lqr’ MATLAB the nominal thrust force Ft ¼ 0:3 N, introduced to normal- command [27]. It requires the following input matrices: A ize the values. After the pseuoinverse is computed a pulse- (dynamic matrix), B (control matrix), C (state-output map- width-modulation strategy is used to finally derive a set of ping matrix), D (control-output mapping matrix or feedfor- twelve dimensionless binary inputs, u10,tobesenttothe ward matrix), and Q and R weighting matrices. Lastly, it onboard on/off thrusters. It is worth underlining that the provides the optimal gain matrix K, the solution of the Ric- pseudoinverse solution is applied to a constant matrix, cati equation S and the closed-loop eigenvalues of the matrix posing no computational or solution existence issues. ðA BKÞ, allowing us to verify whether or not the system has been stabilized, obtaining negative real part eigenvalues. The 2.4. Linear quadratic control presented function executes the routine, in both linux and win32/64 environment where the correspondent C code is The control strategy is sub-optimized since the LQR generated for compilation under RTAI linux. problem is solved at each time step, using dynamically Concerning the weighting matrices, the tuning techni- sized weighting matrices Q and R, adapting gains with que is based on the use of constant elements qi and ri for respect to the actual position reached by the robot. Thus both the matrices Q and R, as part of a trial and error the negative effects deriving from the linearization will be iterative process where each weight and the factor ρ are minimized since the whole trajectory will consist of a gradually adjusted with the aim of obtaining the desired sequence of linearized sections and will approach the non- performances, that is the reaching of the target position. linear one as the sample time tends to zero. Moreover, two additional gains K and K are intro- The definition of the control law requires the introduction pos rot duced in the Q, and the weighting factor ρ becomes of a state feedback gain whose design is a trade-off between variable throughout the simulations as well as the the transient response and the control effort. The optimal experimental test, depending on these two new gains, thus control approach [25] to this design trade-off is to define and increasing the control sensitivity to the system evolution. minimize a performance index; furthermore, since generally The deriving matrices have the following structure: the robot has to reach non-zero target position and attitude, 2 3 a non-zero set point optimal control [26] has been con- rFx 00 sidered, shifting the actual state X by the desired quantity 6 7 6 0 r 0 0 7 fi 6 Fy 3 3 7 Xdes, obtaining the error, de ned in Eq. (34): 6 7 6 00rFz 7 R ¼ ρ6 7 ð39Þ Xerr ðtÞ¼XðtÞ XdesðtÞð34Þ 6 r 007 6 Mx 7 6 7 Consequently, the shifted optimal regulation problem 4 03 3 0 rMy 0 5 becomes 00r 8 Mz R hi > 1 1 T T <> Minimize J ¼ Xerr ðtÞQXerr ðtÞþU ðtÞRUðtÞ dt "# 2 t0 ðÞ _ Q transl 06 6 LQR > Subject to X ðtÞ¼AX ðtÞþBUðtÞ > err err Q ¼ ð40Þ : 06 6 Q rot YðtÞ¼Xerr ðtÞ ð35Þ where L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 121 2 3 Kposqx 00 and to be extremely flexible, allowing a user to formulate a 6 7 6 0 K q 0 0 7 wide variety of applications including engineering, eco- 6 pos y 3 3 7 6 7 nomics, and medicine. 6 00Kposqz 7 Q ¼ 6 7 ð41Þ transl 6 q 007 The optimal control problem is then stated as a NLP 6 vx 7 6 7 problem (refer to Eqs. (12) and (44), whose differential 4 03 3 0 q 0 5 vy algebraic equations are collocated using nodes obtained 00q vz from a Gaussian quadrature, whereas the state and the 2 3 control are parameterized using Legendre polynomials and Krotqθ 00 their linear combinations. As stated in [29], the combination 6 x 7 6 7 of Legendre polynomials with Gaussian quadrature ensures 6 0 Krotqθy 0 03 3 7 6 7 exponentialconvergenceforsmoothsolutionproblems. 6 00Krotqθ 7 6 x 7 fi fi Q rot ¼ 6 7 ð42Þ More speci cally, in order to nd the optimal solution, that 6 qωx 007 6 7 is a combination of state and input vector solutions mini- 4 03 3 0 qω 0 5 y mizing the objective function in the Bolza form [25],the 00qωz general purpose optimal control software GPOPS-II uses a set of Legendre Gauss Radau (LGR) collocation points: this Specifically, two controls, over the spacecraft position and set is defined on the domain ½ 1; 1 , but contains only one attitude respectively, allow these new gains to switch of the endpoints. According to [29,30], the whole iterative between two calibrated values, on the basis of the actual procedure, employed by GPOPS-II to determine the optimal position and attitude: solution, could be efficaciously divided into five steps representing partial results of the whole method: once the i-th component of the position or attitude gets into an expressly chosen band, the boundaries of which 1. identification of the first-order optimality conditions of have been defined as percentages of the relative target the continuous Bolza problem, on the basis of the component (i.e. 720% of the target position), a counter Pontryagin Minimum Principle [25]; variable starts to run; 2. Radau pseudospectral discretization of the continuous- when the counter reaches a predefined value the rela- time first-order optimality conditions of the continuous tive K gain (K or K ) switches to a second value transl rot Bolza problem; appropriately calibrated for the steady state phase 3. Radau pseudospectral discretization of the continuous (robot near the target position); time optimal control problem, resulting in a discrete NLP; these new K gains are applied to the position and atti- 4. statement of the Karush–Kuhn–Tucker (KKT) conditions tude terms of the Q (Eqs. (42) and (43)), thus forcing a related to the NLP; change of the weights as the state approaches the target; 5. costate estimation obtained from the results of steps when both the position and the attitude gains K and pos 3 and 4. Krot have changed, the factor ρ is also allowed to change, hence switching to a new value which is more suitable A new model of the spacecraft simulator has then been for the steady state phase. developed, introducing the same rotational and transla- tional equation of motion (12), but following, this time, according to the GPOPS-II logic design [31], a different 3. Comparison with GPOPS-II problem formulation. Two phases have been considered, in order to analyze both the transient response and the A validation of the proposed sub-optimal controller, steady state one, introducing the appropriate linkage implemented in Matlab and Simulink, is carried out by constraints in the endpoint function to ensure continuity comparing it with an optimizer to highlight how well the of the solution and set time and event criteria, defining the proposed controller approximates the best solution. At the transit from the previous phase to the next one. In this time of writing, several open-source, freeware and com- way, during the mesh iterations, when an optimal solution mercial optimal control and nonlinear programming inter- is found, the mesh is analyzed in each of these phases, faces are available. However, results accuracy, advance- verifying if the mesh error tolerance is satisfied; if not, the ments in mesh refinement and generality of problem for- solver automatically increases the number of collocation mulation represent some of the reasons behind the choice points, updating the mesh refinement and generating a of GPOPS-II as the basis for comparison with the Simulink non uniform adaptive grid designed to reduce the error. model. GPOPS-II available at [28] has been developed at the Finally, the following aspects have been considered to University of Florida in cooperation with the U.S. Office of obtain a correct comparison between the approach pro- Naval Research (ONR) and the U.S. Defense Advanced posed herein and GPOPS one. Research Projects Agency (DARPA). Turning to detail, it employs an adaptive Radau pseudospectral Gaussian 1. The comparison of the LQR method and GPOPS has been quadrature method and a sparse finite-differencing is used focused on the global cost required by the controller. to estimate all first and second derivatives required by the Specifically the global dimensionless cost J, that is the nonlinear programming (NLP) solver. Moreover, it has been cost or objective functional to be minimized both by the designed to work with the NLP solvers Sparse Nonlinear LQR and the optimal solver, consists of two terms P and OPTimizer (SNOPT) and Interior Point OPTimizer (IPOPT) F, each related to the generalized position and velocity 122 L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136
Table 1 Costs comparison.
Test Initial state Final state Simulink GPOPS-II ΔJ% (–)
F(–)P(–)J(–)F(–)P(–)J(–)
1 x, y, z (m) (1,1.4, 1) ( 1,1.85,2) 63.3 173.3 236.6 53.3 160.4 213.7 9.7
θx; θy; θz (deg) (50,0,50) ( 30,120,30) 2 x, y, z (m) (1,1.4,0) (2,1.85,1) 15 39.5 54.5 11.1 33.9 45 17.4
θx; θy; θz (deg) (0,30, 20) (10,120,10) 3 x, y, z (m) (1,1.4,0) (1,1.4,0) 3.9 11.3 15.2 2.8 9 11.8 22.4
θx; θy; θz (deg) (20,40,10) (0,150,10) 4 x, y, z (m) (1,1.4,0) (1.5,1.9,1) 9.3 18.7 28 5.8 17.4 23.2 17.1
θx; θy; θz (deg) (20,40, 5) (20,40, 5) 5 x, y, z (m) (2,1.4,0) (0,1.4,1) 30.5 76.8 107.3 22.5 68 90.5 15.7
θx; θy; θz (deg) (20,160,0) (0,70,30) 6 x, y, z (m) (2,1.4,0) (0,1.4,1) 34.1 88.7 122.8 25.4 77.7 103.1 16.0
θx; θy; θz (deg) (20,200,0) (0,0,30) 7 x, y, z (m) (2,1.9,1) (1,1.4,1) 9.6 27 36.6 7.7 23.8 31.5 13.9
θx; θy; θz (deg) (10,300,10) ( 20,45,0) 8 x, y, z (m) ( 1,1.9,2) (0,1.4,0) 19.2 75.4 94.6 21.4 64.7 86.1 9.0
θx; θy; θz (deg) (30,60, 20) (0,0,0)
accuracy, and the control effort respectively. Since the From a statistical analysis it can be inferred that on Simulink model employs a fixed time step of 0.02 s and average the sub-optimal controller requires 15% additional a discrete solver, the cost function has been computed cost. However, although the GPOPS controller seems to be using the following discrete expression, to match the the best alternative, some observations have to be taken same type of cost function generated by GPOPS. This into account before expressing the final evaluation: summation approximates the integral cost function. hi X1 ÂÃ1 X1 Both Simulink and GPOPS simulations have been run J ¼ PðkÞþFðkÞ ¼ XT ðkÞQX ðkÞþUT ðkÞRUðkÞ ð43Þ 2 err err using a laptop with standard performances, at the time of k ¼ 1 k ¼ 1 writing (Processor Intel Core i3, RAM 2 GB). Given a fixed In this analysis more importance has been attributed to time simulated duration of 200 s in either case, the opti- the propellant cost, setting the tuning parameters in mal solver definitely incurred longer computation time; order to save as much propellant as possible. specifically, setting a mesh tolerance between 1e-5 and 2. Since both the optimal solver GPOPS and the LQR solver 1e-3 (suggested values 1e-6, default value 1e-3), each cannot handle binary variables, the propellant term has simulation required on average 600–700 s, three times the been computed using the continuous control vector U ; cont simulated time, while, using Simulink, simulations hence, during these simulations, the block used in the required 200–500 s. Even though the LQR appears slightly Simulink model to transform the continuous control into a faster in simulation, this comparison does not show such a binary series of on/off commands has been bypassed. substantial gap between the two techniques yet. There is, 3. For the sake of simplicity the constant tuning technique in fact, an additional fundamental observation that needs (refer to Section 2.4) has been applied in both models, to be made: while GPOPS needs to compute an entire using also the same numeric values, thus to observe the results on equal weighting terms. trajectory before returning a solution, the proposed approach can solve the LQR problem in real time, i.e., it One last expedient has been devised to make the decideshowtoactuatefortheimmediatenexttimestep. comparison even more truthful: the phase of post- The LQR solution constitutes a linear state feedback control, processing transforms the results obtained with GPOPS, generating a closed loop system, while the multi-purpose interpolating data and then extracting values on the basis optimal control software GPOPS-II provides a guidance, of the equally spaced time vector used in Simulink. This that is the determination of an entire desired trajectory, technique allows a point by point comparison, hence a which implies incapability to execute in real time. In fact, a more detailed analysis of the results and their discrepancy. 600 s convergence time for a 200 s simulation clearly The most important data affecting the outcome of this shows inability for real-time implementation. comparison, hence the Simulink model performances, were the simulation costs. A set of eight simulations has To summarize, the Simulink theoretical controller then been executed both using the Simulink and the represents the best alternative: in particular the feedback GPOPS software, generating Table 1, which summarizes nature and the higher level of flexibility of the controller the cost values as well as the initial and desired condition allow the use of the sub-optimal solution for real-time for each of these eight tests. It is worth reminding that the implementation; furthermore the previous cost analysis translational dynamics is linear, while the rotational is ensures that the same sub-optimal controller maintains a nonlinear. The initial and final values were randomly high optimization level, since the costs discrepancy is 15%, chosen to cover a wide range of maneuvers. compensated by implementability with a real spacecraft. L. Guarnaccia et al. / Acta Astronautica 122 (2016) 114–136 123
4. The six-degree-of-freedom spacecraft simulator testbed
Once the reliability and the effective optimization level of the Simulink model were proven, a series of experimental tests were conducted on the ADAMUS test bed [32,33]. This test bed consists of a six-degree-of-freedom spacecraft simulator based on an air bearing technology that allows it to move as it would in space, i.e., torque and force free. The so-called moving platform (MP) is con- trolled by twelve cold gas thrusters, appropriately placed on the upper part of the simulator. The MP is powered by two Lithium-ions batteries which are connected to an on board power management system. Moreover, a 13 15 ft (3.96 4.57 m) flat epoxy floor (Fig. 3) is being used as the main base upon which to move the robot. The overall system represented in Fig. 4 consists of two main stages: