Model-based FDI for Agile Spacecraft with Multiple Actuators Working Simultaneously E. Lopez i de la Encarnacion Technische Universiteit Delft

Model-based FDI for Agile Spacecraft with Multiple Actuators Working Simultaneously

by E. Lopez i de la Encarnacion

in partial fulfillement of the requirements to obtain the degree of Master of Science in Aerospace Engineering at the Delft University of Technology, to be defended publicly on 12th June 2019 at 9.00am.

Student number: 4617266 Project duration: July 16, 2018 – June 12, 2019 Thesis committee: Dr.ir. R. Fónod, TU Delft, supervisor Dr. A. Cervone, TU Delft, committee chair Dr.ir. E. van Kampen, TU Delft, external examinator

An electronic version of this thesis is available at http://repository.tudelft.nl/. Cover image credits: European Space Agency

Preface

This report presents the MSc thesis project Model-based FDI for Agile Spacecraft with Multiple Actuators Working Simultaneously, the bases of which is a novel fault detection and isolation strategy applied to agile spacecrafts that use multiple actuators together and tested using Monte Carlo campaigns. It has been written to partially fulfill the requirements to obtain the degree of Master of Science in Aerospace Engineering at the Delft University of Technology. The research project and the writing of this thesis has been done from July 2018 to May 2019.

The development of this research project has been done at Airbus Defence and Space GmbH, in Friedrichshafen, Germany, within the Attitude and Orbit Control System and Guidance, Navigation, and Control (AOCS/GNC) department. The definition of the thesis goals and scope was done in accordance with my company supervisor, Patrick Bergner, and my thesis supervisor, Róbert Fónod. The thesis presented some difficulties, the fault detection and isolation field is very broad and there are so many strategies that can be used to achieve the defined goals. However, a good literature research and the help and advice of both my supervisors, who were always available and willing to respond to my inquiries, allowed me to accomplish this thesis with satisfaction. In addition, this project allowed me to submit, together with both supervisors, a conference paper to the Automatic Control in Aerospace 2019 (ACA2019) conference, which has been accepted for publication.

Therefore, I would to sincerely thank both my supervisors for their support, guidance, and supervision during all these months. I also want to thank all the Airbus colleagues, employ- ees, interns, and thesis students, who help me going through and gave their advice when I needed it. To my family and friends: I would like to thank you for supporting me and keep me motivated when I was discouraged. And specially thank to my parents who have always been there for me and encouraged me to pursue my dreams.

I hope you find it interesting and enjoy reading it.

E. Lopez i de la Encarnacion Sabadell, May 2019

iii

Abstract

Current and future space missions require agile and reliable spacecraft capable of trailing and keeping the required attitude. Most of the agile spacecraft missions are near-Earth based but some are placed far away from Earth and its influence. One example of such missions is the Athena mission, which requires the spacecraft to perform fast and large-angle attitude slew manoeuvres. Such manoeuvres often imply simultaneous use of multiple actuators such as thrusters and reaction wheels (RWs). A fault in any of these actuators might lead to partial or full damage of sensitive spacecraft instruments. In this research project, a novel model- based Fault Detection and Isolation (FDI) strategy is proposed, which is able to detect and isolate various actuator faults, such as stuck-open/closed thruster, thruster leakage, loss of effectiveness of all thrusters, and change of RW friction torque due to change of Coulomb and/or viscosity factor. Moreover, the proposed FDI strategy is also able to detect and isolate faults affecting the RWs tachometer. The design of the FDI algorithm is based on a multi- plicative extended Kalman filter, a generalised likelihood ratio thresholding of the residual signals, and a logic algorithm which unequivocally link the faults to the symptoms. The performance and robustness of the proposed FDI strategy are evaluated using Monte Carlo simulations and carefully defined FDI performance indices. In addition, the influence of faults’ magnitudes, times of fault occurrence, and uncertainties’ magnitudes on the FDI sys- tem performance are evaluated. Preliminary results suggest promising performance in terms of detection/isolation times, miss-detection/isolation rates, and false alarm rates. Also, un- certainties on the spacecraft inertia seem to have a negative impact on the FDI performance. In order to fully understand the research project presented here, graduate-level knowledge on rigid body dynamics and kinematics, control theory, and filters applied to estimation might be required. If any of these areas are not known by the reader, it is recommended to read some of the associated literature referenced in the bibliography.

v

Contents

List of Figures ix List of Tables xi Nomenclature xvi Notation xvi 1 Introduction 1 1.1 Background ...... 1 1.2 Motivation ...... 3 1.3 The Scope of this Research ...... 4 1.4 Research Objectives, Framework, and Questions ...... 4 1.4.1 Research Objectives ...... 4 1.4.2 Research Framework ...... 5 1.4.3 Research Questions ...... 5 2 Theoretical Background 7 2.1 Athena Mission Description ...... 7 2.1.1 Space Environment ...... 7 2.1.2 Spacecraft Characteristics ...... 7 2.1.3 AOCS Equipment...... 8 2.2 Fault Detection and Isolation...... 9 2.2.1 FDI Architectures ...... 10 2.2.2 Model-Based FDI ...... 12 2.3 Spacecraft Attitude Representation ...... 15 3 Study Case Description and Methodology 17 3.1 Definition of the Study Case ...... 17 3.1.1 Environment ...... 17 3.1.2 AOCS Equipment: Sensors and Actuators ...... 18 3.1.3 Spacecraft’s Dynamics and Kinematics ...... 20 3.1.4 Faults ...... 21 3.1.5 Uncertainties ...... 22 3.2 Methodology for Evaluation of the FDI System ...... 22 3.2.1 Test Campaigns ...... 22 3.2.2 Evaluation Criteria ...... 23 3.2.3 Post-Processing ...... 24 4 Proposed FDI Strategy 25 4.1 FDI Strategy ...... 25 4.1.1 Kalman Filter ...... 25 4.1.2 Extended Kalman Filter ...... 26 4.1.3 Multiplicative Extended Kalman Filter ...... 27 4.1.4 State Estimation ...... 29 4.1.5 Residual Generation ...... 32 4.1.6 Fault Detection Algorithm ...... 32 4.1.7 Fault Isolation Algorithm ...... 33

vii viii Contents

5 Simulation Results 35 5.1 Simulator ...... 35 5.1.1 Workflow ...... 35 5.1.2 Structure and Functionality...... 36 5.1.3 Missing Features in the Simulator ...... 37 5.2 Study Case Parameters Definition...... 39 5.3 Sample Runs Analysis ...... 43 5.3.1 Fault-Free...... 43 5.3.2 Leakage Fault ...... 44 5.3.3 Loss of Effectiveness Fault ...... 45 5.3.4 Stuck Close Fault...... 46 5.3.5 Stuck Open Fault ...... 47 5.3.6 Reaction Wheel Friction Fault ...... 48 5.3.7 Reaction Wheel Tachometer Fault...... 49 5.4 Monte Carlo Analysis...... 49 5.4.1 Test Campaign Without Uncertainties ...... 50 5.4.2 Test Campaign With Uncertainties...... 59 5.5 Discussion on Simulation Results ...... 64 6 Conclusions and Recommendations 65 6.1 Research Sub-Questions ...... 65 6.1.1 Sub-Question: Study Case Model ...... 65 6.1.2 Sub-Question: Methodology ...... 66 6.1.3 Sub-Question: AOCS FDI System...... 66 6.1.4 Sub-Question: FDI System Performance ...... 67 6.2 Research Question ...... 67 6.3 Recommendations and Future Work ...... 68 A Appendix: Derivations 71 A.1 Process Noise Matrix Q ...... 71 A.2 Power Spectral Density and Variance ...... 74 B Appendix: Validation and Verification 75 B.1 Thruster Reaction Control System...... 75 B.1.1 Individual Test ...... 75 B.1.2 Multiple Test ...... 76 B.1.3 Monte Carlo Test ...... 76 B.2 Kalman Filter ...... 78 C Appendix: Additional Simulation Figures 81 C.1 Sample Runs of Interest ...... 81 C.1.1 Stuck Open Fault With Highest Time To Detection ...... 81 C.1.2 Stuck Close Fault With Highest Time To Detection ...... 82 C.2 Additional Simulation Results Figures ...... 83 C.3 Uncertainties Correlations ...... 84 Bibliography 85 List of Figures

1.1 Research framework ...... 6

2.1 L2 Earth-Sun Halo Orbit...... 8 2.2 ESA design: Athena spacecraft in deployed configuration...... 8 2.3 Hybrid expert based architecture...... 10 2.4 Simplified model-based FDI architecture...... 11 2.5 General model-based FDI architecture...... 13

3.1 ESA design: Athena spacecraft in deployed configuration with fixed-body frame. 20

5.1 Top-level architecture of the GAFE Simulator. Source: GAFE Users’ Manual. . 36 5.2 Comparison between real scenario PWM signal and simulation scenario PWM signal of single thruster during a single cycle...... 39 5.3 Comparation between real scenario PWM signal and simulation scenario PWM signal of single thruster during a single cycle...... 39 5.4 Maximum, minimum, and limiting generated torque per axis...... 40 5.5 RCS torque envelope...... 40 5.6 Evolution of the spacecraft attitude...... 41 5.7 Fault-free case...... 43 5.8 Force [N] per 푖푡ℎ thruster over time in fault-free case...... 44 5.9 Faulty thruster force in Leakage fault case...... 44 5.10Leakage fault case...... 44 5.11Spacecraft’s angular rates in fixed-body frame for leakage fault case...... 45 5.12Force [N] per thruster over time in LOE fault...... 46 5.13LOE fault case...... 46 5.14Faulty thruster force in stuck close fault case...... 46 5.15Stuck close fault case...... 47 5.16Faulty thruster force in stuck open fault case...... 47 5.17Stuck open fault case...... 48 5.18RW friction fault case...... 48 5.19RW tachometer fault case...... 50 5.20Leakage fault: simulation parameters histograms...... 51 5.21Leakage fault: influences w.r.t. detection performance...... 52 5.22Leakage fault: influences w.r.t. isolation performance...... 53 5.23LOE fault: simulation parameters histograms...... 53 5.24LOE fault: influences w.r.t. detection performance...... 54 5.25LOE fault: time of fault occurrence vs fault’s magnitude...... 54 5.26Stuck Close fault: simulation parameters histograms...... 55 5.27Stuck Close fault: influences w.r.t. detection performance...... 55 5.28Stuck Open fault: simulation parameters histograms...... 56 5.29Stuck Open fault: influences w.r.t. detection performance...... 56 5.30RW friction fault: simulation parameters histograms...... 56 5.31RW friction fault: influences w.r.t. detection performance...... 57 5.32RW tachometed fault: simulation parameters histograms...... 58 5.33RW tachometer fault: influences w.r.t. detection performance...... 58 5.34Uncertainties histograms...... 59 5.35Uncertainties correlation coefficients for global results...... 61 5.36Fault-free simulations’ inertia uncertainties magnitudes...... 62 5.37Inertia uncertainty magnitudes vs false alarm...... 63

ix x List of Figures

5.38Inertia uncertainty probability function distribution for no-false/false alarm cases...... 63

B.1 Single RCS cycle delivered forces per thruster...... 76 B.2 Commanded versus delivered forces and torques for a single RCS cycle. .... 76 B.3 Multiple RCS cycles delivered forces per thruster...... 77 B.4 Commanded versus delivered forces/torques for multiple RCS cycles...... 77 B.5 Commanded versus delivered forces/torques and relative errors of 50 runs. .. 77 B.6 Mean and maximum relative errors...... 78 B.7 Spacecraft angular rate true error consistency check...... 79 B.8 RWs angular rates true error consistency check...... 79 B.9 RWs friction torques true error consistency check...... 79 B.10Spacecraft angular rates measurement error consistency check...... 80 B.11RWs angular rates measurement error consistency check...... 80

C.1 Stuck open fault case with highest time to detection...... 81 C.2 Stuck close fault case with highest time to detection...... 83 C.3 Lineal scale representation...... 83 C.4 Uncertainties correlation coefficients...... 84 List of Tables

4.1 Fault signatures...... 34

5.1 Spacecraft and FDI related parameters...... 42 5.2 GLR means and variances and decision taking algorithms design parameters. . 42 5.3 MC related parameters...... 50 5.4 Campaign without uncertainties results...... 50 5.5 Campaign with uncertainties results...... 60 5.6 Percentage differences between results from no-uncertainty and uncertainty campaigns...... 60

B.1 RCS validation and verification parameters ...... 75 B.2 True errors consistency check results...... 80 B.3 Measurement errors consistency check results...... 80

xi

Nomenclature

Acronyms

AEOS Agile Earth Observing Satellites AOCS Attitude and Orbit Control System CI Confidence Interval

CMG Control Moment Gyro CoM Center of Mass CUSUM Cumulative Sum DEPFET Depleted P-channel Fiel Deffect Transistor

EKF Extended Kalman Filter ESA European Space Agency FDI Fault Detection, and Isolation

FDIR Fault Detection, Isolation, and Recovery GLR Generalized Likelihood Ratio GNC Guidance, Navigation, and Control KF Kalman Filter

LOE Loss Of Effectiveness MC Monte Carlo RCS Reaction Control System

RMU Rate Measurement Unit RW Reaction Wheel STR Star Tracker UFK Unscented Kalman Filter Attitude Representation

퐀 Euler angles 퐠 Gibbs vector 퐪 Quaternion

퐑 Rotation matrix

퐯 Vector in body frame

퐯 Vector in reference frame

xiii xiv List of Tables

AOCS Equipment

휒 Function that models different faults in a thruster

휖 Misalignment angle

휙 Scalar variable that models if a thruster is faulty

휓 Scalar variable that models reaction wheel measurement fault

흎 Reaction wheel angular rate

흎 Spacecraft angular rate 휽(⋅, ⋅) Rotation function of a vector

흋(⋅) Quaternion rotation function for noise in Euler angles

흑(⋅) Quaternion rotation function for misalignment angle

퐛 Directional force of 푖푡ℎ thruster

퐛 Directional torque of 푖푡ℎ thruster

퐝 Direction of 푖푡ℎ thruster

퐡 Angular momentum vector associated with reaction wheels

퐉 Reaction wheel inertia

퐉 Spacecraft inertia matrix

퐫 Vector position of 푖푡ℎ thruster w.r.t CoM 퐓 Torque

퐮 Inputs

휉 Scalar variable that models reaction wheel viscous factor fault

휉 Scalar variable that models reaction wheel Coulomb factor fault

휁 Coulomb friction torque constant

휁 Viscous friction torque constant

푐 Total torque of 푖푡ℎ reaction wheel

푓 Coulomb friction

퐹 Thruster force

푓 Viscous friction

푚 Fault magnitude with 푖 ∈ {푙푒푎푘,푙표푒,푚푒푎푠,푣,푐}

푚 Directional vector of 푖푡ℎ reaction wheel 푁 Number of elements

Algorithms

훿퐠 Error Gibbs vector

훿퐪 Error quaternion

훾 Fixed threshold accounting for the RW’s friction characteristics List of Tables xv

Γ Decision taking threshold for 푖푡ℎ GLR test signal

휆 Decision test signal for 푖푡ℎ GLR test signal

퐁 Geometric matrix of thruster noise influence on body frame 퐅 Jacobian matrix of f(x,u) 퐠 GLR test signals 퐇 Sensitivity matrix

퐡 Observation model

퐊 Kalman gain matrix

퐌 Fault signature matrix

퐌 Misalignment-free matrix mapping the estimated RWs’torque contributions into the spacecraft body-fixed frame

퐏 Covariance matrix of the states

퐐 Discrete process noise matrix 퐫 Residual signals

퐑 Measurement noise matrix 퐮 Inputs 퐱 States

퐳 Sensors measurements 휀 Very small constant 퐿 Moving time window Monte Carlo Campaigns Δ Half the confidence interval 푗̂ Fraction of success 푣̂ Fraction of failure

푁 Number of Monte Carlo runs

푡 Fault time of occurrence

푧/ Upper tail of the normal distribution set by the confidence level Statistics 훿 Diract Delta function 휅 Probability function scalar parameter ℋ Hypothesis 휇 Mean value 휎 Standard deviation 휎 Variance 휼(푡) Continuous-time zero-mean Gaussian white noise signal function xvi List of Tables

휼 Discrete-time zero-mean Gaussian white noise vector 퐥 Vector of independent random variables 퐐(푡) Process noise PDS matrix

퐐 Discrete process noise matrix

퐑 Auto-correlation matrix of noise

퐫 Auto-correlation function of noise

퐒 Power spectral density function of noise

퐾 Unknown change time

푆 Power spectral density constant value for white noise Notation

• Vector and matrix: bold and not cursive with dimensions as subindices, if necessary, e.g., m, M, and M. • 푥 and 퐱 stand for a variable and vector variable, respectively. • ̂푥 is the estimate of 푥. • ̇푥= is the time derivative of 푥. • ℝ× defines a set of real numbers with 푚 × 푛 dimensions.

• The norm for vector 퐱 is defined as ||퐱||. • Let 퐌 ∈ ℝ× be a matrix with real values, 퐌 ∈ ℝ× be its transpose. • Let 퐌 ∈ ℝ× be a matrix with real values, 퐌 ∈ ℝ× be its inverse.

• The diag(푥 푥 … 푥) is a diagonal matrix with 푥, ∀푖 ∈ [1 2 … 푛] as main diagonal elements and zero on the off-diagonal elements.

• Geometric mean of a vector 퐱 is defined as 휇 = √푥 ⋅ 푥 ⋅ … ⋅ 푥 • Var[퐱] is the variance of 퐱

• ⟨⋅⟩ pulls out the element associated with 푎푏푐 from the enclosed vector. • 피(푥) stands for expected value of variable 푥. • ⊗ is the quaternion multiplication.

• ≜ means it is defined to equal to. • | ⋅ | is the absolute value operator. • 푥 ∼ 풩(휇, 휎) follows normal distribution with mean 휇 and variance 휎. • The operator 푠푖푛푔(⋅) excerpt the sign (positive or negative) from a variable.

• 퐈× defines an identity matrix of 푚 × 푚 dimensions.

• ퟎ× defines a matrix of 푚 × 푛 dimensions with all its entries being zero. • × denotes the cross-product operation.

• 퐏(,) is the element of matrix 퐏 in the position (푖, 푗).

xvii

1 Introduction

1.1. Background Spacecraft agility, meaning the capability of the spacecraft to change its attitude by perform- ing fast (large-angle) slew maneuvers or to follow a given attitude profile with high precision, becomes more and more important for recent and future missions. These requirements de- mand the spacecraft to be equipped with actuators, such as thrusters, reaction wheels (RWs), control moment gyros (CMGs), and/or magnetorquers, capable of generating high reaction torques and perform attitude maneuvers with high angular rates. The control of the space- craft’s attitude is achieved by the Attitude and Orbit Control System (AOCS), which includes sensors and actuators. Sensors measure input from the physical environment and actua- tors move and control mechanisms of the spacecraft in order to keep/achieve the required attitude and orbit. These sensors and actuators are not exempt from faults. A fault is de- fined as an ”unpermitted deviation of at least one characteristic property or parameter of the system from the acceptable/usual/standard condition”, see [33]. Some of the faults are very trivial, like inactive equipment when it is required to be active, but there are others that are not so trivial. Examples of the last ones, in the case of RWs, could be an increment of the wheel’s friction and for the thrusters, a loss of performance effectiveness. An incorrect AOCS fault management may cause severe degradation of the spacecraft performance and/or cause damage to sensitive spacecraft instruments (e.g., by direct sun exposure).

A system which is in charge of detecting the presence of a fault and identifying its location is often referred to as the Fault Detection and Isolation (FDI) system. A quick and reliable FDI system is crucial for a successful fault recovery action (e.g., switching to redundant hard- ware and/or employing a new controller). Model-based FDI techniques, in general, gained a great deal of attention in the past decades and, in particular, they show great potential for aerospace applications, see [40] for a recent survey. Model-based FDI systems can be designed to deal with AOCS sensors faults, with AOCS actuators faults, or both.

The majority of published work on AOCS fault detection and isolation focuses on faults occur- ring in either type of actuators or sensors only. For instance, sole thruster faults are studied in [22, 45, 47], whereas sole RW faults are investigated in [41, 55]. Sole gyroscope sensor faults were studied, for instance, in [59, 66]. Only very few examples of model-based FDI systems exist which deal with a combination of one type of actuator and one type of sensor faults. Patton et al. [44] considered gyroscope and thruster faults, whereas Hou et al. [32] focused on gyroscope and RW faults. However, focus on single faults occurring on actuators or sensors when multiple actuators work simultaneously, as in some agile spacecraft, has not been extensively investigated and any of the sensors or actuators could be faulty.

Most of the current missions that require spacecraft’s agility make use of Agile Earth Ob-

1 2 1. Introduction servation Satellites (AEOS) like the Pleiades 1 and the KOMPSAT-3 2. For these missions, the spacecraft is placed within the influence of Earth gravitational and magnetic fields, which allow the use of actuators such as magnetorquers, sensors like magnetometers and Earth- sensors, and passive control techniques such as gravity gradient. However, other types of mission, that are placed far away from Earth’s influence, might also require agile spacecraft but none of the aforementioned sensors, actuators, and passive control techniques used by AEOS can be used.

An example of an agile spacecraft that makes use of multiple actuators working simulta- neously that will be placed away from Earth’s influence is the Athena (Advanced Telescope for High-ENergy Astrophysics) spacecraft, an L-class mission of the European Space Agency (ESA), which aims at addressing the Hot and Energetic Universe science theme [2]: mainly the mapping of large-scale hot gas structures in the universe, the survey of supermassive black holes and the exploration of high-energy astrophysical events such as supernova ex- plosions and energetic stellar flares. Such investigation requires the examination of X-rays with space-based observations. The selected orbit for Athena mission is a large-amplitude Halo orbit around the second Lagrange point (L2) of the Sun-Earth system, providing a stable thermal environment, high observing efficiency, and a good instantaneous sky visibility [4].

The Athena’s payload, necessary to achieve the mission’s objectives, is composed by an X- ray spectrometer for high-spectral resolution imaging and a silicon depleted p-channel field effect transistor (DEPFET) active pixel sensor camera [3]. The DEPFET instrument request to be working at very low temperatures, so a right attitude of the Athena spacecraft is es- sential to keep the correct temperature by placing the radiators facing the right direction. An increase in the temperature would decrease the performance of the DEPFET instrument. The DEPFET and some other instruments are also very sensitive to radiation and the maneuver of Athena must comply with a really strict constrain: Sun shall never be placed in the instruments field of view under any circumstance.

The observation plan of celestial targets will be mostly pointed observations, around 300 per year, with an average observation pointing time of about 28 hours and an error of 1 arcsec. But the routine will be suspended by targets of opportunity: -ray bursts and other transient events. Then, Athena requires to be agile to react rapidly to the target of opportunity alerts, at a predicted rate of twice per month, in order to rapidly re-point its instruments, while protecting its sensitive instruments from direct sunlight. The attitude re- pointing may imply the realisation of fast large-angle attitude slews. To perform such slews, the Athena spacecraft is equipped with a set of thrusters and RWs and a suite of sensors to provide accurate control torque.

The design of an FDI system can be approached in diverse ways. As presented in [11], the model-based approach is a very good way to achieve FDI in dynamic systems. It overcomes the main drawbacks of other approaches. For example, it surmounts the main drawbacks of the signal monitoring approaches, which are the generation of false alarms due to dis- turbances and uncertainties, and the difficulty of isolating the fault. It also overcomes the main drawback of the hardware redundancy approaches, which is the increment of weight and costs. This is achieved through the use of analytic redundancies, which are analytic relationships between various measured variables of the monitored process. In model-based approaches, the resulting difference generated from the consistency checking of different variables is called a residual signal. The residuals should be zero-valued when the system behaves normally and should diverge from zero when a fault occurs in the system. Usually, the residuals are computed from the difference between a measured signal and the estima-

1Herbert J. Kramer, “Pleiades - Satellite Missions”, eoPortal Directory. https://earth.esa.int/web/eoportal/satellite-missions/p/pleiades (accessed January 18, 2019). 2Herbert J. Kramer, ”KOMPSAT-3 - Satellite Missions”, eoPortal Directory. https://directory.eoportal.org/web/eoportal/satellite-missions/k/kompsat-3 (accessed January 18, 2019). 1.2. Motivation 3 tion of that signal computed by the model of the system. Its main advantages are that no additional hardware is required, the measurements and algorithms necessary to control the system are, in many cases, also sufficient for the FDI algorithms, and the possibility of do- ing the FDI system robust against disturbances and uncertainties using different techniques.

In past and current space missions, the design of the FDI of the AOCS is done, usually, during the final phases of the spacecraft’s design. That implies that sometimes, FDI requires additional hardware in order to achieve detection and/or full isolation, which increases the weight, cost, and engineering effort. Recently, Airbus Defence and Space in Friedrichshafen, Germany, has been working on the GAFE study, led by ESA, with the objective of developing a methodology to design the Fault Detection, Isolation, and Recovery system (FDIR) of the AOCS of a spacecraft during its early design phases. Such study is composed by three dif- ferent parts, the methodology to do the FDIR, a tool that uses structural analysis techniques to find out the minimum actuators and sensors to achieve detection and isolation, and a spacecraft simulator focused on FDIR.

1.2. Motivation The agile spacecraft that are not near Earth’s influence can only use actuators, like thrusters, RWs and CMGs, that do not depend on Earth influence. And only sensors such as star track- ers (STRs), rate measurement units (RMUs), and gyroscopes that do not depend on Earth influence. In general, RWs are equipped with an internal tachometer that provides wheel rate measurements.

The agile spacecraft mission’s requirements imply important challenges and innovations to the FDI system of the spacecraft’s AOCS. First of all, fast (large-angle) slews performed by the spacecraft together with attitude constraints entail that the FDI system must be accurate, so do not miss-detect/isolate a fault and do not generate false alarms, and that it must react as fast as possible in front of an occurring fault.

Second, the spacecraft will probably be subject to external disturbances like the solar pres- sure, to internal disturbances like propellant slosh and vibrations, and to uncertainties like the exact position of the centre of mass, the inertia of the rigid body, and the sensors and actuators positions and alignments with respect to the spacecraft’s body frame. In addition, AOCS sensors and actuators are generally affected by noises, biases, quantization effect, etc. Therefore, the FDI system must be robust against disturbances, uncertainties, and noises, meaning that it should be as less sensitive to them as possible.

Third, the AOCS configuration allows multiple actuators to be active simultaneously, e.g., RWs and thrusters. This makes very challenging a proper detection and isolation of faults occurring to the actuators. The magnitude of an actuator could be some orders of magni- tude higher than others. Furthermore, it is possible that the magnitude of one actuator is lower than the noise magnitude of the rest. That makes the detection of a fault with multiple actuators working simultaneously very difficult.

The isolation of a fault on the actuators can be very complex and arduous because a fault on an actuator could have an effect on not only on the spacecraft’s dynamics and kinematics but also on other actuators which could imply a wrong isolation of the fault by the FDI sys- tem. Also, a fault could remain undetected because its effect would be counteracted by the control system but eventually it could have a negative impact on the AOCS system and the spacecraft. Moreover, some missions might require, in addition to isolating in which actuator element the fault occurs, to isolate the type of fault that has been detected. E.g., a fault in a reaction wheel could be caused by an increment of the friction or by a wrong measurement of its rate by the internal rate measurement.

The development of the FDI approach for an agile spacecraft’s AOCS has not a unique so- 4 1. Introduction lution, some different combinations of techniques, schemes, and strategies can be used to achieve the correct detection and isolation of the considered actuator and sensor faults, but each different solution would present different performances regarding times of detection and isolation, or false alarm rates, for example.

Airbus Defence and Space in Friedrichshafen, Germany, proposed a MSc thesis, in accor- dance with the TU Delft and the author of this research, to develop and asses a specific FDI strategy for agile spacecraft that make use of multiple actuators simultaneously in order to possible apply it to future research on the FDI field or even to implement it on real missions.

Therefore, the challenges presented by the design of an FDI system for the AOCS of an agile spacecraft in early phase design with similar characteristics to the Athena mission, using the GAFE simulator, have motivated the development of this research project.

1.3. The Scope of this Research The aim of this project is to develop, within the GAFE simulator framework, a novel model- based FDI system capable of detecting and isolating faults occurring to the AOCS sensors and actuators. Then, such FDI system is evaluated based on multiple Monte Carlo (MC) campaigns and defined criteria.

The research here exposed only considers thrusters and RWs as actuators and STRs and RMUs as sensors. The considered faults for thrusters are individual thruster stuck open, stuck close, and leakage, and all thrusters loss of effectiveness. Regarding RWs faults, only increment of rotating friction and wrong tachometer measurement are considered. No STRs and RMUs faults are contemplated.

Finally, the assessment of the FDI performance is focused on evaluating how the strategy perform according to the defined criteria and how the faults’ magnitudes, times of fault oc- currence, and the faulty RW or thruster influence on the FDI performance.

1.4. Research Objectives, Framework, and Questions In order to formulate the research objectives, the procedure described in [60] is followed. Given the introduction, motivation, and scope of the thesis project presented in sections 1.1, 1.2, and 1.3, it is clear that the research is design-oriented. The context can be summarised as:

The agility capabilities demanded on agile spacecrafts require a very demanding performance of the spacecraft’s AOCS system (e.g., multiple actuators acting simultaneously). In order to provide enough safety, availability, and reliability, the AOCS must account with a fast, robust and accurate FDI system.

Based on the exploration of the context, it is possible to recognise that the design and imple- mentation of an FDI system for an agile spacecraft’s AOCS is required.

1.4.1. Research Objectives The main research objective of this project is to contribute to the development of agile space- craft missions by designing an FDI system, implementing it, and testing it on a study case spacecraft’s AOCS similar to Athena mission, in the framework of the GAFE simulator, and evaluating its performance assessment following a methodology and using specific criteria.

This main objective can be divided into sub-objectives:

1. To fully understand the set-up scenario of Athena mission and spacecraft. It includes elements such as the parameters of the spacecraft’s orbit, spacecraft’s characteristics (e.g. mass, and inertia), operations, and AOCS equipment. 1.4. Research Objectives, Framework, and Questions 5

2. Define a study case spacecraft and mission similar to the Athena mission to which apply the FDI system here developed. It includes spacecraft’s characteristics, AOCS sensors, and actuators, and faults models. 3. Define the methodology and criteria used to evaluate the performance of the FDI system. 4. Design and develop the FDI strategy. 5. Adapt the GAFE simulator to the study case mission’s environment, spacecraft kine- matic and dynamic characteristics, AOCS equipment for FDI purposes, and implement all the missing elements required for this thesis project. 6. Apply methodology and evaluation criteria to assess the performance of the FDI system.

1.4.2. Research Framework The research is embedded within Airbus Defence and Space AOCS/GNC department in Friedrichshafen and the GAFE framework. GAFE stands for ”Generic AOCS/GNC (Guidance Navigation Control) Techniques & Design Framework for Failure Detection, Isolation and Re- covery” and it is the name of the framework developed and implemented in the frame of the corresponding ESA-GSTP3 study. Its objectives were to develop an engineering approach and prototype tools to support AOCS/GNC FDIR design and validation and verification (V&V) in early project phases.

It is motivated by the fact that FDIR engineering for space systems is lacking a systematic approach and engineering transparency and, consequently, the design of the FDIR systems experiences significant growth in complexity and cost late in the development cycle, causing launch delays or delayed completion of the FDIR capabilities after launch. The result of this study is the GAFE framework, which is composed of three parts: • GAFE methodology: a step by step guideline to obtain a verified and validated FDIR concept. • GAFE structural analysis: a semi-analytical method that performs a detectability and isolability analysis to deduce a redundant equipment set for fault detection and recov- ery. • GAFE simulator: a time domain simulator, focussed on FDIR in early phases, using the nominal AOCS/GNC design to verify and validate the FDIR concept. From the different GAFE parts, only the simulator is of interest for the scope of this research project.

The research framework can be summarised as a graphical representation in Figure 1.1. A study of the functional requirements of the mission is done to define the assessment cri- teria and adapt the GAFE simulator to the study case mission needs. The theory on FDI and a literature research yields to the design of the FDI system, its techniques, and schemes implemented and tested on the GAFE simulator. Then, the results from the MC campaigns run on the GAFE simulator are evaluated using the assessment criteria.

1.4.3. Research Questions Research questions are used to solve the main topics of the research. Research questions can be elaborated by formulating a set of questions that need to be answered during the research: it starts with a central question that can be unraveled into sub-question and when all the lowest level questions are answered, it can be said that the central question is also answered.

3General Support Technology Programme (GSTP)- ESA http://www.esa.int/Our_Activities/Space_Engineering_Technology/Shaping_the_Future/About_the_ General_Support_Technology_Programme_GSTP (accessed April 23, 2019). 6 1. Introduction

Figure 1.1: Research framework

The research questions can be derived from the research aim, objectives and framework. The main research question is:

Is it possible to develop a model-based FDI system for agile spacecraft’s AOCS capable of de- tecting and isolating faults occurring in both, sensor and actuator, when they work individually or simultaneously, and assessing its performance with respect to defined criteria? 1. How can the study case mission be modelled for FDI purposes?

(a) How is a spacecraft dynamics and kinematics modelled? (b) How are AOCS sensors, actuators, and faults modelled? (c) Are there any uncertain and/or unmodelled elements? 2. What methodology can be used to evaluate the performance of the model-based FDI system? (a) Which are the suitable performance criteria? (b) How can different scenarios of the study case mission be simulated? (c) How can the results be evaluated and analysed?

3. Which FDI system can be proposed as a solution to a spacecraft’s AOCS? (a) Which techniques can be used for the residual generation? (b) Which residual evaluation techniques can be used for fault detection? (c) Which residual evaluation techniques can be employed to achieve fault isolation?

4. How does the proposed FDI system perform with respect to the defined criteria in the study case mission? 2 Theoretical Background

This research project aims to develop and assess a model-based FDI strategy for an agile spacecraft’s AOCS which is based on the Athena mission and spacecraft. Therefore in this chapter, an overview of the Athena mission with its relevant characteristics are pre- sented in Section 2.1 and an overview of the fault detection and isolation, including different approaches, focusing on the model-based one, is found in Section 2.2. Finally, the theoretical background of the different spacecraft attitude representations are presented in Section 2.3.

2.1. Athena Mission Description As commented during the introduction, the study case used in this research project is going to be based on the Athena mission, whose main characteristics, such as the space environment, the spacecraft characteristics, and its AOCS equipment are presented in this section.

2.1.1. Space Environment The orbit of the spacecraft is a large-amplitude Halo orbit around the second Lagrange point (L2) of the Sun-Earth system, see an example in Figure 2.1, with a semi-major axis amplitude of about 700,000 km and a period of approximately 180 days, providing a stable thermal environment, high observing efficiency, and a good instantaneous sky visibility [4]. In such orbit, the Earth magnetic field and atmosphere have no effect on the spacecraft. However, it will be subject to the ambient plasma and ionizing radiation environments due to both the solar and the geomagnetic tail, to the full spectrum of electromagnetic energy produced by the Sun which can be approximated by the output of a blackbody at 5777 K, with an integrated power of 1367 W/m at 1 astronomic unit, and to the bombardment by meteoroids, but artificial space debris should not pose a collision hazard for many years [17].

2.1.2. Spacecraft Characteristics The design of the Athena spacecraft has the objective of boosting the scientific performance while reducing the costs and at the same time minimising development risk. The principal design objectives are derived from the performance requirements of the telescope and all the scientific instruments. More precisely, the accommodation of the longest possible length without the complication of deployable structures and the need to illuminate two focal plane instruments via two separate telescopes. The configuration of the spacecraft allows to be launched with the Ariane 5 rocket.

The considered mass is of about 4150 kg (wet mass with launcher vehicle adaptor) with a maximum height of 14 m. Figure 2.2 represents a conceptual design of the Athena space- craft. The attitude control is 3-axis stabilised by reaction wheels and thruster actuation is used for off-loading wheel momentum.

7 8 2. Theoretical Background

Figure 2.1: L2 Earth-Sun Halo Orbit. Source: NASA/WMAP Science Team.

Figure 2.2: ESA design: Athena spacecraft in deployed configuration. Source: [3].

2.1.3. AOCS Equipment The preliminary Athena AOCS concept has as actuator suite 4 reaction wheels of large capac- ity (68 Nms) and a set of 20 N thrusters. The mass of the propellant mass is of order 400 kg. The wheels provide the fast re-pointing capability (which can go up to 1 deg/min around the X and Y axis) and counteract the torque induced by the solar pressure. The 20 N thrusters are used for wheel off-loading, orbit maintenance, safe-mode recovery, and launcher disper- sion correction. The AOCS sensors include a fully redundant, high precision, single optical head star tracker, a redundant coarse star tracker, and a redundant rate sensor [3].

A reaction wheel is composed primarily by a flywheel (a rotating mass), normally supported by ball bearings, driven by a DC motor. The principle that operates a RW on a spacecraft is based on the conservation of momentum. A momentum to the spacecraft is delivered by maintaining the flywheel at a rotational speed. And a torque is generated by varying the flywheel rotational speed. This reaction torque provides manoeuvring and/or pointing capa- bility to a spacecraft.

The thrusters provide attitude and orbit control to the spacecraft by producing forces that generate directional and rotational manoeuvring capabilities. The distribution and orien- tation of the different thrusters, together with the thrust magnitude, determine the afore- 2.2. Fault Detection and Isolation 9 mentioned capabilities. Thrusters can be of various types such as cold gas thrusters, rocket engine (chemical reaction using propellants), ion thrusters, etc. The type of thruster is driven by the required capabilities.

Star tracker is used to determine the attitude of the spacecraft. It is composed of a pixel detector or camera that measures the position of the stars. Because many stars’ positions have already been determined with high accuracy, star trackers detect the relative position of the spacecraft and the stars, providing an orientation of the spacecraft. This is done by measuring the apparent position of the stars in the reference frame of the spacecraft, so the identified stars’ position can be compared with their known absolute position from a star catalogue.

A rate sensor is used to measure the angular speed of the spacecraft without the need for integration in conditioning electronics. There are various technologies used for rate sensors, like electro-mechanical vibrating structure gyroscopes, ring laser gyroscopes, fibre optic gy- roscopes, etc.

Both the sensors and actuators are subjected to the presence of noises. For actuators, the noises can be considered as a disturbance, causing the actuator to produce an effect different to the one desired. For example, in thrusters, the noise can be produced by fluctuations on the thrust due to variations on the pressure of the gas. Or for RWs, the delivered torque can variate due to actuation noise such as rotor unbalances and ball bearing vibrations. On the other hand, sensor noise is, in general, an unknown, unwanted, and random modification of the signal from the sensors. An ideal sensor would not have any type of noise, but as shown in [30], any measurement device must add noise to the measurement process in order to avoid the violation of the uncertainty principle. There are many different types of noise (see [58, Chapter 2]) that can affect a sensor, such as the additive ones (white noise, pink noise, black noise, etc.), the multiplicative ones, the thermal noise, the photon noise, etc.

2.2. Fault Detection and Isolation

In Section 1.1, the importance of the FDI system for an agile spacecraft’s AOCS has been mentioned with respect to the safety of the instruments and the correct performance of the mission. The detection of a fault can be defined as the determination of the presence of a fault and time of occurrence and the isolation step, which follows the detection, as the de- termination of the type and location of the fault.

The FDI performance can be highly decreased due to the increasing amount of telemetry that the ground control must handle, because the respond to faults might not be fast enough for agile spacecrafts, and because of the time delay between telemetry and commands due to the distance between the Earth and the spacecraft in deep space or interplanetary missions.

The actual computing power of onboard spacecraft’s computers (from 25000 instructions per second in Viking Orbiter, 1976, to more than 266 million instructions per second in Mor- pheus Lander, 2012) allows transferring the FDI functionality from the ground segment to the spacecraft. This increases the spacecraft autonomy, availability, safety, and reliability by reducing the reaction time and the human operator intervention. It must not only be capable of detecting and isolating occurring faults but must meet other requirements such as low false alarm probability and fast fault detection and isolation.

The FDI system design can be approached in different ways. Among the different approaches, three of the most used ones are the expert systems, the data-driven techniques, and the model-based reasoning [42]. 10 2. Theoretical Background

Figure 2.3: Hybrid expert based architecture. Source: [16]

2.2.1. FDI Architectures Expert System Architecture The expert system architecture [10, 27] use knowledge about the system’s behaviour in a way that can be used for reasoning and therefore, a deep understanding of the physical properties of the system is not required. Some examples are the case-based and the rule- based architectures. The first utilises the knowledge obtained in the past from other specific problems (cases) in order to solve current problem situations. This method is an incremental and sustained learning approach since each new solved problem is stored as a new case, making it available for future problems. The main disadvantages are the need for many experiences from past diagnosed cases and the unavailability of diagnosing new cases. The second example, the rule-based architecture, stores the system’s behaviour knowledge in if-then rules without storing past cases. A set of rules is established trying to describe all the symptoms of all possible faults. The advantages are that deep understanding of the physical properties of the system is not required and that with simple rules the FDI can be achieved. However, the disadvantages are that it must be developed for each case from scratch, and if the fault or its effect are not foreseen the diagnosis is not possible. One example of an expert based architecture that include both, the case-based and the rule-based, architectures to do FDI is presented in Figure 2.3.

Model-Based Architecture Model-based architectures [26, 29, 64] make use of the model of the system’s behaviour and system’s configuration. That model can be based on the physical characteristics of the sys- tem or it can use other representations such as hierarchical declarative models. It is based on the analytic redundancies that exist between the model and the real system and the gen- eration of residuals. Figure 2.4 presents a simplified architecture of the model-based FDI. The advantages of model-based architectures are that it is easy to obtain residual signals by comparing the behaviour and outputs of the real system with behaviour and outputs of the model, safes cost by replacing hardware redundancy through analytic redundancy, the 2.2. Fault Detection and Isolation 11

Inputs Outputs System

Residual Generation

Residuals

Evaluation of the Residuals

Fault information

Figure 2.4: Simplified model-based FDI architecture. diagnosis performance can be very high by using precise models of the system and certain techniques such as structured residuals isolation, some methods are robust against distur- bances and uncertainties, and the possible re-use of model libraries (mainly belonging to control theory). The disadvantages are that models of the system must be created or adapted from ones already existing, and that high precision and accuracy are influenced by the model detail and correctness.

Data-Driven Architecture Data-driven architectures [12, 49] assume that abnormal events in the system are indicated by the statistical characteristics of the observed data. These approaches work directly with the data rather than with the system’s model, which is an advantage in terms of complexity and if the system’s model is not known, but it does not use all possible information if the model of the system is known. There are two main steps: learning step where data is fed to the learning algorithm, and diagnosis step where the learning algorithm is used for diagnosis. These type of approaches have been seen as a classical approach to fault detection. Although here are presented as part of a different group of the model-based approaches, these algo- rithms can be applied to raw residual data from the model-based approaches for enhancing the diagnosis performance. For example, by being applied to raw residual data instead of using a simple threshold. Some examples are: neural networks [19], ensemble learning [46], and the Generalized Likelihood Ratio (GLR) test [65]. The advantages of data-driven system architectures are that a deep understanding of the physical properties of the system is not required but just data from the operation of the system is required and it can be applied to any application on any data set. The disadvantages are that the results are very sensitive to the data used, meaning high probabilities of false alarms due to disturbances and uncer- tainties, and a huge amount of data is required from various diagnosis applications.

As presented in [11], the model-based approach has presented a very good performance in terms of FDI for dynamic systems. It surmounts the principal disadvantages of other approaches. For example, data-driven approaches present the problem of generating false alarms due to disturbances and uncertainties, and the difficulty of isolating the fault, which is overcome by the model-based using methods that provide robustness and isolation strate- gies. It also beats the main drawback of the hardware redundancy approaches, which is the increment of weight and costs, by using analytic redundancies. As already mentioned, the current spacecraft’s onboard computers are powerful enough to overcome the high compu- tational effort required. In addition, for these methods, very accurate and precise models of spacecraft, its equipment, external environment, and more are already available. 12 2. Theoretical Background

2.2.2. Model-Based FDI Model-based FDI architectures can be diverse and use different approaches. The FDI of AOCS actuators and sensors faults have been of great interest in the last decades, and most of the recent work on the topic is done following a similar methodology. The typical approach is based on the combination of residual generation and residual evaluation in order to provide a decision on whether faults have occurred and identify its location and type. The residual generation feeds a model of the system with the inputs to the actuators and the outputs measured by the sensors in order to predict the behaviour of the system (or a part of it) and compare the actual behaviour with the predicted one. This produces the residuals which can be considered as quantitative indices of the presence of faults. The residuals should be zero-valued when the system behaves normally and should diverge from zero when a fault occurs in the system. However, residuals are never zero values due to elements such as noises, disturbances, and uncertainties. Therefore, an evaluation of such residuals is required to decide if they are small or not. Usually, that is achieved using thresholds or tests of statistical hypotheses. Then, a decision logic must be followed to use the decision functions to achieve fault isolation. Its main advantages are that no additional hardware is required, the measurements and algorithms necessary to control the system are, in many cases, also sufficient for the FDI algorithms, and the possibility of doing the FDI system robust against disturbances and un- certainties using different techniques. The evaluation of the residuals is carried out with the objective of performing fault detection and isolation using different strategies and techniques.

Figure 2.5 shows a general structure of the model-based architecture. The system is af- fected by faults and external disturbances, and the sensors output signals are affected by noises. The model of the dynamic system is fed with the inputs to the actuators and the outputs of the sensors. Then, the residual generation is done, which can be achieved using different methods. After that, the change detection is in charge of identifying any change on the generated residuals (information of the fault free system behaviour can optionally be used). A decision making algorithm decides if the change is big enough and detects a fault. Finally, a logic decision makes use of the information from the residuals and decision making and isolates the fault.

Residuals Generation The generation of residuals can be approached in different ways, see some examples in [23], [40], and [43]. Three of the most common techniques used are based on observers/filters, on parity relations, and on parameters estimation.

An observer is an algorithm that estimates the system’s states based on the system’s model, inputs, and outputs, while a filter includes also the proper handling of noises, which is preferable than observer-based estimators. Then, the estimation error, computed as the dif- ference between the estimated value and the measured value is used as residual. When a fault occurs, the measured state usually diverges from the estimated one very fast, and with a proper decision function, the time between the occurrence of the fault and its detection is usually very low. In addition, the time to detection can be more or less inversely propor- tional to the fault’s magnitude. Models of AOCS sensors and actuators are very precise and reliable. Moreover, the use of filters in sensor signals allow to handle the signal noises and other uncertainties, as well as in actuators. The advantages of filter-based approaches are a fast reaction to occurring faults, it is very suitable for sensors and actuators FDI, the design procedure is systematic and simple, and the implementation of the algorithms is also simple, it can handle noise with un/known sta- tistical properties, non-linear filters are direct an accurate, and robustness can be achieved by many mature techniques, see for instance [11, 54]. On the other hand, the disadvantages of filter-based approaches are that non-linear filters are only applicable for particular cases of non-linearities and are more complex, and a reasonable and accurate model of the system is required a priori. 2.2. Fault Detection and Isolation 13

External Faults disturbances

Noises

Inputs Outputs Actuators System Sensors

Model of the System

Generation of the Residuals

Residuals

Fault free behaviour Change Detection

Decision Making

Logic Decision

Fault Information

Figure 2.5: General model-based FDI architecture.

The estimation of the states can be done using many different techniques. Some of the most promising ones are the Kalman Filters (KF), unknown input observers , sliding modes filters, diagnostic with direct eigenstructure assignment and 퐻 based filters.

In parity relations, the residual is generated from the difference between the output of the real system and the output of the model (of the system) that is running in parallel. In the fault-free system operation, the difference between the outputs (residuals) is approximately zero. If a fault occurs in the real system, the output of the system is not going to be the same as the one from the model, and therefore residual will deviate from zero. Parity equations represent the residual and can be derived from the input-output model equations or from the state space model matrices. The advantages of parity relations approaches are a fast reaction to occurring faults, are very suitable for sensors and actuators FDI as in model-based approaches, the design conduction is methodical and straightforward, the algorithms’ implementation is simple, and a filter can be added to the residual to reduce noise. However, if the model of the system is not precise and include many uncertainties and unknown parameters, the performance of the approach is highly decreased. In addition, non-linear models must be linearized, but highly non-linear models might present important difficulties when linearized, and noise statistics are not easy to be incorporated into the design.

Parameter estimation approaches are based on the assumption that the occurring faults have an effect on the physical system parameters (e.g., friction, mass, inductance, capaci- tance, etc.). Then, parameter estimation methods are used online to estimate the system’s parameters and compare them with reference system parameters (obtained under fault-free condition) to generate residuals. Its advantages are a very simple fault detection and it can easily handle noise. Opposite to that, fault isolation is not straightforward, physical param- eters do not respond only to models, have a slow reaction to occurring faults, it is compli- cated for sensors and actuators FDI because they must have an impact on the monitored 14 2. Theoretical Background parameters and that might not happen, it requires a large amount of computation, and the robustness, adaptability, and self-learning depend on the parameter estimation method used.

Regarding the purpose of this research project, AOCS actuator and sensor FDI, it is clear that parameter estimation based residuals are not suitable. Then, given that noises will be present in the system and that an agile spacecraft is considered, meaning high non- linearities, the filter-based techniques are chosen over the parity relation techniques. Among the different filter-based techniques used to estimate the system’s states, the Kalman filter and its derivations, such as the Extended Kalman Filter (EKF), seem to be strong can- didates for residual generation.

Once the residuals are obtained it is required to evaluate them in order to perform the de- tection and isolation of the faults.

Fault Detection The evaluation of residual signals that are affected by noises and uncertainties should not be done with deterministic approaches such as the 2-norm function [48], but with stochastic approaches like the Generalized Likelihood Ratio (GLR) algorithm or the cumulative sum (CUSUM) algorithm [6]. The advantage of statistical methods is that the probability of false alarms and non-detection can be evaluated. In order to detect a fault, its behaviour is not required to be known, so new faults not considered a priori can be detected. The drawbacks of these methods are that the false alarms rate increases with the lower rate of fault detection and detection delay, that if not a single residual is sensitive to a fault it cannot be detected, and the quantities estimation like the standard deviation or average demand a large amount of data.

Fault Isolation It is considered that only one fault occurs at the same time. After a fault is detected, it is usually required to identify its location and sometimes its type. So the following step to fault detection is the fault isolation. The methods used to isolate the faults are strongly related to the residual generator method used and sometimes with the fault detection method. Fault isolation can be achieved if the generated residuals are fault-selective in addition to fault sensitive. The residuals generated by the residual generator must be a complete set of residuals, instead of just one, and make residuals respond uniquely to each plausible fault. And therefore, the residuals generated can be used for fault detection and fault isolation. This is possible if the residuals are manipulated in some enhanced way. The most common schemes that are used are structured residuals and fixed directional residuals.

For structured residuals, only certain values of the residual vector become non-zero when a certain fault occurs. In other words, each residual is sensitive to certain faults, but it is insensitive to the rest. The main advantage of such type of residual generators is that the fault isolation reasoning is only based on which residuals are non-zero. The fault detection techniques can be applied independently to each of the residuals. This set of non-zero resid- uals are mapped to a fault case. The fault signatures, defined as ”unique sets of non-zero residuals for every single fault”, can be obtained, for example, from structural analysis tech- niques, see [20]. Such fault signatures are normally a matrix whose columns and rows are fault codes and residuals, respectively. Two main approaches, see[13], among others, are possible, the dedicated residual set scheme, where the residuals can be designed with the fault sensitivity condition that each residual is a functional relation of just one fault, but it is very difficult to be designed in practice and, if it was possible to be designed, normally, there is not enough freedom of design to meet robustness against modelling errors. Or the general residual set scheme, that makes each residual sensitive to all but one fault. The latest is more commonly used and it is generally better than the dedicated one. It works simply by following the next logic: if 푖푡ℎ residual is below a threshold, and the rest are above a thresh- old, 푖푡ℎ fault has occurred (remember, 푖푡ℎ residual is sensitive to all faults except for 푖푡ℎ fault). 2.3. Spacecraft Attitude Representation 15

The fixed directional residuals, see [25], are based on generation set within a geometric framework. In a residual space, a directional residual vector is designed in a way that it has the same fixed direction as a specific fault in response to a particular fault. A fault can be then isolated by determining the fault signature direction that is the closest to the gen- erated residual vector. For a reliable fault isolation, each fault signature must be unique and be related with just one of the faults. Compared to the structured schemes, it is easy to implement, it provides better isolation performance for ideal conditions, but it very hard to make it robust against uncertainties and system disturbances.

2.3. Spacecraft Attitude Representation The attitude of an object can be represented using different parametrizations. Current parametriza- tions in use include vectors (of three or four elements), as well as matrices that can go from 2 × 2 to 4 × 4 dimensions. At the survey of attitude representations [51], up to twelve different representations are presented, each one of those with different advantages for certain appli- cations. For spacecraft attitude representation, some examples of the most commonly used ones are direction cosine matrix, Euler angles, quaternions and, Gibbs vector.

Direction Cosine Matrix The direction cosine matrix or rotation matrix is a 3×3 matrix that relates the representation of a vector in a reference frame (퐯) with the representation of the same vector in the space- craft body frame (퐯) as 퐯 = 퐑퐯, where 퐑 is the rotation matrix.

The rotation matrix is orthogonal, meaning that preserves lengths and angles, and that it has six redundant components. It is the most fundamental parametrization of an attitude and it is a unique representation between two coordinate frames, but it is very inefficient (six redundant parameters) and it is complex to keep the six orthogonality constraints.

Euler Angles Any rotation in the space can be represented by three sequential rotations, for example rota- tions about the axes of a coordinate system. The three rotation angles can be arranged in a vector called Euler angle vector (퐀). There are two types of rotation sequences conventions, the extrinsic rotations, which are done about the original axes of the coordinate frame which remain motionless, and the intrinsic rotations, which are performed about the axes of the rotating coordinate system that changes its orientation following each rotation. Regardless the previous rotation sequences conventions (extrinsic or intrinsic), the Euler angles can be divided between the classic Euler angles and the Tait–Bryan angles.

The classic Euler angles are geometrically defined by the line of nodes, which is the intersec- tion between the original X-Y plane and the final rotated X-Y plane. Then, the three rotational angles of 퐀 are defined as the angle between the original x-axis and the line of nodes, the angles between the original z-axis and the rotated z-axis, and the angle between the rotated x-axis and the line of nodes. It can also be defined by intrinsic or extrinsic rotations. There are six options to select rotation axes for Euler angles and in all of them the first and third rotation axes are the same.

The Tait–Bryan angles are usually used in aerospace applications because the horizontal attitude is set by defining one of the angles equal to zero (the elevation angle). Its definition is very similar to the classical Euler angles (geometrically, extrinsically, and intrinsically) but with the difference that that Tait–Bryan angles represent rotations around three different axes, which modifies the definition of the line of nodes.

The Euler angles are commonly used because they have a clear physical interpretation but their kinematics and dynamics contain trigonometric functions, and they are not a unique representation, if the Euler angles ranges are not defined, because between two coordinate frames at least two different sets of Euler angles can be defined. For example, a rotation of 90 16 2. Theoretical Background degrees around one axis is equal to a rotation of -270 degrees around the same axis. Certain orientations can be represented by an infinite number of Euler angles, causing singularity.

Quaternion Quaternion is a four-parameter representation that expresses the attitude matrix as a ho- mogeneous quadratic function of the elements of the quaternion, requiring no trigonometric or other transcendental function evaluations. It has four elements (three considered the vector part and one considered the scalar part) and one constraint, the norm constraint, which make it more efficient than the cosine matrix which has nine elements and six con- straints due to orthogonality. It presents no singularities or discontinuities. However, for attitude representation or rotation a quaternion represent a 3D attitude or rotation in the three-dimensional Euclidean space, which only requires 3 elements for full determination. Therefore, the quaternion, which is defined using 4 elements, has one degree of freedom. It is defined as 퐪 = [푞 푞 푞 푞] (2.1) with ||퐪|| = 1.

Gibbs Vector The Gibbs vector representation of a rotation in three dimensional space is a three-dimensional vector, where the vector is parallel to the axis of rotation and its three components transform covariantly on change of coordinates. As shown in [39], 퐪 and −퐪 map to the same Gibbs vector, so it provide a 1 ∶ 1 mapping of rotations. In exchange, the Gibbs vector is infinite for a 180∘ rotation. Thus for global attitude representation it is not recommended, but it provides an excellent representation of small rotations. The relation between the quaternion and the Gibbs vector is 푞 퐠 = ∶ (2.2) 푞 and its inverse is 1 퐠 퐪 = [ ] . (2.3) √1 + ‖퐠‖ 1 3 Study Case Description and Methodology

The model-based FDI strategy here proposed is intended to be designed for an agile space- craft. In order to design it, first, the models of an agile spacecraft, its AOCS equipment, and the considered faults are defined in Section 3.1. Then, the methodology followed to evaluate the performance of the FDI system is described in Section 3.2.

3.1. Definition of the Study Case The Athena mission is a very good example of an agile spacecraft that makes use of mul- tiple actuators simultaneously to perform large angle attitude slews while keeping rigorous attitude constraints such as keeping the instruments away from direct exposure to the Sun. Therefore, it is used as a reference to define the study case used in this project. The relevant design parameters for the purposes of this research project depend on the mission envi- ronment, the spacecraft characteristics, the equipment (sensors and actuators), the faults considered, and the uncertainties that affect the system.

Therefore, the study case presented here is a generic one for an agile spacecraft that is in a certain environment similar to the Athena mission one, that is equipped with RWs, thrusters, star trackers, and rate measurement units, and that is affected by defined faults and uncer- tainties. So, the generality (or flexibility) of the design applies to the spacecraft characteristics and AOCS configuration, only.

3.1.1. Environment As in the Athena mission, the study case spacecraft is placed in a halo orbit around L2, the second Lagrange point of the Sun-Earth system. In this orbit, the main disturbance torque (퐓) that affects the spacecraft is the solar radiation pressure, which is assumed to be constant and to have an effect only on spacecraft y-axis (this might not be realistic, but these where the given premises by the attitude controller developer, see Section 5.1.3). The simulated scenario comprises four (shorten) inertially-fixed observation phases connected by three attitude slews, see Figure 5.6, with a duration of approximately 9000 seconds.

The definition of the inertial reference frame of the system is done assuming that during an entire simulation, the position of the spacecraft with respect to the Sun and to the Earth has not changed. Therefore, the centre of the fixed reference frame is placed in the centre of mass of the spacecraft, with x-axis pointing away from the Sun, z-axis is perpendicular to the Sun-Earth-Spacecraft plane with the same direction than the z-axis of the Earth-centred inertial coordinate frame [31], and the y-axis is contained in the aforementioned plane com- pleting an orthonormal frame.

17 18 3. Study Case Description and Methodology

3.1.2. AOCS Equipment: Sensors and Actuators In the frame of this project, it is assumed that the study case spacecraft is equipped with a sensor suite composed of two high precision star trackers and one angular rate measure- ment unit. The actuator suite is composed of a set of thrusters and a set of RWs. Each RW is equipped with an internal rate measurement sensor (tachometer).

As explained in Section 2.1.3, all sensors and actuators are affected by noises. From all the existing types of noises, here, only the Gaussian zero-mean white noise, represented by 휼, is considered. A random vector is said to be a white noise vector if its components are sta- tistically independent, with zero mean, and finite variance. Statistically independence means that they covariance is zero, so the covariance matrix of the white noise vector of 푁 elements is a diagonal matrix of 푁 × 푁 dimensions with 푖푡ℎ diagonal element being the variance of the 푖푡ℎ component of the vector. And the correlation matrix, the identity matrix of 푁 × 푁 dimensions. In addition, being Gaussian means that the white noise vector has a normal distribution with zero mean and the same variance 휎.

White noise is defined in [58] as a random uncorrelated noise process with equal power at all frequencies. The white noise is a theoretical concept, given that if the power is equal for all frequencies, the total power, obtained integrating over all frequencies, would be infinite. Therefore, the concept of band-limited white noise process, with a flat spectrum covering the frequency range of a band-limited system is used because, from the point of view of the sys- tem, it is a white noise process. Like any system, the white noise affecting the equipment can be represented in discrete-time and continuous-time. For example, sensors usually provide information in discrete-time while the actuators noise might be defined in continuous-time if the system’s model is done also in continuous-time.

The discrete-time zero-mean white noise vector (휼) is described as a random vector of 푁 finite elements if its elements are independent and identically distributed, with a mean of zero 흁 = 0, variance 흈 , and no serial correlation (e.g., Corr(휼 ,휼 ) ≠ ퟎ, ∀푖 ≠ 푗). The dis- crete process noise matrix (퐐) of a zero-mean Gaussian white noise vector is obtained as

퐐 = E[휼휼] (3.1) The continuous-time zero-mean white noise (휂(푡)) is a continuous-time random signal that the value of 휂(푡) is for any time 푡 a random variable that is statistically independent of its entire history before 푡. A precise definition is not easy to be obtained, but most authors define 휂(푡) indirectly by defining the following properties over an interval [푡, 푡 + 휏]

E[휂(푡)] = 0, (3.2)

E[|휂(푡)|] = 0, (3.3) 푟(휏) = E[휂(푡)휂(푡 + 휏)] = 휎 훿(휏). (3.4) Continuous-time white noise is described with the Power Spectral Density (PSD) constant value which is related to the discrete-time variance, see Appendix A.2 for derivation, as

푆 = Δ푡휎 . (3.5) The relation between the process noise PSD matrix 퐐 and the discrete process noise matrix 퐐 can be derived using different methods, some of them are described in detail in [28], but an approximation can be done if 휏 → 0 so

퐐 ≈ 퐐Δ푡. (3.6)

Thruster Model It is assumed that the spacecraft is equipped with a set of 푁 thrusters. The total torque produced by the thrusters depends on force, direction, and distance from the spacecraft center of mass (CoM) of each thruster. Defining 풮 ≜ {1, 2, … , 푁} as the set of all thrusters 3.1. Definition of the Study Case 19

× × indices, 퐝 ∈ ℝ as the fixed direction of 푖 thruster, 퐫 ∈ ℝ as the vector position of the 푖 thruster in body-fixed reference frame, and 퐹 as the maximum thrust force of 푖 thruster, then the maximum directional torque of the 푖 thruster becomes

퐛 = 퐫 × 퐛 , 푖 ∈ 풮, (3.7) where ’×’ denotes the cross product of two vectors and 퐛 is the directional force of the 푖 thruster, defined as

퐛 = −휽(퐝, 휖 )(퐹 + 휂 ), (3.8) where 휽(⋅, ⋅) is a function that rotates the 푖 thruster directional vector (퐝) for a given mis- alignment angle 휖 and 휂 is a scalar zero-mean Gaussian white-noise which aims at mod- eling variations on the effective thruster force.

Finally, the total torque about the CoM of the spacecraft generated by the thrusters is given by

퐓 = ∑ 퐛 푢 , (3.9) where 푢 is the commanded opening of the 푖 thruster.

Reaction Wheel Model It is assumed that the spacecraft is equipped with a set of 푁 reaction wheels. The total torque produced by the 푖 RW can be modelled as

푐 = 푢 + 푇 + 휂 , 푖 ∈ 풮, (3.10) where 풮 ≜ {1, 2, … 푁} is a set of all RW indices, 푢 is the commanded control torque, 휂 is a zero-mean Gaussian white-noise introduced to model other torque effects caused, for instance, by variations of motor voltage frequency and DC coil resistance, and 푇 is the friction torque in the ball bearings of the wheel. Here, the friction torque is modelled as

. 푇 = −휁 tanh (휔 ) − 휁 sign(휔 ) |휔 | , (3.11) where 휁 > 0 and 휁 > 0 are appropriate constants. In (3.11), the term associated with 휁 aims at modelling the Coulomb torque and the term associated with 휁 the viscous friction torque of the 푖 RW. The rotational dynamics of the 푖 RW satisfies

̇휔 = 퐽푐 , (3.12) where 휔 is the angular speed and 퐽 is the constant inertia of the 푖 RW.

Finally, the total torque about the CoM of the spacecraft generated by the RWs is given by

퐓 = ∑ 휽(퐦 , 휖 )푐 , (3.13) where 휽(⋅, ⋅) is the same function as defined in (3.8), however now rotating the 푖 RW direc- × tional vector 퐦 ∈ ℝ for a given RW misalignment angle 휖 .

Sensor Model For FDI purposes, two STRs, one RMU, and dedicated tachometers for each RW are consid- ered. Both STRs and RMU are assumed to be fault-free, because an effective and quick FDI is assumed to be in place for them. 20 3. Study Case Description and Methodology

The sensor models for the 푖 STR, the RMU, and the RW tachometers are, respectively, defined as follows (3.14) 퐪 = (퐪 ⊗ 흑(휖 )) ⊗ 흋(휼 ), 흎 = 휽(흎, 휖) + 휼, (3.15) 흎 = 흎 + 휼, (3.16) where 흎 ≜ [휔 … 휔 ] is a vector composed of all RW angular speeds; 퐪, 흎, and 흎 are the measurements; and 휼 , 휼, and 휼 are the independent zero-mean Gaussian white-noise sequences affecting the measurements. In (3.14), ⊗ denotes quater- nion multiplication, 흑(⋅) is a function of the misalignment angle 휖 , and 흋(⋅) is a function of noise 휼 defined in Euler angles. These functions are used to manipulate (rotate) the true quaternion 퐪 in order to mimic the STR misalignment and noise, respectively, while pre- serving quaternion unity. In (3.15), 휽(⋅, ⋅) rotates the measured spacecraft angular rate vector (흎) for a given misalignment angle 휖.

3.1.3. Spacecraft’s Dynamics and Kinematics The study case spacecraft considered is very similar to the idea developed for the Athena mission shown in Figure 3.1. The fixed-body frame is set that z-axis points in the same direction as the instruments, y-axis is parallel to the solar panels, and x-axis is perpendicular completing an orthonormal reference frame. Recalling the definition of the inertial reference frame presented in Section 3.1.1, it is possible to see that the fixed-body frame and the inertial reference frame are related by a rotational matrix defined using the Euler angles (퐀).

Figure 3.1: ESA design: Athena spacecraft in deployed configuration with fixed-body frame. Source: [3].

The spacecraft is treated as a rigid body. Its rotational dynamics about the CoM satisfy 흎̇ = 퐉 (퐓 − 퐓 + 퐓 − 흎 × (퐉흎 − 퐡)) , (3.17) × × where 퐓 ∈ ℝ is the external disturbance torque, 퐉 ∈ ℝ is the spacecraft inertia matrix, × × 흎 ∈ ℝ is the angular velocity of the spacecraft, and 퐡 ∈ ℝ is the angular momentum vector associated with the RWs, i.e.,

퐡 = ∑ 휽(퐦 , 휖 )퐽 휔 . (3.18) The spacecraft rotational kinematics is parametrized using a unit quaternion, which repre- sents the spacecraft attitude with respect to an inertial frame of reference. The kinematic expression is given by 1 퐪̇ = 퐖(흎 )퐪, (3.19) 2 3.1. Definition of the Study Case 21 where 퐖 is defined as 0 휔 −휔 휔 ⎡ ⎤ −휔 0 휔 휔 퐖(흎) = ⎢ ⎥ , (3.20) ⎢ 휔 −휔 0 휔⎥ ⎣−휔 −휔 −휔 0 ⎦ where 휔, 휔, and 휔 are the components of 흎 and represent the spacecraft rotational rates around its body-fixed 푋, 푌, and 푍 axis, respectively.

3.1.4. Faults The faults considered here for thrusters are: leakage, stuck-open, and stuck-close of a sin- gle thruster, and loss of effectiveness (LOE) of all thrusters simultaneously. For RWs, an increment of the internal friction and an increment of the measured rate by the tachometer sensor with respect to the real rate are considered.

Faults can be modelled in a multiplicative or additive way ([24]). In this paper, both thruster and RW faults are modelled in a multiplicative manner.

Thrusters’ faults are modelled as [21]

퐮 = (퐈× − 횽)퐮, (3.21)

where 횽 ≜ diag (휙 휙 … 휙 ), 퐮 ≜ [푢 푢 … 푢 ] , and the index 푓 denotes the faulty case. The scalar variable 휙 models the fault for the 푖 thruster, i.e.

0, if fault free 휙 = { 1 − 휒/푢 , if faulty

The function 휒 allows to model different thruster faults as follows

1, stuck-open

휒(푡) = {0, stuck-closed

max{푚 , 푢 }, propellant leakage where 푚 is the 푖 thruster leakage magnitude.

The loss of effectiveness fault represents a decrease in the propellant supply pressure which feeds all the thrusters, thus a LOE fault will affect all thrusters simultaneously, i.e.,

휙 ≡ 푚, ∀푖 ∈ 풮. where 푚 is the LOE magnitude.

Two type of RW faults are considered. The first considers an increment of the measurement realized by the tachometer sensor, i.e.,

흎 = 횿흎 + 휼, (3.22) where 횿 ≜ diag (휓 … 휓 ) and

1, if fault free 휓 = { 푚 , if faulty

A constant value of 푚 > 1 represents the 푖 tachometer scale increment.

The second considered RW fault is an increment of the RW friction. An increase of the 푖

RW friction torque 푇 causes a variation in 푐 . To model this increase, 푇 consists of two 22 3. Study Case Description and Methodology

types of frictions, viscous friction (푓 ) and Coulomb friction (푓 ). Both 푓 and 푓 can vary differently in a faulty situations. Thus, the following fault model for the 푖 RW is employed

푇 = 휉 푓 + 휉 푓 , 푖 ∈ 풮 , (3.23) where 1, if fault free 1, if fault free 휉 = { , 휉 = { 푚, if faulty 푚, if faulty

Here, 푚 > 1 and 푚 > 1 is the magnitude of the viscous and Coulomb friction factors, respectively.

3.1.5. Uncertainties In any real system, there are parameters which values are unknown or not fully determined. In these cases those parameters are said to have uncertainty. Uncertainties might severely affect the performance of the system given that it might be impossible to describe the exact existing states and future outcomes. For estimators, like Kalman filters, uncertainties might lead to a wrong behaviour. Therefore, the estimators should be robust against uncertainties.

The parameters that are assumed to have a certain uncertainty in the study case spacecraft are the missalignment angles of the actuators and sensors, 휖, and the spacecraft principal 1 2 3 axis of inertia, 퐉(,) , ∀푖 = { }. 3.2. Methodology for Evaluation of the FDI System In this section, the methodology used to evaluate the performance of the here proposed FDI strategy is carried out. First, the types of tests that will be done are presented, followed by the performance criteria defined, and finally a post-processing step is presented.

3.2.1. Test Campaigns The setting of the tests, designed to evaluate the FDI strategy, must take into account the scope of this project (see 1.2). The focus of this project is not just to evaluate the perfor- mance of the FDI system according to certain criteria, but to obtain, in addition, information about how is the system’s performance with respect to the magnitude of the faults defined in Section 3.1.4 and the magnitude of the uncertainties defined in Section 3.1.5 that affect the study case mission set up. The effect of different slew manoeuvres would also be very interesting to be studied, but due to time and memory constraints the tests are restricted to the guidance described in Section 3.1.1.

The FDI performance is evaluated using two MC simulation campaigns. The first campaign excludes uncertainties, while the second one does not. Both campaigns assume measure- ment noises and consist of 푁 runs per fault type (including the fault-free case). In each run, the time of fault occurrence, 푡, and the fault magnitudes, 푚, 푖 ∈ {푙푒푎푘,푙표푒,푚푒푎푠,푣,푐}, vary uniformly in the defined interval, see Table 5.3.

The first campaign aims at demonstrating the FDI performance without considering any model uncertainties. The main focus is on the faults’ magnitudes and their times of oc- currences. Nominal spacecraft parameters and no sensor misalignments are considered.

The second campaign aims at demonstrating the effect of uncertainty on the obtained FDI performance results. Such uncertainties are modelled to follow a normal distribution of 풩(휇, 휎), where 휇 corresponds to the nominal value and 휎 is the associated uncertainty. Parameters which are considered to have uncertainty and their variances are the spacecraft principal axis of inertia (휎), and the individual misalignment angles, 휖, of the STRs (휎), the RMU (휎), the individual thrusters (휎), and the individual RWs (휎).

The FDI strategy must be tested, for each type of fault, in as many different situations (times 3.2. Methodology for Evaluation of the FDI System 23 of injection, fault magnitudes, and uncertainties magnitudes) as possible in order to better evaluate its performance. The number of runs per each type of test depends on different factors.

The complexity of the model used here makes that an exact number of times that a MC campaign should be run in order to obtain a desired confidence and precision on the results is almost impossible to be computed. In order to give some orientating value, one method from the literature is used.

The method by Wald (see [61] and [15]), based on the Confidence Interval (CI) for a popu- lation proportion, proposes that the minimum number of runs can be estimated using

푧/̂ ̂ 푁 = , (3.24) Δ where 푧/2 is the value of the normal distribution that represents the upper tail of the dis- tribution set by the confidence level, 푗̂ is the fraction of success, 푣̂ = 1 − 푗,̂ and Δ is half the length of the CI. Normal values of confidence level is 95% which leads a 푧/2 = 1.96, with a percentage of suc- cess of 50%, 푗̂ = 0.5 that maximises (3.24), and a CI of 1% → Δ = 0.005, (3.24) gives 푁 = 38416 runs. If the CI is increase to 10% → Δ = 0.05, the result would be 푁 = 385.

Given that the time and memory are limited in this project, the number and type of differ- ent tests will highly depend on the amount of time that a simulation takes and the physical memory that it occupies. For a singular test with the attitude profile described in 3.1.1 the GAFE simulator, which is set to run with a time-step of Δ푡 = 0.1 seconds, generates 443 data points, which in total occupies about 200 MB and takes an average time of simulation of about 3 minutes. Using a 푁 = 385 per type of fault and fault-free cases, the total number of MC runs for the two campaigns would be 푁 = 385 × 7 × 2 = 5390, so the computation time would be 270 hours and would require a memory of 1053 Gb. Regarding the amount of time and memory required for 푁 = 385, it was decided that the minimum number of tests should be of 푁 = 150 per type of fault and fault-free cases, leading a total runs of 2100. That would mean a computation time of 105 hours and 412 Gb of memory.

3.2.2. Evaluation Criteria In order to evaluate the performance of the FDI strategy, criteria must be defined. Such criteria are the data that needs to be evaluated and extracted from the different tests run. It must be meaningful, provide value to the future of the agile spacecraft missions and the FDI community, and must be obtainable with the project here developed. Currently, there are no threshold values to which compare the results here obtained because it depends on the mission requirements. The performance of the proposed FDI strategy is evaluated in terms of the following indices, some of them extracted from [5] and other defined for the purposes of this project:

• Correct detection: a fault is correctly detected in any residual diverge from zero after a fault has occurred.

• False alarm: a false alarm occurs when a fault is detected (any residual diverging from zero) and it is a fault-free case or it is detected before the fault has occurred.

• Miss-detection: a fault is miss-detected if a detectable1 fault occurs and not a single residual diverge from zero during the entire simulation.

• Correct isolation: the correct isolation of a fault is defined in two different levels:

- Equipment level: if the fault’s location is identified to be at the faulty thrusters system or at the faulty RW.

1A fault is detectable if it has an actual effect on the spacecraft. 24 3. Study Case Description and Methodology

- Fault type level: if the fault is correctly located at the faulty tachometer sensor or at the faulty wheel friction torque. • Miss-isolation: the miss-isolation of a fault is defined using the same concept as the correct isolation index. A fault can be miss-isolated in two different levels, the equipment level and the fault type level (only applicable to RWs). • Time to detection (RW, thruster leakage, and stuck open): it is defined as the time between the occurrence of the fault and its detection. • Time to detection (LOE and stuck close): it is defined as the time between the first time that a faulty thruster is activated after the occurrence of the fault and the time of detection.

3.2.3. Post-Processing After all the tests runs are carried out, it is necessary to run a post-process in order to ex- tract all the required information from the simulations results to evaluate the performance of the FDI strategy. The post-processing can be done over single test runs or over entire MC campaigns.

The single test runs are used to obtain insight about the variable states and residuals evo- lution, GLR test behaviour, decision making for detection and isolation of the fault, etc.

The post-processing of entire MC campaigns allows obtaining information of the performance indices as the mean values and variances for the times of detection and as percentage ratios for the rest of the indices. The ratios are computed as the number of times that tests scores positive on each one of the indices divided by the total number of tests that are not fault-free. For example, for the thruster stuck open fault tests, if the fault is correctly detected in 145 out of 150 tests, the correct detection ratio of the stuck open fault is of 93.33%. In addition, MC campaigns without uncertainties post-processing provide data about how the magnitude of the faults and the time of fault occurrence affect the FDI performance by plotting the cor- relations of such magnitudes and times of injection with the performance indices. The MC campaign that includes the uncertainties post-processing is used to show the correlation of the uncertainties magnitudes with the FDI performance. 4 Proposed FDI Strategy

The strategy and the different algorithms that have been developed to achieve the fault de- tection and isolation for an agile spacecraft are presented in Section 4.1.

4.1. FDI Strategy The model-based FDI strategy here proposed can be divided into three main steps: the esti- mation of states of interest, the generation of residuals, and the evaluation of such residuals to achieve detection and isolation of faults.

The proposed FDI strategy is based on a model-based (filter-based) residual signal gener- ation strategy. A detection algorithm monitors each residual signal using a GLR test and compares them with fixed thresholds to decide if a fault has occurred or not. The isolation logic tests whether the fault occurred in any of the thrusters or in a particular RW. Addition- ally, if a RW is faulty, the isolation algorithm is able to identify the type of fault.

Prior to the presentation of the proposed FDI strategy, there are few concepts that require to be introduced first. From Section 2.2.2 it is clear that for residual generation, the best approach to be used is the one based on filters. Among the different techniques presented, the Kalman filter is presented as a very good option because it has relatively low complexity and provides the quality of the estimate (e.g., the variance).

4.1.1. Kalman Filter The Kalman filter has been considered the optimal solution under certain condition to predic- tions based on models and data, see [63]. The Kalman filter derivation can be approached as the mean squared error minimum, or, as an alternative, as how the filter relates to maximum likelihood statistics. Its principal purpose is filtering to extract the required information from a signal, ignoring everything else. However, it can be designed to estimate the state vector in a linear model. If the model results to be not Gaussian and linear, it might not provide accurate results. For those reasons, and taking into account that the spacecraft performs large slew attitude, and therefore, the models used for estimation are highly non-linear, a variation of a linearization process is usually done to derive the filtering equations. Using a Taylor approximation of the system, and using an observation function, the filter obtained is the Extended Kalman Filter (EKF), see [34]. Moreover, an improvement to the EKF was performed allowing to address the approximation issues, the Unscented Kalman Filter (UKF), see [62]. The EKF is preferred over the UKF because it has a fairly simple and effective way of filtering and it has been proven for many years to have a very good application to many important real-time applications, see [14]. And the states and observation models are not required to be linear functions of the state. However, those models may be differentiable functions.

25 26 4. Proposed FDI Strategy

4.1.2. Extended Kalman Filter The EKF can be formulated in both continuous and discrete times, see [9]. The discrete- time filter has the advantage, in front of the continuous-time filter, to have the prediction and update steps decoupled. Nevertheless, most of the physical systems are modelled with a continuous-time representation while sensors used for estimation, usually, provide mea- surements and observations in discrete-time. For these reasons, a continuous-time extended Kalman filter with discrete-time observations is preferred (also called continuous-discrete EKF). The filter process can be divided into three main steps, the initialisation, the predic- tion step (continuous-time), and the update step (discrete-time).

The non-linear model of the system is

퐱(푡)̇ = 퐟(퐱(푡), 퐮(푡)) + 휼(푡) { (4.1) 퐳 = 퐡(퐱) + 휼 were 퐟(⋅, ⋅) is a vector function of the states 퐱(푡) and the inputs to the system 퐮(푡), 퐡(⋅) is a vector function of the states 퐱(푡), and 휼(푡) ∈ ℝ and 휼 ∈ ℝ are zero-mean Gaussian white noise sequences with 1 ≤ 푝, 푞 ≤ 푛.

The initialisation of the filter is very important in a EKF given that it is not the optimal estimator and it could diverge quickly. The initial estimated states are

퐱(푡̂ ) = E[퐱] (4.2) and initial estimated covariances are

퐏(푡̂ ) = E[퐏] = Var[퐱]. (4.3)

The propagation step is done in continuous-time. The estimated state (퐱̂ ) and covariance ̂ ̂ (퐏) in the previous time step are propagated to the current time step (퐱̂ and 퐏 ), assuming constant control input (퐮), by integrating the following system of equations

퐱̂̇ = 퐟(퐱,̂ 퐮) { (4.4) 퐏̂̇ = 퐅퐏̂ + 퐏퐅̂ + 퐐

퐟(퐱,퐮) where 퐅 = | which is the Jacobian matrix of the state and 퐐 is the process noise 퐱 ̂, covariance matrix. The process noise concept represents the idea/feature that the state of the system changes over time, but it is not know the exact details of when/how those changes occur, and thus it is needed to model the process noises as a random process.

Finally, the update step makes use of the observations, once that are available, to update 퐱̂ ̂ ̂ and 퐏 into 퐱̂ and 퐏, respectively. The equations that rule the update are

퐱̂ = 퐱̂ + 퐊 (퐳 − 퐡(퐱̂ )) (4.5)

퐏 = (퐈 − 퐊퐇) 퐏 (4.6) where ̂ ̂ 퐊 = 퐏 퐇 (퐇퐏 퐇 + 퐑) 휕퐡 (퐱) 퐇 = 휕퐱 and 퐑 is the measurement noise covariance matrix.

Then, for the next time-step this formulation is iterated and 퐱̂ and 퐏̂ become, 퐱̂ and 퐏̂.

The real-time estimation attitude of a spacecraft is generally done using an EKF. Most of the 4.1. FDI Strategy 27

EKF’s use lower-dimensional parametrizations of the special orthogonal group SO(3) of rota- tion matrices, like a minimal three-dimensional representation [18]. However, all the three- dimensional representations present singularities or discontinuities, which can be avoid us- ing a four-dimensional representation of the spacecraft attitude, like the quaternion [53]. The quaternion is the lowest-dimensional representation of SO(3) that has no singularities nor discontinuities that provides a globally representation of the spacecraft attitude. But, quaternion has a superfluous degree of freedom, so there is a dilemma about using a repre- sentation that is singular or one that is redundant.

The parameterization of the quaterion must comply with a unity normalization constraint, which might be an issue for attitude estimation using a standard EKF. Such issue appears during the update stage of the filter in (4.5):

퐪̂ = 퐪̂ + ⟨퐊 (퐳 − 퐡 (퐱̂ ))⟩ (4.7)

From where it is clear that, unless a certain relation between 퐪̂ and ⟨퐊 (퐳 − 퐡 (퐱̂ ))⟩ ex- ists, both 퐪̂ and 퐪̂ cannot be normalized at unity.

Two of the solutions proposed in the literature are the Multiplicative EKF (MEKF) and the Additive EKF (AEKF). The first one, uses a non-singular quaternion representation as a ref- erence attitude and a three-component representation as the errors from this reference [37]. The second one, treats all four components of the quaternion as independent parameters [1]. The outcomes of the comparison between MEKF and AEKF performed in [38] show the pros and cons of each method. The AEKF is a more simple concept, it gives good attitude and covariance estimates but with a higher computational effort and some numerical issues could appear from the unobservable quaternion degree of freedom. It is based on less se- cure foundations, the unity of the quaternion must be kept using ”brute force”, the attitude matrix is not exactly orthogonal, and the covariance matrix 퐏× might become singular and produce instability. On the other hand, despite the fact that the MEKF is a more complex concept and has a more intricate implementation, it requires less computation effort, it is a more satisfying concept given that keeps the dimensionality of the rotation group and its at- titude estimate is a unit quaternion by definition. In addition, the covariance matrix 퐏× has lower dimensionality and have a more clear physical interpretation. For the reasons above mentioned, the MEKF is preferred.

4.1.3. Multiplicative Extended Kalman Filter The MEKF uses the quaternion 퐪 as a global attitude representation of the spacecraft and a three-component state vector to represent the local error of the attitude. The attitude error, as well as the attitude, can be represented using different parameterization, as shown in Section 2.3. Here, the Gibbs vector parameterization is used for attitude errors (훿퐠).

Usually, for estimation purposes, the real state is represented as the estimate plus an er- ror, but here, the true quaternion, 퐪, is written as the product of an error quaternion, 훿퐪(⋅), (here, estimated using the three-component error attitude in Gibbs vector parametriza- tion, 훿퐠) and the estimated quaternion, 퐪̂ , as shown in (4.8). The selected three-component attitude error representation is not relevant, as far as it is consistent in the entire filter, given that for small angles they are all equivalent to first order approximation (see [39]).

퐪 = 훿퐪(훿퐠) ⊗ 퐪̂ (4.8)

In this representation, 퐪, 훿퐪(⋅), and 퐪̂ are correctly normalized to unity. In this MEKF the correctly normalized four-elements estimate quaternion 퐪̂ is not part of the filter, although a reset step moves the information from the update into this global attitude representation variable and resets the three-element attitude error 훿퐠 to zero in order to keep it always small and avoid any singularity. 28 4. Proposed FDI Strategy

Because 퐪̂ is not part of the estimator but 훿퐠 is, it implies some advantages. For exam- ple, the covariance matrix has one less dimension which has computational advantages, and the covariance of the attitude error angles has a transparent physical interpretation.

The state that is intended to be estimated is 훿퐠̂. Following the derivations presented in [39], the time derivative equation of 훿퐠̂ is

̇ 훿퐠̂ = −흎̂ × 훿퐠̂ (4.9) where 흎̂ is the estimated angular rate of the spacecraft.

The initialisation step is done as in Equations 4.2 and 4.3. Then, the propagation step is carried out using (4.4) where 퐟(퐱,̂ 퐮) ≜ −흎̂ × 훿퐠̂. It should be noticed that if 훿퐠̂ is zero at the beginning of a propagation step it will persist being zero during the propagation, which means that 훿퐪(훿퐠)̂ will be equal to the identity quaternion during this step and that 훿퐠̂ = 훿퐠̂ . In parallel to the propagation in time of (4.4), the estimated global attitude (퐪̂ ) also needs to be propagated in time from 퐪̂ to 퐪̂ by integrating 1 퐪̂̇ = 퐖(흎̂ )퐪,̂ (4.10) 2 where 퐖 is defined as 0 ̂휔 − ̂휔 ̂휔 ⎡ ⎤ − ̂휔 0 ̂휔 ̂휔 퐖(흎̂ ) = ⎢ ⎥ , (4.11) ⎢ ̂휔 − ̂휔 0 ̂휔⎥ ⎣− ̂휔 − ̂휔 − ̂휔 0 ⎦ where ̂휔, ̂휔, and ̂휔 are the components of 흎̂ around its body-fixed 푋, 푌, and 푍 axis, re- spectively.

The update step is performed as in the standard EKF with Equations 4.5 and 4.6. How- ever, the observations of the spacecraft’s attitude are usually done with star trackers, which provide quaternion-out capabilities together with associated error covariances. The transfor- mation of this attitude observation into the MEKF model is done using the relation between quaternions and the three-component attitude error representation (see Section 2.3). It must be consistent with the selected representation for the filter, in this case, the Gibbs vector rep- resentation: ⟨퐪 ⊗ (퐪̂ ) ⟩ 훿퐠 = (4.12) ⟨퐪 ⊗ (퐪̂ )⟩ where (⋅) states for measured by the star tracker and ⟨⋅⟩ extracts the vector part and ⟨⋅⟩ the scalar part of the resulting quaternion. Note that the inverse of a quaternion is 퐪⋆ given by 퐪 = , where 퐪⋆ is a quaternion conjugate of 퐪. ‖퐪‖

Then, the sensitivity matrix that concern the attitude errors ⟨퐇⟩ = 퐈 and the measure- ment noise covariance matrix ⟨퐑⟩ ∈ ℝ is a matrix of attitude measurement error angles.

The update state give post-update values to 훿퐠̂ but the components of the global state still retain the values 퐪̂ . Therefore, in contrast to standard EKF, a reset step is included in order to transfer the update information to a post-update global attitude 퐪̂ , while resetting 훿퐠̂ to zero. This reset step is done implicitly in the EKF, but here it must be done explicitly using the reset parametrization for Gibbs vector while preserving the unity quaternion constraints:

1 훿퐠̂ 퐪̂ = 훿퐪(훿퐠)̂ ⊗ 퐪̂ = [ ] ⊗ 퐪̂ (4.13) 1 √1 + ‖훿퐠̂ ‖

After each measurement update, 훿퐠̂ needs to be reset to zero explicitly, e.g., 훿퐠̂ = ퟎ×. 4.1. FDI Strategy 29

4.1.4. State Estimation Before defining the residual signals, we first design a state estimator (used for residual gen- eration) to estimate the following state vector:

퐱 = [퐪 흎 흎 퐓] . (4.14)

The filter used here to estimate 퐱 is a mix of an EKF and MEKF. The EKF is used to es- timate 흎, 흎, and 퐓, whereas MEKF is used to estimate 퐪. Its validation and verification are carried out in Appendix B.2

To proceed, two new vectors are defined. The total control input vector 퐮 as

퐮 ≜ [퐮 퐮] (4.15) and an intermediate state vector estimate 퐱̂ ̂ 퐱̂ ≜ [훿퐠̂ 흎̂ 흎̂ 퐓], (4.16) where 퐓̂ = [푇̂ … 푇̂ ].

The time derivative of 퐱̂ satisfies 퐱̂̇ = 퐟(퐱,̂ 퐮), (4.17) where 퐟(퐱,̂ 퐮) is a vector function defined as

−흎̂ × 훿퐠̂ ⎡ ̂ ̂ ̂ ⎤ ⎢퐉 (퐓 − 퐓 − 흎̂ × (퐉흎̂ − 퐡))⎥ 퐟(퐱,̂ 퐮) ≜ ̂ , ⎢ 퐉 (퐮 + 퐓) ⎥

⎣ ퟎ ⎦ where 퐉 ≜ diag(퐽 … 퐽 ) and 퐡̂ is the estimated angular momentum of the RWs computed as ̂ 퐡 = 퐌퐉흎̂ , with 퐌 ≜ [퐦 … 퐦 ] being the misalignment-free matrix mapping the estimated RWs’ torque contributions into the spacecraft body-fixed frame. In (4.17), 퐓̂ is defined as

̂ 퐓 = −퐹∑ (퐫 × 퐝) 푢 ,

퐓̂ is modelled as ̂ ̂ 퐓 =∑ 퐦 (푢 + 푇 ), and 퐓̂ is modelled as a random walk driven by a zero-mean Gaussian white-noise (휼), e.g., ̇ 퐓̂ = 휼.

The filter’s time propagation step is done in a continuous time. The estimated state (퐱̂ ) and covariance (퐏) in the previous time step are propagated to the current time step (퐱̂ and 퐏 ), assuming constant control input (퐮), by integrating the following system of equations

퐱̂̇ = 퐟(퐱,̂ 퐮) { (4.18) 퐏̇ = 퐅퐏 + 퐏퐅 + 퐐 where 휕퐟(퐱, 퐮) 퐅 = | 휕퐱 퐱퐱̂ ,퐮퐮 30 4. Proposed FDI Strategy is the Jacobian matrix of the state and 퐐 ∈ ℝ()×() is the PSD process noise matrix defined as

휀퐈× ퟎ× ퟎ× ퟎ× ⎡ ⎤ ퟎ× 퐉 (퐁퐒퐁 − 퐌퐒퐌) 퐉 ퟎ× ퟎ× 퐐 = ⎢ ⎥ (4.19) ⎢ퟎ× ퟎ× 퐉 퐒퐉 ퟎ× ⎥ ⎣ퟎ× ퟎ× ퟎ× 퐒 ⎦

× × × where 휀 is a small constant, and 퐒 ∈ ℝ , 퐒 ∈ ℝ , and 퐒 ∈ ℝ being double sided constant values matrices of the power spectral densities of 휼, 휼, and 휼, respec- tively. Constant value of PSD are used because a continuous time model is used and they are related with the discrete-time noise variance as 휎 = Δ푡푆., see Appendix A.2 for derivation. The derivation of matrix 퐐 can be found in Appendix A.1.

In parallel to the propagation in time of (4.18), the estimated full attitude (퐪̂ ) also needs to be propagated in time from 퐪̂ to 퐪̂ by integrating 1 퐪̂̇ = 퐖(흎̂ )퐪,̂ (4.20) 2 where 퐖(⋅) was defined in (4.11).

All sensor measurements, whose models are defined in (3.14)-(3.16), are exploited for es- timation purposes. These measurements are only available in discrete time and are, for convenience, lumped into the following measurement vector

퐳 = [(훿퐠 ) (훿퐠 ) (흎 ) (흎 ) ] (4.21)

where the 푖 attitude error measurement, 훿퐠 , is expressed as a Gibbs vector

̂ ⟨퐪 ⊗ (퐪 ) ⟩ 훿퐠 = . (4.22) ̂ ⟨퐪 ⊗ (퐪 ) ⟩

Once 퐳 becomes available, the state and the covariance matrix are updated as follows

퐱̂ = 퐱̂ + 퐊 (퐳 − 퐡(퐱̂ )) (4.23)

퐏 = (퐈()×() − 퐊퐇) 퐏 (4.24) where ퟎ ퟎ (흎̂ ) (흎̂ ) 퐡 (퐱̂ ) = [ × × ] 퐈 ퟎ ퟎ ퟎ ⎡ × × × × ⎤ 퐈 ퟎ ퟎ ퟎ 퐇 = ⎢ × × × × ⎥ ⎢ ퟎ× 퐈× ퟎ× ퟎ× ⎥ ⎣ퟎ× ퟎ× 퐈× ퟎ× ⎦

퐊 = 퐏 퐇 (퐇퐏 퐇 + 퐑) 흈 퐈 ퟎ ퟎ ퟎ ⎡ × × × ⎤ ⎢ 흈 ⎥ ퟎ 퐈 ퟎ ퟎ 퐑 = ⎢ × × × ⎥ ⎢ ퟎ× ퟎ× 흈퐈 ퟎ× ⎥ ⎣ ퟎ× ퟎ× ퟎ× 흈퐈× ⎦ and 흈 , 흈 , 흈, and 흈 are the standard deviations of 휼 , 휼 , 휼, and 휼, respectively. Note that in the measurement noise covariance matrix 퐑, the measurement noise variances of the STRs are divided by 4 to account for the Gibbs vector transformation.

Note that 훿퐠̂ is an error estimation, however, as it will be shown in the next section, we 4.1. FDI Strategy 31 are interested in the global satellite attitude. Therefore, the Gibbs vector 훿퐠̂ is transformed into the global attitude representation, while preserving the unity quaternion constraints, using

1 훿퐠̂ 퐪̂ = [ ] ⊗ 퐪̂ . (4.25) 1 √1 + ‖훿퐠̂ ‖ After each measurement update, 훿퐠̂ needs to be reset to zero explicitly, e.g.,

훿퐠̂ = ퟎ×. (4.26)

In this project, standard measurement update step equation of the Kalman filter is modified ̂ to account for proper implementation of the friction torque estimate 푇 , e.g., the 푖 friction torque estimate is updated as follows

푇̂ = 푇̂ 푇̂ , (4.27) ; ; ; where the sign of the friction torque estimate is determined by

−sign ( ̂푐 ) if |퐽 ̂휔 | < 훾 푇̂ = { ; ; (4.28) ; −sign ( ̂휔; ) otherwise where 훾 > 0 is a fixed threshold accounting for the RW’s friction characteristics. Finally, the magnitude of the friction torque is computed as

푇̂ = |⟨퐱̂ ⟩ | + 푇̂ ⟨퐊 (퐳 − 퐡(퐱̂ ))⟩ (4.29) ; ; where ⟨⋅⟩ pulls out the element associated with 푇 from the enclosed vector. The sign of the estimated friction torque, see (4.28), is assumed to be the opposite of the estimated angular rate sign. However, when the physical angular momentum is close to zero, the sign of the physical friction torque is not clear. Therefore, when the estimated an- gular momentum is close to zero, the estimated torque is normally very high (in order to reduce zero crossing time) and the opposite sign of the previous estimated friction torque is considered.

Once the filter is designed it is necessary to initialise it with the most accurate values for the estimated states values and the covariance matrix. To do it, all the possible information should be used. The states for which there are direct measurements, the information pro- vided by the sensors must be used. Therefore, their initial estimated values are taken as the first valid measurements of the sensors, and their initial estimated covariance is defined as a diagonal matrix with main diagonal equal to the squared of the sensors’ noise variance, as show in 4.30. That is the case for the spacecraft angular rates, and the reaction wheels angular rates. For the attitude errors, its initial values are assumed to be zero and its co- variance the error measurement of the star trackers transformed into Gibbs vector form. For the reaction wheels friction torques, the initial state can be assumed to be zero, and the co- variance must be guessed. In general it is taken to be great enough to fit any uncertainty of the real initial value in it.

퐱(푡̂ ) = E[퐱] = 퐲 퐏(푡̂ ) = E[퐏] = 푑푖푎푔(흈) (4.30) where

흈 흈 흈 = [ 흈 흈 퐏 ] 퐲 = [ퟎ ퟎ 흎 (푡) 흎 (푡) ퟎ ] 32 4. Proposed FDI Strategy

4.1.5. Residual Generation Once the full states (퐱) vector is estimated, the next step is to generate the necessary residuals. The most straightforward way to generate a residual given the states is to compute, as a residual, the difference between an estimated state and its measure. The residual vector with length 푁 is first defined 퐫 = [푟 푟 … 푟 ] . (4.31) From the estimated states 퐱̂, generating residuals directly from the difference between the estimate and its measurements can only be done for the spacecraft angular rate states (흎) and for the RW angular rates (흎). Regarding the attitude estimate 퐪̂ , it is defined in quaternion representation, which is very useful for computing purposes but has not a very clear and transparent physical meaning. Therefore, the residual is preferred to represent attitude in the inertial reference frame in Euler angles (퐀). The residual can be done by first transforming the estimated and measured attitude in quaternion (퐪̂ and 퐪 ) into estimated and measured attitude in Euler angles (퐀̂ and 퐀). Then, residual for attitude can be obtained by computing the difference between 퐀̂ and 퐀. Finally, for the RW friction torque, the obtaining of the residual is more complex because there are no direct measurement of it. In order to have a ”measured” friction torque (퐓 ) in (4.35), it is necessary to define a model of the friction torque that depends only on a measured element. 퐓 = [푇 … 푇] stands for the “pseudo-measured” friction torque vector, where 푇 is calculated using a RW friction torque model (3.11), which depends on the estimated angular rate of the RW, e.g., . 푇 = −휁 tanh ( ̂휔 ) − 휁 sign( ̂휔 ) | ̂휔 | . (4.32) Now, the estimated states vector for residual generation is defined as 퐱̂ = [퐀̂ 퐀̂ 흎̂ 흎̂ 퐓̂] (4.33) with duplicated spacecraft attitude estimates (퐀̂ ) because the attitude is measured by two star trackers.

The measurements vector for residual generation is

퐳̂ = [퐀 퐀 흎 흎 퐓 ] , (4.34) where 퐀 and 퐀 stands for the measured attitude in Euler angles of star tracker 1 and star tracker 2, repectively. And the residual vector is then, 퐀̂ 퐀 ⎡ ⎤ ⎡ ⎤ ̂ 퐀 ⎢퐀 ⎥ ⎢ ⎥ 퐫 ≜ 퐱̂ − 퐳 = ⎢ 흎̂ ⎥ − ⎢흎 ⎥ . (4.35) ̂ ⎢흎⎥ ⎢흎 ⎥ ⎢ ⎥ ̂ ⎣퐓⎦ ⎣퐓 ⎦

4.1.6. Fault Detection Algorithm The first step of the FDI system is to detect that a fault has occurred. To do it, the residuals (퐫) are evaluated online in order to check if any residual diverges from zero value. Ideally any residual is always zero and diverges from zero in the presence of a fault. But actually that is not true, a residual is never zero due to noise, uncertainties, and other factors effects, although its value is usually small. When a fault occurs, the value of the residual varies abruptly and diverges from zero.

The residual signal defined in (4.35) has in total 푁 = 9 + 2푁 components. To detect fault presence, we employ the well-known GLR test to detect changes in the mean value of each residual component 푟, 푖 ∈ 풮 ≜ {1, 2, … , 푁}. 4.1. FDI Strategy 33

Generalized Likelihood Ratio Test According to [8], the evaluation of residuals can be reduced, under suitable hypotheses, to the problem of identifying a change in the mean of a normally distributed random sequence. That is possible using sequential change detection algorithms. Each residual is treated in- dividually as a sequence of independent random variables with probability density function depending on a scalar parameter 휅. Often, in residual evaluation, 휅 is the mean of a Gaussian distribution (휇). Considering a sequence of independent random variables 퐥(푖), ∀푖 = [1, 2, …], with probability density function 푝(푧) depending upon one scalar parameter 휅. Before an unknown change time, 퐾, 휅 is equal to 휅. At time 퐾, it changes to 휅 = 휅 ≠ 휅.

For the desired purpose of this algorithm, only the detection of a change in the mean of the evaluated signals is required. Therefore, it is only required to detect whether the condi- tion is normal, 휅 = 휅, or the parameter 휅 has changed to 휅. It is done by distinguishing between two hypotheses ℋ - the nominal case, ℋ - a change has taken place. The condition under ℋ are assumed to be known so that the parameter 휅 is known.

If the magnitude of 휅 is known after the change, the change detection algorithm used is the cumulative sum (CUSUM) algorithm. However, if the magnitude of 휅 is not known after the change, the Generalized Likelihood Ratio (GLR) algorithm must be used. The latest is a more general algorithm, so it is preferred over the CUSUM algorithm, especially because the changes in the magnitude of 휅 due to faults are usually not known.

The GLR algorithm, evaluates the log-likelihood between two hypotheses ℋ (fault-free case) and ℋ (faulty case). It works at discrete time instances 푘 and with a moving time window 퐿. If the 푖 residual signal sequence can be assumed independent and Gaussian, then the evaluation function for the 푖 residual signal is derived in [7] and is given as follows

1 푔 (푘) = [ ∑ (푟 (푗) − 휇 )] , ∀푖 ∈ 푆 (4.36) 2휎 퐿

where 휇 and 휎 is the mean and standard deviation of the 푖 residual signal in fault-free case, respectively.

Finally, the decision test for the 푖 residual signal is defined as follows

1, if 푔(푘) ≥ Υ 휆(푘) = { (4.37) 0 if 푔(푘) < Υ where Υ > 0 is a fixed threshold selected by the designer. The selection of the thresholds is challenging, there must be a compromise between the avoidance of false alarms due to too low thresholds and the avoidance of miss-detection due to too high thresholds.

4.1.7. Fault Isolation Algorithm Once a fault is detected, the FDI system must identify in which actuator or sensor the fault has occurred, and if required, to identify the type of fault. As shown in Section 2.2.2, there are some standard techniques to achieve isolation of faults. However, here, the isolation of the considered faults is achieved by comparing a decision vector

흀 ≜ [휆 휆 … 휆 ] (4.38)

×() with the columns of a pre-defined fault signature matrix 퐌 ∈ ℝ represented in Table 4.1, where the X represents 1. The columns of this table represent fault signatures, which unequivocally link the faults to the symptoms detected during the system monitoring. For a certain fault, its corresponding correct decision vector (흀) is defined as 흀. 34 4. Proposed FDI Strategy

Table 4.1: Fault signatures.

Thruster RW sensor fault RW friction fault fault 1 … 푁 1 … 푁 퐀 ------퐀 ------흎 푋 ------

휔 - 푋 - - - - - ⋮ ⋱ ⋱ 휔 - - - 푋 - - - 푇 - 푋 - - 푋 - - ⋮ ⋱ ⋱ 푇 - - - 푋 - - 푋 It is clear from Table 4.1 that specific thruster fault isolation was not considered. The task of thruster fault isolation was extensively tackled in the literature, see for instance [22, 45] and references therein.

The definition of Table 4.1 was done by using the logic shown below and testing it by running several simulations and analysing the effect of the faults on the residuals and on the signals computed by the GLR test in (4.36).

Thruster faults only had an important effect on the attitude and spacecraft’s angular rates. It is clear that a sudden increase or decrease of the torque produced by the thrusters RCS would affect first and directly the dynamics and kinematics of the spacecraft and would differ from the expected ones. However, after many tests done to prove the validity of the isolation algorithm it was found that the residuals and GLR signals for the attitude were not affected as expected by thruster faults. Probably the cause is that the controller is able to counter- act the effect of the fault and the attitude, which responds slower than the angular rates, is not affected immediately. In that moment, the attitude controller would increase the RWs’ angular rates, which delivers a torque to the spacecraft and increases the friction torque, to compensate for the extra/missing torque. However, this increase of RW rate would be expected and the residuals would not variate significantly (may be a bit a the beginning due to delay between the AOCS and the equipment).

A fault in the RW tachometer sensor changes suddenly the measured RW’s rate, so it af- fects the RW’s rate residual drastically. In addition, the estimator of the friction torque of the faulty RW also jumps, but not the ”pseudo-measured” one that only depends on the estimated RW rate (which does not change due to fault). Therefore, the Rw friction torque residual also varies due to tachometer fault.

For the RW friction torque fault, the real friction torque of the faulty RW increases as well as the estimated one by the filter. In general, and in the study case here considered, the commanded torque to the 푖푡ℎ RW (푢 ) includes the friction torque that is estimated by the commanding algorithm, to compensate the friction torque and achieve a delivered torque equal to the desired torque. This means, that when the friction is increased due to a fault, the total actuated torque to the 푖푡ℎ RW (푢 +푇 ) does not vary, so the RW rate is not affected by the friction torque fault. Because the 푖푡ℎ ”pseudo-measured” friction torque only depends on ̂휔 , it is also not affected by friction torque fault so the difference between the estimated friction torque that jumps and the ”pseudo-measured” one, that does not jump, diverges from zero. Regarding the estimated RW rate residual, it is not affected.

From Table 4.1 it is also clear that the attitude residuals are not used for isolation pur- poses. The logic here presented do not require attitude residuals to perform the desired isolation of faults. However, these attitude residuals can be used with other purposes, e.g., STR faults detection and isolation. Nevertheless, the scope of this thesis research project does not include STRs faults and this could be included for future work. 5 Simulation Results

The here proposed FDI strategy must be evaluated in a simulator in order to assess its perfor- mance with respect to the defined criteria. The simulator used is the GAFE simulator whose main characteristics and the developed algorithms that were missing and that are required for this project are described in Section 5.1. Then, in Section 5.2 all the design parameters of the spacecraft and AOCS equipment that were defined as generic are specified. In Section 5.3 a simple run simulation example of each type of fault and fault-free case are shown with their most characteristic features. Finally, two different Monte Carlo campaigns and their results are presented in Section 5.4.

5.1. Simulator The GAFE Simulator is a time domain purely numeric computer-based simulator, which supports the approach of FDIR design and validation by simulating the study case mission scenario set up, including the FDI strategy, injecting the desired faults, and observing the FDI system performance. Its principal features are that it has a fully data-driven configuration, including FDIR system, AOCS algorithms configuration, equipment instantiation (number of units and their individual parametrisation), it provides libraries for AOCS algorithmic com- ponents, equipment (actuators, sensors, and non-AOCS devices), programmers interface for new AOCS algorithmic components and new equipment, full observability and logging of all relevant information, logging data preconditioning and visualisation, and Monte-Carlo tests capability.

It is a Matlab/Simulink based simulator that runs on Matlab R2016b and it is recommended to be run in a computer with 16GB of RAM. GAFE is public and available at the European Space Software Repository (ESSR)1. The time-step of the simulator can be determined by the user, but here it is set to Δ푡 = 0.1 seconds. The solver that is used is a fixed time-step ODE4 Runge-Kutta.

5.1.1. Workflow The use of GAFE simulator can be defined as different steps. It begins with the start up of the GAFE library, followed by the creation of a new project or start up an existing one. Then, if necessary, create and develop own Equipment and/or AOCS algorithms models to the default library models. After that, the parameterization of the simulator modules can be done modifying the parameter files of each module. If desired, tests can be defined in a separate folder which allows defining specific parameters for each test that can be different from the ones in the default parameterization of the scenario case. Then, run a simulation of the default scenario or run a test from the ones defined. Finally, post-process the simulation data by visualisation of the results.

1https://essr.esa.int/

35 36 5. Simulation Results

5.1.2. Structure and Functionality The top-level architecture of the simulator is shown in Figure 5.1 where all the modules are implemented. The modules and their main tasks in the frame of FDIR and its functionality

Figure 5.1: Top-level architecture of the GAFE Simulator. Source: GAFE Users’ Manual. can be found at the GAFE User’s Manual document2. However, here a short description of each module is presented.

The environment module comprises all the required elements that compute the spacecraft motion and to provide all data from the environment to the equipment models. It includes a time and ephemeris parameterization to set the initial date of the simulation using different time scales and formats, a physical environment parameterization of the gravity, atmosphere magnetic field, and eclipse models, and a spacecraft dynamics parameterization allowing ini- tial translational and rotational state in different formats as well as mass, satellite geometry and inertia, and frames properties.

The equipment module includes different equipment models. For sensors, it includes ac- celerometer, camera, Earth sensor, global navigation satellite system receiver, light detector and ranging, magnetometer, rate measurement unit, star tracker, and Sun sensor. For actu- ators, it contains magnetorquer, Reaction Control System (RCS) using thrusters, and reaction wheels.

The AOCS algorithms module accommodate all AOCS algorithms, which includes AOCS re- lated FDIR algorithms, like residual generators and analytical models.

The FDIR operation states contain all the essential FDIR mechanisms used to monitor the parameter.

The System module simulates all the on-board computer actions and functions that have an impact on the AOCS but are not part of the AOCS itself. It consists of the system con- figuration manager that provides basic system level information; the equipment manager activates and deactivates AOCS units, as well as manages reconfigurations based on FDIR decisions. And the AOCS mode manager deals with AOCS mode transitions.

2http://gafe.estec.esa.int/files/GAFE_User_Manual.pdf 5.1. Simulator 37

The Command module is in charge of sending telecommands to the System module.

The Fault Injection module provides information about occurring faults to the other mod- ules. Faults can have an effect on equipment, the AOCS algorithms, and the System Module.

5.1.3. Missing Features in the Simulator The GAFE simulator was missing few AOCS algorithms and models which are required to fully simulate the proposed FDI strategy in the considered spacecraft and mission. These missing algorithms and models are the attitude guidance algorithm, the attitude controller algorithm, the thrusters RCS model and commanding algorithms, and the non-trivial faults considered.

Guidance Algorithm The attitude guidance algorithm generates the guidance of the spacecraft in terms of atti- tude and angular rates from the observing targets and the time of initialisation of the slew manoeuvre. Given that the purpose of this thesis is focused on FDI, this algorithm has not been fully developed and it has been taken from another project, adapted, and implemented into the GAFE simulator. A proper validation and verification of this algorithm has not been done because whether the desired attitude is achieved or not does not influence the purpose of this project that is focused on FDI.

The algorithm computes only once, at the beginning of the simulation, the angular accel- eration reference profile in the fixed-body frame. The angular acceleration reference profile is generated using the shortest path trajectories with a Bang-Bang manoeuvre, see [52]). In Bang-Bang manoeuvre, the spacecraft is rotated around the axes of the fixed-body frame in three sub-manoeuvre. First, it is rotated around the Y-axis until the X-axis is pointing to the Sun (this allows to rotate around X-axis without risks of any instrument facing the Sun). Then, it is rotated around the X-axis as long as required. Finally, it is rotated again around Y-axis to finally point the Z-axis into the desired observing target. The angular acceleration reference profile to accomplish the required Bang-Bang manoeuvre is calculated, first, computing the required rotating angles along the fixed-body frame axis given the observation points. Then, using the maximum allowed torque generated by the actuators and the spacecraft inertia, the angular acceleration time profile of each of the sub- manoeuvre is computed. Finally, all the sub-manoeuvre angular acceleration time profiles are combined into one.

Once the angular acceleration reference profile is computed, the algorithm is run online until the end of the simulation (푡). For each time-step 푘, the algorithm takes the previous reference attitude 퐪 and angular rates 흎 of the spacecraft and calculates the current reference attitude 퐪 and angular rates 흎 using a fixed time-step 4-order Runge-Kutta integrator. The algorithm is only based on dynamic and kinematic equations. Therefore, errors are accumulated during the simulation and, at the end, the obtained reference atti- tudes might diverge from the commanded observing targets. This is not important for the FDI purposes of this project, so it was used despite the aforementioned issues. The algorithm is summarised in Algorithm 1.

Controller Algorithm Similarly to what occurs with the guidance algorithm, this algorithm has been taken from another project, adapted, and implemented into the GAFE simulator. This is an 퐻-based controller which computes the quaternion error between the current measured attitude and the reference attitude and transforms it into fixed-body frame Euler error angles. Then, these errors are controlled, for each axis individually, with 퐻-based controllers. The generation of the controllers’ parameters has remained undisclosed due to confidentially issues. 38 5. Simulation Results

Algorithm 1: Guidance Algorithm Result: Reference attitude and angular rate initialization of parameters; while 푡 ≤ 푡 do if 푡 = Δ푡 then compute angular acceleration reference profile; integrate attitude and angular rate (푡 − Δ푡 → 푡) else integrate attitude and angular rate (푡 − Δ푡 → 푡); end end

Thrusters Reaction Control System Previous to this project, the GAFE simulator had implemented a very simple thrusters RCS. It was simply an algorithm that took the commanded torque, 퐓, to the RCS in body frame and added a noise, 휼, to generate the delivered torque, 퐓, as

퐓 = 퐓 + 휼 (5.1)

This model was not realistic and had not the precision required for FDI purposes. There- fore, a more realistic and detailed model has been designed, implemented, and tested. Its validation and verification are carried out in Appendix B.1. It is composed of two parts, the commanding algorithm and the equipment model.

The thrusters RCS command algorithm is based on three-seconds cycles, which means that the calculation of the commanding to the RCS equipment model is done only once at the beginning of the cycle. This is done to reduce the computation effort, given that the compu- tation of the commanding to each thruster uses a lot of computing power and increases the total AOCS algorithm computation time. The algorithm calculates, at the beginning of the cycle, the necessary opening time of each thruster (fully open) during the next three seconds, as a Pulse Width Modulated (PWM) signal, to achieve an equivalent force and torque to the ones required. The allocation of the opening time among the different thrusters is done using an optimisation algorithm that minimises the total amount of opening time. To make it more realistic, the quantization effect is included in the opening time of the thrusters. To illustrate it, a very simple example is shown in Figure 5.2. The example shows a required single continuous one-dimension torque of 0.3 Nm during the three-seconds cycle and the equivalent opening time of a 1 N thruster that is exactly at one meter distance to the center of mass and which thrust direction is perpendicular to the direction of the required torque (so the opening of the valve is equal to the torque in magnitude). The results show that the valve is fully open for 0.9 seconds, so the integration of both signals along the three seconds cycle are identical.

In a real scenario, where the computer clock runs at a higher frequency, the realised opening time of the thrusters valves can be almost identical to the one computed; but in the simu- lation scenario, where the simulation is run in more bigger time steps, a modification must be included in order to achieve the desired effect of the on-time on the realised forces and torques. That is done by opening only partially (in the real equipment it is not possible) the thrusters’ valves during their last opening time step. This is illustrated in Figure 5.3 follow- ing an example where an on-time of 2.242 seconds is required and the simulation time step is 0.1 second.

The second part, the equipment model, is responsible for simulating the real equipment. To do so, it manipulates the thrust per thruster (퐹 ) in order to apply noises (휂 ) and simulate the occurring faults (see Section 3.1.4), if necessary. 5.2. Study Case Parameters Definition 39

1.2 1.2 Requested Torque 1 Valve opening 1

0.8 0.8

0.6 0.6

Torque [Nm] 0.4 0.4

0.2 0.2 Valve opening [0-1]

0 0 0 0.5 1 1.5 2 2.5 3 Time [s]

Figure 5.2: Comparison between real scenario PWM signal and simulation scenario PWM signal of single thruster during a single cycle.

100 Real PWM Simulated PWM 80

60

40 Valve opening % 20

0 0 0.5 1 1.5 2 2.5 3 Time [s]

Figure 5.3: Comparation between real scenario PWM signal and simulation scenario PWM signal of single thruster during a single cycle.

The addition of noise is performed by taking the inputs to each thruster, adding noise to the magnitude of the thrust, to compute the delivered forces and torques to the spacecraft in body frame. A correct manipulation of the magnitude of the thrusts is very important. The application of the noise to the thruster magnitude must be done properly so it does not vary equally in all thrusters because depending on the thrusters’ configuration it could have a magnifying effect (most thrusters pointing in a similar direction) or it could not have any effect (symmetric geometry). The thrusters magnitude noise must be uncorrelated noises,

E[휂 휂 ] = 0, ∀푖, 푗 ∈ 푆.

The complete validation of the thrusters RCS model implementation, including the validation of the AOCS command algorithm and the equipment model, in the GAFE simulator, can be found in the Appendix B.1.

Faults Because the RCS thruster model has been implemented from scratch to the GAFE simulator, all the faults concerning this actuator were missing and had to be implemented. These faults are the individual thruster stuck close, stuck open, and leakage, and the loss of effectiveness fault affecting all the thrusters simultaneously, as presented in 3.1.4.

5.2. Study Case Parameters Definition In this research project, all the assumptions with respect to the spacecraft parameters used for the simulations have been chosen for a class of spacecraft similar to Athena with a total mass of considered of 6752 Kg and the mass is distributed following the matrix inertia

190854 −86 −357 퐉퐒 = [ −86 173312 −5 ] −357 −5 28603 40 5. Simulation Results

It is assumed that the study case spacecraft is equipped with a set of 푁 = 12 thrusters. The thruster RCS configuration is such that torques can be generated in all three degrees of freedom. The limiting RCS torques that AOCS can require given that only torque in one single axis is required are shown in Figure 5.4. In addition, the envelope of the RCS torques in x-y, x-z, and z-y planes are shown in Figure 5.5, where the limiting torque per isolated axis and the limiting torque if torques in all axes are required are considered.

A set of 푁 = 4 identical (퐽 = 퐽, ∀푖 ∈ 풮) RWs placed in a classical pyramidal configuration

2.5 RCS Max Possitive Trq RCS Max Negative Trq RCS Limiting Trq B B B 2

1.5

1

0.5

0

Torque [Nm] -0.5

-1

-1.5

-2

-2.5 Axis X Axis Y Axis Z

Figure 5.4: Maximum, minimum, and limiting generated torque per axis.

Y [Nm] Z [Nm] Isolated Axis Isolated Axis 2 1.5 All Axes All Axes 1.5 1 1 0.5 0.5 X [Nm] X [Nm]

-1 -0.5 0.5 1 -1 -0.5 0.5 1 -0.5 -0.5 -1 -1 -1.5 -1.5 -2

(a) X-Y plane. (b) X-Z plane.

Y [Nm] Isolated Axis 1.5 All Axes

1

0.5 Z [Nm]

-2.5 -2 -1.5 -1 -0.5 0.5 1 1.5 2 2.5

-0.5

-1

-1.5

(c) Z-Y plane.

Figure 5.5: RCS torque envelope. with a tilt angle 훼, see [50] for the derivation of the configuration matrix, are considered. 5.2. Study Case Parameters Definition 41

Thus, the nominal missalignment-free RWs configuration matrix is given by

cos(훼) 0 − cos(훼) 0 퐌 = [ 0 cos(훼) 0 − cos(훼)] . sin(훼) sin(훼) sin(훼) sin(훼)

In the scenario considered here the torque commanded by the AOCS algorithms to the ac- tuators is allocated such that the torque computed by the guidance algorithm (see Section 5.1.3) is feed-forwarded to the thrusters RCS commanding algorithm while the control torque computed by the controller algorithm (see Section 5.1.3) is feed into the RWs commanding algorithm, and the desaturation of RWs with thrusters RCS is not considered.

Also in this scenario, the spacecraft is placed in a halo orbit around L2, the second La- grange point of the Sun-Earth system. In this orbit, the main disturbance torque (푇 ≜ ‖퐓‖) that affects the spacecraft is the solar radiation pressure, which is assumed to be constant and to affect only the y-axis of the spacecraft fixed-body frame (requirement from controller presented in Section 5.1.3). The simulated scenario comprises four (shorten) inertially-fixed observation phases connected by three attitude slews, see Fig. 5.6. The total duration of the scenario is approximately 9000 s.

The study case spacecraft and FDI system related parameters are summarised in Tables 5.1

20 X Y 0 Z -20 -40 -60

Attitude Euler Angle [°] -80

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure 5.6: Evolution of the spacecraft attitude. and 5.2. More specifically, in Table 5.2, the constants required by the GLR algorithm test (흈 and 흁) and the thresholds used by the decision taking algorithm (횼) are shown. The selection of 흈 and 흁 is done after running several fault free cases, evaluating the statistical properties (mean and variance) of each GLR signal per test, and taking the average values for all the tests run.

Once the models and algorithms are defined, the performance indices defined in Sec- tion 3.2.2 can be redefined as follows

• Correct detection: a fault is correctly detected if any 휆 = 1 and any 휆 = 1, ∀푖 ∈ 풮. • False alarm: a false alarm occurs if any 휆 = 1, ∀푖 ∈ 풮 and 휆 = 0×.

3 • Miss-detection: a fault is miss-detected if a detectable fault occurs (휆 ≠ 0×) and 휆 = 0, ∀푖 ∈ 풮 during the entire simulation. • Correct isolation: the correct isolation of a fault is defined in two different levels:

- Equipment level: if the fault’s location is identified to be at the faulty thrusters system or at the faulty RW.

3A fault is detectable if it has an actual effect on the spacecraft. 42 5. Simulation Results

Table 5.1: Spacecraft and FDI related parameters.

Param. Value Unit 퐅 [ 1.2 1 1.2 1.3 1.2 1.2 1 1.2 1.2 0.9 1.2 1.1 ] N 훼 15 ∘ 퐽 0.108 kgm 푇 1.6 ⋅ 10 Nm 휁 0.005 N/A 휁 10 N/A 휀 10 N/A 훾 10 kgms L 10 s 휎 5 ⋅ 10 N

휎 0.003 Nm 휎 4 ⋅ 10 Nm 3.4 3.4 9.2 휎 [ ] 10 rad 휎 0.21 rad/s 휎 [5.7 5.7 5.7] 10 rad/s

Table 5.2: GLR means and variances and decision taking algorithms design parameters.

Residual 흁 흈 횼 X −1.191 · 10 1.233 · 10 50 퐀 Y −1.513 · 10 4.233 · 10 50 Z −1.138 · 10 1.018 · 10 50 X −1.946 · 10 1.226 · 10 50 퐀 Y 1.529 · 10 4.234 · 10 50 Z 1.116 · 10 2.266 · 10 50 X 1.111 · 10 3.144 · 10 40 흎 Y 1.365 · 10 3.029 · 10 30 Z −1.214 · 10 1.018 · 10 25 1 1.328 · 10 3.666 100 2 1.148 · 10 3.652 100 흎 3 −7.091 · 10 3.639 100 4 −9.107 · 10 3.664 100 1 −3.500 · 10 3.515 · 10 50 2 3.794 · 10 2.880 · 10 50 퐓 3 −1.775 · 10 2.917 · 10 50 4 −4.175 · 10 2.900 · 10 50

- Fault type level: if the fault is correctly located at the faulty tachometer sensor or at the faulty wheel friction torque.

• Miss-isolation: the miss-isolation of a fault is defined using the same concept as the correct isolation index. A fault can be miss-isolated in two different levels, the equipment level and the fault type level (only applicable to RWs).

• Time to detection (RW, thruster leakage, and stuck open): it is defined as the time between the occurrence of the fault and its detection.

• Time to detection (LOE and stuck close): it is defined as the time between the first time that a faulty thruster is activated after the occurrence of the fault and the time of detection. 5.3. Sample Runs Analysis 43

5.3. Sample Runs Analysis

Each fault has a different effect on the spacecraft and AOCS equipment. In order to clearly see how are those effects, first, a fault-free simulation sample run is presented, where the resid- ual and GLR signals along the simulation are shown, expecting them to be always around zero. In addition, and because there are no states nor direct measurements of the thrusters RCS actuation, a plot of the acted force per thruster along the simulation is shown. The aforementioned signals and plot will be used as a reference to then compare the considered fault’s effect on the residual and GLR signals, and on the thruster RCS. Each type of fault is represented by a sample run.

5.3.1. Fault-Free

The sample simulation of a fault-free case is done in order to show how the FDI system responds in front of a simulator where no faults occur. It is expected that all residuals are around zero all the times and that the GLR test signals do not cross over any threshold. Then, no fault is detected, which is possible to be seen in Figures 5.7a and 5.7b. The thresholds used for the decision making (횲) are plotted as horizontal lines in Figure 5.7b together with the GLR test functions. The colour of the threshold line corresponds to the colour of the GLR signal to which are associated. If there is only one line it’s because all threshold values are equal (see Table 5.2).

(a) Residuals function r(t). (b) GLR function g(t).

Figure 5.7: Fault-free case.

For RW, the effect of the faults can be appreciated directly in the tachometer sensor signal or in the friction torque estimator. However, for thrusters RCS, the only way to see the fault is indirectly by looking the dynamics and kinematics of the spacecraft. For that reason, the actuated force per thruster is taken from the simulation, see Figure 5.8 where y-label indicates the thruster number, in order to better understand the effect of the thrusters’ faults. 44 5. Simulation Results

1.5 1.5 1.5 1 1 1 1 2 3 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1 4 5 6 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1 7 8 9 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1

10 0.5 11 0.5 12 0.5 0 0 0 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Time [s] Time [s] Time [s]

Figure 5.8: Force [N] per thruster over time in fault-free case.

5.3.2. Leakage Fault The leakage fault represents the partial opening of a single thruster’s valve which produces a certain force when it should be closed but has not effect when the thruster is open. In the sample run here presented, the fault occurs on thruster number 6 with a magnitude of 푚 = 0.216 at time 푡 = 3071.6 s as it is shown in figure 5.9. Figures 5.10a and 5.10b

1.5 Thr6 1 [N] N F 0.5

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure 5.9: Faulty thruster force in Leakage fault case. present the residual and GLR signals of all the states for the aforementioned sample run, from where is clear that the only GLR functions that jump over the thresholds are the spacecraft’s angular rates. It is true, the the GLR functions of the RW friction torques also jump over the threshold, but 1000 seconds later. On closer look at the 흎퐒 is done in Figure 5.11 where for

(a) Residuals function r(t). (b) GLR function g(t).

Figure 5.10: Leakage fault case. 5.3. Sample Runs Analysis 45

each component of the spacecraft’s angular rate, the estimation (흎̂), the real (흎), and the measured (흎 ) states are displayed together. In addition, the residual signal 푟(푡) and the GLR function signal 푔(푡) of each angular rate component, the latest includes a black vertical line to indicate the fault occurrence time and a red horizontal line representing the threshold, are also shown. In the first subplot of each of the figures in Figure 5.11 there is no clear

0.2 Estimated Measured True value 0.2 Estimated Measured True value

0.1 0 0 [°/sec] [°/sec] X Y

S -0.2 S -0.1

-0.4 -0.2 10-4 10-4 6 Residual 2 Residual

4 0 2 [°/sec] [°/sec] X Y

S S -2 0

-2 -4 GLR function Threshold 60 GLR function Threshold S S Fault occurrence S S Fault occurrence X X Y Y 80 50 60 40 30 40 20 20 10

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] Time [s]

1.5 Estimated Measured True value

1

0.5 [°/sec] Z S 0

-0.5 10-4 4 Residual

2 [°/sec] Z

S 0

-2 60 GLR function Threshold S S Fault occurrence Z Z 50 40 30 20 10

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure 5.11: Spacecraft’s angular rates in fixed-body frame for leakage fault case. difference between the different variables presented because the measurement follows the reality as well as the estimator, which might jump at the beginning but quickly follows the real state. On the other hand, for the residual and GLR functions, the jump in the signals after the fault occurrence is pretty clear.

5.3.3. Loss of Effectiveness Fault The loss of effectiveness fault tries to emulate a decrease in the feeding gas pressure that affects all thrusters simultaneously and that decreases the force of each thruster by a certain percentage. In the sample run presented here as example, the magnitude of the loss fault is 푚 = 0.105 and its time of occurrence 푡 = 1468 s, as displayed in Figure 5.12, where y-label indicates the thruster number. From the residual and GLR signals presented in Figure 5.13a and 5.13b, respectively, it is obvious that only the spacecraft’s angular rates are affected. 46 5. Simulation Results

1.5 1.5 1.5 1 1 1 1 2 3 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1 4 5 6 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1 7 8 9 0.5 0.5 0.5 0 0 0 1.5 1.5 1.5 1 1 1

10 0.5 11 0.5 12 0.5 0 0 0 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Time [s] Time [s] Time [s]

Figure 5.12: Force [N] per thruster over time in LOE fault.

(a) Residuals function r(t). (b) GLR function g(t).

Figure 5.13: LOE fault case.

5.3.4. Stuck Close Fault The single thruster stuck close fault simulates a malfunctioning of the trhuster opening valve that suddenly closes and cannot be opened again. For the sample run presented here, it occurs to thruster number four at time 푡 = 1868 s, as shown in Figure 5.14.

1.5 Thr4 1 [N] N F 0.5

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure 5.14: Faulty thruster force in stuck close fault case.

The effect of this fault has an impact that is appreciable on the spacecraft’s angular rates, so the residuals and GLR functions are expected to jump only for those states. This is clearly represented by Figure 5.15. The friction torque residuals cross the thresholds but later. 5.3. Sample Runs Analysis 47

(a) Residuals function r(t). (b) GLR function g(t).

Figure 5.15: Stuck close fault case.

5.3.5. Stuck Open Fault The stuck open fault has the same effect as the leakage fault but where the opening of the valve is total. So the effect is more intense than leakage. The force time evolution of the faulty thruster for the example sample run with time of fault’s occurrence 푡 = 1460 seconds and faulty thruster number 3 is shown in Figure 5.16. As it is possible to be appreciated from

1.5 Thr3 1 [N] N F 0.5

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure 5.16: Faulty thruster force in stuck open fault case.

Figures 5.17a and 5.17b, the residuals and GLR functions of the spacecraft’s angular rates and RW friction torques seem to jump simultaneously, which would imply a malfunctioning of the FDI isolation strategy. Doing a zoom in the time scale at the time the fault occurs (see Figures 5.17c and 5.17d), it is clear that first 흎 jump and about 1.5 seconds later, the friction torque’s GLR cross the thresholds.

(a) Residuals function r(t). (b) GLR function g(t). 48 5. Simulation Results

10-5 X Y Z 2 X Y Z 200 1 0 100 [rad/s] -1 0 -6 10 A1 A1 A1 A1 A1 A1 X Y Z X Y Z 5 50

[°] 0 -5 0 -6 10 A2 A2 A2 A2 A2 A2 X Y Z X Y Z 5 50

[°] 0 -5 0 150 200 R1 R2 R3 R4 R1 R2 R3 R4 100 100 0

[rpm] 50 -100 0 T T T T T T T T 0.06 fr1 fr2 fr3 fr4 fr1 fr2 fr3 fr4 0.04 0.02 0 0 [Nm] 10 -0.02 -0.04 1392 1394 1396 1398 1400 1402 1404 1406 1408 1410 1412 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 Time [s] Time [s] (c) Residuals function r(t) with zoom. (d) GLR function g(t) with zoom.

Figure 5.17: Stuck open fault case.

5.3.6. Reaction Wheel Friction Fault The friction modelled in this system has two components, which vary due to the friction fault and in increases the friction torque in a certain RW. The sample run used here as example includes a friction fault in RW 3 at 푡 = 2091 seconds with magnitudes 푚 = 2.95 and 푚 = 14.6. The residuals and GLR signals for all variable states are shown in Figures 5.18a and 5.18b, from where can be seen that only the friction torque of RW number 3 GLR signal jumps over its threshold. One closer look to Rw 3 friction torque is presented in Figure 5.18c, which allows to see the difference between the pseudo-measured friction torque (does not vary because the RW angular rate does not vary), and the estimated and real ones, that clearly jump after the fault occurrence.

(a) Residuals function r(t). (b) GLR function g(t).

100 Estimated Measured True value

50

0 [mNm] 3 Rf

T -50

-100 50 Residual

0 [mNm] 3

Rf -50 T

-100 T GLR function T Threshold Rf Rf Fault occurrence 3 3

100

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

(c) Friction torque variable ( ).

Figure 5.18: RW friction fault case. 5.4. Monte Carlo Analysis 49

5.3.7. Reaction Wheel Tachometer Fault The RW tachometer fault simulates a wrong functioning of the tachometer sensor that is placed inside each Rw. An example of that fault occurring to Rw number 1 at time 푡 = 1447 s with a fault magnitude 푚 = 2.743 is displayed here. The residuals and GLR signals are shown in Figures 5.19a and 5.19b, which have been zoomed in Figures 5.19c and 5.19d in order to show better when the GLR signals jump over their thresholds. As it is clear, both the RW angular rate and friction torque GLR signals cross the threshold line simultaneously, identifying the type of fault as a RW tachometer fault in RW 1. Then, a more close analysis on RW 1 friction torque and angular rate is done in Figures 5.19f and 5.19e, respectively, where the estimation, the measured, and the real states are represented. Clearly, the true value does not vary while the measured and the estimated do. The wheel rate estimation error looks like a spike, that is because the estimator, after the fault, quickly follows the measurement, but that spike is enough for FDI system to correct detect and isolate the fault.

(a) Residuals function r(t). (b) GLR function g(t).

10-6 50 X Y Z 2 X Y Z 0

[rad/s] -2 0 -6 10 A1 A1 A1 A1 A1 A1 X Y Z 10 X Y Z 50 5 [°] 0 -5 0 -6 10 A2 A2 A2 A2 A2 A2 X Y Z 5 X Y Z 50

[°] 0

-5 0

R1 R2 R3 R4 R1 R2 R3 R4 0 200 -200 [rpm] -400 0 T T T T T T T T 0.2 fr1 fr2 fr3 fr4 fr1 fr2 fr3 fr4

0.1 [Nm] 100 0 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 Time [s] Time [s] (c) Residuals function r(t) with zoom. (d) GLR function g(t) with zoom.

5.4. Monte Carlo Analysis As described in Section 3.2.1, the MC analysis consists of two simulations campaigns. One that excludes any model uncertainties, and another that does include them. Both campaigns consist of 150 runs per fault type (including the fault-free case). It must be noticed, that the simulation of the MC campaigns was done in two stages, the first one with 50 tests, and the second one with 100 tests. This was done due to time constraints, the first 50 tests were used to validate the correct performance of the FDI strategy and get the first impressions of the results, while the extra 100 runs had the objective of increasing the amount of data. The parameters that vary for each run of the test are summarised in Table 5.3. In order to verify that the variation of the parameters presented in in Table 5.3 is done as desired, histograms of each of the mentioned variables are presented for each tests campaign in addition to the 50 5. Simulation Results

104 1 Estimated Measured True value 500 Estimated Measured True value

0.5

0

0 [mNm] [rpm] 1 1 R Rf

-0.5 T

-1 -500 104 10 Residual 500 Residual

5 0 [mNm] [rpm] 1 1 R 0 Rf T

-5 -500 T GLR function T Threshold GLR function Threshold Rf Rf Fault occurrence R R Fault occurrence 1 1 500 1 1 400 300 200 100 100

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] Time [s] (f) Friction torque variable ( ). (e) RW angular rate variable ( ).

Figure 5.19: RW tachometer fault case.

Table 5.3: MC related parameters.

Parameter Value Unit Parameter Value Unit 푡 (0, 7000] s 푚 (0, 0.5] N/A 푚 (0, 0.5] N/A 푚 (1, 3] N/A 푚 (1, 20] N/A 푚 (1, 9] N/A ∘ 휎 0.1 휎 5 % ∘ ∘ 휎 0.001 휎 0.01 ∘ 휎 0.5

analysis of the results.

5.4.1. Test Campaign Without Uncertainties As described in the methodology, in this campaign no uncertainties are included. Therefore,

휎 = 휎 = 휎 = 휎 = 휎 = 0.

First of all, the general results of this campaign are summarised in Table 5.4 where the different performance indices are presented for each fault type in form of percentage (with exception of the time to detection as average time and variance) of the total number of tests realised per type of fault. The time of detection is only computed for the correct detected cases.

It must be noticed that in the average column of Table 5.4 the percentage of correct isolated

Table 5.4: Campaign without uncertainties results.

Thruster faults RW faults Average Leakage LOE Close Open Friction Measurement Detection [%] 99.33 87.33 100 100 100 100 97.78 Detection time/ 451.06/ 16.38/ 1.63/ 7.93/ 3.59/ 0.155/ 80.1/ 흈 [s] 1444 47.43 5.05 51.63 21.01 0.0489 261.82 Equipment isolation [%] 71.33 87.33 100 100 100 100 90.89 Fault isolation [%] N/A N/A N/A N/A 100 96.67 98.34 Miss-detection [%] 0.67 12.67 0 0 0 0 2.22 Equipment miss-isolation [%] 28.66 0 0 0 0 0 4.78 Fault miss-isolation [%] N/A N/A N/A N/A 0 3.33 0.17 False Alarm [%] 0 0 0 0 0 0 0 5.4. Monte Carlo Analysis 51 fault in type level is higher than the percentage of correct isolated fault in equipment level. This should not be possible (it is not possible to correct isolate the type of fault but not the equipment) but the correct isolated fault type just applies to RW. Therefore, looking only at the RW faults correctly isolated in equipment level, the average is a 100%, and then, the results make sense.

With the results shown in Table 5.4 the performance of the FDI strategy for the study case with no uncertainties affecting the models can be said to be almost excellent. From a total number of 900 tests run, a 95.56% of the times the fault is correctly detected with an average time of detection of 80.1 seconds but with a very high variance of 9.5 ⋅ 10 seconds, meaning that the detection times vary widely. The fault isolation performance is also impressive, in a 90.89% of the cases the fault is correctly isolated in equipment level, meaning that only a 4.78% of the cases it is miss-isolated after being detected, and it only happens for leakage fault. Regarding the isolation in fault type level, the results are even better, only for a 0.33% of the time for the RW tachometer fault, a fault is not correctly isolated. The miss-detection of faults happens in only 2.22% of the times and only for leakage and LOE faults, precisely the thruster faults that have a fault magnitude. And no false alarm is detected during all the tests campaign.

In order to expand the results and identify the reasons why some cases are not fully de- tected and isolated, a deep analysis of the results for each simulated type of faults is carried out below. In such analysis, different aspects are studied, like the influence of the faults’ magnitudes, the influence of the affected RW or thruster, and the influence of the current attitude manoeuvre (slew or observing period), on the FDI system’s performance.

Thruster Leakage Fault The Figure 5.20 shows a proper distribution of the thruster leakage fault in terms of time of occurrence and affected thruster. However, regarding the fault’s magnitude distributions, there is a clear decrease in the density of cases for 0.4 < 푚 < 0.5. The reason behind it is that in the first simulation stage (50 runs) the fault magnitude was defined as 푚 ∈ (0.0.5], while due to a type error, in the second stage (100 runs) it was defined as 푚 ∈ (0.0.4]. This error does not affect the results because, as it will be seen later, for great leakage fault’s magnitude the fault is always detectable and isolable, and the critic cases are for low values of 푚. Table 5.4 shows a very high percentage of correct detected cases (99.33%) and only

Figure 5.20: Leakage fault: simulation parameters histograms. a 0.67% of miss-detection. However, the results in time to detection and equipment isolation do not seem very promising. The time to detection present a very high mean (451.06 seconds) and an enormous variance. Therefore, the reasons why a fault is miss-detected, the influ- ence of fault’s magnitude and time of occurrence on time to detection of correct detected cases are studied first. Then, why a 28% of the correct detected cases the fault is not correctly iso- lated (a 0.67% of the miss-isolated cases are due to miss-detection) and the influence of fault’s 52 5. Simulation Results magnitude and time of occurrence on time to detection of correct isolated cases are analysed.

Starting with the analysis on correct detected cases figure, 5.21a shows a clear time to de- tection dependence with the magnitude of the leakage. Moreover, it shows how the only not correctly detected case (miss-detected) is caused by a very tiny leakage (푚 = 0.0085%). Then, looking at Figure 5.21b it can be seen that there is no correlation with the time of fault occurrence. There is only one miss-detected case and it affects thruster number 7 during an attitude slew on x-axis, but this is not relevant given that it is only a single case and the reason is clearly the magnitude of the leakage.

Focusing now on the isolation performance, Figure 5.22a and 5.22b are shown with time

Detected Non-Detected

103

102

1

Time to Detection [s] 10

100

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 m leak (a) Fault magnitude. (b) Time of fault occurrence.

Figure 5.21: Leakage fault: influences w.r.t. detection performance. to detection in logarithmic scale to better show all the results (see Appendix C.2 for linear scale representation). A semi-logarithmic regression line is done in Figure 5.22a to show that the time to detection and fault’s magnitude have a semi-logarithmic correlation, lower the leakage worse the isolation performance. It is clear that the isolation is not achieved for smaller leak magnitudes. Figure 5.22b depicts a clear non-correlation with respect to the fault’s time of occurrence. In addition to fault’s magnitude and time of occurrence, other factors can have an influence on the FDI performance, like the affected thruster. Therefore, Figure 5.22c shows the times leakage fault is miss-detected w.r.t the faulty thruster. While thrusters number 3, 9, 10, 11, and 12 present only one or two cases of miss-detection, the rest present a higher number of times, up to 7 for thruster number 8. Figure 5.22d shows the mean time of detection and its variance per affected thruster. Again, thrusters number 3, 9, 10, 11, and 12 present a lower mean time to detection and variance. There seems to be a dependence on the faulty thruster w.r.t the isolation performance, probably driven by the thrusters configuration design, the spacecraft inertia, and the used thrusters during the attitude slews.

Thruster Loss of Effectiveness Fault Given that the LOE affects all the thrusters simultaneously, only the time of injection and the fault’s magnitude are relevant for this type of fault. Regarding the fault’s simulation param- eters histograms presented in Figure 5.23, it is possible to see how they are distributed more or less uniformly, with a bit less cases simulated between 2000 and 4000 seconds, probably due to low number of sample cases, only 150.

Looking at the results from Table 5.4, it can be seen that a 12.67% of the cases the fault is miss-detected, while the correct detection ration is 87.33%, meaning that the non-correct detected cases are caused by miss-detection. Also, it must be noticed that the correct iso- lated ratio is also 87.33%, which means that all correct detected cases are also correct isolated. Therefore, only an study of the correct detection is enough to analyse also the fault isolation, saving space and time.

Starting with the relation between the fault’s magnitude and the time of detection and correct 5.4. Monte Carlo Analysis 53

Isolated Non-Isolated Semi-Log Regression 103

102

1

Time to Detection [s] 10

100

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 m leak (a) Fault magnitude. (b) Time of fault occurrence. 7 7 10 Mean Time to Detection [s] 6 Variance 106

5 105 4

104 3

103 2 Number of cases not isolated

2 1 10

0 101 1 2 3 4 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8 9 10 11 12 Thruster number Thruster (c) Non-isolated number of cases per thruster. (d) Mean time to detection and variance per thruster.

Figure 5.22: Leakage fault: influences w.r.t. isolation performance.

Figure 5.23: LOE fault: simulation parameters histograms. detection, there is a clear correlation between them, as it is shown in Figure 5.24a, where the non-detected cases (miss-detected cases) are plotted with a detection time equal to one second for plotting purposes and the time to detection is in logarithmic scale to better show all the results (see Appendix C.2 for lineal scale representation). Clearly, the miss-detected cases are the ones which fault’s magnitude are lower than 8.5% of pressure loss, or equiva- lently for 푚 < 0.085. And the time to detection decreases as the fault’s magnitude increases up to 푚 = 0.25 when after that there is a patter. For 0.25 < 푚 < 0.35 the time to detection varies randomly, then for 0.35 < 푚 < 0.42 it it uniform, and then for 0.42 < 푚 < 0.5 it varies again randomly. This behaviour for 0.25 < 푚 < 0.50 does not seem to be random. There must be something else influencing it. Inspecting now the time of injection and its influence on the FDI performance, shown in Figure 5.24b, it is possible to see how the corre- lation is identical to the one shown in Figure 5.24a due to fault’s magnitude. This should not be the case unless there is a direct correlation between the fault’s magnitude and the time of injection. Because of that, Figure 5.25 representing the time that a fault occurs versus the fault’s time of occurrence is plotted, showing two clear correlations in the form of straight lines. The reason why there are two lines is the simulation of the MC campaigns being done in two stages. Therefore, the seeds used in the simulator were different in each stage and the correlation followed two different patterns. 54 5. Simulation Results

103 Detected Non-detected

102

101 Time to Detection [s]

100

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 m loe (a) Fault magnitude. (b) Time of fault occurrence.

Figure 5.24: LOE fault: influences w.r.t. detection performance.

Once the reason why Figure 5.24a and 5.24b show the same pattern is clear it is necessary to analyse the effect of fault’s time of occurrence on times to detection and correct detection. There seems not to be an effect on the miss-isolated cases, it is pretty clear that these are driven by the fault’s magnitude, as well as it drives the time to detection behaviours in the period between 600 and 3000 seconds. However, looking at the times of detection between 3000 and 5000 seconds and 5900 and 7000 seconds, precisely when an attitude slew is performed, those act more randomly and are in general lower than the times of detection between 2300 and 3000 seconds and 5000 and 5900 seconds, when an observing period is going on.

This makes total sense, during attitude slews the spacecraft is rotating and a sudden loss of effectiveness on the thrusters is noticed immediately and depending on the performed atti- tude the effect is faster or slower detected (in the order of 1 second of difference). While, if the fault occurs during an observing period, the fault has no effect until the spacecraft starts rotating (active thrusters) and, then, the effect is sensed later in time due to the slowing rotation of the spacecraft at the beginning of the slew.

0.5 first 50 tests 0.45 last 100 tests

0.4

0.35

0.3

loe 0.25 m

0.2

0.15

0.1

0.05

0 0 1000 2000 3000 4000 5000 6000 7000 Time of injection [s]

Figure 5.25: LOE fault: time of fault occurrence vs fault’s magnitude.

Thruster Stuck Close Fault The simulation parameters of the stuck close fault are represented in Figure 5.26, from where it is possible to see a behaviour close to a uniform distribution for fault’s time of occurrence and affected thruster. The FDI performance for the stuck close fault is the most promising one. All the test runs where correctly detected and isolated, with a mean time of detection of 25.50 seconds and very low variance. However, it must be noticed that thruster number 7 is not active any time during the simulation. Therefore, a stuck close fault on thruster number 7 is not a detectable fault and it counts as correctly detected according to the definition of correct detection from Section 3.2.2. Figure 5.27a depicts a certain influence of the time of fault occurrence and the time to detection, which are in general lower if the fault occurs 5.4. Monte Carlo Analysis 55

Figure 5.26: Stuck Close fault: simulation parameters histograms. during an attitude slew. The time to detection is represented in logarithmic scale to better show all the results (see Appendix C.2 for linear scale representation).

The influence of the faulty thruster in the FDI system is described in Figure 5.27b, where the mean time of detection and variance per thruster are shown. First of all, notice how thruster 7 has no results (for the reasons already mentioned). Then, with exception to thrusters 1 and 8, the mean times to detection are below one second. The reason behind why thruster numbers 1 and 8 have a higher mean and variance could be that the requested opening time (recall the thruster model described in Section 5.1.3) is very small, and therefore, a stuck close fault in those thrusters would have a small effect and it would take longer to be de- tected, see Appendix C.1 for examples. Finally, it is possible to see how thruster numbers 3, 5, 9, 10, and 12 have a mean time to detection very small and similar (or 0.4 or 0.5 seconds) and zero variance. The justification of why this occurs is related to the definition of the time to detection for the stuck close fault (see Section 3.2.2). If a fault occurs during an observing period, the time to detection is only counted after the faulty thruster is open, and similar to what happens for the LOE fault presented previously when the spacecraft is not rotating, and effect on the thrusters is more pronounced than when it is rotating. Then, given that similar effects happen for the same faulty thruster, it is not so strange that the times to detection per thruster are identical for the same faulty thruster and therefore have zero variance.

104 Mean Effective Time to Detection [s] Variance

102

100

10-2

10-4

10-6 1 2 3 4 5 6 7 8 9 10 11 12 Thruster (a) Time of fault occurrence. (b) Mean time to detection and variance per thruster.

Figure 5.27: Stuck Close fault: influences w.r.t. detection performance.

Thruster Stuck Open Fault The simulations parameters shown in Figure 5.28 are as expected, more or less uniformly distributed. The results from the stuck open fault simulations are very similar to the stuck close fault ones. All cases are correctly detected and isolated. However, for stuck open fault, the mean time to detection is relatively small (7.93 seconds) but the variance is enormous. In Figure 5.29a it is possible to see that for most of the cases the time to detection is below one second because the time to detection is plotted logarithmic scale showing all the results despite the huge variance of results (see Appendix C.2 for linear scale representation). How- ever, for some cases, all of them during an attitude slew, the time to detection is a bit higher 56 5. Simulation Results

Figure 5.28: Stuck Open fault: simulation parameters histograms.

(two or three seconds) and for two cases the time to detection increases up to 400 seconds. The reasons behind this could be similar to the ones for stuck close fault, but for the oppo- site reason, if a thruster requested opening time is very high, a stuck open fault that occurs while the thruster is active would have a very little impact on the spacecraft dynamics, see Appendix C.1 for examples. Now, considering the results shown in Figure 5.29b it is clear that for all the thruster except for thruster number 6 having a high mean (around 100 sec- onds) and very large variance, the mean times to detection are around one second and the variances are very small. Therefore, it seems that the two cases mentioned from Figure 5.29a are caused by thruster number 6 for the already mentioned causes.

106 Mean Detection Time [s] Variance

104

102

100

10-2

10-4 1 2 3 4 5 6 7 8 9 10 11 12 Thruster (a) Time of fault occurrence. (b) Mean time to detection and variance per thruster.

Figure 5.29: Stuck Open fault: influences w.r.t. detection performance.

Reaction Wheel Friction Fault Figure 5.30 depicts more or less uniformly distributed fault simulation parameters. The

Figure 5.30: RW friction fault: simulation parameters histograms. results for the RW friction fault are excellent. All the cases are correctly detected and isolated in both levels, equipment and fault type level. The mean time to detection is very low (3.59 seconds) but the variance is quite high. Looking at Figure 5.31a and 5.31c it is clear that 5.4. Monte Carlo Analysis 57 there is one single case where the time to detection is very high (about 250 seconds) which increases the average time and variance. This single case behaviour is caused on RW number one (in Figure 5.31f RW one has the higher mean time and variance) by a faulty Coulomb friction factor almost identical to the nominal case (푚 = 1.014) and a relative high faulty viscous friction factor (푚 = 9.8). However, in Figure 5.31a and 5.31c the influence of the faulty friction factors on the time the detection is not appreciated. So, in Figure 5.31b and 5.31d the single case with very high time to detection is not plotted in order to show better the effect of the faults magnitudes. It is clear that the friction factor that has more influence when faulty is the Coulomb one. Regarding the influence of the fault time of occurrence, it can be perceived that times to detection are slightly higher during the attitude slews, as see in Figure 5.31e where the time to detection is represented using the logarithmic scale to better show all the results (see Appendix C.2 for lineal scale representation).

6 300 Fault Level Isolated Fault Level Isolated 5.5

250 5

4.5 200 4

3.5 150 3

100 2.5 Time to Detection [s] Time to Detection [s] 2

50 1.5

1

1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 m m c c (a) Fault Coulomb factor magnitude. (b) Fault Coulomb factor magnitude. 6 300 Fault Level Isolated Fault Level Isolated 5.5

250 5

4.5 200 4

3.5 150 3

100 2.5 Time to Detection [s] Time to Detection [s] 2

50 1.5

1

0 2 4 6 8 10 12 14 16 18 20 0 2 4 6 8 10 12 14 16 18 20 m m v v (c) Fault viscous factor magnitude. (d) Fault viscous factor magnitude. 104 Mean Time to Detection [s] Variance

103

102

101

100

10-1 1 2 3 4 RW (e) Time of fault occurrence. (f) Mean time to detection and variance per RW.

Figure 5.31: RW friction fault: influences w.r.t. detection performance.

Reaction Wheel Measurement Fault In Figure 5.32 fault simulation parameters are shown to be more or less uniformly dis- tributed. The results for the RW tachometer fault are almost excellent. All the cases are 58 5. Simulation Results

Figure 5.32: RW tachometed fault: simulation parameters histograms. correctly detected and isolated in equipment level with very low times to detection and vari- ances (see figure and 5.33d). Nevertheless, in 3.33% of the cases fault is not isolated in type level. The mean time to detection is incredibly low (0.155 seconds) as well as the variance. In order to find out why some cases are not fully isolated and the effect of the fault’s magnitude and the time of the fault occurrence, their influences are are presented. First, in Figure 5.33a is it possible to see that the fault’s magnitude has no influence at all in the time to detection nor to the isolability of the fault. On the other hand, Figure 5.33b depicts that all cases non-isolated in fault level occur during an attitude slew on the x-axis of the fixed body-frame but no correlation with the time of fault occurrence and the time to detection. Regarding figurer 5.33c it is clear that cases where fault was not fully isolated affect only RW number one and three, which, in addition, have a higher time to detection variance. Moreover, if the position of the RW with respect to the spacecraft body frame is considered (see Section 5.2), RW 1 and 3 aligned in the x-z plane, it is clear that the RWs which torque affects the x-axis of the spacecraft present more isolability issues than the ones that has no effect on x-axis (RW 2 and 4).

0.6 fault level Isolated 0.55 fault level not-Isolated

0.5

0.45

0.4

0.35

0.3

0.25 Time to Detection [s]

0.2

0.15

0.1 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 m meas (a) Fault magnitude. (b) Time of fault occurrence. 3 100 Mean Time to Detection [s] Variance

10-1 2

10-2

1

10-3 Number of times fault level not isolated

0 10-4 1 2 3 4 1 2 3 4 Rw not isolated RW (c) Times no correct isolation in fault type lever per RW. (d) Mean time to detection and variance per RW.

Figure 5.33: RW tachometer fault: influences w.r.t. detection performance. 5.4. Monte Carlo Analysis 59

5.4.2. Test Campaign With Uncertainties First of all, as it has been done in the previous section, the proper simulation of the un- certainties magnitudes during the MC campaign must be verified, which means to follow a normal distribution. The histograms of each uncertainty magnitude are displayed in Figure 5.34, from where it is possible to verify that they do follow a normal distribution around the nominal values. It must be noticed that for RW and thrusters miss-alignments, in order to not have 16 plots, the miss-alignments were plot all together. Each individual actuator his- togram can be found in Appendix C.3.

The results of the performance run using MC tests including uncertainties are summarised

Figure 5.34: Uncertainties histograms. in Table 5.5. It should be noted that the FDI strategy was optimised for the uncertainty-free case. Therefore, the campaign that includes uncertainties is expected to present a worse performance of the FDI strategy.

Comparing Table 5.4 and 5.5, there is a clear mismatch in the false-alarm ratio indices. The introduction of uncertainties clearly increases the amount of false-alarm cases and there- fore, decreases the correct detection and isolation ratios. And as expected, the introduction of uncertainties decreases the performance of the FDI strategy. In order to clearly see the ef- fect of the uncertainties in the global results, Table 5.6 presents the difference in percentage between Table 5.4 and 5.5. If a value increases from zero to a certain value, the percentage is shown as ”-”.

Table 5.6 clearly shows how the inclusion of the uncertainties in the models affect the FDI performance. As already mentioned, the false alarm ratio increases, while the correct detection and correct isolation (in both levels) ratios decrease around a 20%. The time to detection mean increases in general except for the stuck open fault and the RW friction fault, 60 5. Simulation Results

Table 5.5: Campaign with uncertainties results.

Fault Thruster faults RW faults Average Free Leakage LOE Close Open Friction Measurement Detection [%] N/A 66 67.33 76.47 79.33 80 89.26 76.40 Detection time/ 685.83/ 26.06/ 1.93/ 4.50/ 1.41/ 0.16/ 119.98/ N/A 흈 [s] 2.37e3 484.34 17.89 29.93 0.73 0.054 358.13 Equipment isolation [%] N/A 44.67 67.33 76.47 79.33 80 84.56 72.06 Fault isolation [%] N/A N/A N/A N/A N/A 76.67 82.55 79.61 Miss-detection [%] N/A 5.33 12.67 0 0 0 0 3 Equipment miss-isolation [%] N/A 21.33 0 0 0 0 4.69 4.34 Fault miss-isolation [%] N/A N/A N/A N/A N/A 3.33 2.01 2.77 False Alarm [%] 25.3 28.67 20 23.53 20.67 20 10.74 17.66

Table 5.6: Percentage differences between results from no-uncertainty and uncertainty campaigns.

Thruster faults RW faults Average Leakage LOE Close Open Friction Measurement Detection [%] -33.55 -22.9 -23.53 -20.67 -20 -10.74 -21.87 Detection time/ +52.04/ +59.10/ +18.40/ -43.25/ -60.72/ +3.23/ +49.73/ 흈 [s] +64.06 +77.67 +254.3 -42.03 -96.53 +9.2 +36.8 Equip isolation [%] -37.36 -22.90 -23.53 -20.67 -20 -15.44 -20.72 Fault isolation [%] N/A N/A N/A N/A -23.33 -14.60 -19.04 Miss-detection [%] +695.52 0 0 0 0 0 +35.14 Eq miss-isolation [%] -34.36 0 0 0 0 - -9.21 Fault miss-isolation [%] N/A N/A N/A N/A - -39.64 1529.41 False Alarm [%] ------

while the all the variances increases. The miss-detection only increases for the faults that had already miss-detection for the campaign without uncertainties. Like it was done for the MC campaign without uncertainties, a deeper study and analysis of the tests are done to get insight on the results and obtain data about how uncertainties affect the FDI strategy here proposed.

First of all, the correlation coefficient of the different performance indices and the uncer- tainties magnitudes for global results are presented in Figure 5.35. It is important to under- stand how those values are obtained. The correlation coefficients for the time to detection are obtained using the Matlab function corrcoef.m which returns the matrix of correlation coefficients for a matrix, where the columns of the matrix represent random variables and the rows represent observations. In that matrix, the first column represents the times to de- tection for each simulation and the rest columns are the different values for the uncertainties in each test run. The rest of the correlation values are obtained using the point-biserial correlation (see [56]), which represents the correlation between a dichotomous (0 or 1) discrete variable and a continuous variable. This is used because the other performance indices are discrete di- chotomous variables (e.g., correctly detected, 1, or not ,0) and the uncertainties variables are continuous. Some new uncertainties are defined, those are the mean value of the principal moments of inertia, wheels misalignment, and thrusters misalignment, which are calculated using the geometric mean (휇). The results are depicted in Figure 5.35, from where it is obvious that the most important uncertainty with respect to correct detection/isolation and false alarm is the inertia of the spacecraft, with x-axis being more influent than y-axis, which is more influent than z-axis. For miss-detection, miss-isolation, and time of detection, there is no clear correlation with any uncertainty parameter. The same information displayed in Figure 5.35 is done for each type of fault (see Appendix C.3). It is interesting to observe how each type of fault is differently affected by each uncertainty, for example for faults that implies an extra torque (leakage and stuck open) the inertia uncertainty does not have a correlation 5.4. Monte Carlo Analysis 61 with the time to detection, while for those faults that imply a lack of torque (LOE, stuck close, and RW friction) the inertia uncertainty does have a great impact. Miss-alignments seem to have a more random behaviour, for example, the miss-alignment of the wheels have a less or similar impact than the miss-alignment of the thrusters for Rw friction fault for correct detection and isolation. Or for the stuck close fault, the misalignment of thruster 7 has a certain influence for the correct detection of the fault, but remember that thruster number 7 is never required to be open so it cannot have any impact on this type of fault. The reason behind this could be a small number of MC samples (only 150 per type of fault) while for the general results 1050 were used.

It is of interest to study the reasons behind the false alarms cases which are more than

0.2

Det 0.1

0

0.2

Eq Iso 0.1

0

0.2

0.1 Fault Iso 0

0.2

0.1

False Alarm 0

0.2

0.1 Miss Det 0

0.2

0.1 Eq no-Iso 0

0.2

0.1

Fault no-Iso 0

0.2

0.1 Det. Time 0 InertiaX InertiaZ InertiaY AlignStr1 AlignStr2 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRmu AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs10 AlignRcs11 AlignRcs12 InertiaMean AlignRwMean AlignRcsMean

Figure 5.35: Uncertainties correlation coefficients for global results. the 20% of the cases. As already mentioned, the spacecraft’s inertia uncertainties are the major causes of false alarm. Therefore, a further study of the spacecraft’s inertia uncertain- ties and false alarm is done below, but only for the fault-free cases. This is done because the presence of faults would not show clearly all the runs that could have a false alarm.

First of all, Figure 5.36a plots, only for fault-free simulations where there was no false alarm, the inertia uncertainty in groups of 3 (x, y, and z axis). It is possible to see, how for those cases presented in Figure 5.36a not a single inertia uncertainty is above the ±9%. On the other hand, Figure 5.36b displays, only for fault-free simulations where there was false alarm, the inertia uncertainty in groups of 3 (x, y, and Z axis). From where it is clear that most of the cases have at least one inertia uncertainty magnitude above the ±9%. This ±9% inertia uncertainty magnitude is not a fixed threshold that clearly defines the maximum inertia un- certainty that produces false alarm. Moreover, it depends on the axis, so a study per axis 62 5. Simulation Results

Uncertainty Inertia X Uncertainty Inertia Y Uncertainty Inertia Z 10

8

6

4

2

0

Uncertainty [%] -2

-4

-6

-8

0 20 40 60 80 100 120 Cases (a) No false alarm.

Uncertainty Inertia X Uncertainty Inertia Y Uncertainty Inertia Z 15

10

5

0

Uncertainty [%] -5

-10

-15

0 5 10 15 20 25 30 35 40 Cases (b) False alarm.

Figure 5.36: Fault-free simulations’ inertia uncertainties magnitudes. inertia uncertainty is done. In Figures 5.37a, 5.37b, and 5.37c the all runs inertia uncertain- ties of the different axis are plotted against each other together with the presence or not of false alarm. The ranges for inertia uncertainties, that do not produce false alarm, for axes x and y are more or less clear. Those ranges are [−7%, 7%] for x-axis and [−10%, 7%] for y-axis. Regarding z-axis it seems that range is not so well defined, but it is still within [−7%, 10%].

Another way to see the correlation of the false alarm cases with the spacecraft inertia uncer- tainties are shown in in Figures 5.38a, 5.38b, and 5.38c, which show the probability density function of the group of uncertainty magnitudes for the not-false alarms cases (in green) and the false alarm cases (in red). It can be seen that for the not-false alarm cases the magnitudes of the uncertainties have mean close to zero and standard deviation around 4%, while the false alarm cases uncertainties magnitudes have a mean that is further from zero and with a greater standard deviation close to 7%. These results support the previous outcomes, for greater spacecraft inertia uncertainties, more probability of false alarm. 5.4. Monte Carlo Analysis 63

15 Not false alarm 15 Not false alarm False alarm False alarm

10 10

5 5

0 0 Uncertainty Y [%] Uncertainty Y [%]

-5 -5

-10 -10

-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15 Uncertainty X [%] Uncertainty Z [%] (a) Inertia X-Y. (b) Inertia Z-Y.

15 Not false alarm False alarm 10

5

0

-5 Uncertainty Z [%]

-10

-15

-15 -10 -5 0 5 10 15 Uncertainty X [%] (c) Inertia X-Z.

Figure 5.37: Inertia uncertainty magnitudes vs false alarm.

(a) X-axis. (b) Y-axis.

(c) Z-axis.

Figure 5.38: Inertia uncertainty probability function distribution for no-false/false alarm cases. 64 5. Simulation Results

5.5. Discussion on Simulation Results In this chapter, the discussion on simulation results is done for each specific test after their results are presented. It is done like this because it is easier and more clear to examine the outcomes of the simulation tests together with the plots that present those outcomes. Also, the unexpected result cases are evaluated there as well as the reasons that led to these results. However, a general discussion on simulation results and their limitation is done here.

From the sample run tests, the effect that each fault has on the considered system, as well as the FDI system response, are clear. For all the presented cases, the expected GLR signals cross the thresholds after the fault has occurred. Nevertheless, in the leakage, stuck close, and stuck open faults simple test runs the RW friction torque GLR signals also cross the thresholds but later than the angular rate GLR signals. This is due to the fact that when the RW is saturated, the pseudo-measured friction torque does not vary anymore (maximum angular rate) but the estimator of the friction torque keeps increasing, and therefore, the residual diverge from zero. Despite that, all the simulated faults are correctly detected and isolated in all the levels. However, a single sample run does not represent the full situations where a certain fault can occur, that is why the Monte Carlo campaigns are carried out.

The performance of the FDI strategy in front of the considered faults is precisely defined by the results from the two MC campaigns. The MC campaign that does not include uncer- tainties present very promising results, with a very high percentage of correct fault detection and isolation, low times to detection, and very low miss-detection and no false alarm. For the type of faults that the percentages were not a 100%, the reasons why that is not achieved are also exposed. Regarding the MC campaign that includes uncertainties, it can be deduced that the uncertainties do have a great impact on the FDI performance, specially the uncer- tainties on the inertia of the spacecraft. They decrease the correct detection and isolation ratios, increase the false alarm ratio, and vary the times to detection.

The results from the MC campaigns are very valuable and the amount of tests performed give reliability and quality to the results obtained. Nevertheless, reliability and quality are restricted by the limitations of the MC tests. The first limitation is that only one guidance profile is used, meaning that for all the tests the same path, in terms of attitude and angular rates, is followed. That implies, that the obtained results are only valid for this particular spacecraft’s attitude. The second limitation concerns the study case mission environment, where the position of the spacecraft with respect to the Sun and to the Earth does not vary during the simulation and that the disturbance torque produced by the Sun pressure is constant and only affects fixed-body frame y-axis. This entails a poor realistic environment compared to the one of a real mission. The third limitation corresponds to the sensors and actuators, where only zero-mean Gaussian white noises are considered. Other imperfections and flaws that affect sensors and actuators in real spacecraft are, for example, biases, other types of noises, and thermal dependence. 6 Conclusions and Recommendations

An innovative model-based FDI system to detect and isolate AOCS thrusters and reaction wheels faults for an agile spacecraft developed at Airbus Defence and Space GmbH, Friedrichs- hafen, Germany, during the MSc thesis stage, is presented. The strategy differs from usual schemes by being able to handle multiple actuators working simultaneously and to distin- guish different types of faults of the same equipment (e.g., RW actuation/friction fault from a RW tachometer fault). The approach followed in this research project tests the developed theoretical method in a simulation scenario to evaluate its performance. It includes a litera- ture study to gather all the relevant information regarding different aspects, such as Athena mission characteristics and requirements. Followed by a definition of the study case and methodology to evaluate the FDI system’s performance. Then, the proposition of an FDI sys- tem is presented. Finally, the proposed FDI system is simulated on the defined study case following a clear methodology and the simulation results are evaluated using the defined cri- teria in order to correctly asses its performance.

The main goal of this research project is to answer the research question presented in Section 1.4.3 and its unraveled sub-questions. Each research sub-question is answered in Section 6.1, which then allows answering the main research question, Section 6.2. Then, the FDI system presented here, and its results, have opened an interesting line of investigation for future research. The main recommendations and future work are presented in Section 6.3.

6.1. Research Sub-Questions Four research sub-questions were defined in Section 1.4.3. After the research project here elaborated, they can be fully answered.

6.1.1. Sub-Question: Study Case Model This sub-question was elaborated as How can the study case mission be modelled for FDI purposes? This sub-question intends to model an agile spacecraft that makes use of multiple actuators simultaneously, its AOCS sensors, actuator, and faults, and model uncertainties.

The spacecraft here considered is a rigid body and only rotational dynamics and kinematics are considered. The reason behind it is that this project is focused only on the spacecraft’s attitude and not on its orbit. The body dynamics are affected by the actuators torques and angular momentum and by external torques. Finally, the kinematics of the spacecraft are modelled using quaternion parametrization.

Both AOCS sensors and actuators are affected by Gaussian zero-mean white noises and misalignments.

65 66 6. Conclusions and Recommendations

The thruster torque contribution is defined by the position of each thruster with respect to the spacecraft’s CoM and its direction, which are affected by misalignments. The RWs models consider three types of torques, the commanded torque, the friction torque, and the actuation noise torque. Its dynamics are the ones of a flat rotating disk and its torque contribution to the spacecraft depends on its rotating axis direction with respect to the spacecraft body, which is also affected by misalignments.

The sensor models are very simple, noise and misalignment effects are added to the real values in order to simulate the measured ones.

The considered faults are leakage, stuck-open, and stuck-close of a single thruster, and loss of effectiveness (LOE) of all thrusters simultaneously. For RWs, an increment of the internal friction and an increment of the measured rate by the tachometer sensor with respect to the real rate are considered.

The contemplated uncertainties affect the spacecraft’s principal axis of inertia and the sen- sors and actuators misalignments.

6.1.2. Sub-Question: Methodology This sub-question was defined as

What methodology can be used to evaluate the performance of the model-based FDI system?

This sub-question intended to develop a clear methodology to evaluate the performance of the FDI system by defining assessment criteria, different study case mission scenarios, and evaluation process of the results.

The assessment of the FDI system performance is done by testing it on a simulator. The methodology selected comprises single test run analysis for each type of fault and fault free case and two Monte Carlo campaigns. The single test run analysis has the objective of under- stand what is the effect of the fault in the system and how the FDI system responds. While the first MC campaign, that excludes any uncertainty, aims to evaluate the FDI performance with respect to the faults’ magnitudes and times of occurrence, which vary in each run of the tests. The second MC campaign does include uncertainties, which also vary in each run, and has the objective of asses the FDI performance with respect to the uncertainties magnitudes. The number of runs per test has been evaluated and taking into account the memory and time of computation required for each simple test run, it has been decided to run 150 runs per test.

The FDI performance assessment criteria are focused on the correct detection, correct isola- tion in equipment and fault type level, false alarms, miss-detection, miss-isolation, and time to detection (the detection and isolation are done simultaneously).

The evaluation of the results can be done with a post-process, which include single test run analysis and MC campaign analysis. For single test runs, the performance of the FDI strategy can be evaluated by analysing, for example, residuals and GLR test functions time evolution and different behaviours between estimated, measured, and real values provided by the simulator. Regarding MC campaigns post-processing, mean and variance of the times to detection per type of fault can be obtained, for example. Also, ratios of correct detection or false alarm are also possible to be obtained.

6.1.3. Sub-Question: AOCS FDI System This sub-question was elaborated as

Which FDI system can be proposed as a solution to a spacecraft’s AOCS? 6.2. Research Question 67

This sub-question aims to design the FDI system capable of detecting and isolating AOCS sensors and actuators faults for agile spacecraft that makes use of multiple actuators simul- taneously by designing residuals generation methods and residuals evaluation algorithms.

The FDI system that has been designed here proposes the use of a mix of an EKF and a MEKF to estimate the values of the spacecraft angular rates and attitude, and RWs angular rate and friction torque. Then, the generation of the residuals is achieved by computing the difference between the estimates and the measured values by sensors (including a pseudo- measured RW friction torque).

The evaluation of the residuals to achieve fault detection is done by a detection algorithm which makes use of a GLR test and a decision making algorithm that compares the GLR test signals with fixed thresholds. Finally, the fault isolation is accomplished by comparing the decision vector obtained in the fault detection algorithm with a pre-defined fault signature matrix which unequivocally links the faults to the symptoms detected.

6.1.4. Sub-Question: FDI System Performance This sub-question was elaborated as How does the proposed FDI system perform with respect to the defined criteria in the study case mission?

This final research question expects to find out how the FDI system here proposed performs in a certain study case according to the assessment criteria.

The evaluation of the FDI strategy shows very good performance of the FDI strategy in terms of time to detection, and correct detection and isolation rates. In addition, it displays a clear dependence between the magnitude of the faults and its correct detection and isolation, as well as with respect to the time to detection, and a certain dependence on the fault time of occurrence. Moreover, it illustrates also a correlation between the magnitude of the uncer- tainties (especially the inertia of the spacecraft) and the trigger of false alarms, and therefore, incorrect FDI performance.

6.2. Research Question The main research question was Is it possible to develop a model-based FDI system for agile spacecraft’s AOCS capable of detecting and isolating faults occurring in both, sensor and actuator, when they work individ- ually or simultaneously, and assessing its performance with respect to defined criteria?

By answering all the sub-questions, as done in Section 6.1, the main research question can also be answered with satisfaction.

The FDI strategy proposed here has proved to be capable of detecting and isolating AOCS sensors and actuators faults in agile spacecraft that use multiple actuators and its perfor- mance has been evaluated following a methodology and defined criteria.

The simulation results are very valuable and provide very good insight given the number of tests performed. However, its quality and reliability are confined by the limitations of the models and MC campaigns. Those limitations include a single attitude profile, a not very re- alistic space environment (disturbances), and that the considered noise on the sensors and actuators is only zero-mean Gaussian white noise.

Despite the limitations of the project, the results provide a starting point to future work that will pretend to implement a functional FDI system for agile spacecrafts. The implication of the results obtained in the scientific community of the fault detection and isolation field is that is it possible to detect and isolate faults that occur on both AOCS sensors and actuators 68 6. Conclusions and Recommendations using a model-based FDI strategy. Moreover, it shows that identifying the actuator and the type of fault on reaction wheels is feasible. And that the performance of the FDI strategy can be evaluated using defined criteria, and therefore, can be quantitatively compared with other strategies and approaches.

6.3. Recommendations and Future Work Over the course of this research project, some simplifications have been included to make the project more feasible which have implied some limitations. Therefore, as recommendations on this topic, some of these simplifications can be withdrawn and make the simulation and models more close to reality.

It would be recommended to start by including more sensor deviations such as biases, other types of noises, and temperature dependence. Biases are constant offsets that affect the mea- sured values. Other types of noise can make measurement simulation more realistic. And temperature dependence would simulate the effect that a change in the temperature has on a sensor. First, the effect of such sensors deviations could be studied to determine if the FDI system performance is decreased and how much. Then, if the FDI performance is not good enough, an improvement of the models and estimation algorithm would be recommended to overcome these backdraws. For example, the bias of a sensor can be estimated as a new state of the filter and its effect can be then taken into consideration. Also, the improvement of the models, including the temperature dependence and other types of noises, could be of interest.

Then, the environment of the study case mission could be modelled more according to real circumstances, for instance, the disturbance noise could be modelled as a non-constant vari- able that depends on the attitude of the spacecraft with respect to the Sun and the orbiting period of the Spacecraft could also be taken into account. This change in the disturbance model might imply that the considered attitude controller in this project might not be able to provide enough accurate control, so a tune of the controller might be required to deal with this new environmental disturbance. Given that solar pressure is the main disturbance in the considered space environment, consider other types disturbances would not be neces- sary.

Some of the results provide no clear conclusions given the low number of runs per test. Therefore, an increment of the number of runs in the MC campaigns could give more insight into the FDI system performance. Regarding more future work, which is not related to lim- itations and simplifications of the project, different lines of work could be of interest. For example, an improvement of the FDI strategy performance could be considered. Also, other types of actuators, like CMGs, could replace the RWs making the spacecraft more agile and improving its attitude performance.

The improvement of the FDI system performance can be approached in different ways. Re- garding the results obtained in this research project, one improvement could be focused on the effect of uncertainties on the FDI system performance. Given that the uncertainties that have more effect on FDI performance are the spacecraft inertia, the inclusion of an estimator which online estimates the spacecraft inertia would reduce such uncertainty and therefore improve the FDI performance. Some work has been done on this topic, for instance see [35], [36], and [57]. Also, a more refined tuning of the different algorithms used in the FDI system could be carried out in order to improve the performance of the FDI system. For example, the decision-making algorithm uses fixed thresholds to determine the presence of a fault. These fixed thresholds could be replaced by adaptive thresholds, which, if well defined, could improve the detection of faults performance. In addition, the filter used for estimation could be better tuned in a more refined way to have a better estimation of the states.

Another proposition for future work is the replacement of the RWs by CMGs. The CMGs, 6.3. Recommendations and Future Work 69 as the RWs, are flying rotating wheels but the difference is that its rotation axis can also rotate thanks to an additional motor on gimbal axes. Different CMGs exist, but as a starting point a single gimble axis CMGs could be considered, which wheel rotating axis can rotate in one plane. Therefore, the position of the wheel (and the torque direction) must be known in each moment, which involves the use of additional sensors. The controlling of the space- craft attitude and the torque allocation among the different CMGs are complex and arduous, but the advantages in terms of agility and torque delivery make them an interesting line of investigation.

Finally, the last recommendation for future work would be hardware in the loop simula- tions with the objective of testing the FDI system in a complex real-time embedded system. The FDI system would be tested in a real system, rather than in the simulated system used so far, which would provide a much more realistic performance of the FDI. The FDI system would be fed with the inputs to the actuators provided by the AOCS control algorithm and the real sensors output signals from the hardware set up. One example of a hardware in the loop infrastructure where this could be done is the Intrepid, an AOCS test bed for agile spacecraft developed in the frame of HOREOS agile project developed at Airbus Defence and Space in Friedrichshafen, Germany, in collaboration with the Institute of Flight Mechanics and Control University of Stuttgart and the Space Agency of the German Aerospace Center. Intrepid test bed consists of an air-bearing table with three rotational degrees of freedom. The relevant equipment modules located on the air-bearing test platform are the CMG actuators (which can be fixed to act as RWs), on-board processing unit as well as attitude and rate sensors. To achieve a high similarity with spacecraft in-orbit dynamics, the centre of mass location is aligned with the centre of pivot by means of a dedicated mass balancing system.

A Appendix: Derivations

A.1. Process Noise Matrix Q The derivation of the process noise covariance matrix 퐐 is done individually for each one of the estimated states. Because the process noise is defined as a random process, 퐐 can be computed as the covariance matrix of the effect of the noise in the system models (훿휼) using Equation 3.1 . Therefore, the contribution to the process noise matrix of a noise 휼 can be obtained as 퐐 = E[훿휼훿휼] (A.1) Starting with the attitude error process noise, it is theoretically zero (there are no discrepan- cies between the model and the reality) but in reality there are always numerical errors due to computation limitations, so very small values, 휀, are taken.

퐐 = 퐈휀 (A.2) Then, for the spacecraft angular rates the derivation begins from the dynamic equation of the spacecraft presented in equation 3.17, from which the effect of the noises, defined as 훿휼, are isolated from the effects on the commanded torques. To do it more understandable, first the effect of thrusters actuation noise, 훿휼, is derived, and then, the effect of the RW actuation noise, 훿휼, is derived.

Start by isolating the effect of the noise 휂 in equation 3.8 and considering that no un- certainties are applied (휖 = 0)

퐛 = −퐝퐹 − 퐝휂 . (A.3) Then, using equation 3.7

퐛 = −퐫 × (퐝퐹 + 퐝휂 )

= −퐫 × 퐝퐹 − 퐫 × 퐝휂 (A.4) and defining the matrix 퐁 = [퐛 퐛 … 퐛 ] −퐫 × 퐝 퐹 − 퐫 × 퐝 휂 ⎡ ⎤ −퐫 × 퐝 퐹 − 퐫 × 퐝 휂 ⎢ ⎥ 퐁 = (A.5) ⎢ ⋮ ⎥ −퐫 × 퐝 퐹 − 퐫 × 퐝 휂 ⎣ ⎦ equation 3.9 can be re-written in matrix form isolating the effect of the thruster noise as 푢 (−퐫 × 퐝 퐹 ) 푢 (−퐫 × 퐝 휂 ) ⎡ ⎤ ⎡ ⎤ ⎢ 푢 (−퐫 × 퐝퐹 ) ⎥ ⎢ 푢 (−퐫 × 퐝휂 ) ⎥ 퐓 = ⎢ ⋮ ⎥ + ⎢ ⋮ ⎥ (A.6) ⎢ ⎥ ⎢ ⎥ 푢 (−퐫 × 퐝 퐹 ) 푢 (−퐫 × 퐝 휂 ) ⎣ ⎦ ⎣ ⎦

71 72 A. Appendix: Derivations

Therefore, 푢 (−퐫 × 퐝 휂 ) ⎡ ⎤ ⎢ 푢 (−퐫 × 퐝휂 ) ⎥ 훿휼 = ⎢ ⋮ ⎥ (A.7) ⎢ ⎥ 푢 (−퐫 × 퐝 휂 ) ⎣ ⎦

It must be noticed that the magnitude of 훿휼 depends on the commanded opening thrusters 퐮 = [푢 푢 … 푢 ] , but the 퐐 matrix is defined as a constant in the filter. In order to not underestimate the effect of the noise on the system, the worst case scenario, where the effect of the noise is maximised, is taken. To do it, from all the possible combinations of open/closed thrusters defined by 퐮, the one that produce a higher angular acceleration on the spacecraft (considering the inertia) is selected and named 퐮 . Isolating the noises from the geometric part defined by the matrix 퐁

훿휼 = 휼퐁 (A.8) where 휂 0 … 0 ⎡ ⎤ 0 휂 … 0 ⎢ ⎥ 휼 = ⎢ ⋮ ⋮ ⋱ 0 ⎥ 0 0 … 휂 ⎣ ⎦ 푢 (−퐫 × 퐝 ) ⎡ ⎤ ⎢ 푢 (−퐫 × 퐝) ⎥ 퐁 = ⎢ ⋮ ⎥ ⎢ ⎥ 푢 (−퐫 × 퐝 ) ⎣ ⎦

The derivation of the effect of the RW actuation noise, 훿휼, starts by placing equation 3.10 into equation 3.13 and considering that no uncertainties are applied (휖 = 0 ∀)

퐓 = ∑ 퐦 (푢 + 푇 + 휂 ). (A.9) Then, the effects of the noise can be isolated as

퐓 = ∑ 퐦 (푢 + 푇 ) + 퐦 (휂 ) (A.10) After that, the total torque produced by the RWs can be re-written in matrix form using the nominal RW configuration matrix given in equation 5.2 as

퐓 = 퐌 (퐮 + 퐓) + 퐌 (휼) (A.11)

with 퐮 = [푢 푢 … 푢 ] , 퐓 = [푇 푇 … 푇 ] , and 휼 = [휂 휂 … 휂 ] . 퐑 퐑퐟

Therefore,

훿휼 = 퐌휼. (A.12) Now, placing equations A.6 and A.11 into the dynamic equation of the spacecraft defined in equation 3.17 and substituting the noise effect with equations A.8 and A.12 −퐫 × 퐝 퐹 ⎡ ⎤ −퐫 × 퐝 퐹 ⎛ ⎢ ⎥ ⎞ 흎̇ = 퐉 퐮퐓 + 훿휼 − 퐌 (퐮 + 퐓) − 훿휼 + 퐓 − 흎 × (퐉흎 − 퐡) . (A.13) ⎢ ⋮ ⎥ −퐫 × 퐝 퐹 ⎝ ⎣ ⎦ ⎠ A.1. Process Noise Matrix Q 73

From equation A.13 the effect of the actuators noises on the angular rate of the spacecraft, 훿휼 can be isolated as 훿휼 = 퐉 (훿휼 − 훿휼) . (A.14) Once the effect of the noises on the different state variables are clear, the matrix 퐐 can be derived. The different contributions to matrix 퐐 are computed using equation A.1:

퐐 = E[훿휼훿휼] = E[퐌휼휼퐌] = 퐌E[휼휼]퐌 = 퐌퐒퐌 where 퐒 = 푑푖푎푔(푆 , 푆 , … , 푆 ). For the spacecraft angular rate 퐐 contribution, the effect of the noises are treated individually using the linearity property of expectation operator (E[푋 + 푌] = E[푋] + E[푌]). Also notice that 퐉 is symmetric, therefore 퐉 = 퐉.

퐐 = E[훿휼훿휼 ] = E[퐉 훿휼훿휼 (퐉 ) ] − E[퐉 훿휼훿휼 (퐉 ) ] = 퐉 E[훿휼훿휼]퐉 − 퐉 E[훿휼훿휼]퐉 = 퐉 E[휼퐁퐁휼]퐉 − 퐉 (퐌푆퐌) 퐉 = 퐉 (퐁E[휼휼]퐁) 퐉 − 퐉 (퐌푆퐌) 퐉 = 퐉 (퐁푆퐁) 퐉 − 퐉 (퐌푆퐌) 퐉 = 퐉 (퐁퐒퐁 − 퐌퐒퐌) 퐉 where 퐒 = 푑푖푎푔 (푆 푆 … 푆 ).

Concerning the process noise on the RW dynamic system, its derivation starts substitut- ing equation 3.10 into equation 3.12

̇휔 = 퐽 (푢 + 푇 + 휂 ). (A.15) Then, isolating the actuation noise

̇휔 = 퐽 (푢 + 푇 ) + 퐽휂 (A.16)

훿휂 = 퐽휂 (A.17) and defining 훿휼 = [훿휂 훿휂 … 훿휂 ] and 퐉 = 푑푖푎푔 (퐽 퐽 … 퐽 ) the effect of the noises on all the RW is defined as

훿휼 = 퐉 휼 The contribution to process noise matrix is then computed as

퐐 = E[훿휼훿휼 ] = E[퐉 휼휼 (퐉 ) ] = 퐉 E[휼휼]퐉 = 퐉 퐒퐉 74 A. Appendix: Derivations

Regarding the process noise of the RW friction torques, its definition follows the previous procedures. However, no physical model is used to define it, but it is modelled as a random walk driven by a zero-mean Gaussian white-noise (휼). Therefore, in the filter (see equation 4.17) its time derivative is zero and its process noise contribution is

퐐 = E[훿휼훿휼]

= 퐒 where 퐒 = 푑푖푎푔 (푆 푆 … 푆 ).

Finally, the process noise matrix is

퐐 = diag (휀퐈× 퐉 (퐁퐒퐁 − 퐌퐒퐌) 퐉 퐉 퐒퐉 퐒) (A.18)

A.2. Power Spectral Density and Variance The PSD function (퐒(푓)), where 푓 is the signal frequency, is defined as the measure of sig- nal’s power content versus frequency. Therefore, the total power of a signal is equal to the integration of the PSD function over all the frequencies

1 푃 = ∫ 퐒(휔)푑휔 (A.19) 2휋 As described in Section 3.1.2, a Gaussian zero-mean white noise has a constant power for all frequencies (푆), so the total power would be infinite. Therefore, the band-limited white noise is used where the limiting frequencies are at least the Nyquist frequency (half of the sampling frequency in a discrete signal process system, Δ푡). Transforming equation A.21 into frequency domain and integrating from 푓 = − Δ푡 to 푓 = + Δ푡, the total power is

푃 = ∫ 퐒(푓)푑푓 = Δ푡푆. (A.20)

Given that the variance is the average power of the noise

휎 = Δ푡푆. (A.21) B Appendix: Validation and Verification

B.1. Thruster Reaction Control System The thruster RCS algorithm and model must be validated in order to ensure that it delivers the commanded forces and torques to the spacecraft, and verified in order to ensure that it meets a series of design specifications, such as accuracy and precision.

The validation and verification (V&V) is done using three different types of tests. The first one is done for a individual test for a single thruster RCS cycle with commanded forces, 퐮 , and commanded torques, 퐮 . The second one, is done in the same way as the first one but for 푁 thruster RCS cycles with random commanded forces and torques. Finally, a Monte Carlo simulation with 푁 runs is used to compare random commanded forces and torques to the actual delivered by the RCS with a success criterion threshold, Υ, defined as relative error. The random commanded forces and torques have a minimum value of 퐮 to avoid commands too close to zero which might not be achievable with the designed RCS.

The parameters of the model are the same as Section 5.2. And the V&V parameters are defined in Table B.1

Table B.1: RCS validation and verification parameters

Param. Value Unit Param. Value Unit 0 0 0 1 1.1 −0.8 퐮 [ ] N 퐮 [ ] Nm 푁 10 N/A 푁 10 N/A

Υ 1 % 퐮 0.1/0.1 N/Nm

B.1.1. Individual Test The individual test allows to observe the delivered force of each thrusters during each AOCS time for a single RCS cycle. In Figure B.1 the delivered force of each individual thruster are plotted together with the maximum forces of each thruster as horizontal red lines. It also permits to compare the commanded forces and torques per axis during a single RCS cycle against the actual delivered forces and torques (see Figure B.2). The difference between the commanded and delivered force is of the order of 10 N and the relative error is not- computable due to commanded force equal to zero. For the torque, the difference is not visible but it is also of the order of 10 or 10, as well as the relative error.

75 76 B. Appendix: Validation and Verification

1 1 1 [N] [N] [N] 1 2 3 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 4 5 6 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 7 8 9 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 10 11 12 0.5 0.5 0.5 Thr Thr Thr

0 0 0 0 10 20 30 0 10 20 30 0 10 20 30 Time steps Time steps Time steps

Figure B.1: Single RCS cycle delivered forces per thruster.

1.5 Actuated Torque 1 Commanded Torque

0.5

0 Torque [Nm] -0.5

-1 Axis X Axis Y Axis Z

10-4 4 Actuated Force Commanded Force 2

0 Force [N] -2

-4 Axis X Axis Y Axis Z

Figure B.2: Commanded versus delivered forces and torques for a single RCS cycle.

B.1.2. Multiple Test This test is only used to see how the model behaves when more than one three-seconds RCS cycle is applied with random commanded forces and torques. Likewise as in the individual test, in Figure B.3 the delivered force of each individual thruster are plotted together with the maximum forces of each thruster as horizontal red lines. The comparison between the commanded and actuated forces and torques during the 10 RCS cycle simulations are shown in Figures B.4a and B.4b.

B.1.3. Monte Carlo Test This test is the most meaningful one regarding V&V. Here, ten thousand runs with random commanded forces and torques are performed. The success criteria states which is the min- imum relative error between the command and the actual performed force and torque so the test is valid. The ratio of tests that are successful is of 99.97% of the cases. In order to extend the results, Figure B.5a and Figure B.5b shows 50 of the tested runs with a comparison of the commanded versus the actual forces and torques, respectively, and their relative errors. Figure B.6a shows the means and maximum relative errors of the forces, and Figure B.6b details the means and maximum relative errors of the torques. B.1. Thruster Reaction Control System 77

1 1 1 [N] [N] [N] 1 2 3 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 4 5 6 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 7 8 9 0.5 0.5 0.5 Thr Thr Thr

0 0 0

1 1 1 [N] [N] [N] 10 11 12 0.5 0.5 0.5 Thr Thr Thr

0 0 0 0 30 60 90 120 150 180 210 240 270 300 0 30 60 90 120 150 180 210 240 270 300 0 30 60 90 120 150 180 210 240 270 300 Time steps Time steps Time steps

Figure B.3: Multiple RCS cycles delivered forces per thruster.

1 Commanded Force Actuated Force 0.5

0.5 [Nm] X

[N] 0 X 0 -0.5

Torque Commanded Torque Force -0.5 -1 Actuated Torque 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Rcs Cycle Rcs Cycle Commanded Force 1 Actuated Force 0.5 0.5 [Nm] [N] Y 0 Y 0 -0.5 Force

Torque Commanded Torque -0.5 -1 Actuated Torque 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Rcs Cycle Rcs Cycle Commanded Force 1 Actuated Force 0.5 0.5 [N] [Nm] Z Z 0 0 -0.5 Force

-0.5 Torque Commanded Torque -1 Actuated Torque 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Rcs Cycle Rcs Cycle (a) Forces. (b) Torques.

Figure B.4: Commanded versus delivered forces/torques for multiple RCS cycles.

4 Commanded Delivered Relative Err. % Commanded Delivered Relative Err. % 1

2 0.5 [N] [Nm] X X 0 0 Force Torque -0.5 -2 4 Case Commanded Delivered Relative Err. % 1 Case Commanded Delivered Relative Err. %

2 [Nm] [N] 0 Y Y 0 -1 Force

-2 Torque

-2 4 Case Commanded Delivered Relative Err. % 2 Case Commanded Delivered Relative Err. %

1 2 [Nm] [N] Z Z 0 0 Force

Torque -1 -2 -2 0 5 10 15 20 25 30 35 40 45 50 0 5 10 15 20 25 30 35 40 45 50 Case Case (a) Forces. (b) Torques.

Figure B.5: Commanded versus delivered forces/torques and relative errors of 50 runs.

Regarding the results shown in this section, it is possible to say that the designed and implemented thruster RCS algorithm and model work as it was intended to do, delivering the commanded forces and torques with a high accuracy and precision. 78 B. Appendix: Validation and Verification

10-3 10-3 Mean Relative Error % Max Rel Error % 1.5 Mean Relative Error % Max Rel Error % 1.5 0.8

1

1 0.6 1

0.4 0.5 0.5 0.5 0.2

0 0 0 0

-0.2 -0.5 -0.5 -0.5 -0.4

-1 -1 -0.6 -1

-1.5 -0.8 -1.5 Axis X Axis Y Axis Z Axis X Axis Y Axis Z (a) Commanded forces. (b) Commanded torques.

Figure B.6: Mean and maximum relative errors.

B.2. Kalman Filter The correct implementation of the EKF and its proper performance for the used cases are important to be verified. To do so, different consistency checks are carried out. Consistency check of the true error and consistency check of the measured error are presented.

The consistency check of the true error is done using the true state values (퐱) provided by the simulator, the estimated state values (퐱̂), and the covariance matrix (퐏). In this test, 푖푡ℎ true error (푒 ) of the true error vector, defined as

푒 = 푥 − ̂푥, is plotted against its ±3휎 . Where 휎 is defined as

휎 = √퐏(,)

Ideally in a linear Kalman filter designed for a linear time-invariant system with Gaussian white noises, 푒 should stay a 99.7% of the time within the limits defined by the ±3휎 . How- ever, in a EKF that is not achievable due to linearization. In addition, small errors and imprecision on the model makes this very difficult to be achieved. For example, as explained in Section A.1, the definition of matrix 퐐 is not ideally done due to different noise contribu- tions depending on the active thrusters. Therefore, values of time within the limits of the true error close to the 99.7% of the simulation time are considered acceptable. For the proposed Kalman filter in this project the test is not performed for attitude errors (훿퐠) given that true attitude errors are meaningless. The results of the true error consistency check are presented in Figures B.7, B.8, and B.9. The percentage of time that the true errors are within the limits are summarised in Table B.2.

The second check done concerns the measured error. It makes use of the measured state values (퐳), the estimated state values (퐱̂), and the innovation or measurement covari- ance matrix (퐒). In this test, 푖푡ℎ measurement error (푒 ) of the measurement error vector, defined as

푒 = 푧 − ̂푥 , is plotted against its ±3휎 . Where 휎 and 퐒 are defined as

휎 = √퐒(,) 퐒 = 퐇퐏퐇 + 퐑. B.2. Kalman Filter 79

10-7 5 +3 Error -3

0 [rad/s] X S

-5 10-7

5 +3 Error -3

0 [rad/s] Y S -5

10-7

+3 Error -3 5

0 [rad/s] Z S -5

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure B.7: Spacecraft angular rate true error consistency check.

Figure B.8: RWs angular rates true error consistency check.

Figure B.9: RWs friction torques true error consistency check.

The objective is the same as in the previous check done. Ideally, 푒 should stay a 99.7% of the time within the limits defined by the ±3휎 . In this case, the friction torque is not included given that there are no direct measurements of its value. Therefore, the graphic results of the test are shown in Figures B.10 and B.11, and the numerical results are summarised in Table B.3. 80 B. Appendix: Validation and Verification

Table B.2: True errors consistency check results.

Variable Time within limits Variable Time within limits

휔 99.79% 휔 99.82%

휔 99.92% 휔 99.79% 휔 99.83% 휔 99.91% 휔 99.79% 푇 99.92% 푇 99.94% 푇 99.91% 푇 99.90%

10-7 5 +3 Error -3

0 [rad/s] X S

-5 10-7

5 +3 Error -3

0 [rad/s] Y S -5

10-7

+3 Error -3 5

0 [rad/s] Z S -5

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s]

Figure B.10: Spacecraft angular rates measurement error consistency check.

Figure B.11: RWs angular rates measurement error consistency check.

Table B.3: Measurement errors consistency check results.

Variable Time within limits Variable Time within limits

휔 99.76% 휔 99.77%

휔 99.93% 휔 99.82% 휔 99.80% 휔 99.82% 휔 99.83% C Appendix: Additional Simulation Figures

C.1. Sample Runs of Interest C.1.1. Stuck Open Fault With Highest Time To Detection Thruster actuation for fault free and stuck open fault cases are shown in Figure C.1a, where it is possible to see that at the moment that the fault occurs (around 3476 seconds) the faulty thruster has a very large opening time (about 99.3%). A stuck open fault implies that the thruster is continuously open (or 100% in terms of opening time). Therefore, the difference between the thruster opening time before and after the fault is very small (about 0.7% of opening time), which produces a very small torque difference between the commanded one and the realised one. Figure C.1b shows the GLR signals and thresholds for the spacecraft angular rates in faulty case, as well as the time of fault occurrence, in order to verify that the effect on the angular rates of the spacecraft is not noticed until the faulty thruster should be closed (around 3950 seconds).

1.5 Stuck open 1 [N] N

F 0.5

0 1.5 Fault free 1 [N] N

F 0.5

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] (a) Faulty thruster in stuck open fault and fault free case. 600 X Threshold 500 X

Y 400 Threshold Y

300 Z Threshold Z 200 Fault occurrence

100

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] (b) GLR signals for spacecraft’s angular rates.

Figure C.1: Stuck open fault case with highest time to detection.

81 82 C. Appendix: Additional Simulation Figures

C.1.2. Stuck Close Fault With Highest Time To Detection

Thruster actuation for fault free and stuck close fault cases are shown in Figure C.2a, where it is possible to see that at the moment that the fault occurs (around 4312 seconds) the faulty thruster has a very small opening time (about 3.33%). A zoom on thruster actuation is done in Figure C.2b to verify that at the time of fault occurrence the required opening time of the thruster is very small. Therefore, a stuck close fault, which implies that the thruster is continuously closed (or 0% in terms of opening time), would produce a very small torque difference between the commanded one and the realised one. Figures C.2c and C.2d, which show the GLR signals and thresholds for the spacecraft angular rates in faulty case, as well as the time of fault occurrence, are presented in order to verify that the effect on the angular rates of the spacecraft is barely noticed until the required opening time of the thruster is greater.

1.5 Stuck close 1 [N] N

F 0.5

0 1.5 Fault free 1 [N] N

F 0.5

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] (a) Faulty thruster in stuck close fault and fault free case. 1.5 Stuck close 1 [N] N

F 0.5

0 1.5 Fault free 1 [N] N

F 0.5

0 4300 4320 4340 4360 4380 4400 4420 4440 4460 Time [s] (b) Faulty thruster in stuck close fault and fault free case with zoom. 300 X Threshold 250 X

Y 200 Threshold Y

150 Z Threshold Z 100 Fault occurrence

50

0 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Time [s] (c) GLR signals for spacecraft’s angular rates. C.2. Additional Simulation Results Figures 83

300 X Threshold 250 X

Y 200 Threshold Y

150 Z Threshold Z 100 Fault occurrence

50

0 4300 4350 4400 4450 4500 4550 4600 4650 Time [s] (d) GLR signals for spacecraft’s angular rates with zoom.

Figure C.2: Stuck close fault case with highest time to detection.

C.2. Additional Simulation Results Figures

350 Isolated Detected 6000 Non-Isolated Non-detected 300

5000 250

4000 200

3000 150

Time to Detection [s] 2000 Time to Detection [s] 100

1000 50

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 m m leak loe (a) Leakage magnitude vs. time to detection. (b) LOE magnitude vs. time to detection.

(c) Stuck close time of occurrence vs. time to detection. (d) Stuck open time of occurrence vs. time to detection.

(e) RW friction time of occurrence vs. time to detection.

Figure C.3: Lineal scale representation. 84 C. Appendix: Additional Simulation Figures

C.3. Uncertainties Correlations

0.2 0.2

Det Det 0.1

0 0 0.2 0.2 0.1 Eq Iso Eq Iso 0 0 0.2 0.2 0.1

False Alarm 0 False Alarm 0 0.2 0.2 0.1 Miss Det Miss Det 0 0 0.2 0.2 0.1

Eq no-Iso 0 Eq no-Iso 0 0.2 0.2 0.1

Det. Time 0 Det. Time 0 InertiaX InertiaZ InertiaX InertiaZ InertiaY InertiaY AlignStr1 AlignStr2 AlignStr1 AlignStr2 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRmu AlignRmu AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs10 AlignRcs11 AlignRcs12 AlignRcs10 AlignRcs11 AlignRcs12 InertiaMean InertiaMean AlignRwMean AlignRwMean AlignRcsMean AlignRcsMean (a) Leakage fault results. (b) LOE fault results.

0.2 0.2 Det 0.1 Det 0 0

0.2 0.2 0.1 Eq Iso Eq Iso 0 0

0.2 0.2 0.1

False Alarm 0 False Alarm 0

0.2 0.2 0.1 Miss Det Miss Det 0 0

0.2 0.2 0.1

Eq no-Iso 0 Eq no-Iso 0

0.2 0.2 0.1

Det. Time 0 Det. Time 0 InertiaX InertiaZ InertiaX InertiaZ InertiaY InertiaY AlignStr1 AlignStr2 AlignStr1 AlignStr2 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRmu AlignRmu AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs10 AlignRcs11 AlignRcs12 AlignRcs10 AlignRcs11 AlignRcs12 InertiaMean InertiaMean AlignRwMean AlignRwMean AlignRcsMean AlignRcsMean (c) Stuck close fault results. (d) Stuck open fault results.

0.2 0.2 Det Det

0 0

0.2 0.2 Eq Iso Eq Iso

0 0

0.2 0.2 Fault Iso Fault Iso 0 0

0.2 0.2

False Alarm 0 False Alarm 0

0.2 0.2 Miss Det Miss Det 0 0

0.2 0.2 Eq no-Iso Eq no-Iso 0 0

0.2 0.2

Fault no-Iso 0 Fault no-Iso 0

0.2 0.2 Det. Time Det. Time 0 0 InertiaX InertiaZ InertiaX InertiaZ InertiaY InertiaY AlignStr1 AlignStr2 AlignStr1 AlignStr2 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRw1 AlignRw2 AlignRw3 AlignRw4 AlignRmu AlignRmu AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs1 AlignRcs2 AlignRcs3 AlignRcs4 AlignRcs5 AlignRcs6 AlignRcs7 AlignRcs8 AlignRcs9 AlignRcs10 AlignRcs11 AlignRcs12 AlignRcs10 AlignRcs11 AlignRcs12 InertiaMean InertiaMean AlignRwMean AlignRwMean AlignRcsMean AlignRcsMean (e) RW friction fault results. (f) RW tachometer fault results.

Figure C.4: Uncertainties correlation coefficients. Bibliography

[1] I. Bar-Itzhack and Y. Oshman. Attitude determination from vector observations: Quater- nion estimation. IEEE Transactions on Aerospace and Electronic Systems, (1):128–136, 1985.

[2] X. Barcons. Athena: the X-ray observatory to study the hot and energetic Universe. Journal of Physics: Conference Series, 610(1), 2015.

[3] X. Barcons, D. Barret, A. Decourchelle, J-W. Den Herder, T. Dotani, AC. Fabian, R. Fraga-Encinas, H. Kunieda, D. Lumb, G. Matt, et al. Athena (Advanced Telescope for High ENergy Astrophysics) Assessment Study Report for ESA Cosmic Vision 2015- 2025. arXiv preprint arXiv:1207.2745, 2012.

[4] X. Barcons, K. Nandra, D. Barret, J-W. Den Herder, AC. Fabian, L. Piro, MG. Watson, et al. Athena: the X-ray Observatory to Study the Hot and Energetic Universe. Journal of Physics: Conference Series, 610(1), 2015.

[5] M. Bartys, R. Patton, M. Syfert, S. De las Heras, and J. Quevedo. Introduction to the DAMADICS Actuator FDI Benchmark Study. Control Engineering Practice, 14(6):577 – 596, 2006.

[6] M. Basseville, I. Nikiforov, et al. Detection of Abrupt Changes: Theory and Application, volume 104. Prentice Hall Englewood Cliffs, 1993.

[7] M. Blanke, M. Kinnaert, J. Lunze, and M. Staroswiecki. Fault Diagnosis of Continuous- Variable Systems, pages 189–298. Springer Berlin Heidelberg, Berlin, Heidelberg, 2006.

[8] M. Blanke, M. Kinnaert, J. Lunze, and M. Staroswiecki. Fault Diagnosis of Stochastic Systems, pages 275–342. Springer Berlin Heidelberg, Berlin, Heidelberg, 2016.

[9] R. Brown, P. Hwang, et al. Introduction to Random Signals and Applied Kalman Filtering, volume 3. Wiley New York, 1992.

[10] B. Buchanan. Rule Based Expert Systems. The MYCIN Experiments of the Stanford Heuristic Programming Project, 1984.

[11] J. Chen and R. Patton. Model-Based Fault Diagnosis. In Robust Model-Based Fault Di- agnosis for Dynamic Systems, volume 3, pages 3–7. Springer Science & Business Media, 2012.

[12] V. Cherkassky and F. Mulier. Learning from Data: Concepts, Theory, and Methods. John Wiley & Sons, 2007.

[13] C. Commault, J-M. Dion, O. Sename, and R. Motyeian. Observer-Based Fault Detection and Isolation for Structured Systems. IEEE Transactions on Automatic Control, 47(12): 2074–2079, 2002.

[14] J. Crassidis, F. Markley, and Y. Cheng. Survey of Nonlinear Attitude Estimation Meth- ods. Journal of Guidance, Control, and Dynamics, 30(1):12–28, 2007.

[15] J. Devore. Probability and Statistics for Engineering and the Sciences. Cengage Learning, 2011.

[16] N. Dlodlo, L. Hunter, C. Cele, R. Metelerkamp, and A. Botha. A Hybrid Expert Systems Architecture for Yarn Fault Diagnosis. volume 15, pages 43–49, 04 2007.

85 86 Bibliography

[17] S. Evans et al. Natural Environment Near the Sun/Earth-Moon L2 Libration Point. Next Generation Program, NASA Marshall Space Flight Center, MSFC, AL, 2002.

[18] J. Farrell. Attitude Determination by Kalman Filtering. Automatica, 6(3):419–430, 1970.

[19] H. Fernando and B. Surgenor. An Unsupervised Artificial Neural Network Versus a Rule-Based Approach for Fault Detection and Identification in an Automated Assembly Machine. Robotics and Computer-Integrated Manufacturing, 43:79–88, 2017.

[20] V. Flaugergues, V. Cocquempot, M. Bayart, and M. Pengov. Structural Analysis for FDI: a Modified, Invertibility-Based Canonical Decomposition. In Proceedings of the 20th International Workshop on Principles of Diagnosis, pages 59–66. Citeseer, 2009.

[21] R. Fonod, D. Henry, C. Charbonnel, and E. Bornschlegl. Position and Attitude Model- Based Thruster Fault Diagnosis: A Comparison Study. Journal of Guidance, Control, and Dynamics, 38(6):1012–1026, 2015.

[22] R. Fonod, D. Henry, C. Charbonnel, E. Bornschlegl, D. Losa, and S. Bennani. Robust FDI for Fault-Tolerant Thrust Allocation with Application to Spacecraft Rendezvous. Control Engineering Practice, 42:12–27, 2015.

[23] P. Frank. Fault Siagnosis in Dynamic Systems Using Analytical and Knowledge-Based Redundancy: a Survey and Some New Results. Automatica, 26(3):459–474, 1990.

[24] J. Gertler. Analytical Redundancy Methods in Fault Detection and Isolation. IFAC Fault Detection, Supervision and Safety for Tech. Process., 6, 01 1992.

[25] J. Gertler. Fault Detection and Diagnosis in Engineering Systems. Maecel Dekker Inc., New York, 1997.

[26] J. Gertler. Fault Detection and Diagnosis. Springer, 2013.

[27] J. Giarratano and G. Riley. Expert Systems: Principles and Programming (Fouth Edi- tion). Canada: Thomson, 2005.

[28] P. Groves. Principles of GNSS, Inertial, and Multisensor Integrated Navigation Systems. Artech House, 2013.

[29] W. Hamscher, L. Console, and J. De Kleer, editors. Readings in Model-based Diagnosis. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.

[30] H. Heffner. The Fundamental Noise Limit of Linear Amplifiers. Proceedings of the IRE, 50(7):1604–1608, July 1962.

[31] LK. Herman. The History, Definition and Peculiarities of the Earth Centered Inertial (ECI) Coordinate Frame and the Scales that Measure Time. In 1995 IEEE Aerospace Applications Conference. Proceedings, volume 2, pages 233–263. IEEE, 1995.

[32] Q. Hou, Y. Cheng, N. Lu, and B. Jiang. Study on FDD and FTC of Satellite Attitude Con- trol System Based on the Effectiveness Factor. In Systems and Control in Aerospace and Astronautics, 2008. ISSCAA 2008. 2nd International Symposium on, pages 1–6. IEEE, 2008.

[33] R. Isermann and P. Ballé. Trends in the Application of Model-Based Fault Detection and Diagnosis of Technical Processes. Control Engineering Practice, 5(5):709 – 719, 1997. ISSN 0967-0661.

[34] S. Julier and J. Uhlmann. New Extension of the Kalman Filter to Nonlinear Systems. In Signal Processing, Sensor Fusion, and Target Recognition VI, volume 3068, pages 182– 194. International Society for Optics and Photonics, 1997. Bibliography 87

[35] J. Keim, A. Acikmese, and J. Shields. Spacecraft inertia estimation via constrained least squares. In 2006 IEEE Aerospace Conference, pages 6–12. IEEE, 2006. [36] A. Y Lee and J. Wertz. In-flight Estimation of the Cassini Spacecraft’s Inertia Tensor. Journal of Spacecraft and Rockets, 39(1):153–155, 2002. [37] F. Markley. Attitude Error Representations for Kalman Filtering. Journal of Guidance, Control, and Dynamics, 26(2):311–317, 2003. [38] F. Markley. Multiplicative vs. Additive Filtering for Spacecraft Attitude Determination. Dynamics and Control of Systems and Structures in Space, (467-474), 2004. [39] F. Markley and J. Crassidis. Filtering for Attitude Estimation and Calibration, pages 235– 285. Springer New York, New York, NY, 2014. [40] J. Marzat, H. Piet-Lahanier, F. Damongeot, and E. Walter. Model-based fault diagnosis for aerospace systems: a survey. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 226(10):1329–1360, 2012. [41] N. Meskin and K. Khorasani. Fault Detection and Isolation in a Redundant Reaction Wheels Configuration of a Satellite. In IEEE International Conference on Systems, Man and Cybernetics, pages 3153–3158. IEEE, 2007. [42] S. Narasimhan et al. Automated Diagnosis of Physical Systems. Proceedings of ICALEPCS07, pages 701–705, 2007. [43] R. Patton, P. Frank, and R. Clarke. Fault Diagnosis in Dynamic Systems: Theory and Application. Prentice-Hall, Inc., 1989. [44] R. Patton, FJ. Uppal, S. Simani, and B. Polle. Reliable Fault Diagnosis Scheme for a Spacecraft Attitude Control System. Proceedings of the Institution of Mechanical Engi- neers, Part O: Journal of Risk and Reliability, 222(2):139–152, 2008. [45] C. Pittet, A. Falcoz, and D. Henry. A Model-based Diagnosis Method for Transient and Multiple Faults of AOCS Thrusters. IFAC-PapersOnLine, 49(17):82–87, 2016. [46] R. Polikar. Ensemble Based Systems in Decision Making. IEEE Circuits and Systems Magazine, 6(3):21–45, 2006. [47] A. Posch, A. Schwientek, J. Sommer, and W. Fichter. Model-Based On-board Realtime Thruster Fault Monitoring. IFAC Proc. Volumes, 46(19):553–558, 2013. [48] V. Puig, S. Montes de Oca, and J. Blesa. Adaptive Threshold Generation in Robust Fault Detection Using Interval Models: Time-domain and Frequency-domain Approaches. In- ternational Journal of Adaptive Control and Signal Processing, 27(10):873–901, 2013. [49] J. Quinlan. C4.5: Programs for Machine Learning. Elsevier, 2014. [50] A. Shirazi and M. Mirshams. Pyramidal Reaction Wheel Arrangement Optimization of Satellite Attitude Control Subsystem for Minimizing Power Consumption. International Journal of Aeronautical and Space Sciences, 15(2):190–198, 2014. [51] M. Shuster. A Survey of Attitude Representations. Navigation, 8(9):439–517, 1993. [52] G. Singh, PT. Kabamba, and NH. McClamroch. Bang-bang Control of Flexible Space- craft Slewing Maneuvers: Guaranteed Terminal Pointing Accuracy. Journal of Guidance, Control, and Dynamics, 13(2):376–379, 1990. [53] J. Stuelpnagel. On the Parametrization of the Three-dimensional Rotation Group. SIAM review, 6(4):422–430, 1964. [54] H. A. Talebi and K. Khorasani. A Robust Fault Detection and Isolation Scheme with Application to Magnetorquer Type Actuators for Satellites. In 2007 IEEE International Conference on Systems, Man and Cybernetics, pages 3165–3170, October 2007. 88 Bibliography

[55] H. A. Talebi, R. V. Patel, and K. Khorasani. Fault Detection and Isolation for Uncertain Nonlinear Systems with Application to a Satellite Reaction Wheel Actuator. In 2007 IEEE International Conference on Systems, Man and Cybernetics, pages 3140–3145, Oct 2007. [56] R. Tate. Correlation Between a Discrete and a Continuous Variable. Point-Biserial Cor- relation. The Annals of Mathematical Statistics, 25(3):603–607, 1954. [57] J. Thienel, R. Luquette, and R. Sanner. Estimation of Spacecraft Inertia Parameters. In AIAA Guidance, Navigation and Control Conference and Exhibit, page 6454, 2008. [58] S. Vaseghi. Advanced Digital Signal Processing and Noise Reduction. John Wiley & Sons, 2008.

[59] N. Venkateswaran, MS. Siva, and PS. Goel. Analytical Redundancy Based Fault Detec- tion of Gyroscopes in Spacecraft Applications. Acta Astronautica, 50(9):535–545, 2002. [60] P. Verschuren, H. Doorewaard, and MJ. Mellion. Designing a Research Project, volume 2. Eleven International Publishing House, The Hague, 2010.

[61] A. Wald. Tests of Statistical Hypotheses Concerning Several Parameters When the Num- ber of Observations is Large. Transactions of the American Mathematical society, 54(3): 426–482, 1943. [62] E. Wan and R. Van Der Merwe. The Unscented Kalman Filter for Nonlinear Estimation. In Proceedings of the IEEE 2000 Adaptive Systems for Signal Processing, Communications, and Control Symposium, pages 153–158. IEEE, 2000. [63] G. Welch, G. Bishop, et al. An Introduction to the Kalman Filter. 1995. [64] B. Williams and P. Nayak. A Model-Based Approach to Reactive Self-Configuring Sys- tems. In Proceedings of the National Conference on Artificial Intelligence, pages 971–978, 1996.

[65] A. Willsky and H. Jones. A Generalized Likelihood Ratio Approach to the Detection and Estimation of Jumps in Linear Systems. IEEE Transactions on Automatic control, 21(1): 108–112, 1976. [66] J. Zhang, AK. Swain, and SK. Nguang. Robust Sensor Fault Estimation Scheme for Satellite Attitude Control Systems. Journal of the Franklin Institute, 350(9):2581–2604, 2013.