<<

A New Paradigm in Optimal Guidance

Item Type text; Electronic Dissertation

Authors Morgan, Robert W.

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Download date 24/09/2021 22:40:12

Link to Item http://hdl.handle.net/10150/194121 A New Paradigm in Optimal

by Robert W. Morgan

A Dissertation Submitted to the Faculty of the Department of Electrical & Computer Engineering In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy In the Graduate College The University of Arizona

2007 2

The University of Arizona Graduate College

As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Robert W. Morgan entitled A New Paradigm in Optimal Missile Guidance and recommend that it be accepted as fulfilling the dissertation requirement for the

Degree of Doctor of Philosophy

Date: 04/05/2007 Dr. Hal Tharp

Date: 04/05/2007 Dr. Jeffrey J. Rodriguez

Date: 04/05/2007 Dr. Jerzy W. Rozenblit

Date: 04/05/2007 Dr. Thomas L. Vincent

Final approval and acceptance of this dissertation is contingent upon the candidate’s submission of the final coppies of the dissertation to the Graduate College.

I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement.

Date: 04/05/2007 Dissertation Director: Dr. Hal Tharp 3

Statement by Author

This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

Signed: Robert W. Morgan 4

Acknowledgments

The author wishes to express his appreciation to Professors J.J. Rodriguez, J.W. Rozenblit, H.S. Tharp, and T.L. Vincent for their service on his doctoral committee. Special acknowledgement is due the committee chairman, Professor H.S. Tharp, for his advice and efforts to review and scrutinize the content and quality of this dissertation. The author would also like to express his appreciation for the inspiration and encouragement given him by Professor T.L. Vincent throughout the author’s academic career.

This work was made possible by the generous fellowship awarded to the author by Raytheon Missile Systems. Several people at Raytheon have been instrumental to the author’s ability to participate in the program. Ron Reid, the director of the program, has been a source of encouragement from the very beginning. The author’s acceptance into and desire to participate in the program is, with great thanks, due to Martin Ulehla. Special acknowledgement is also due Chris Poage, who has been an exemplar professional and a friend to the author while participating in the program.

The author would also like to thank his wife, Angela, and his children for their support and encouragement these past four years. 5

Ta b l e o f Co n t e n t s

List of Figures ...... 11

List of Tables ...... 14

Abstract ...... 15

Chapter 1. Introduction ...... 16 1.1.BackgroundandScope...... 16 1.2.HistoryandStateoftheArt...... 18 1.3. Organization ...... 25 1.4.Notation...... 30

Chapter 2. Estimation ...... 32 2.1.GeneralConceptsinEstimation...... 32 2.1.1. Estimation Defined...... 32 2.1.2.MarkovModel...... 33 2.1.3.OptimalEstimationandRisk...... 34 2.2. Bayesian Estimation...... 38 2.3. Multiple Model ...... 40 2.3.1.InteractiveMultipleModel...... 43 2.4. Linear Estimation ...... 43 2.4.1.UnbiasedEstimation...... 44 2.4.2.BLUEEstimation...... 45 2.4.3.LeastSquareEstimation...... 49 2.4.4.MaximumLikelihoodEstimation...... 50 2.4.5.FullRankModels ...... 51 2.4.6. A Priori Estimates and Rank DeficientModels...... 52

Chapter 3. Estimation in Linear Sampled Data Systems ...... 55 3.1. Bayes’ Estimator ...... 57 3.1.1.FirstMeasurement...... 58 3.1.2. kth MeasurementandFinalResultbyInduction...... 64 3.2. Estimates and Confidence Regions (Error Ellipsoids) for the Bayes’ Estimator...... 67 3.3.TheWhiteNoiseAssumptionandBayesianEstimation...... 68 3.4. The Deterministic Input Assumption and Bayesian Estimation . . . . 70 3.5.BayesianEstimationBetweenMeasurements...... 71 3.6.NoAPrioriInformationandBayesianEstimation ...... 74 6

Table of Contents–Continued

3.7. The Kalman Filter ...... 77 3.7.1.CovarianceSimulations...... 79 3.7.2.ScalarSystemEstimationExample...... 79 3.8. Multiple Model ...... 83

Chapter 4. Stochastic Motion Models ...... 91 4.1. Markov Models ...... 91 4.1.1.PrincipleofInertia...... 92 4.2.ProcessNoiseModels...... 93 4.3.RandomWalk...... 95 4.3.1.ContinuousTimeRandomWalk...... 96 4.4. White Acceleration...... 98 4.4.1.DiscreteEquivalent...... 99 4.5.CorrelatedAcceleration...... 101 4.5.1.DiscreteEquivalent...... 101

Chapter 5. Optimization and Control Theory ...... 104 5.1.BasicControlTheoryConcepts...... 104 5.2.ParametricOptimization ...... 105 5.2.1. Constraints ...... 106 5.2.2. Necessary Conditions for a Local Minimum ...... 107 5.3.LyapunovControlTheory...... 111 5.3.1.QuickestDescentControl...... 113 5.3.2.QuickestDescentwithMinimumIncrementalCost...... 115 5.4.OptimalControlTheory...... 117 5.4.1.OptimalReturnFunction...... 118 5.4.2. The Augmented State Vector and the Augmented State Space 119 5.4.3.TemporalBoundaryConditions...... 121 5.4.4. The Optimal Control H Function...... 123 5.4.5.StateIndependentControlConstraints...... 124 5.4.6.TheOptimalControlMinimumPrinciple...... 126 5.4.7.LinearQuadraticRegulator(LQR)...... 127 5.4.8.LinearSystemswithProcessNoise...... 133 5.4.9. Linear Systems with Process Noise and Measurement Noise . . 134 5.5. DifferentialGameTheory...... 140 5.5.1. System Definition...... 140 5.5.2.ControlConstraints...... 141 5.5.3.TerminalSet...... 141 5.5.4. The Payoff ...... 141 5.5.5.GamesofKindandGamesofDegree...... 142 7

Table of Contents–Continued

5.5.6.Min-MaxPrinciple...... 143 5.5.7. Min-Max Necessary Conditions ...... 144

Chapter 6. Airframe and Autopilot Modeling ...... 145 6.1. Introduction ...... 145 6.2.AirframeModelingofTailControlledMissiles...... 151 6.2.1.AerodynamicForces...... 151 6.2.2.AirframeDynamics...... 156 6.2.3.AirframeTransferFunction...... 158 6.3.Three-LoopAutopilot...... 162 6.3.1.HighBandwidthActuator...... 165 6.3.2.3-LoopSummary...... 166 6.3.3.3-LoopParameters...... 166 6.3.4.3-LoopPerformance...... 167 6.4.FirstOrder(Pole)ApproximationofFlightControlSystem..... 168 6.5.Pole-ZeroApproximationofFlightControlSystem...... 170 6.6.BinomialApproximationofFlightControlSystem...... 173

Chapter 7. Optimal Guidance ...... 176 7.1.EngagementGeometryandDynamics ...... 176 7.2.Miss...... 178 7.2.1.AccelerationDynamics...... 179 7.3. Zero EffortMiss(ZEM)...... 181 7.3.1.AccelerationDynamics...... 182 7.4. Heading Error ...... 183 7.5.BasicModel ...... 185 7.5.1.AugmentedProportionalNavigationGuidance...... 188 7.5.2.ProportionalNavigationGuidance...... 188 7.6.AccelerationDynamics...... 189 7.6.1.DecoupledDynamics...... 192 7.7. Single Pole Flight Control System Model ...... 193 7.8. Optimal Evasion ...... 198 7.9.MagnitudeConstraints(Saturation) ...... 203 7.9.1.SinglePoleFlightControlSystem...... 205 7.10.DirectionalConstraints ...... 207 7.10.1. Simulation ...... 212 7.10.2. Engagement Configuration...... 215 7.10.3.TrajectoryShaping...... 216 7.10.4.NavigationRatio...... 219 7.11. Endgame Geometry and Final Time tf ...... 223 8

Table of Contents–Continued

7.11.1.ApproximateFinalTimeEstimate...... 224 7.11.2.UnconstrainedMissileAcceleration...... 225

Chapter 8. Estimating the Zero Effort Miss ...... 227 8.1. Target Modeling ...... 228 8.1.1.StepChangeinTargetAcceleration...... 228 8.1.2.UncorrelatedTargetAcceleration...... 229 8.1.3.CorrelatedTargetAcceleration...... 230 8.1.4.OptimalEvasion...... 232 8.1.5.Summary...... 232 8.2.FlightControlSystemModeling...... 233 8.2.1.SinglePoleFlightControlSystem...... 234 8.2.2.FastFlightControlSystem...... 235 8.2.3.Summary...... 236 8.3.Estimation ...... 236 8.3.1.KinematicEstimatorEquations...... 237 8.3.2.RangeandRange-RateFilter...... 242 8.3.3.AzimuthAngleFilter ...... 244 8.3.4.ElevationAngleFilter...... 245 8.3.5. Estimating the ZEM Using Decoupled Filter Estimates . . . . 245 8.4.EstimationwithoutRangeInformation...... 248 8.5.LackofAPrioriInformation ...... 253

Chapter 9. Guidance Strategies ...... 255 9.1.GuidanceStrategiesandInformationConstraints...... 255 9.1.1. Zero EffortMissGuidanceStrategy...... 256 9.1.2.ParallelNavigationStrategy...... 257 9.1.3.DirectPursuit...... 261 9.2.ControlStrategiesandControlConstraints...... 262 9.3.GuidanceStrategyParadigm...... 263

Chapter 10. Applications of the Strategy Paradigm and Conclu- sion ...... 266 10.1.LyapunovGuidance ...... 266 10.1.1.DirectPursuitGuidance...... 267 10.1.2. Zero-EffortMissGuidance...... 269 10.2.ExtendingPPNforManeuveringTargets ...... 273 10.2.1.PureAugmentedProportionalNavigation...... 275 10.2.2.NumericalResults...... 277 10.3.ConclusionsandRecommendationsforFutureResearch...... 284 9

Table of Contents–Continued

10.3.1.SummaryofDissertation...... 285 10.3.2. Contributions...... 286 10.3.3.FutureResearch...... 290

Appendix A. Probability and Stochastic Processes ...... 292 A.1.ConceptsinProbability...... 292 A.2. Random Variables...... 294 A.3.MomentGeneratingFunctions...... 295 A.3.1.ComputationofMoments...... 296 A.3.2.FunctionsofIndependentRandomVariables...... 297 A.3.3.MarginalDistributions...... 298 A.4.Normal(Gaussian)Distribution...... 299 A.4.1. ConfidenceIntervals...... 301 A.4.2. Sum and Difference of Independent Normally Distributed Ran- domVariables...... 302 A.4.3.RelationshiptoChi-SquareDistribution...... 303 A.5.GammaDistribution...... 304 A.5.1. Sum of Independently Distributed Gamma Random Variables 308 A.6.Chi-SquareDistribution...... 309 A.6.1. Sum of Independently Distributed Chi-Square Random Variables 312 A.7.MultivariateNormalDistribution...... 312 A.7.1. Confidence Regions for Multivariate Normal Random Variables 321 A.8. Random Processes and Random Sequences ...... 334 A.8.1. White Processes and Sequences ...... 335 A.8.2.MarkovRandomSequences...... 336 A.9.InnovationsSequence ...... 337

Appendix B. Concepts from Systems Theory ...... 340 B.1. Common Signals ...... 340 B.1.1.DeterministicSignals...... 340 B.1.2.StochasticSignals...... 342 B.2.ConvolutionandImpulseResponse...... 345 B.2.1.PrincipleofSuperposition...... 345 B.2.2.HomogeneousSystem...... 346 B.2.3.TheParticularSolution...... 348 B.2.4. Time-Invariant Systems ...... 350 B.3.LinearSystemswithStochasticInputs...... 351 B.4.CovarianceandStatePropagation...... 355 B.5. Shaping Filters ...... 357 B.5.1. Continuous-Time Shaping Filters for Deterministic Signals . . 357 10

Table of Contents–Continued

B.5.2. Continuous-Time Shaping Filters for Stochastic Signals . . . 361 B.5.3.Discrete-TimeShapingFiltersforStochasticSignals...... 366 B.6. Coordinate Frames...... 368 B.6.1.DirectionCosineMatrix(DCM) ...... 368 B.6.2.EulerAngles...... 370 B.7.Dynamics(Kinematics)...... 372 B.7.1.Time-DerivativeofaConstantMagnitudeVector...... 373 B.7.2.Time-DerivativeofanArbitraryVector...... 374

References ...... 378 11

List of Figures

Figure 1.1.Typicaltacticalmissiletrajectory[35]...... 17 Figure 1.2.Majormissilesubsystems...... 26 Figure 2.1.Blockdiagramofmultiple-modelsystem[7,p.127]...... 40 Figure 3.1. Kalman Filter Performance for a 1st Order Gauss-Markov Process. 83 st Figure 3.2. Estimation Error (blue) and √Pk (red) for 1 Order Gauss- Markov Process...... ± ...... 84 Figure 3.3. Kalman Gain for 1st OrderGauss-MarkovProcess...... 85 Figure 5.1. Hatched region indicating intersection of a spherical ball and the control constraint set...... 106 Figure 5.2. Borrowed with permission from [92, p. 127]. Cost and cost gradient geometry for minimizing G (u). (a) At a minimizing point u∗. (b) At a nonminimizing point u...... 108 Figure 5.3. Geometry for steepest descent control us (green) and quickest descent control uq (red)...... 112 Figure 5.4. Quickest descent trajectories (black) and controllable set bound- aryforZermelo’sproblem...... 116 Figure 5.5. A Σ surfaceinaugmentedstatespace...... 120 Figure 5.6.Plotofcontrolconstrantfunctionanditsderivative...... 125 Figure 5.7.Optimalcontrolgainsasafunctionoftime...... 131 Figure 5.8. Controllable set and LQR domain of attraction for system with bounded control...... 132 Figure 6.1. Aerodynamic force, which is resolved into lift and drag compo- nents, is generated by creating an angle of attack, α, with respect to the directio of airflowagainsttheairfoil...... 147 Figure 6.2. A standard measure of serodynamic stability is static margin, the distance between the center of gravity and the center of pressure. . . 148 Figure 6.3. Canard control requires actuators to be located near the nose of themissile,typicallyinthesameareaastheseeker...... 149 Figure 6.4. Movement of tail control surfaces will not disturb the airflow acrossthewingsormissilebody...... 150 Figure 6.5.Aqualitativecomparisonofcanardandtailcontrol...... 151 Figure 6.6.Forcesonatail-controlledmissile...... 153 Figure 6.7. Missile with normal force FN at CP and moment M about CG. 157 Figure 6.8.Blockdiagramoflinearizedairframe...... 158 Figure 6.9. Standard Three-Loop Autopilot Topology [101, p. 508]. . . . . 162 Figure 6.10. Bode plot (frequency response) of flightcontrolsystem...... 168 12

List of Figures–Continued

Figure 6.11. Flight control system step response for different flight control system time constants...... 169 Figure 6.12. Step response comparison for 1st order (pole) approximation of 3-loop autopilot...... 170 Figure 6.13. Frequency response comparison of 3-loop autopilot and pole model. 171 Figure 6.14. Step response comparison for pole-zero approximation of 3-loop autopilot...... 174 Figure 6.15. Frequency response comparison of 3-loop autopilot and pole-zero model...... 174 Figure 6.16. Binomial approximations of flight control system with TA =0.3 seconds...... 175 Figure 7.1.Typicalengagementgeometry...... 177 Figure 7.2. Typical engagement geometry for spiraling target (green), con- stant acceleration target (blue), and constant velocity target (red). . . . 214 Figure 7.3. Typical pursuer trajectories for ks =0.01 (red) and ks =0.05 (green)...... 218 Figure 7.4. Heading error as a function of time-to-go for various trajectory shapingparametervalues...... 218 Figure 7.5. Trajectories A (green) and B (red) generated from parameters listed in Table 7.2...... 221 Figure 7.6. Heading error curves A (green) and B (red) as a function of time-to-go...... 221 Figure 7.7.Navigationgainasafunctionofheadingerror...... 222 Figure 7.8. Navigation gain Vs. heading error for different values of φ.... 222 Figure 8.1.Engagementrelativetoseekerboresight...... 238 Figure 10.1. Comparison between three forms of proportional navigation. . . 274 Figure 10.2.Plane(2D)engagementwithacceleratingtarget...... 279 Figure 10.3.X-axisZEMtrajectories...... 280 Figure 10.4.Y-axisZEMtrajectories...... 281 Figure 10.5.Z-axisZEMtrajectories...... 282 Figure 10.6. Asymptotically decreasing ZEM trajectories and asymptotically increasingcosttrajectoriesforguidancelaws...... 283 Figure 10.7.Pursueraccelerationasafunctionoftime...... 284

Figure A.1. Relationship between the ellipses tranced out bo δx and δy...326 Figure A.2. Equiprobability ellipsoids for random variables X N (µX, VX), Y N (µ , V ) and Z N (0, I)...... ∼ 326 ∼ Y Y ∼ Figure A.3. Regions X formed by rotating region Y such that P (X X)= R R ∈R P (Y Y) for the orthonormal transformation Y = RX...... 328 ∈R 13

List of Figures–Continued

Figure A.4. Rectangular confidence region (M) enclosed by box (M).... 329 RX BX Figure B.1.Rectangularpulseswithunityarea...... 341 Figure B.2.Directionangles...... 369 14

List of Tables

Table 6.1. Nominal values for flight control system parameters [60], [61]. . . 167 Table 7.1.Simulationparameters...... 214 Table 7.2.Simulationparametersfornavigationgainanalysis...... 220 Table 8.1.TargetmodelassumptionsandimpactonZEMequation..... 233 Table 8.2. Flight control system model assumptions and impact on ZEM equation...... 236 Table 10.1.Simulationparameters...... 278 Table A.1. Probabilities (x100) for different dimensions and different regions. 333 Table A.2. Required region size normalized to number of standard deviations along principal axes...... 334 15

Abstract

This dissertation investigates advanced concepts in terminal missile guidance. The terminal phase of missile guidance usually lasts less than ten seconds and calls for very accurate maneuvering to ensure intercept. Technological advancements have produced increasingly sophisticated threats that greatly reduce the effectiveness of traditional approaches to missile guidance. Because of this, terminal missile guidance is, and will remain, an important and active area of research. The complexity of the problem and the desire for an optimal solution has resulted in researchers focusing on simplistic, usually linear, models. The fruit of these endeavors has resulted in some of the world’s most advanced weapons systems. Even so, the resulting guidance schemes cannot possibly counter the evolving threats that will push the system outside the linear envelope for which they were designed. The research done in this dissertation greatly extends previous research in the area of optimal missile guidance. Herein it is shown that optimal missile guidance is fundamentally a pairing of an optimal guidance strategy and an optimal control strategy. The optimal guidance strategy is determined from a missile’s information constraints, which are themselves largely determined from the missile’s sensors. The optimal control strategy is determined by the missile’s control constraints, and works to achieve a specified guidance strat- egy. This dichotomy of missile guidance is demonstrated by showing that having different control constraints utilize the same guidance strategy so long as the information constraints are the same. This concept has hitherto been unrecognized because of the difficulty in developing an optimal control for the nonlinear set of equa- tions that result from control constraints. Having overcome this difficultybyindirect means, evidence of the guidance strategy paradigm emerged. The guidance strategy paradigm is used to develop two advanced guidance laws. The new guidance laws are compared qualitatively and quantitatively with existing guidance laws. 16

Chapter 1 Introduction

This introductory chapter of the dissertation is organized into four sections. The first section, titled "Background and Scope," provides an introduction to some of the issues in guided weapons and defines what this dissertation will focus on — terminal missile guidance. The second section of this chapter, titled "History and State of the Art," provides a brief history of weaponry with more detail being given to the topic of missile guidance and its historical development. The third section of this chapter, titled "Dissertation Organization," lists the chapters of this dissertation along with an explanation of their content and relation to other chapters in the dissertation. The final section of this chapter, titled "Notation," discusses some of the conventions that are used elsewhere in the dissertation, which may not be familiar to the reader.

1.1 Background and Scope

A missile is an object (weapon) that is fired, thrown, dropped, or otherwise projected at a target [1]. The three primary factors affecting a missile’s ability to successfully intercept a target are: (1) incorrect direction at takeoff (heading error), (2) environ- mental influences, and (3) unpredictable target maneuvers. One way of countering these factors is to guide the weapon to the target; another way is to use a (larger) warhead [49]. Guided weapons are generally classified as the command type or the homing type. In command systems, guidance commands are transmitted to the missile by a data link (e.g. , wire cables or fiber optics). Homing systems are characterized by the ability of the missile to detect, acquire and track a target. The essential difference between the two systems is the location of the target tracking device [31]. 17

Figure 1.1. Typical tactical missile trajectory [35].

Among guided weapons, further distinction needs to be made concerning the mis- sion objective. Will the engagement be surface-to-surface (SSGW), surface-to-air (SAGW), air-to-surface (ASGW), air-to-air (AAGW) or something else? What is the expected engagement range: short, medium, or long? Surely, the guidance scheme will be different, depending on mission objective. A typical missile trajectory is shown in Figure 1.1. There are essentially five phases to a typical missile engagement: launch, midcourse guidance, detection, acquisition and terminal guidance1. During the launch phase, the missile is launched from some platform. Depending on the length of the engagement, midcourse guidance is initiated directly after launch. A typical midcourse strategy might be to maintain launch heading and a constant altitude. "Detection is the process whereby the seeker senses a certain amount of energy (in some region of the spectrum) above that normally expected from background or internal seeker noise. Acquisition is the process whereby the seeker, after experiencing one or more incidents of detection, decides (according to some pre-established criteria or algorithm) that a valid target

1 Some missiles have a discrimination phase that occurs between acquisition and terminal guid- ance. 18 has been located. Tracking is the process whereby the seeker continually specifies the angular location of the target relative to some fixed coordinate system" [35, 1979, p. 3-3]. Once the missile is tracking the target, a algorithm is initiated. It is this final phase of the missile engagement (terminal guidance) that this dissertation addresses.

1.2 History and State of the Art

This section of the introduction will briefly discuss the history and state of the art in weapon systems. For the sake of brevity, the scope and thoroughness of the discussion must be sufficiently bounded. The discussion that follows will first give a very brief summary of the history of weapons. This will be followed by more specific information on guided weapons, and in particular the process of guidance itself as it is central to this dissertation. Clearly, the general concept of weapons dates to antiquity, when they were used by early humans as a method of obtaining food, as well as possibly settling disputes. First, there was likely the simple throwing of a stone, but doubtless it was contempo- rary with the club — this may be the early "sticks and stones" class of weapons. As time passed, spears were fashioned, maybe out of only wood at first but eventually tipped with a sharp piece of rock. The earliest account of these type of weapons are flint knives and flint (or flintstone) tipped spears which can be found on display in museums. Itisuncertainifthebowandarroworthesling(distinctfromslingshot) appeared first. The earliest textual account of the sling is found in the Bible in the Book of Judges, wherein "a shepherd David, unarmored and equipped only with a sling, defeats the warrior champion Goliath with a well aimed shot [96, encyclopedia entry "sling"]." However, archeological evidence indicates that the sling and were used as early as 20,000 B.C.. The Bronze Age brought about bronze swords as early as 2000 B.C. and the Iron Age brought about iron swords as early as 800 B.C.. 19

As history progressed, weaponry technology advanced at a fairly slow rate until the Medieval period. During this period, castles became common which necessitated a new class of weapons to defeat them — siege weapons. There are likely many cat- egories and variations of siege weapons, but the main ones were the catapult, the trebuchet, and the ballista [40]. The Renaissance period (14th-16th century) marked, among other things, the beginning of the implementation of combustion based devices in warfare. The most long-lasting effect of this was the introduction of cannon and firearms to the battlefield,wheretheyarestillatthecoreofmodernweaponry[96, encyclopedia entry "weapon"]. Rapid shooting weapons, such as the gatling gun (circa 1860), began to appear in the 19th century. The early 20th century brought about the tank and the airplane, which itself was used by the Japanese as a guided weapon in the World War II kamikaze attacks. During World War II, the first nuclear weapons were developed in the United States (an international effortthatinvolved both European and American scientists) under the top secret Manhattan Project and eventually unleashed against the Japanese cities of Hiroshima and Nagasaki in August 1945. The 20th century also gave birth to and matured the guided weapon, which will be discussed more in the following paragraph. In [31, p. 1], Garnell defines a guided weapon as a "weapon system in which the warhead is delivered by an unmanned guided vehicle." This definition may be too narrow. The term "guide" means to lead or show the way and the term weapon is an item or object used to injure, kill, disarm or incapacitate an opponent [96]. Perhaps a better definition of a guided weapon is simply a weapon that makes course corrections subsequent to initiation based upon post initiation data. Note then that an arrow, which by design corrects for environmental disturbances, is not a guided weapon because disturbance rejection is not considered a course correction. Similarly, a ballistic (that which falls under the force of gravity) missile can be launched in a manner that will cause it to fall within close proximity to its intended target. However, the is not a guided weapon because no post launch data is 20 used to alter its course subsequent to launch. An example which does satisfy the author’s definition of a guided weapon, but fails Garnell’s definition is the Japanese kamikaze attacks. Clearly the kamikaze were guiding their planes in a manner so as to cause damage to their enemy. The fact that a person was guiding the plane as opposed to a computer, does not change the fact that the weapon (the plane) was guided to a target with an intent to cause damage. Having accepted a definition of a guided weapon as a weapon that uses a closed loop control to engage an enemy (target, evader, etc.), we proceed to look at some of the history of guided weaponry. In 1870, a man by the name of Werner Siemen submitted a proposal for "the destruction of enemy vessels by guided torpedoes" to the Prussian ministry of war. The following account is given in [78, p. 12].

It consisted of a mounted beneath a sailing boat, controlled by pneumatic pulses transmitted through rubber tubes. The commands were to be transmitted from a control post on land or on a marine vessel, the position of the guided boat being marked by a shielded lamp. By the time the system had finally been developed and deployed by the German navy (in 1916), the boats were propelled by advanced internal-combustion engines, could achieve speeds exceeding 30 knots (45 feet/sec), and were guided from airborne command posts via radio and 50km-long electrical cables. In October 1917, the first operational success was attained when a British ship was hit and sunk.

In Siemen’s original proposal, the torpedo was to be guided to the target by directing its course (via pneumatic pulses) such that the torpedo traversed the line- of-sight between the controlling station and the target. This is a very primitive guidance scheme which directs the torpedo to where the target is, as opposed to where it will be at the time of intercept. Nevertheless, it has been shown to be effective for speed disadvantaged targets. The German military also developed and 21 employed at least two radio-guided air-to-sea bombs during World War II, both of which were guided to the target in a similar manner to the previously described sea torpedo based on Siemen’s design [78, p. 13]. The next major advance of guided weapons came about with the development of the "proportional navigation" (PN) guidance law, which was first formulated in the United States in 1943. The PN guidance law, which was invented by C. L. Yuan at the RCA Laboratories in the USA in 1943, was declassified and published in 1948 in the Journal of Applied Physics [100]. Qualitatively speaking, the PN guidance law seeks to stabilize the angular motion of the line-of-sight (LOS) between the missile and the target. As long as the distance along the LOS, referred to as the range, is decreasing, intercept is assured if the rotation rate of the LOS is minimal. The "horribly effective kamikaze attacks" against US ships in World War II was sufficient motivation to promote rapid development of a PN guided missile. The first successful intercept made by a missile against a (pilotless) aircraft was in December 1950, by a Hughes-developed2 Lark missile [30]. Although the mathematical development of PN and it’s eventual weapon system implementation is relatively modern, it’s use by humans and animals has evidently occurred for eons. As evidence of this, consider a football player attempting a tackle or a lion chasing its prey. These will often run nearly parallel with their targets while slowly decreasing the range between them until intercept occurs; this is in contrast to, and superior to, the direct pursuit strategy where the pursuer turns all his efforts directly at the target. An interesting account of navigation principles appearing in nature can be found in [2]. Many extensions of and variations of PN developed in the decade or so after its original development. In fact, there is still active research in this area and this dissertation presents some entirely new and promising results that are related to the PN guidance scheme. The mathematical theory of optimal control began in the 1950’s with Bellman’s dynamic programming [8]. A short time later (circa 1960) a blind Russian math-

2 Hughes Aircraft Company was purchased by Raytheon in 1997. 22 ematician by the name of Pontryagin developed an alternate approach to optimal control theory in what has been termed the "Maximum Principle" [67]. An Amer- ican scientist by the name of Rufus Isaacs initiated the study of differential game theory3 in 1954 [45]. The work of Isaacs and others that soon followed led to the seminal work of Ho and Bryson [43], wherein they showed that the PN guidance law is optimal under reasonable conditions (which will be discussed in Chapter 7). The aforementioned reference assumed the missile’s flight control system could be well approximated by a unity gain. That is, the model used by Ho and Bryson presumed that the missile could instantaneously achieve a desired acceleration. In 1968, a few years after the work of Ho and Bryson, Willems developed a guidance law that was optimal when the missile flight control system could be adequately modeled by a single lag [98], [99]. In 1971, Ronald Cottrell, of Hughes Aircraft Company, in- dependently developed and published the guidance law for a single lag flight control system in collaboration with University of Arizona Professor Dr. Thomas Vincent, who is also an advisor for the author of this dissertation [23]. Although Willems or others may have arrived at the result prior to Cottrell, most references cite Cottrell’s work as the original source of this important guidance law — this is likely due to the accessibility, clarity and conciseness of the work. Various papers on missile guidance continued to be developed through the 1980’s; for a literature survey see Pastrick [64]. In 1980 Guelman and Shinar made significant progress in optimal control of aero- dynamic missiles that cannot effect axial acceleration [37], [38]. However, Guelman and Shinar’s work did not lead to a closed form feedback controller which limits its potential application to practical weapon systems — a short coming that is overcome in the work contained in this dissertation. In 1991, Rusnak and Meir considered the intercept problem for a general flight control system model with inequality constraints

3 Differential game theory is a branch of the more general field of game theory developed by Neumann in 1944 [62]. 23 on acceleration magnitude [70]. In their paper, Rusnak and Meir showed that the ef- fect of the magnitude constraint on control results in a saturating controller when the flight control system is minimum phase. In 1995, Aggarwall used the approach taken by Rusnak and Meir to develop an "implementable" guidance law that is applicable for a tail controlled missile exhibiting what is known as the "wrong way effect" [4]. The distinction of Aggarwall’s approach is that the resulting guidance law makes use of states that are directly measurable, thus making implementation feasible. In 2002, Zarchan and Alpert also developed an optimal controller for a tail controlled missile by numerically solving the Ricatti equation to obtain the controller gains as a func- tion of time-to-go [102]. These gains could theoretically be used in an engagement by simple table look-up. The most noteworthy of Zarchan and Alpert’s results is that the more sophisticated guidance law fails to show remarkable improvement over that developed by Cottrell [23] which models the flight control system as a single lag. In fact, the only case where a significant performance increase was observed seems to be at a set of flight control system parameters which are likely not ever to be observed in a physical missile (a zero at five radians per second and a pole at 20 radians per second). Other control techniques have been applied to the problem of missile guidance in the past few decades. What distinguishes these techniques from much of the pre- 1990’s literature is the focus on nonlinearities and other difficulties. This said, it is very difficult to make an assessment of the quality of the resulting guidance laws because an exhaustive study would necessitate a designed experiment [56] involving possibly hundreds of parameters in a high-fidelity, six-degree of freedom (6-DOF) simulation. As evidence of this, the reader is referred to Nesline and Zarchan’s classic paper [59] wherein PN is compared with Cottrell’s single pole model [23]. In this reference, Nesline and Zarchan show that there are real-world situations, such as large radome slope and poor time-to-go estimation, where PN is superior to the more advanced optimal control law — that is PN is more robust. In fact, the reason PN is 24 still used in modern missiles is because it has been shown to be an extremely robust guidance law. Despite the difficulty in determining the performance and robustness of complex guidance schemes, new guidance schemes continue to be developed, often accompanied by paltry numerical results that compare them (usually favorably) to PN or some variant. This point is made here because several new guidance schemes are presented in this dissertation. While numerical comparisons are necessarily made, they are not relied on to imply superiority of certain guidance schemes and should be viewed with the same skepticism as one would any other paper showing similar results. This is not because these situations were selectively picked, but rather because it is nearly impossible to exhaustively compare complex guidance schemes. Fortunately, the guidance laws developed in this dissertation are amenable to qualitative study and also to direct comparison with other well known guidance schemes. As the previous paragraph states, several other approaches have been used to develop complex guidance schemes that are applicable in challenging engagement situations. LyapunovmethodshavebeenappliedbyVincentandMorganin2002 [95] and by Lechevin and Rabbath in 2004 [50]. Geometric approaches to missile guidance have been examined by Duflos et al. in 1995 [25] and by Chiou and Kuo [22] in 1998. Feedback linearization was used by Bezick et al. in 1995 [11]. The universal approximation ability of fuzzy models and neural network models have also been used in addressing the missile guidance problem. For the problem of missile guidance, the author’s preference is fuzzy models simply because the resulting models are interpretable, where as the counterpart neural networks are "black box" systems4. Application of these "soft" techniques to missile guidance has produced some promising results and seems to be where the momentum is heading in the area of missile guidance. There are many new papers in this area, three of which are [33],

4 In a missile that costs upwards of a million dollars, risk is an important factor. Accordingly, an interpretable system, from which intuitive explanations can be drawn, will be preferred to a black box system, even if the black box system performs slightly better. 25

[51] and [52].

1.3 Organization

This section explains the organization and content of this dissertation. While the main focus of this dissertation is on missile guidance, the dissertation also considers closely related topics. To better explain this, a functional diagram of a missile’s major subsystems and their inter-relationships is shown in Figure 1.2. The major subsystems shown in Figure 1.2 are explained in the sentences that follow. The seeker is an instrument used to detect, acquire, and track a target by sensing some band(s) of the electromagnetic spectrum. The seeker’s measurements are then passed to an appropriate filter (or estimator) to obtain an approximation of pertinent engagement parameters needed for guidance. These parameters (possibly along with other sensor measurements) are used by the guidance law to determine the desired missile maneuver. The autopilot generates appropriate actuator commands which will most efficiently produce the desired missile maneuver, while maintaining missile stability. An actuator is used to alter the external geometry of the missile by means of findeflection, tail deflection, canard deflection, thrust control, or some combination of these commands. Modern missiles may also employ divert thrusters (e.g. Raytheon’s Exoatmospheric Kill Vehicle - EKV [68, EKV product sheet]) to achieve the desired missile maneuver. “The airframe serves two purposes. First, it is the container for all the other subsystems (including the payload). Secondly, by proper design and in partnership with the propulsion, it can be used to effectively produce the required lift and drag forces for accomplishing the mission objectives” [35, 1979, pp. 3-4]. The kinematic blocks represent the governing physics for the engagement. Having introduced and briefly explained a functional diagram of the missile, the reader may now be able to better appreciate the organization of this dissertation. 26

Figure 1.2. Major missile subsystems.

This dissertation contains ten chapters and two appendices. The two appendices, one on probability and stochastic processes and the other on certain topics in systems theory, provide information and derivations that are used elsewhere in the disserta- tion, but are of secondary importance. The first chapter is an introductory chapter that provides background information and defines the scope of the dissertation. It also contains a section on the dissertation’s organization. The second, third and fourth chapters of this dissertation deal with estimation theory and stochastic motion modeling. These topics are of direct importance to missile guidance because an estimate of pertinent engagement parameters is necessary to guide the missile to its target. Chapter 2 of the dissertation presents fundamental concepts in estimation theory and lays the groundwork for Chapter 3 on estimation in linear sampled data systems. Chapter 3 is included because the equations describing the engagement are fundamentally linear and the measurements come at discrete time intervals. In practice, there are often deviations from the linearity assumption, but these are often accounted for by modifying the basic equations developed for 27 linear systems — e.g. the Extended Kalman Filter (EKF) and many other variants. Chapter 4, which deals with stochastic motion modeling, is included because of the inherent uncertainty involved in tracking a target. Application of an appropriate stochastic motion model allows the missile tracking algorithm to place bounds on the uncertainty and often test out different motion model hypotheses (multiple model estimation) during the tracking process. Chapter 5 develops fundamental concepts in optimization and advanced control theory. Geometrical arguments are used to develop the theory of constrained para- metric optimization. Lyapunov control theory ([41], [94], and [92, ch. 5]) is developed and an example of a quickest decent Lyapunov controller is given. Optimal control theory is developed using geometrical arguments based on semipermeable surfaces in n-dimensional space. This technique, whichiscoveredinmoredetaininVincent’s book on optimal control [92], avoids the use of functionals common to many other variational approaches to optimal control theory (e.g. [20] and [47]). The linear quadratic regulator is developed and an example is given. A moderately rigorous derivation of the separation theorem for linear sampled data systems is then pro- vided. Chapter 5 concludes with a lengthy statement of the min-max principle for continuous-time systems — the results of which should be directly evident after the development of optimal control theory. Chapter 6 develops the equations describing the missile airframe and associated controller (autopilot), which together are referred to as the flight control system. An aerodynamic missile cannot instantaneously achieve a desired level of acceleration. A tail controlled missile is analyzed and the equations governing the commanded to achieved acceleration are developed. This configuration results in a non-minimum phase, third-order system. In order for a missile to accelerate in a desired direction, the tail fins must firstinclineintheoppositedirectionsoastopitchthebody(the main lifting surface) in the desired direction. The initial lift that is created from the tail fins causes the missile to accelerate in the direction that is opposite to the 28 desired direction, a phenomenon known as the "wrong-way" effect that is character- istic of non-minimum phase systems. The complexity of the non-minimum phase, third-order system motivates the guidance engineer to seek an appropriate modeling approximation. In this vein, a single pole and a pole-zero model are developed and compared to the full flight control system model. Chapter 7 develops the theory of optimal missile guidance. The equations gov- erning the miss are developed and then used to define the zero effort miss (ZEM), which is the miss that occurs when the pursuer exerts zero effort. Guidance laws are developed for a missile with a very generic flight control system, where the only requirement is that it (the flight control system) can be represented by a linear, time- invariant ordinary differential equation (or equivalently a transfer function). The resulting guidance law is related to the previously defined zero effort miss. An ex- plicit closed loop guidance law for the single pole flight control system model is given. Optimal evasion (differential game theory) is also considered and shown to result in a guidance law that is simply an amplified (increased gain) version of the guidance law for zero target acceleration. The nonlinear equations describing a missile without axial acceleration capabilities are considered and new results are presented. Chap- ter 7 concludes with a discussion of time-to-go estimation, wherein a new time-to-go algorithm is presented. Chapter 8 takes up the important topic of estimating the variables required by all previously developed guidance laws, more specifically with estimating the zero effort miss. Stochastic motion modeling, as discussed in Chapter 4, is used to specify the equations describing the motion of the target. The missile is modeled by the flight control system discussed in Chapter 6. The engagement equations are given in terms of variables that are relevant for a sensor mounted on a gimbaled platform, namely equations related the measured boresight angles and other sensor data. The resulting equations are shown to be amenable to estimation by the Kalman filter that was developed in Chapter 3. In Chapter 8, numerical estimation results are not 29 of importance and therefore not pursued, but the interested reader can find them elsewhere, for example in [84]. Chapter 9 of this dissertation ties in Chapters 7 and 8 by invoking the separation theorem. This results in the author’s observation that there are optimal guidance strategies, that are themselves dictated by the pursuer’s (missile’s) information con- straints (i.e. level of knowledge about the state of the system). Similarly, there are also optimal control laws to achieve a given guidance strategy. Hitherto, the guid- ance strategy and control strategy have been relatively inseparable and corporately referred to as guidance laws. The main exception to this is the work by Schney- dor [78], where it is evident from the organization of his book, that he too thought such a dichotomy is inherent to missile guidance. Until now any definitive sweeping statement would have been difficult to support, even if it were suspected — the main reason is that optimal guidance laws have only been devised for linear systems. Any nonlinearity makes solving the differential equations by analytical means impossible (at least using present day solution techniques that do not involve power series), and therefore limited the practical applicability of optimal control theory to nonlinear missile guidance5. However, the new results obtained in Chapter 7 solved the op- timal control problem for a nonlinear engagement model by indirect means. The solution indicates that the guidance strategies that are optimal for linear engagement problems are also optimal for nonlinear engagements. The nonlinearities only affect how the control law will achieve an optimal strategy. Chapter 10 demonstrates the usefulness of the strategy and control dichotomy paradigm. The paradigm is first used to show that Lyapunov control theory can be used to obtain a guidance law for a missile by choosing the Lyapunov function to be associated with the applicable guidance strategy. Next, the paradigm is used to separate existing guidance laws into strategies and control laws. Once this has been done, a guidance law can be extended to higher-level strategies. This concept is used

5 There are exceptions to this, for example see [95]. 30 to extend the well known pure proportional navigation guidance law to a higher-level strategy that should be much more effective against maneuvering targets. The last section of Chapter 10 provides a conclusion for the dissertation.

1.4 Notation

This dissertation covers several areas of engineering, each of which has a relatively standard notation. Whenever possible, the commonly accepted standards were ad- hered to. For example, vectors are usually depicted as bold faced, lower-case letters and matrices are usually depicted as bold faced, capital letters. Most of the other use of symbols and accents should be clear from the context in which it occurs. However, the author feels the need to briefly discuss the notation used for random variables. This dissertation makes frequent use of random variables. It is standard practice to distinguish the name of a random variable from a particular value the random variable takes. This is typically done by either using a bold face font or capitalization. This convention presents a difficulty in this dissertation because bold face is often reserved to distinguish vector quantities from scalar quantities and capitalization is often used to distinguish matrices from vectors. This conflictbecomesmostapparent when working with functions of random variables, where the previously mentioned notation is used to distinguish scalars, vectors and matrices. Without any other satisfactory solution to this dilemma the author chose to capitalize the name of a random variable only when it would have caused confusion not to do so. For example, the standard syntax for the expectation of a scalar random variable X is

∞ E X = xfX (x) dx . (1.1) { } Z−∞ However, this dissertation often uses the syntax

∞ E x = xfX (x) dx , (1.2) { } Z−∞ 31 becauseitisclearthatx in the expectation operator refers to the random variable itself rather than any value the random variable might take. A case when such an abuseinnotationwouldnothavebeenacceptableiswhenstatingtheprobabilitythat the random variable X takesonavaluelessthanx

x P (X x)=FX (x)= ξfX (ξ) dξ . (1.3) ≤ Z−∞ In this case, it is necessary to explicitly distinguish the name of the random variable from the argument x. 32

Chapter 2 Estimation

This chapter presents fundamental concepts in estimation theory and lays the ground- work for Chapter 3 on estimation in linear sampled data systems. The first section of this chapter provides a definition of estimation and a discussion of optimal estima- tion. This section is followed by a section that develops the fundamental equations for Bayesian estimation. The next section extends the Bayesian estimator to the case when there are multiple model descriptions of the system, with each model having aspecified probability of being the correct model. The final section of this chapter provides a discussion of linear estimation.

2.1 General Concepts in Estimation

Estimation has been the topic of many papers as well as textbooks. Perhaps one of the most popular books on the subject is [32], which is in its 16th printing since its original publication in 1974. The book has probably enjoyed so much success because of its practical approach to the subject and its many examples. Another excellent reference is [86], which develops the topic succinctly from a least-squares perspective. A moderately rigorous approach to the subject is given in [3]. Reference [19] is useful for those unfamiliar to both random processes and estimation; this book also provides Matlab exercises and solutions. Bryson and Ho’s celebrated optimal control book provides a nice introduction to the subject for those familiar with control theory [43].

2.1.1 Estimation Defined

The following definitions come from reference [1]. An estimation (noun) is an ap- proximate calculation (or evaluation/opinion/judgement) of the amount, magnitude, 33 extent, position or value of something. An estimator is an algorithm or tool used to compute an estimation. To estimate (verb) something means to generate an estima- tion. Alternately, the noun form of the word estimate has the same meaning as the noun form of the word estimation. For this reason, the terms estimate (noun) and estimation (noun) are used interchangeably. However, the term estimator should never be used interchangeably with the terms estimate or estimation. The remainder of this section develops basic concepts in estimation theory. The Markov model is defined and related to the problem of object tracking. The concepts of optimal estimation and risk are developed and related to maximum likelihood estimation and minimum mean square error (MMSE) estimation.

2.1.2 Markov Model

Consider the vector system [6]

xk = fk (xk 1, vk 1) , (2.1) − −

nx nv nx where xk is the system state vector; fk : is a possibly nonlinear < ×< → < function; vk is an i.i.d. (independently and identically distributed) process noise sequence; nx and nv are the dimensions of the state and noise vectors. The system produces a Markov sequence because vk is i.i.d. and the function fk only depends on xk 1 (and no prior values of x). A Markov sequence has properties, some of which are − derived in Section A.8.2, that make it amenable to state estimation. These properties will be exploited in the sections that follow. A measurement equation is defined by

zk = hk (xk, nk) , (2.2)

nx nn nz where zk is the measurement vector; hk : is a possibly nonlinear < ×< → < function; nk is an i.i.d. measurement noise sequence; nz and nn are the dimensions 34

of the measurement and measurement noise vectors. The notation z1:k will be used to denote the set of zi measurements with i [1,k]. ∈ The model given by Eq. 2.1 and Eq. 2.2 is a hidden Markov model (HMM) because the data available to the observer are not the evolution of the states but a second stochastic process that is a probabilistic function of the states [85, Sec. 9.5].

In the context of object tracking, the state xk usually consists of the position, velocity, acceleration, and other features of the object. The observation zk is usually a video frame at the current time instance [89], [83, p. 744].

2.1.3 Optimal Estimation and Risk

Reference [32] has the following to say about optimal estimation: “An optimal estima- tor is a computational algorithm that processes measurements to deduce a minimum error (in accordance with some stated criterion of optimality) estimate of the state of a system by utilizing: knowledge of system and measurement dynamics, assumed sta- tistics of system noises and measurement errors, and initial condition information.” Any time the term optimal is used, one should immediately look for the associated index or cost. That is, a quantity that is optimal with respect to one index is not necessarily optimal with respect to another index.

Suppose that k measurements are available and an initial estimate xˆ0 of the state x0 is available. Using this information, the goal is to form an optimal estimate xˆk of the state xk. Let a scalar function L (xˆk, xk) represent the loss that occurs as a result of choosing xˆk as the estimation when the true state is xk. The estimate can be a function of all measurements and the initial estimate

xˆk = g (z1:k, xˆ0) . (2.3)

To simplify notation, we note that the initial estimate xˆ0 can also be thought of as a measurement z0 and so the estimator takes the form

xˆk = g (z0:k) . (2.4) 35

The risk of using the function g (z0:k) to compute an estimate xˆk of the state xk R is defined to be the expected value of the loss conditioned on the measurement data z0:k

(g ( )) E L (xˆk, xk) z0:k R · , { | } ∞ = L (xˆk, xk) f (xk z0:k) dxk | Z−∞ ∞ = L (g (z0:k) , xk) f (xk z0:k) dxk . (2.5) | Z−∞ Minimum Mean Square Error Suppose that the loss is a quadratic function of the estimation error T L (xˆk, xk)=(xˆk xk) S (xˆk xk) . (2.6) − − Then, the risk is given by

(g ( )) = E L (xˆk, xk) z0:k R · { | } T = E (xˆk xk) S (xˆk xk) z0:k − − | Tn T oT T = xˆ Sˆxk xˆ SE xk z0:k E x z0:k Sˆxk + E x Sxk z0:k (2.7), k − k { | } − k | k | © ª © ª where xˆk can be removed from the expectation because it depends only on the con- ditioning variables z0:k. Further simplification of the risk gives

T T T T (g ( )) = xˆ Sˆxk xˆ SE xk z0:k E x z0:k Sˆxk + E x Sxk z0:k R · k − k { | } − k | k | T =(xˆk E xk z0:k ) S (xˆk E©xk z0:k ª) © ª − { | } − { | } T T E x z0:k SE xk z0:k + E x Sxk z0:k . (2.8) − k | { | } k | © ª © ª Only the first term depends on xˆk and it can be made zero if

xˆk = E xk z0:k . (2.9) { | }

Therefore, the estimate provided by Eq. 2.9 minimizes the risk of the quadratic loss function defined in Eq. 2.6. In this sense, the conditional expectation defined in Eq. 36

2.6 is optimal, and is often referred to as a minimum mean square error (MMSE). The conditional mean is an unbiased estimate, which can be shown by taking its expectation

∞ E xˆk = E xk z0:k f (z0:k) dz0:k { } { | } Z−∞ ∞ ∞ = xkf (xk z0:k) f (z0:k) dxkdz0:k | Z−∞ Z−∞ ∞ ∞ = xkf (xk, z0:k) dxkdz0:k Z−∞ Z−∞ ∞ = xkf (xk) dxk Z−∞ = E xk . (2.10) { }

It has now been established that the MMSE estimate is the conditional mean given in Eq. 2.9. However, Eq. 2.9 is an equation describing an estimate, it is not an estimator. Suppose one had knowledge of the conditional density function f (xk z0:k). | Then one could compute the MMSE estimation by taking the expectation of the conditional density function

xˆk = E xk z0:k { | } ∞ = xkf (xk z0:k) . (2.11) | Z−∞ Maximum Likelihood Estimation Suppose that the risk is defined to be 1 if the ab- solute value of the error is greater than ε and zero otherwise

1 xˆ x > ε L (xˆ , x )= if k k . (2.12) k k 0 | otherwise− | ½ The risk could be modified slightly, for example a quadratic risk could be used

1 (xˆ x )T S (xˆ x ) >ε L (xˆ , x )= if k k k k . (2.13) k k 0 − otherwise− ½ The main point is to choose a risk that places zero penalty on estimates that are very accurate and place a large, but equal penalty on all other estimates. The risk is 37 given by

(xˆk)= f (xk z0:k) dxk R | xˆ Zx >ε | k− k|

=1 f (xk z0:k) dxk . (2.14) − | xˆ Zx <ε | k− k| If ε is very small

(xˆk) 1 f (xˆk z0:k) . (2.15) R − ∝− |

Minimizing (xˆk) over all xˆk is equivalent to minimizing (xˆk) 1 over all xˆk.The R R − minimum value occurs when f (xˆk z0:k) is maximum. That is, the optimal estimate | corresponds to the value of x that maximizes the conditional density

xˆk =arg max f (xk z0:k) . (2.16) xk | ∙ ¸ A density function has a maximum at a point where its derivative is zero. Any point in a density function with a zero derivative is called a mode of the density function. The reason why modes are important is because they represent a locally most likely value

x+dx P (x xk x + dx z0:k)= f (xk z0:k) dxk = f (x z0:k) dx ≤ ≤ | | | Zx + higher order terms . (2.17)

For small dx, this probability is a maximum when the conditional density is a max- imum. The estimate xˆ which corresponds to the peak in the conditional density function f (x z0:k) is more likely than any other value of x. For this reason, the | estimate provided by Eq. 2.16 is called a maximum likelihood estimate (MLE).

General Loss Functions If the loss were not of the types already mentioned, one could still make use of the conditional density to compute the risk of choosing a particular 38

xˆk

(xˆk) E L (xˆk, xk) z0:k R , { | } ∞ = L (xˆk, xk) f (xk z0:k) dxk . (2.18) | Z−∞

Then, one could select the value of xˆk that produced the minimum risk.

It seems that knowledge of the conditional density f (xk z0:k) is the key to optimal | estimation, whether one is concerned with a MMSE estimate, a maximum likelihood estimate or some other criterion. Bayes’ formula provides a means by which the conditional density f (xk z0:k) may be obtained given an a priori density function | f (xk z1:k 1) and a new measurement zk. | −

2.2 Bayesian Estimation

This section develops the general equations for Bayesian estimation. Assume that an initial density function for the state vector is given as f (x0) and define

f (x0 z0) f (x0) . (2.19) | ,

The goal is to use f (xk 1 z0:k 1) to form a new estimate f (xk z0:k) when a new − | − | measurement zk becomes available. The a priori density function f (xk z0:k 1) can | − be found from integration of the density function f (xk, xk 1 z0:k 1) − | −

∞ f (xk z0:k 1)= f (xk, xk 1 z0:k 1) dxk 1 | − − | − − Z−∞ ∞ = f (xk xk 1, z0:k 1) f (xk 1 z0:k 1) dxk 1 | − − − | − − Z−∞ ∞ = f (xk xk 1) f (xk 1 z0:k 1) dxk 1 , (2.20) | − − | − − Z−∞ where the last equality follows because the state xk described by Eq. 2.1 is a Markov sequence. The probabilistic model of the state evolution f (xk xk 1) is known from | − Eq.2.1andtheknownstatisticsofvk 1. The density function f (xk 1 z0:k 1) is − − | − 39 known from a previous iteration. Another consequence that is readily recognizable from the Markov model given by Eq. 2.1 and measurement equation given by Eq. 2.2 is

f (zk xk, z0:k 1)=f (zk xk) . (2.21) | − |

The a posterior estimate f (xk z0:k) can be found using Bayes’ formula | f (z0:k, xk) f (xk z0:k)= | f (z0:k) f (zk z0:k 1, xk) f (z0:k 1, xk) = | − − f (zk z0:k 1) f (z0:k 1) | − − f (zk z0:k 1, xk) f (xk z0:k 1) f (z0:k 1) = | − | − − f (zk z0:k 1) f (z0:k 1) | − − f (zk z0:k 1, xk) f (xk z0:k 1) = | − | − f (zk z0:k 1) | − f (zk xk) f (xk z0:k 1) = | | − . (2.22) f (zk z0:k 1) | − The density f (zk xk) in Eq. 2.22 is known from Eq. 2.2 and the known statistics | of nk.Thetermf (xk z0:k 1) is known from Eq. 2.20. The term f (zk z0:k 1) is | − | − obtained rearranging Eq. 2.22 and integrating

∞ f (zk z0:k 1)= f (zk xk) f (xk z0:k 1) dxk . (2.23) | − | | − Z−∞ Eqs. 2.20-2.23 are the necessary equations for updating the conditional density func- tion f (xk 1 z0:k 1) f (xk z0:k). The algorithm would work as follows. Process the − | − → | measurement by evaluating the density f (zk xk) at the measurement value zk.Next | compute Eq. 2.20. Using 2.20 and f (zk xk) compute Eq. 2.23. Finally, compute | the a posteriori density function given by Eq. 2.22. With the a posteriori estimation known, one could compute an MMSE estimate (Eq. 2.11), an MLE estimate (Eq. 2.16) or any other minimum risk estimate. The following section extends the Bayesian equations to a multiple model context, in which the underlying process is described by a set of models, each with an associated probability of being the true process model. 40

Process Noise

1 1 System xˆ , P Dynamics Filter 1

xˆ, P Measurement Combine System Estimates

Measurement xˆ r , Pr Noise Filter r

Calculate Probabilities

Figure 2.1. Block diagram of multiple-model system [7, p. 127].

2.3 Multiple Model

This section extends the Bayesian estimator to the case when there are multiple model descriptions of the system, with each model having a specified probability of being the correct model. The basic idea behind the multiple model approach is shown in Figure 2.1. Each model can have a different state (process) model and/or a different measurement model. The model probabilities are computed according to Bayes’ rule, as discussed in Section 2.2 with all quantities conditioned on a particular model. Let

Mj be the event that model j is correct with prior probability

P (Mj)=µj (0) , (2.24) 41 and corresponding density

r

fM (m)= µj (0) δ (m j) . (2.25) j=1 − X

Assume that an initial density function for the state vector is given as f (x0 m) and | define

f (x0 z0,m) f (x0 m) . (2.26) | , |

Assume that k measurements have been made and are represented by z1:k.The density of the current state xk conditioned on the measurements z1:k 1 and the model − choice m can be computed using Eq. 2.20

∞ f (xk z1:k 1,m)= f (xk xk 1,m) f (xk 1 z1:k 1,m) dxk 1 , (2.27) | − | − − | − − Z−∞ where the density f (xk xk 1,m) is implicitly given from model m and f (xk 1 z1:k 1,m) | − − | − is known from a previous iteration. The density of a new measurement zk condi- tioned on the measurements z1:k 1 and the model choice m can be computed using − Eq. 2.23 ∞ f (zk z1:k 1,m)= f (zk xk,m) f (xk z1:k 1,m) dxk , (2.28) | − | | − Z−∞ where the density f (zk xk,m) is implicitly given from model m and f (xk z1:k 1,m) | | − is given by Eq. 2.27. The density of xk conditioned on the measurements z1:k and the model choice m can be computed using Eq. 2.22

f (zk xk,m) f (xk z1:k 1,m) f (xk z1:k,m)= | | − , (2.29) | f (zk z1:k 1,m) | − where the density f (zk xk,m) is implicitly given from model m; the density f (xk z1:k 1,m) | | − is given by Eq. 2.27; and f (zk z1:k 1,m) is given by Eq. 2.29. The density z1:k con- | − ditioned on the chosen model m is given by Bayes’ rule

f (z1:k m)=f (zk, z1:k 1 m) | − |

= f (zk z1:k 1,m) f (z1:k 1 m) , (2.30) | − − | 42

where f (zk z1:k 1,m) is given by Eq. 2.28 and f (z1:k 1 m) is known from a previous | − − | iteration. The a posteriori model density, which is the model density m conditioned on the measurements z1:k is given by Bayes’ rule

f (z1:k m) fM (m) f (m z1:k)= | | f (z1:k) f (z1:k m) fM (m) = | , (2.31) ∞ f (z1:k m) fM (m) dm −∞ | where f (z1:k m) is given by Eq. 2.30R and fM (m) is given by Eq. 2.25. The expec- | tation of the state conditioned on the measurements z1:k and the model choice m is given by ∞ E xk z1:k,m = xkf (xk z1:k,m) dxk , (2.32) { | } | Z−∞ where f (xk z1:k,m) is given by Eq. 2.29. The expectation of the state given the | measurements z1:k is given by

∞ ∞ E xk z1:k = xkf (xk,mz1:k) dxkdm { | } | Z−∞ Z−∞ ∞ ∞ = xkf (xk, mz1:k) f (m z1:k) dxkdm | | Z−∞ Z−∞ ∞ = E xk z1:k,m f (m z1:k) dm , (2.33) { | } | Z−∞ where E xk z1:k,m is given by Eq. 2.32 and f (m z1:k) is given by Eq. 2.31. Differ- { | } | ent estimates will be used depending on the loss function chosen (see Section 2.1.3). If the loss is the mean-square estimation error then the conditional expectation, given by Eq. 2.33, will be used. If the loss is such that a maximum-likelihood estimator results, then the value of x will be chosen such that Eq. 2.29 attains a maximum. The results in this section can be greatly simplified if the models are linear, and the process and measurement noise are linear [7, Ch. 4]. These are the same simplifica- tions that result in a Kalman filter being more than the best linear unbiased estimator (BLUE), but also the minimum mean square error (MMSE) estimator. An excellent derivation and example of the multiple model for Gaussian noise is given in [19, pp. 353-361]. 43

2.3.1 Interactive Multiple Model

The latest research in the area of multiple-models are interacting multiple-models (IIM). The basic idea is that each of the models are considered a state in a Markov chain. A model transition matrix is then constructed that contains the probabilities that the object will transition from one motion model to another at a given instant. The interactive multiple model was first proposed by Blom [12, p. 221], [15] and [16]. Sections 2.2 and 2.3 have developed the framework for Bayesian estimation. Prac- tical application of the equations developed therein is application specific. It is often the case that a sub-optimal solution is settled on, with the most common choices being particle filters or linear estimators. The theory of particle filtersisbeyond the scope of this dissertation, but the interested reader can consult the literature [6], [27]. The other common approach to estimation is the use of linear filters, which is the topic of Section 2.4.

2.4 Linear Estimation

An MMSE estimator provides the minimum mean square error estimate among all estimators that are a (possibly nonlinear) function of the measurements zk. Except for the special, but important, case when the system equation and measurement equation are linear and have Gaussian noise, it may be difficult to derive an MMSE estimate. For this reason, it is helpful to restrict the estimator to the class of functions that are linear. In this section, no restrictions are imposed on the state model (i.e. it may be nonlinear), but the measurement equation is assumed to have the form

zk = Hkxk + vk . (2.34)

For the moment, assume no a priori knowledge about xk is available, either in the form of an initial estimate xˆk/k 1 or in the form of measurements z0:k 1.Alinear − − 44 estimator has the form

xˆk = Lzk , (2.35) where L is an n m matrix. Let e denote the error in the estimate ×

ek= xˆk xk . (2.36) −

2.4.1 Unbiased Estimation

Recall from Eq. 2.10 that an MMSE estimate is unbiased. Requiring the same of the linear MMSE estimator requires the expected value of ek to be zero

E ek = E xˆk xk { } { − }

= E Lzk xk { − }

= E L (Hkxk + vk) xk { − }

=(LHk In) E xk . (2.37) − { }

Since the state xk is not (in general) zero mean, then the expected value of the e is zero if and only if

LHk= In . (2.38)

At this point, a very important implication of unbiased estimation is made clear.

Namely that the matrix Hk must have full column rank for unbiased estimation to be possible. If the matrix Hk is less than full column rank, then it is impossible for any matrix L to satisfy Eq. 2.38. Suppose that the loss function is the mean square error T L (xˆk, xk)=ek ek . (2.39) 45

The associated risk is the mean square error. The error covariance is defined by

T Pk = E ekek T = E ©(xˆk ªxk)(xˆk xk) − − n oT = E (Lzk xk)(Lzk xk) − − n o T = E (L (Hkxk + vk) xk)(L (Hkxk + vk) xk) − − n T o = E (Lvk)(Lvk)

n T T o = LE vkvk L

T = LRk©L .ª (2.40)

The trace of the error covariance is equal to the risk of the loss function

T Tr(Pk)=E (xˆk xk) (xˆk xk) − − n o = E L (xˆk, xk) { }

= (xˆk) . (2.41) R

If one can find a matrix L that minimizes the trace of the error covariance Pk, then the mean square error will also be minimized. The estimator that satisfies this criteria is often referred to as the best linear unbiased estimator (BLUE).

2.4.2 BLUE Estimation

The best linear unbiased estimator (BLUE) provides an estimate xˆk that is both unbiased and has the minimum variance among all linear estimators. The BLUE can be obtained by minimizing the trace of the covariance Pk, given by Eq. 2.40, subject to the constraint imposed on all unbiased estimators, given by Eq. 2.38

T min Tr LRkL . (2.42) L:LHk=In Theorem 2.1. The BLUE estimator satis¡fies the equation¢

T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk (2.43) 46

Proof. Determining the matrix L that satisfies Eq. 2.42 is a constrained optimization problem and can be solved using Lagrange multipliers. A slight modification is necessary because the constraint is a matrix constraint. To accomplish this, we will first find the BLUE of a linear combination of the elements of the state vector

T y = γ xk , (2.44) where γ is an n 1 vector. A linear unbiased estimator of y,denotedbyyˆ,willbe × of the form T yˆ = θ zk , (2.45) where θ is an m 1 vector. If the estimator is to be unbiased we must have γ = HT θ, × k as can be shown by

E y yˆ = E y E yˆ { − } { } − { } T T = γ E xk θ E zk { } − { } T T = γ E xk θ HkE xk { } − { } T T = γ θ Hk E xk − { } =0¡ ,¢ (2.46) or T γ = Hk θ . (2.47) 47

The corresponding covariance is given by

2 Pyˆ = E (ˆy y) − T T 2 = E © θ zk γª xk − n T o T 2 = E ¡θ (Hkxk+v¢k) γ xk − n T T T 2o = E ¡θ Hkxk γ xk+θ vk¢ − n T T 2 o T 2 = E ¡θ Hkxk γ xk + E¢ θ vk − nT T 2 oT n o = θ H¡ kxk γ xk +¢θ Rkθ ¡ ¢ − T T 2 T = ¡θ Hkxk θ Hk¢xk + θ Rkθ − T = ¡θ Rkθ .¢ (2.48)

T T Thus, we seek to minimize θ Rkθ subject to γ = Hk θ

T T T L = θ Rkθ + λ γ H θ , (2.49) − k ¡ ¢ where λ is a n 1 vector of Lagrange multipliers. Taking the derivative with respect × to θ and setting the result equal to zero gives

T T T 2θ Rk λ H = 0 , (2.50) − k or 1 θ = R 1H λ . (2.51) 2 k− k Substituting this result into the constraint equation gives

1 γ = HT R 1H λ . (2.52) 2 k k− k

Before proceeding further, assume that the above process optimization is performed

th th for γ = γi, a unit vector with the i element equal to one. Then, yi is the i element of the state vector xk and the equations of importance are (from the proceeding 48 analysis) given below.

T yi = γi xk (2.53a)

T yˆi = θi zk (2.53b) 1 θ = R 1H λ (2.53c) i 2 k− k i 1 γ = HT R 1H λ (2.53d) i 2 k k− k i Define the following matrices.

Λ = λ1 λ2 λn (2.54a) ···

In = £ γ γ γ ¤ (2.54b) 1 2 ··· n

Θ = £ θ1 θ2 θn ¤ (2.54c) ··· Then, in matrix form, the collected£ equations are given¤ below.

y = xk (2.55a)

T yˆ = Θ zk (2.55b) 1 Θ = R 1H Λ (2.55c) 2 k− k 1 I = HT R 1H Λ (2.55d) n 2 k k− k From these, our estimate of x becomes

T xˆk= yˆ = Θ zk . (2.56)

T 1 Pre-multiplying by Hk Rk− Hk gives

T 1 T 1 T Hk Rk− Hkxˆ = Hk Rk− HkΘ zk . (2.57)

Substituting Eq. 2.55c gives

T 1 T 1 T Hk Rk− Hkxˆ = Hk Rk− HkΘ zk 1 T = HT R 1H R 1H Λ z k k− k 2 k− k k µ ¶ 1 T 1 T T 1 = H R− H Λ H R− z . (2.58) 2 k k k k k k 49

To simplify this result, we must show that the matrix Λ is symmetric. To do this,

1 T 1 we make use of Eq. 2.55d and note that S = 2 Hk Rk− Hk is symmetric

SΛ = In . (2.59)

As has already been stated, the matrix Hk must be full rank for unbiased estimation to be possible. This result and the previous equation indicate that Λ is the inverse of S.SinceS is symmetric, Λ must also be symmetric

T T In =(SΛ) = Λ S . (2.60)

That is, both Λ and ΛT are inverses of the matrix S and therefore

Λ = ΛT . (2.61)

Substituting this result into Eq. 2.58 becomes 1 1 HT R 1H xˆ = HT R 1H ΛT HT R 1z = HT R 1H ΛHT R 1z = HT R 1z . k k− k k 2 k k− k k k− k 2 k k− k k k− k k k− k (2.62) Thus, the theorem is proved

T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.63)

2.4.3 Least Square Estimation

A least square estimator is one that minimizes

T 1 L =(zk Hkxˆk) R− (zk Hkxˆk) . (2.64) − k − As will now be shown, the least square estimator is also the BLUE estimator. This is an unconstrained optimization problem. Taking the partial derivative with respect to xˆk and setting the result equal to zero gives

T 1 2(zk Hkxˆk) R− Hk= 0 , (2.65) − k 50 or T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.66)

2.4.4 Maximum Likelihood Estimation

Another way of arriving at the BLUE estimator is to assume that the noise is nor- mally distributed vk N (0, Rk) and use a technique known as maximum likelihood ∼ estimation. When the noise Gaussian distributed, the measurement zk is distributed as N (Hkxk, Rk) with the corresponding joint PDF

1 1 T 1 fZ (zk H , x , R )= exp (zk Hkxk) R− (zk Hkxk) . k | k k k m/2 1/2 −2 − k − (2π) Rk − ½ ¾ | | (2.67) The idea behind maximum likelihood estimation is to choose the parameter values, in this case xk, such that the observed values of zk are the most likely ones to have occurred. Since the value of xk is to be selected so as to minimize the likelihood function, we replace it with the estimate xˆk to distinguish it from the true value of the state xk

1 1 T 1 fZ (zk H , xˆ , R )= exp (zk Hkxˆk) R− (zk Hkxˆk) . k | k k k m/2 1/2 −2 − k − (2π) Rk − ½ ¾ | | (2.68) Now, the likelihood function, which is always non-negative, is maximized when its logarithm is maximized, which is more convenient

1 1 T 1 log (L)=log m/2 1/2 (zk Hkxˆk) RK− (zk Hkxˆk) . (2.69) Ã(2π) Rk − ! − 2 − − | | Taking the partial derivative with respect to xk gives

∂ (log (L)) ∂ 1 T 1 = (zk Hkxˆk) R− (zk Hkxˆk) ∂xˆ ∂xˆ −2 − k − k k ∙ ¸ T 1 =(zk Hkxˆk) R− Hk . (2.70) − k Setting this result equal to zero gives

T T 1 T 1 xˆk Hk Rk− Hk= zk Rk− Hk , (2.71) 51 or T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.72)

2.4.5 Full Rank Models

For the purposes of this discussion, full rank models are those models for which the m n dimensional matrix Hk has full rank; i.e. its rank is n.Allthreemethods × of estimation (least squares, maximum likelihood and BLUE) result in an estimation equation given by T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.73)

T 1 If the matrix Hk is full rank, the inverse of Hk Rk− Hk exists and xˆ canbesolvedfor

1 T 1 − T 1 xˆ = Hk Rk− Hk Hk Rk− zk . (2.74) ¡ ¢ That is, the estimate is of the form given below.

xˆ = Lz (2.75a)

1 T 1 − T 1 L = Hk Rk− Hk Hk Rk− (2.75b) ¡ ¢ Since this is a linear unbiased estimator, the covariance of the estimate is given by Eq. 2.40

T Pk = LRkL 1 1 T T 1 − T 1 T 1 − T 1 = Hk Rk− Hk Hk Rk− Rk Hk Rk− Hk Hk Rk−

h 1 i h 1 T i ¡ T 1 −¢ T 1 ¡T 1 −¢ = Hk Rk− Hk Hk Rk− Hk Hk Rk− Hk

1 T h i ¡ T 1 ¢ − ¡ ¢ = Hk Rk− Hk 1 h T 1 − i = H¡ k Rk− Hk ¢ . (2.76) ¡ ¢ The matrix L can be expressed using the updated covariance matrix

T 1 L = PkHk Rk− . (2.77) 52

2.4.6 A Priori Estimates and Rank Deficient Models

It has been found that unbiased estimation is only possible if the matrix Hk is of full column rank. It turns out that this condition is no longer necessary if an a priori estimate is available.

Theorem 2.2. Suppose that an a priori estimate xˆk/k 1 is available with a covariance − Pk/k 1. The BLUE (or least square or maximum likelihood) estimator is given by − xˆk/k = xˆk/k 1 + Kk z Hkxˆk/k 1 and the associated error covariance is given by − − 1 − 1 T 1 − T 1 Pk/k = Pk/k− 1 + Hk¡Rk− Hk ,where¢ K = Pk/k 1Hk Rk− − − h i Proof. Combining the a priori information with the measurement equation results in the system of equations given below.

xˆk/k 1 In − = xk + m (2.78a) z Hk ∙ ¸ ∙ ¸ T Pk/k 1 0 E mm = − (2.78b) 0Rk ∙ ¸ © ª Note that the combined system is in the form of a new measurement equation. To obtain an a posteriori unbiased estimate of xk,werequire I rank n = n , (2.79) Hk µ∙ ¸¶ which is insured because of the In present in the matrix. The BLUE estimator of this system is

xˆk/k 1 xˆ = L − k/k z ∙ ¸T 1 In Pk/k 1 0 − xˆk/k 1 = Pk/k − − Hk 0Rk z ∙ ¸ ∙ ¸ ∙ ¸ T 1 In Pk/k− 1 0 xˆk/k 1 = Pk/k − 1 − Hk 0R− z ∙ ¸ ∙ k ¸ ∙ ¸ T 1 In Pk/k− 1xˆk/k 1 0 = Pk/k − − 1 Hk 0R− z ∙ ¸ ∙ k ¸ 1 T 1 = Pk/k Pk/k− 1xˆk/k 1 + Hk Rk− z , (2.80) − − h i 53

where the a posteriori covariance matrix Pk/k is given by

1 T 1 − In Pk/k 1 0 − In Pk/k = − . (2.81) Hk 0Rk Hk Ã∙ ¸ ∙ ¸ ∙ ¸! Taking the inverse of the a posteriori covariance gives

T 1 1 In Pk/k 1 0 − In Pk/k− = − Hk 0Rk Hk ∙ ¸ ∙ ¸ ∙ ¸ T 1 In Pk/k− 1 0 = − 1 Hk 0R− H ∙ ¸ ∙ k k ¸ 1 T 1 = Pk/k− 1 + Hk Rk− Hk . (2.82) −

1 Solving this equation for Pk/k− 1 and substituting into Eq. 2.80 gives −

1 T 1 xˆk/k = Pk/k Pk/k− 1xˆk/k 1 + Hk Rk− z − − h 1 T 1 i T 1 = Pk/k Pk/k− Hk Rk− Hk xˆk/k 1 + Hk Rk− z − − h³ T 1 ´ i = xˆk/k 1 + Pk/kHk Rk− z Hkxˆk/k 1 − − −

= xˆk/k 1 + Kk z Hkxˆ¡k/k 1 ,¢ (2.83) − − − ¡ ¢ where 1 Kk = Pk/kHkRk− . (2.84)

Therefore, the a posteriori covariance and state estimate are given below.

1 1 T 1 − Pk/k = Pk/k− 1 + Hk Rk− Hk (2.85) − ³ T 1 ´ Kk = Pk/kHk Rk− (2.86)

xˆk/k = xˆk/k 1 + Kk z Hkxˆk/k 1 (2.87) − − − ¡ ¢ The use of this estimator only requires that the rows of Hk are linearly independent — a very reasonable assumption.

This chapter has introduced the subject of estimation. Optimal estimation was defined in the context of risk. Bayesian estimation was discussed and fundamental 54 equations were derived. Linear estimation was discussed in detail, as a possibly sub-optimal alternative to Bayesian estimation. Chapter 3 will develop optimal estimators for linear systems, with the final result being that linear estimators are optimal for linear systems with linear measurement equations. 55

Chapter 3 Estimation in Linear Sampled Data Systems

In the present day when computers are inexpensive and operate at speeds measured in gigahertz, linear sampled data systems are extremely common, and in fact are quickly replacing older (pure) analog systems. This is certainly true of missiles, where traditional analog control systems are now being replaced by more advanced digital control systems. This chapter is concerned with developing estimators for a linear sampled data system. The linear system has the general form discussed in Appendix B and repeated here

x˙ = F (t) x (t)+BS (t) uS (t)+BD (t) uD (t) , (3.1) where uS is a white stochastic input with autocorrelation given by

T E uS (t) uS (t + τ) = Aδ (τ) , (3.2) © ª and uD is a deterministic input. The system response may be expressed in terms of the state transition matrix Φ, as given by Eq. B.45 and repeated here

t x (t)=Φx (t, t0) x (t0)+ Φx (t, τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ . (3.3) Zt0

At every T seconds a linear measurement, zk, of the system state, x (tk),ismade

zk = Hkxk + vk , (3.4)

where vk is a white random sequence. Letting t0 = tk and t = tk+1 in Eq. 3.3 gives

tk+1 x (tk+1)=Φx (tk+1,tk) x (tk)+ Φx (tk+1,τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ Ztk = Φx (tk+1,tk) x (tk)+w (tk)+d (tk) ,(3.5) 56 where tk+1 w (tk)= Φx (tk+1,τ) BS (τ) uS (τ) dτ , (3.6) Ztk and tk+1 d (tk)= Φx (tk+1,τ) BD (τ) uD (τ) dτ . (3.7) Ztk As discussed in Appendix B, the random variable w (tk) is a Gaussian white random sequence. For notational convenience, Eq. 3.5 can be written as

xk+1 = Φkxk + wk + dk , (3.8) where

xk =(n 1) process state vector at time tk × Φk =(n n) matrix relating xk to xk+1 in the absence of a forcing function × wk =(n 1) a white sequence with known covariance structure × dk =(n 1) deterministic input × zk =(m 1) vector measurement at time tk × Hk =(m n) matrix giving the ideal (noiseless) connection between × the measurement and the state vector at time tk vk =(m 1) measurement error - a white sequence with known covariance × structure and having zero cross correlation with the wk sequence (3.9) The autocorrelation of the process noise and measurement noise are denoted by

T E vkvk+m = Rkδ (m) (3.10a)

T E ©wkwk+mª = Qkδ (m) ,(3.10b) © ª where Qk canbecomputedasshowninAppendixB

tk+1 T T Qk = Φx (tk+1,τ) BS (τ) ABS (τ) Φx (tk+1,τ) dτ . (3.11) Ztk The requirement that the process noise and measurement noise be uncorrelated is not necessary for developing an optimal estimator of the system. Rather this assumption is merely used to simplify the ensuing derivation of the optimal estimator. Appro- priate extensions are available in the literature should the reader have need to make use of them [3], [54]. 57

3.1 Bayes’ Estimator

The Bayes’ approach to optimal estimation comes from the equations developed in

Section 2.2. Estimation begins with an initial estimate xˆ0, which is related to the true state x0 by

x0 N xˆ0/0, P0/0 . (3.12) ∼ That is, ¡ ¢ 1 T 1 (x0 xˆ0/0) P− (x0 xˆ0/0) e− 2 − 0/0 − f (x )=f (x z )= , (3.13) 0 0 0 n/2 1/2 | (2π) P0/0 where the conditioning variable z0 represents the initial¯ estimate¯ xˆ0/0 and covariance ¯ ¯ P0/0. The random variable xk xk 1 will always be Gaussian because the process | − noise is Gaussian. Since the random variable xk xk 1 is Gaussian, it is completely | − characterized by its mean and covariance, which are readily computed using Eq. 3.8. The conditional mean is given by

E xk xk 1 = Φk 1xk 1 + dk 1 . (3.14) { | − } − − −

The conditional covariance

T cov xk xk 1 = E (xk E xk xk 1 )(xk E xk xk 1 ) xk 1 { | − } − { | − } − { | − } | − n T o = E wk 1wk 1 xk 1 − − | −

= Qk© 1 .ª (3.15) −

Since wk 1 is Gaussian, the conditioned random variable is also Gaussian −

xk xk 1 N (Φk 1xk 1 + dk 1, Qk 1) . (3.16) | − ∼ − − − −

Therefore

1 T 1 2 (xk Φk 1xk 1 dk 1) Qk− 1(xk Φk 1xk 1 dk 1) e− − − − − − − − − − − − f (xk xk 1)= . (3.17) − n/2 1/2 | (2π) Qk 1 | − | 58

The random variable zk xk will always be Gaussian because the measurement noise | vk is Gaussian. In the same manner as which f (xk xk 1) was determined, it is easy | − to show that the conditional measurement noise density is given by

1 T 1 (zk Hkxk) R− (zk Hkxk) e− 2 − k − f (zk xk)= m/2 1/2 . (3.18) | (2π) Rk | |

3.1.1 First Measurement

After T seconds a measurement z1 is processed. The density f (x0 z0) can be updated | using Eq. 2.20 ∞ f (x1 z0)= f (x1 x0) f (x0 z0) dx0 , (3.19) | | | Z−∞ where f (x1 x0) is given by Eq. 3.17 and f (x0 z0) is given by Eq. 3.13. However, it | | is not necessary to directly evaluate the integral in Eq. 3.19. The random variable x1 is a linear combination of the Gaussian distributed random variables x0 and w0 and is therefore Gaussian distributed as well. Furthermore, the conditioning random variable z0 is also Gaussian and so must be x1 z0. Since all Gaussian random variables | are uniquely characterized by their mean and variance, the distribution is most easily obtained by finding the mean and variance of the random variable x1 z0.Themean | of x1 z0 is obtained from Eq. 3.8 |

E x1 z0 = E Φ0x0 + w0 + d0 z0 { | } { | }

= Φ0E x0 z0 + d0 { | }

= Φ0xˆ0/0 + d0 . (3.20) 59

Thecovarianceofx1 z0 is given by |

T cov x1 z0 = E (x1 E x1 z0 )(x1 E x1 z0 ) z0 { | } − { | } − { | } | n o T = E (Φ0x0 + w0 + d0 E x1 z0 )(Φ0x0 + w0 + d0 E x1 z0 ) z0 − { | } − { | } | n T o = E Φ0 x0 xˆ0/0 + w0 Φ0 x0 xˆ0/0 + w0 z0 − − | n T oT = E ¡Φ0 ¡x0 xˆ0/0¢ Φ0 ¢¡x0 ¡xˆ0/0 z0¢ + E¢ w0w .(3.21) − − | 0 n¡ ¡ ¢¢ ¡ ¡ ¢¢ o © ª The last result follows because w0 is independent of z0, x0,andxˆ0/0.Continuing with the computation gives

T T cov x1 z0 = E Φ0 x0 xˆ0/0 Φ0 x0 xˆ0/0 z0 + E w0w { | } − − | 0 n T T o = Φ0E¡ x¡0 xˆ0/0 ¢¢x0 ¡ xˆ¡0/0 z0 ¢¢Φ + Q0 © ª − − | 0 n T o = Φ0P0/0¡Φ0 + Q0 ¢¡.¢ (3.22)

Thus, the random variable x1 is distributed as

x1 z0 N xˆ1/0, P1/0 , (3.23) | ∼ ¡ ¢ where

xˆ1/0 = Φ0xˆ0/0 + d0 (3.24)

T P1/0 = Φ0P0/0Φ0 + Q0 . (3.25)

That is, the density of x1 is given by

1 T 1 (x1 xˆ1/0) P− (x1 xˆ1/0) e− 2 − 1/0 − f (x1 z0)= . (3.26) n/2 1/2 | (2π) P1/0

Now, we must evaluate f (z1 z0). One option is¯ to use¯ Eq. 2.23 | ¯ ¯ ∞ f (z1 z0)= f (z1 x1) f (x1 z0) dx1 . (3.27) | | | Z−∞ Alternately,wecanusethefactthatthevariablesz1 and z0 are Gaussian. Therefore, the conditional density f (z1 z0) will also be Gaussian. Since a Gaussian RV is | 60 completely determined by its mean and covariance, we proceed by finding the mean and covariance of Eq. 3.4

E z1 z0 = E H1x1 + v1 z0 { | } { | }

= H1E x1 z0 { | }

= H1xˆ1/0 , (3.28) where Eq. 3.23 has been used. The covariance is given by

T cov z1 z0 = E (z1 E z1 z0 )(z1 E z1 z0 ) z0 { | } − { | } − { | } | n³ ´o T = E H1x1 + v1 H1xˆ1/0 H1x1 + v1 H1xˆ1/0 z0 − − | n³ T ´o = E ¡H1 x1 xˆ1/0 + v1 ¢¡H1 x1 xˆ1/0 + v1 ¢ z0 − − | n³ T T T ´o = E ¡H1 x¡ 1 xˆ1/0 ¢ x1 ¢¡xˆ1/0 ¡H z0 ¢+ E ¢v1v − − 1 | 1 n³ T ´oT = H1E ¡x1 xˆ1/0 ¢¡x1 xˆ1/0 ¢ z0 H + R1©,ª (3.29) − − | 1 n³¡ ¢¡ ¢ ´o where we have used the fact that v1 is independent of z0, x1,andxˆ1/0 and

T R1 = E v1v1 . (3.30) © ª From Eq. 3.26 we know that

E x1 z0 = xˆ1/0 . (3.31) { | } Substituting this result into Eq. 3.29 gives

T T cov z1 z0 = H1E x1 xˆ1/0 x1 xˆ1/0 z0 H + R1 { | } − − | 1 n³ ´o T T = H1E (¡x1 E x¢¡1 z0 )(x1 ¢E x1 z0 ) z0 H + R1 − { | } − { | } | 1 n³ T ´o = H1cov (x1 z0) H + R1 . (3.32) | 1 Substituting Eq. 3.22 and Eq. 3.25 into the previous result gives

T cov z1 z0 = H1cov (x1 z0) H + R1 { | } | 1 T = H1P1/0H1 + R1 . (3.33) 61

The conditional density of z1 given z0 can be found using Eq. 3.28 and Eq. 3.33

1 T T 1 (z1 H1xˆ1/0) (H1P1/0H +R1)− (z1 H1xˆ1/0) e− 2 − 1 − fz1 z0 (z1, z0)= . (3.34) | m/2 T 1/2 (2π) H1P1/0H1 + R1

We now have enough information to compute¯ f (x1 z0:1).¯ The random variable ¯ | ¯ x1 z0 is Gaussian (see Eq. 3.26) and the random variable z1 is also Gaussian because | it is a linear combination of two independent Gaussian random variables x1 and v1. Therefore, the random variable x1 z0 conditioned on z1, which is written as | x1 z0z1 = x1 z0:1, must also be Gaussian. Because the random variable x1 z0:1 is | | | Gaussian, it will have the form

1 T 1 (x1 E x1 z0:1 ) [cov(x1 z0:1)]− (x1 E x1 z0:1 ) e− 2 − { | } | − { | } f (x1 z0:1)= n/2 1/2 | (2π) cov (x1 z0:1) | | | 1 Q e− 2 1/1 = n/2 1/2 , (3.35) (2π) cov (x1 z0:1) | | | where

T 1 Q1/1 =(x1 E x1 z0:1 ) [cov (x1 z0:1)]− (x1 E x1 z0:1 ) − { | } | − { | } 1 T 1 = x1 [cov (x1 z0:1)]− x1 2x [cov (x1 z0:1)]− E x1 z0:1 | − 1 | { | } T 1 +E x1 z0:1 [cov (x1 z0:1)]− E x1 z0:1 . (3.36) { | } | { | }

By itself, Eqs. 3.35-3.36 are not of much use because we don’t know the mean and covariance of x1 z0:1. However, the density of x1 conditioned on z0:1 can be found | using Eq. 2.22 f (z1 x1) f (x1 z0) f (x1 z0:1)= | | . (3.37) | f (z1 z0) | 62

Substituting Eq. 3.18 (with k =1), Eq. 3.26, and Eq. 3.34 into Eq. 3.37 gives

f (z1 x1) f (x1 z0) f (x1 z0:1)= | | | f (z1 z0) | T 1 T 1 1 1 (z H x ) R (z H x ) (x1 xˆ1/0) P− (x1 xˆ1/0) e 2 1 1 1 −1 1 1 1 e− 2 − 1/0 − = − − − m/2 1/2 n/2 1/2 (2π) R1 (2π) P | | 1/0 1 1 T T 1 − (z1 H1xˆ1/0) (H1P1/0H +R1)− (z1 H1xˆ1/0) e− 2 − ¯ 1 ¯ − ¯ ¯ . (3.38) × ⎛ m/2 T 1/2 ⎞ (2π) H1P1/0H1 + R1 ⎝ ⎠ As it stands, the density f (x1 z0:1) appears¯ very complicated.¯ It has already been | ¯ ¯ reasoned that the random variable x1 z0:1 must be Gaussian. Further evidence of | this is that the density f (x1 z0:1) is an exponential quadratic in x1 | T 1/2 H1P1/0H + R1 1 1 Q1/1 f (x1 z0:1)= e− 2 , (3.39) | n/2 1/2 1/2 (2¯π) P1/0 R¯1 ¯ | ¯ | where ¯ ¯ ¯ ¯ T 1 T 1 Q1/1 =(z1 H1x1) R− (z1 H1x1)+ x1 xˆ1/0 P− x1 xˆ1/0 − 1 − − 1/0 − T T 1 z1 H1xˆ1/0 H1P1/0H ¡+ R1 − ¢z1 H1¡xˆ1/0 .¢ (3.40) − − 1 − The quadratic can be¡ expanded as¢ follows¡ ¢ ¡ ¢

T 1 T 1 Q1/1 =(z1 H1x1) R− (z1 H1x1)+ x1 xˆ1/0 P− x1 xˆ1/0 − 1 − − 1/0 − T T 1 z1 H1xˆ1/0 H1P1/¡0H + R1¢ − z1¡ H1xˆ1/0 ¢ − − 1 − T 1 T 1 T 1 T T 1 T 1 = x P− ¡+ H R− H¢1 ¡x1 + z R− z1 2¢x H¡ R− z1 +¢xˆ P− xˆ1/0 1 1/0 1 1 1 1 − 1 1 1 1/0 1/0 ³ T 1 ´ T T 1 2x P− xˆ1/0 z1 H1xˆ1/0 H1P1/0H + R1 − z1 H1xˆ1/0 − 1 1/0 − − 1 − T 1 T 1 T T 1 1 T 1 = x P− + H R− H1 x¡ 1 2x H¢ R¡ − z1 + P− xˆ1/0 ¢+ z¡ R− z1 ¢ 1 1/0 1 1 − 1 1 1 1/0 1 1 ³ T 1 ´ ³ T T ´ 1 + xˆ P− xˆ1/0 z1 H1xˆ1/0 H1P1/0H + R1 − z1 H1xˆ1/0 . 1/0 1/0 − − 1 − (3.41) ¡ ¢ ¡ ¢ ¡ ¢ Comparing the quadratic term in Eq. 3.41 with Eq. 3.36, one can conclude that the covariance is given by

1 1 T 1 − cov (x1 z0:1)= P− + H R− H1 . (3.42) | 1/0 1 1 ³ ´ 63

This inverse can be expressed using the easily checked formula

1 1 T 1 − P− + H R− H1 =(I K1H1) P1/0 , (3.43) 1/0 1 1 − ³ ´ where 1 T T − K1 = P1/0H1 H1P1/0H1 + R1 . (3.44)

Using this definition, the covariance may¡ be expressed as¢

P1/1 = cov (x1 z0:1)=(I K1H1) P1/0 . (3.45) | − Similarly, comparing the bilinear term in Eq. 3.41 with Eq. 3.36, one can conclude thatthemeanisgivenby

T 1 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 + P− xˆ1/0 . (3.46) { | } | 1 1 1/0 ³ ´ Substituting Eq. 3.45 into Eq. 3.47 gives

T 1 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 + P− xˆ1/0 { | } | 1 1 1/0 1 1 T ³ 1 − T 1 ´ 1 = P1−/0 + H1 R1− H1 H1 R1− z1 + P1−/0xˆ1/0 ³ T´ 1³ ´ =(I K1H1) P1/0H R− z1 +(I K1H1) xˆ1/0 − 1 1 − T 1 T 1 = P1/0H R− K1H1P1/0H R− z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = ¡P1/0H R− K1H1P1/0H R− ¢ K1 z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − − T 1 T 1 = ¡P1/0H R− K1 I + H1P1/0H R− ¢ z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = £P1/0H R− K1 ¡R1+H1P1/0H R¢¤− z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = £P1/0H R− P1/¡0H R− z1 + K¢1z1 +(¤ I K1H1) xˆ1/0 1 1 − 1 1 −

= K£ 1z1 +(I K1H1) xˆ1/0 ¤ −

= xˆ1/0 + K1 z1 H1xˆ1/0 .(3.47) − ¡ ¢ We then define xˆ1/1 as

xˆ1/1 = E x1 z0:1 = xˆ1/0 + K1 z1 H1xˆ1/0 . (3.48) { | } − ¡ ¢ 64

The conditional density can then be written as

1 T 1 (x1 xˆ1/1) P− (x1 xˆ1/1) e− 2 − 1/1 − f (x1 z0:1)= . (3.49) n/2 1/2 | (2π) P1/1 ¯ ¯ 3.1.2 kth Measurement and Final Result¯ by Induction¯

A general proof by induction involves showing a result holds initially and then showing it also holds after k +1steps by assuming it held after k steps. For the problem at hand, the result is a recursive formula for the conditional density of a random variable x. The random variable x has an initial distribution given by Eq. 3.12

x0 N xˆ0/0, P0/0 , (3.50) ∼ ¡ ¢ where xˆ0/0 denotes the conditional mean of x0 after processing 0 measurements. More generally, a subscript i/j indicates that the subscripted quantity is valid at time step i and is conditioned on j measurements collectively denoted as z1:j. A recursive formula for updating the conditional density of x is now stated. With the state modelgivenbyEq.3.8,theaprioriconditionaldensityofxk is given by

f (xk z0:k 1)=N xˆk/k 1, Pk/k 1 , (3.51) | − − − where ¡ ¢

xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.52a) − − − − − T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 .(3.52b) − − − − − −

Given a new measurement zk, the a priori density is updated according to

f (xk z0:k)=N xˆk/k, Pk/k , (3.53) | where ¡ ¢

xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 (3.54a) − − −

Pk/k =(I KkHk) P¡ k/k 1 ¢ (3.54b) − − 1 T T − Kk = Pk/k 1Hk HkPk/k 1Hk + Rk .(3.54c) − − ¡ ¢ 65

The detailed derivation contained in Section 3.1.1 proved the validity of this result for the initial measurement z1 and the initial density f (x0) given by Eq. 3.50. Proceed- ing with a proof by induction, we assume that this result holds for k 1 measurements − and demonstrate that the same results holds for k measurements. By assumption, the a posteriori density of xk 1 is −

f (xk 1 z0:k 1)=N xˆk 1/k 1, Pk 1/k 1 . (3.55) − | − − − − − ¡ ¢ The random variable xk z0:k 1 is Gaussian because xk and z0:k 1 are both Gaussian. | − − The mean and covariance of xk z0:k 1 are readily obtained in the same manner as was | − done for x1 z0 and are given by |

E xk z0:k 1 = Φk 1xˆk 1/k 1 + dk 1 = xˆk/k 1 (3.56a) { | − } − − − − − T cov (xk z0:k 1)=Φk 1Pk 1/k 1Φk 1 + Qk 1 = Pk/k 1 .(3.56b) | − − − − − − −

Thus, the density of xk z0:k 1 is consistent with the result we are trying to prove and is | − given by Eq. 3.51. The next step in the estimation process is to obtain the conditional density zk z0:k 1. Since the random variables zk are Gaussian, the conditional density | − f (zk z0:k 1) will also be Gaussian, and therefore completely determined by its mean | − and covariance. We proceed by finding the mean and covariance of Eq. 3.4

E zk z0:k 1 = E Hkxk + vk z0:k 1 { | − } { | − }

= HkE xk z0:k 1 { | − }

= Hkxˆk/k 1 , (3.57) − where Eq. 3.56a has been used. The covariance is given by

T cov zk z0:k 1 = E (zk E zk z0:k 1 )(zk E zk z0:k 1 ) z0:k 1 { | − } − { | − } − { | − } | − n o T = E Hkxk + vk Hkxˆk/k 1 Hkxk + vk Hkxˆk/k 1 z0:k 1 − − − − | − n T o = E ¡Hk xk xˆk/k 1 + vk ¢¡Hk xk xˆk/k 1 + vk ¢ z0:k 1 − − − − | − n T T T o = HkE¡ x¡k xˆk/k 1 ¢xk ¢¡xˆk/k 1¡ z0:k 1 H¢k + E¢ vkvk − − − − | − n T o = HkPk/k¡ 1Hk + Rk ¢¡,¢ © (3.58)ª − 66

where Eq. 3.56b has been used. The fact that the random variable vk is indepen- dent white noise and therefore independent of xk, xˆk/k 1 and z0:k 1 was also used in − − obtaining Eq. 3.58. Using Eq. 3.57 and Eq. 3.58, the density of zk z0:k 1 is given by | − T 1 1 T − 2 (zk Hkxˆk/k 1) (HkPk/k 1Hk +Rk) (zk Hkxˆk/k 1) e− − − − − − f (zk z0:k 1)= . (3.59) | − m/2 T 1/2 (2π) HkPk/k 1Hk + Rk −

We now have enough information to compute¯ f (xk z0:k).¯ The random variables xk ¯ | ¯ and z0:k are all Gaussian. Therefore, the conditional random variable xk z0:k is also | Gaussian. The density of xk z0:k can be found using Eq. 2.22 |

f (zk xk) f (xk z0:k 1) f (xk z0:k)= | | − . (3.60) | f (zk z0:k 1) | − Substituting Eq. 3.18, Eq. 3.51, and Eq. 3.59 into Eq. 3.60 gives

f (zk xk) f (xk z0:k 1) f (xk z0:k)= | | − | f (zk z0:k 1) | − T 1 T 1 1 1 (zk Hkxk) R− (zk Hkxk) 2 (xk xˆk/k 1) Pk/k− 1(xk xˆk/k 1) e 2 k e− − − − − = − − − − m/2 1/2 n/2 1/2 (2π) Rk (2π) Pk/k 1 | | − T 1 1 1 T − − 2 (zk Hkxˆk/k 1) (HkPk/k¯ 1Hk +Rk¯) (zk Hkxˆk/k 1) e− − − − − − ¯ ¯ . × ⎛ m/2 T 1/2 ⎞ (2π) HkPk/k 1Hk + Rk − ⎝ ¯ ¯ ⎠(3.61) ¯ ¯ The density given by Eq. 3.61 is directly comparable to the density given by Eq. 3.38. Therefore, one can proceed to the final result by appropriate substitutions into Eq. 3.49. Doing so results in the following PDF

1 T 1 (xk xˆk/k) P− (xk xˆk/k) e− 2 − k/k − f (xk z k)= , (3.62) 0: n/2 1/2 | (2π) Pk/k ¯ ¯ where xˆk/k and Pk/k are given by Eq. 3.54a and¯ 3.54b,¯ respectively. Since Eq. 3.62 is identical to Eq. 3.53 the proof by induction is complete. 67

3.2 Estimates and Confidence Regions (Error Ellipsoids) for the Bayes’ Estimator

A central aspect of the Bayesian estimator given by Eqs. 3.51-3.54 is that the condi- tional densities are Gaussian. This is true for both the a priori conditional density (Eq. 3.51)

f (xk z0:k 1)=N xˆk/k 1, Pk/k 1 , (3.63) | − − − and the a posteriori density (Eq. 3.53) ¡ ¢

f (xk z0:k)=N xˆk/k, Pk/k . (3.64) | ¡ ¢ This is convenient since one may wish to form an estimate of the state (a priori or a posteriori) and an associated confidence region for that estimate. The MMSE estimate is the conditional mean. The conditional a priori mean estimate of xk and the associated covariance are computed by Eq. 3.52a. Similarly, the conditional a posteriori mean estimate of xk and the associated covariance are computed by Eq. 3.54. As discussed in Appendix ??, the eigenvalues and eigenvectors of the covariance matrix are used to specify an ellipsoidal confidence region about the estimate xˆk. The exponential in the Gaussian PDF is a quadratic function defining an ellipsoid S in n dimensional space

(α) T 1 = xk :(xk xˆk) P− (xk xˆk)=α . (3.65) SXk − k − n o The ellipsoid defined by (α) has principal axes aligned with the eigenvectors of the SXk covariance matrix Pk. The eigenvalues λi of the covariance matrix Pk are related to 2 the variance σi along each of the principal axes of the ellipsoid

2 λi = σi . (3.66)

(α) The ellipsoid defined by intersects its principal axes at a distance of √α√λi = SXk √ασi from the center of the ellipsoid. The probability that the true state xk belongs 68 to the region (α) enclosed by the ellipsoid (α) can be found by Eq. A.228 RXk SXk

2 (r ) 2 P Xk = FR2 r n , (3.67) ∈ RXk | µ ¶ 2 ¡ ¢ where the cumulative distribution F 2 (r n) refers to the chi-square distribution with R | n degrees of freedom. More generally, we would like to determine the value of α = r2 required for a given level of probability. Table A.2 lists the required value of r = √α for various values of n at probability levels of 0.5, 0.90, and 0.95. For example, with n =6, the 95% confidence (0.95 probability) ellipsoid is centered on the estimate xˆk and intersects its principal axes (which are the eigenvectors of Pk)atadistanceof

3.55σi from the ellipsoid center.

3.3 The White Noise Assumption and Bayesian Estimation

The derivation of the Bayesian estimator made explicit use of the independent nature of the process and measurement noise. The Bayesian estimator given by Eqs. 3.51- 3.54 is only valid if the noise sources are white. Unfortunately, many noise sources are far from white and exhibit some type of correlation. If the noise autocorrelation can be adequately described by a decaying exponential, which is very common, then a shaping filter can be used to obtain a Bayesian estimator. The noise sources of the Bayes estimator must be white and therefore have an autocorrelation given by Eq. 3.10a. The following continuous-time shaping filter was considered in Appendix B

x˙ = ax + bu . (3.68) − The input u is white noise described by the autocorrelation

RU (τ)=E u (t) u (t + τ) = Aδ (τ) , (3.69) { } where δ (τ) is a dirac delta function. The output autocorrelation is given by

Ab2 R (τ)= e a τ . (3.70) X 2a − | | 69

The shaping filter described by Eq. 3.68 can be augmented to the continuous-time state equations used to form the discrete-time sampled data state equations given by Eq. 3.8. The input of the shaping filter is white, as required for the Bayesian estimator being discussed, and the output is correlated according to Eq. 3.70. The output of the shaping filter could then be added to the state equations, the measure- ment equation, or both. This is a very effective way to model colored noise. The same idea can be used directly with the discrete-time state equations. The following discrete-time shaping filter was considered in Appendix B.5.3

xk = αxk 1 + βwk 1 . (3.71) − −

The input wk is white noise described by the autocorrelation

RW (m)=E wkwk+m = Aδ (m) , (3.72) { } where δ (m) is the unit impulse function (with a value of one when m =0and zero otherwise). The output autocorrelation is given by

Aβ2 R (m)= αm . (3.73) X 1 α2 − The shaping filter described by Eq. 3.71 can be directly augmented to the discrete- time sampled data state equations given by Eq. 3.8. The augmented state represent- ing the output of the shaping filter could then be added to the other state equations, the measurement equation, or both. It is only a matter of preference whether to use a discrete-time shaping filter directly, or begin with a continuous-time shaping filter and discretize it along the other state equations when forming the discrete-time sampled data equations. The use of shaping filters is, of course not restricted to scalar inputs, but easily extends to vector inputs. In the vector input case, a system of linear, time invariant differential (or difference) equations can be used to create a noise signal that has an exponentially decreasing autocorrelation and if so desired a non-trivial cross-correlation. 70

3.4 The Deterministic Input Assumption and Bayesian Esti- mation

The linear state equation is given by Eq. 3.8 and repeated here

xk+1 = Φkxk + wk + dk . (3.74)

In the derivation of the Bayes’ estimator, it was assumed that the function dk is deterministic. In many situations, the function dk is a function of the estimate xˆk/k

dk = g xˆk/k . (3.75) ¡ ¢ Is the Bayes’ estimator still valid when dk is not deterministic, but rather given by Eq. 3.75? The answer to this question is yes, as will soon become clear. In the case when dk is deterministic, the estimate xˆk/k,satisfies the recursive formula given by Eq. 3.54a and repeated here

xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 . (3.76) − − − ¡ ¢ Because of this, xˆk/k only depends on the initial estimate xˆ0/0 and the measurement sequence z1:k, which together are represented by the sequence z0:k.Theonlytimedk is used in the derivation of the Bayes’ estimator is in the computation of the a priori density (see Eq. 3.51) f (xk+1 z0:k),whichisgivenby |

f (xk+1 z0:k)=f (Φkxk + wk + dk z0:k)=f Φkxk + wk + g xˆk/k z0:k . (3.77) | | | ¡ ¡ ¢ ¢ However, this density is conditioned on z0:k — the very information used to construct xˆk/k. Therefore, the resulting Bayes’ estimator will be the same if the input dk is deterministic, or a function of xˆk/k. A very interesting observation is now made regarding the fact that g ( ) can be a nonlinear function. The development of the · Bayes’ estimator for dk beingdeterministicmademuchuseofthefactthatthevari- ables xk and zk were Gaussian. Clearly, if g ( ) is a nonlinear function then neither · 71

xk nor zk will (in general) be Gaussian. However, the conditional densities xk z0:k 1 | − and zk z0:k 1 are Gaussian because conditioning on z0:k 1 effectively also conditions | − − on d0:k 1 since the inputs d0:k 1 are functions of the conditioning variables z0:k 1. − − −

3.5 Bayesian Estimation Between Measurements

As developed so far, the Bayesian estimator produces the conditional density at mea- surement instances separated by intervals of time T . Itisalsopossibletogenerate the conditional density at instants of time that occur in between measurements. An example of how this is accomplished is the a priori prediction, which provides the conditional density of xk given the measurements z0:k 1.Recallthatxk is simply − a convenient notation for the random variable x (tk). In the same manner as the a priori density is generated, it is also possible to provide a conditional density of x (tk + ∆T) given the measurements z0:k. Although not necessary, assume that ∆ is any number such that 0 ∆ 1 . (3.78) ≤ ≤

For notational convenience, let xk+∆ denote the random variable x (tk + ∆T).The state x (tk + ∆T) can be expressed in terms of the state x (tk) by using Eq. 3.5

xk+∆ = x (tk+∆) tk+∆ = Φx (tk+∆,tk) x (tk)+ Φx (tk+∆,τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ Ztk = Φk,∆xk + wk,∆ + dk,∆ . (3.79) where for notational convenience, the following substitutions have been made

Φk,∆ = Φx (tk+∆,tk) (3.80a) tk+∆ wk,∆ = Φx (tk+∆,τ) BS (τ) uS (τ) dτ (3.80b) Ztk tk+∆ dk,∆ = Φx (tk+∆,τ) BD (τ) uD (τ) dτ .(3.80c) Ztk 72

Because the process noise uS is white, the random variable wk,∆ is a white Gaussian random variable with density

f (wk,∆)=N (0, Qk,∆) , (3.81) where Qk,∆ canbecomputedasshowninAppendixB

tk+∆ T T T Qk = E wkwk = Φx (tk+∆,τ) BS (τ) ABS (τ) Φx (tk+∆,τ) dτ . (3.82) Ztk © ª The conditional density of xk given z0:k is given by Eq. 3.53

f (xk z0:k)=N xˆk/k, Pk/k . (3.83) | ¡ ¢ Because the random variables xk and wk,∆ are Gaussian, the random variable xk+∆, which is a linear combination of xk and wk,∆, will also be Gaussian. Furthermore, because xk+∆ and zj are both Gaussian, the conditional density of xk+∆ given z0:k is also Gaussian and therefore completely characterized by the conditional mean and conditional covariance

f (xk+∆ z0:k)=N xˆk+∆/k, Pk+∆/k . (3.84) | The conditional mean is found by taking the¡ expectation of¢ Eq. 3.79

xˆk+∆/k = E xk+∆ z0:k { | }

= E Φk,∆xk + wk,∆ + dk,∆ z0:k { | }

= Φk,∆E xk z0:k + dk,∆ { | }

= Φk,∆xˆk/k + dk,∆ . (3.85)

Similarly, the conditional covariance is given by

T Pk+∆/k = E xk+∆ xˆk+∆/k xk+∆ xˆk+∆/k z0:k − − | n o T = E (¡Φk,∆ (xk xˆk)+¢¡wk,∆)(Φk,∆ (xk¢ xˆk)+wk,∆) z0:k − − | n T To = E (Φk,∆ (xk xˆk)) (Φk,∆ (xk xˆk)) z0:k + E wk,∆w z0:k − − | k,∆| n T T o T = Φk,∆E (xk xˆk)(xk xˆk) z0:k Φ + E wk,©∆w ª − − | k,∆ k,∆ n T o = Φk,∆Pk/kΦk,∆ + Qk,∆ .© ª (3.86) 73

It is clear that between measurements, the state and covariance satisfy the differential equations developed in Appendix B, specifically Eq. B.83 and Eq. B.87. When a measurement zj is processed, the state estimate and covariance are instantaneously adjusted according to Eq. 3.54. These results show that the estimate xˆ (t) satisfies the differential equation dxˆ = F (t) xˆ + B u + δ (t t ) K (z H xˆ (t)) . (3.87) dt D D j j j j j − − X Thecovariancesatisfies the differential equation

˙ T T P (t)=F (t) P (t)+P (t) F (t)+BS (t) ABS (t) δ (t tj) KjHjP (t) . (3.88) − j − X The sum of all knowledge available at time t, referred to as the information state, is denoted by D (t).Attimet0 the information state consists of the a priori density = of x (t0) given by Eq. 3.12

D (0) = xˆ0/0, P0/0 . (3.89) = © ª The distribution of the state at time t =0can be expressed as

f (x (0) D (0)) N xˆ0/0, P0/0 . (3.90) |= ∼ ¡ ¢ At time instances denoted by tk, measurements zk become available and are added to the information state

xˆ0/0, P0/0 t0 t

f (xk D (t)) = N (xˆ (t) , P (t)) , (3.92) |= where xˆ (t) and P (t) are obtained from Eq. 3.87 and Eq. 3.88, respectively. 74

3.6 No A Priori Information and Bayesian Estimation

The case of no a priori information can be adequately handled by using the following conditional distribution for x1 [3, p. 99]

f (x1 z0) lim N (0,ρIn) . (3.93) ρ | ∼ →∞ That is,

xˆ1/0 = 0 (3.94a)

P1/0 = lim ρIn .(3.94b) ρ →∞ This situation could, for example, occur when a filter is just initialized and absolutely nothing is known about the state x1, and yet is now expected to process a measure- ment z1. Note that the lack of a priori information could have been assigned to the state x0. However, if such were done, it would do no good to use the state equations to update the density and covariance (Eq. 3.51), since the covariance itself is infinite. The a posteriori density is given by Eq. 3.53

f (x1 z0:1)=N xˆ1/1, P1/1 . (3.95) | ¡ ¢ where the a posteriori estimate xˆ1/1 and a posteriori covariance P1/1 are to be de- termined from Eqs. 3.54. The Kalman gain given by Eq. 3.54c can be written as

1 T T − K1 = P1/0H1 H1P1/0H1 + R1 1 T T − = P1/0H1 ¡H1P1/0H1 + R1¢ 1 T T − = lim ρH1 ¡ ρH1H1 + R1 ¢ ρ 0 → 1 T ¡ T 1 ¢ − =limH1 H1H1 + R1 . (3.96) ρ ρ →∞ µ ¶ 75

The a posteriori estimate xˆ1/1 can be determined from Eq. 3.54a

xˆ1/1 = xˆ1/0 + K1 z1 H1xˆ1/0 −

= K1x1 ¡ ¢ 1 T T 1 − =limH1 H1H1 + R1 x1 ρ ρ →∞ µ 1 ¶ T T − = H1 H1H1 x1 . (3.97) ¡ ¢ It is not possible to determine the updated covariance by use of Eq. 3.54b. In fact, unless H1 has full column rank (not likely), it is tedious (but possible) to obtain a closed form expression for the a posteriori covariance matrix. To understand why this is so, one must reexamine Eq. 3.46 with xˆ1/0 = 0

T 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 { | } | 1 1 1 1 T 1 − T 1 = P1−/0 + H1 R¡1− H1 H¢1 R1− z1 ³ ´ 1 1 T 1 − T 1 = lim I + H1 R1− H1 H1 R1− z1 ρ ρ →∞ µ ¶1 T 1 − T 1 = lim I + ρH1 R1− H1 ρH1 R1− z1 . (3.98) ρ →∞ ¡ ¢ Before proceeding, we note the following matrix inversion rule [77, p. 151]

1 1 (I + AB)− A = A (I + BA)− . (3.99)

T 1 With A = H1 R1− and B =ρH1

1 T 1 − T 1 E x1 z0:1 = lim I + H1 R1− ρH1 H1 R1− z1ρ ρ { | } →∞ 1 ¡ T 1 ¢ T 1 − = lim H1 R1− I + ρH1H1 R1− z1ρ ρ →∞ 1 T 1 ¡ 1 T ¢1 − = lim H1 R1− I + H1H1 R1− z1 ρ ρ →∞ µ 1 ¶ T 1 T 1 − = H1 R1− H1H1 R1− z1 1 T T − = H1 H1H¡ 1 z1 .¢ (3.100) ¡ ¢ 76

NotethatthisisthesameresultgivenbyEq.3.97. ComparingthisresultwithEq. 3.46 gives the following relationship

T 1 T T 1 [cov (x1 z0:1)] H R− z1 = H H1H − z1 , (3.101) | 1 1 1 1 ¡ ¢ or T 1 T T 1 [cov (x1 z0:1)] H R− H H1H − z1 = 0 . (3.102) | 1 1 − 1 1 h ¡ ¢ i Since z1 is arbitrary, the previous equation requires

T 1 T T 1 [cov (x1 z0:1)] H R− = H H1H − , (3.103) | 1 1 1 1 ¡ ¢ or T T T 1 [cov (x1 z0:1)] H = H H1H − R1 . (3.104) | 1 1 1 ¡ ¢ When the matrix H1 hasfullcolumnrank,thecovarianceisgivenby

T T 1 T 1 [cov (x1 z0:1)] = H H1H − R1H1 H H1 − . (3.105) | 1 1 1 ¡ ¢ ¡ ¢ The matrix H1 is not, in general, of full rank. However, it is possible to use a generalized inverse. For example, suppose that H1 is such that the measurement is only influenced by a subset of the states

H1 = M0 , (3.106) £ ¤ where for sake of simplicity in the illustration we assume that M is invertible. Then,

T T M [cov (x1 z0:1)] H =[cov (x1 z0:1)] | 1 | 0 ∙1 ¸ T T − = H1 H1H1 R1 T M 1 = ¡ MM¢ T − R 0 1 ∙ ¸ 1 MT MM¡ T −¢ R = 1 0 ∙ ¡ ¢ ¸ M 1R = − 1 , (3.107) 0 ∙ ¸ 77 or 1 1 T I M− R1 (M− ) [cov (x1 z0:1)] = . (3.108) | 0 0 ∙ ¸ ∙ ¸ The measurement z1 provided information about the subset of the states that were measured. The resulting covariance calculation was only valid for those portion of the states that influenced the measurement. Since no information is provided on the other set of states (those that did not influence the measurement), the uncertainty in the estimate of these states should not be affected, and we can therefore write 1 1 T M− R1 (M− ) 0 [cov (x1 z0:1)] = lim . (3.109) | ρ 0 ρI →∞ ∙ ¸ Certainly other special cases exist where an explicit expression of the a posteriori covariance has a reasonable analytical expression. The purpose here, is not to discuss such special cases, but rather to point out that the covariance does exist (in the limit) and the estimate provided by Eq. 3.97 is indeed a MMSE estimate.

3.7 The Kalman Filter

Early approaches to MMSE estimation were developed by Wiener in the 1940s [97]. However, the Wiener solution does not lend well to the more complicated time- variable, multiple-input, multiple output (MIMO) systems — precisely the type de- scribed by Eqs. 3.4-3.8 [19, Chs. 4-5]. In 1960, R. E. Kalman provided an alternative way of formulating the MMSE estimation problem using state-space methods [46]. The resulting estimator is usually referred to as the Kalman filter, with other varia- tions of the name reflecting slight modifications to the original estimator proposed by Kalman. The Kalman filter is a MMSE estimator for the system described by Eqs. 3.4-3.8. As a result of the Markov form of the model and measurement equations, the Kalman filter is a recursive filter, making it a very attractive approach to real time estimation problems. There are many approaches to developing the infamous Kalman filter equations. Some are designed to arrive at the result in the quickest fashion and often lack the 78 insight of a more detailed derivation. The simplified approach usually begins by showing the Kalman filter to be the best linear unbiased estimator that recursively processes measurements. This is precisely the approach taken in Section 2.4. How- ever, contrary to this discussion, the results of Section 2.4 are not limited to linear stateequationsofthetypegivenbyEq.3.8. Thetwomostcommonapproachesto rigorously deriving the Kalman filter as an MMSE estimator are: (1) the innovations approach [3, Ch. 5], and (2) the Bayesian approach. While both approaches are in- sightful and arrive at the same result, the Bayesian approach is used here because the Bayes’ estimator has already been discussed in Section 3.1. In fact, having developed theBayes’estimator,theKalmanfilter equations immediately follow.

The Kalman filter process is initialized with an initial estimate xˆ0/0 of the true state x0 andaninitialcovarianceP0/0. It is assumed that measurements zk are available at each index k. From the derivation of the Bayes’ estimator, it is known that the MMSE estimates are updated recursively. Assuming that the Kalman filter has been in operation through measurement k 1. Then, the a posteriori estimate − of xk 1 is denoted by xˆk 1/k 1 and the associated a posteriori covariance is denoted − − − by Pk 1/k 1. The a posteriori state estimate and covariance for step k 1 are then − − − projected ahead to step k using Eq. 3.52

xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.110a) − − − − − T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 . (3.110b) − − − − − −

The estimate xˆk/k 1 and covariance Pk/k 1 are referred to as the a priori estimate and − − a priori covariance because the measurement at step k has not been used to obtain them. That is xˆk/k 1 is the MMSE of xk conditioned on xˆk 1/k 1. The MMSE − − − estimate of xk and the associated covariance conditioned on xˆk 1/k 1 and zk,which − − are referred to as the a posteriori estimate and a posteriori covariance for step k,are 79 obtained using Eq. 3.54

xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 (3.111a) − − −

Pk/k =(I KkHk) P¡ k/k 1 ¢ (3.111b) − − 1 T T − Kk = Pk/k 1Hk HkPk/k 1Hk + Rk . (3.111c) − − ¡ ¢ 3.7.1 Covariance Simulations

One can see from the Kalman filter equations that it is not necessary to update the state at all. In application, the state estimate is probably more useful than the covariance. However, in analysis, the covariance is often the only quantity of interest. While the state is just a sample from the random process, the covariance represents the statistical properties of all such samples. As such, the covariance is much more representative of the system behavior. A covariance simulation only requires one to compute the a priori covariance given by Eq. 3.110b

T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 (3.112a) − − − − − − 1 1 − T 1 − Pk = Pk− + Hk Rk− Hk . (3.112b) h i The a posteriori covariance can be computed¡ ¢ using Eqs. 3.111b, which also requires computing the Kalman gain given by Eq. 3.110b. It is possible to compute the a posteriori covariance without computing the Kalman gain using the following equation [19, p. 247] 1 1 T 1 − Pk/k = Pk/k− 1 + Hk Rk− Hk . (3.113) − The covariance can be used to determine³ if the quality of´ the state estimate is sufficient for a particular application.

3.7.2 Scalar System Estimation Example

Consider the system described by the first order differential equation

x˙ = βx + 2βσu(t) , (3.114) − p 80 where u (t) is a white noise process

RU (τ)=δ (τ) (3.115a)

PU (Ω)=1. (3.115b)

This system was analyzed in Appendix B where the autocorrelation was found to be

2 β τ RX (τ)=σ e− | | . (3.116)

Using the results of Appendix B, the sampled system and measurement equation are given by

xk+1 = φkxk + wk (3.117a)

zk = Hkxk + vk . (3.117b)

The state transition matrix (a scalar in this example) for the differential equation is given by β∆t φk = e− . (3.118)

Thevarianceofwk is given by

2 Qk = E wk ∆t ∆t βξ βη = E £ ¤ 2βσe− u (ξ) dξ 2βσe− u (η) dη ∙Z0 Z0 ¸ p∆t ∆t p 2 βξ βη =2βσ E e− u (ξ) dξ e− u (η) dη 0 0 ∙∆Zt ∆t Z ¸ 2 βξ βη =2βσ e− e− E [u (ξ) u (η)] dξdη 0 0 Z ∆t Z ∆t 2 βξ βη =2βσ e− e− E [δ (ξ) δ (η)] dξdη 0 0 Z ∆t Z 2 2βξ =2βσ e− dξ 0 2 Z 2β∆t = σ 1 e− − = Q .¡ ¢ (3.119) 81

This results in the autocorrelation function

RW [k]=δ (k) Q

2 2β∆t = σ 1 e− δ (k) . (3.120) − ¡ ¢ The same result could have been obtained using (4.45). The mean value of x (t) is zero [19, pp. 83-84]

lim RX (τ)=0 E [X (t)] = 0 . (3.121) τ →∞ ⇒ The mean square value for the process (also the variance, since the process has a zero mean) at t =0is found by

2 E X (t) = RX (0) £ ¤ = σ2 . (3.122)

If a Kalman filter is to be used, the initial estimate of x (t) is given by

xˆ0/0 = E [X (0)] = 0 , (3.123) and the error covariance is given by

2 P0/0 = RX (0) = σ . (3.124)

Numerical Results [19, pp. 223-225] If the process noise wk is drawn from a Gaussian density function with variance σ2, then the sequence x is said to be a first-order Gauss-Markov process. It is first-order because the equation is a first-order difference equation; it is Gauss because the density of the process noise wk is Gaussian; it is Markov because (due to the state transition matrix) the state update depends only on the previous state. Let the autocorrelation parameters be

β =1 (3.125a)

σ =1. (3.125b) 82

Then, the autocorrelation is given by

τ RX (τ)=e−| | . (3.126)

Suppose we have a sequence of noisy measurements of this process taken 0.02 seconds apart beginning at t =0with autocorrelation given by

2 Rk = E vk =1. (3.127) © ª The measurement relationship to x is

Hk =1. (3.128)

AKalmanfilter is to be used to obtain an optimal estimate of x (t).Thestate transition matrix is given by

β∆t 0.02 φ = e− = e− 0.9802 . (3.129) k ≈ The discrete-time process noise has a variance given by

2 2β∆t 2(0.02) Q = σ 1 e− =1 e− 0.03921 . (3.130) − − ≈ ¡ ¢ The process has a zero mean and unity variance. In summary, the process parameters are given below. φk =0.98020 Qk =0.03921 Rk =1.00000 (3.131) P0/0 =1.00000 xˆ0/0 =0 Figure’s 3.1-3.3 show the Kalman filter’s performance for this example. The estima- tion error is shown in Figure 3.2. Also shown is the square root of the covariance

Pk. The estimation error is Gaussian and therefore, it should lie within (plus or minus) one standard deviation about 68% of the time. Of the twenty-four samples shown, eight fall outside of the 1-σ error bound. Therefore, the experimental results show that the state estimation is within the 1-σ boundary 67% of the time — this 83

1.5 True process Filter estimates 1 Measurements

0.5

0

−0.5

−1 Process Value

−1.5

−2

−2.5

−3 0 5 10 15 20 25 Time (seconds)

Figure 3.1. Kalman Filter Performance for a 1st Order Gauss-Markov Process. is in excellent agreement with the theoretical value of 68%. The Kalman gain for the problem quickly reaches a steady-state value. For this reason, it is more likely that one would simply implement the steady-state gains rather than the time varying gains. The full utility of the Kalman filter is typically only realized when the process or measurement noise matrices are time-varying (i.e. when Rk or Qk are functions of k). The use of steady-state gains is discussed more in [7, pp. 89-100].

3.8 Multiple Model

The equations governing optimal estimation in a general multiple model context have been developed in Section 2.1. These general multiple model equations will now be applied to the linear sampled data system. As in the general multiple model context, 84

0.8

0.6

0.4

0.2

0 Estimation Error −0.2

−0.4

−0.6

−0.8 0 5 10 15 20 25 Filter Cycle Number (k)

st Figure 3.2. Estimation Error (blue) and √Pk (red) for 1 Order Gauss-Markov Process. ± 85

0.5

0.45

0.4

0.35

Filter Gain 0.3

0.25

0.2

0.15 0 5 10 15 20 25 Filter Cycle Number (k)

Figure 3.3. Kalman Gain for 1st Order Gauss-Markov Process. 86 the model probabilities are given by

(j) P (Mj)=µ0 , (3.132) with corresponding density r (j) fM (m)= µ0 δ (m j) . (3.133) j=1 − X Each model m has a linear measurement equation of the form given by Eq. 3.4

(m) (m) (m) (m) zk = Hk xk + vk , (3.134) where the superscript (m) indicates that the measurement equation is valid for model m. The measurement noise is white with covariance

(m) (m) (m) E vk vk = Rk . (3.135) n o Similarlytheprocessdynamicsareofthetypespecified by Eq. 3.8

(m) (m) (m) (m) (m) xk+1 = Φk xk + wk + dk (3.136)

Theprocessnoiseiswhitewithcovariance

(m) (m) (m) E wk wk = Qk . (3.137) n o For each model m, an initial PDF is given f (x0 m) and as usual we include z0 as a | place holder for measurements

f (x0 z0,m) f (x0 m) . (3.138) | , | For the linear sampled data system, the density is given by

(m) (m) f (x0 z0,m)=N xˆ , P , (3.139) | 0/0 0/0 ³ ´ where

(m) xˆ = E x0 M = m (3.140a) 0/0 { | } (m) P = cov (x0 M = m) . (3.140b) 0/0 | 87

A Bayes’ estimator for model m is given by Eqs. 3.51-3.54c with all estimates in- cluding the superscript (m). With the state model given by Eq. 3.136, the a priori conditional density of xk is given by

(m) (m) f (xk z0:k 1,m)=N xˆk/k 1, Pk/k 1 , (3.141) | − − − ³ ´ where

(m) (m) (m) (m) xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.142) − − − − − T (m) (m) (m) (m) (m) Pk/k 1 = Φk 1Pk 1/k 1 Φk 1 + Qk 1 . (3.143) − − − − − − ³ ´ Given a new measurement zk, the a priori density is updated according to

(m) (m) f (xk z0:k,m)=N xˆ , P , (3.144) | k/k k/k ³ ´ where

(m) (m) (m) (m) (m) (m) xˆk/k = xˆk/k 1 + Kk zk Hk xˆk/k 1 (3.145) − − − (m) (m) (m)³ (m) ´ Pk/k = I Kk Hk Pk/k 1 (3.146) − − ³ T´ T 1 (m) (m) (m) (m) (m) (m) − Kk = Pk/k 1 Hk Hk Pk/k 1 Hk + Rk . (3.147) − − ³ ´ µ ³ ´ ¶ The a posteriori model density, which is the model density m conditioned on the measurements z0:k is given by Bayes’ rule

f (m, z0:k) f (m z0:k)= | f (z0:k) f (m, zk z0:k 1) f (z0:k 1) = | − − f (zk z0:k 1) f (z0:k 1) | − − f (m, zk z0:k 1) = | − f (zk z0:k 1) | − f (zk z0:k 1,m) f (m z0:k 1) = | − | − . (3.148) f (zk z0:k 1) | − 88

That is, the density f (m z0:k) can be recursively computed. The value of f (zk z0:k 1,m) | | − can be obtained using Eq. 3.59

T T 1 1 (m) (m) (m) (m) (m) (m) − (m) (m) 2 zk Hk xˆk/k 1 Hk Pk/k 1 Hk +Rk zk Hk xˆk/k 1 − − − − − − e         f (zk z0:k 1,m)= . − T 1/2 | m/2 (m) (m) (m) (m) (2π) Hk Pk/k 1 Hk + Rk ¯ − ¯ ¯ ³ ´ ¯ (3.149) ¯ ¯ ¯ ¯ The conditional density f (m z0:k 1) is known from a previous iteration and is initial- | − ized by

f (m z0:0)=fM (m) . (3.150) |

For the first measurement z1, Eq. 3.148 is computed by

f (z1 z0,m) f (m z0) f (m z0:1)= | | | f (z1 z0) | f (z1 z0,m) fM (m) = | f (z1 z0) | r f (z1 z0,m) = | µ(j)δ (m j) . (3.151) f (z z ) 0 1 0 j=1 − | X Alternately, using probability mass functions for M,wehave

f (z1 z0,m) (m) f (z1 z0,m) P (m z0:1)= | µ0 = | P (m z0) . (3.152) | f (z1 z0) f (z1 z0) | | | Using the PMF representation, Eq. 3.148 gives the following recursion formula

f (zk z0:k 1,m) P (m z0:k)= | − P (m z0:k 1) . (3.153) | f (zk z0:k 1) | − | −

The value of f (zk z0:k 1) is computed by | −

f (zk z0:k 1)= f (zk,mz0:k 1) dm | − | − Z = f (zk z0:k 1,m) f (m z0:k 1) dm | − | − Zr

= f (zk z0:k 1,m) P (m z0:k 1) . (3.154) − − m=1 | | X 89

Substituting this result into Eq. 3.153 gives

f (zk z0:k 1,m) P (m z0:k)= | − P (m z0:k 1) | f (zk z0:k 1) | − | − f (zk z0:k 1,m) = r | − P (m z0:k 1) . (3.155) m=1 f (zk z0:k 1,m) P (m z0:k 1) | − | − | − The expectation of the stateP conditioned on the measurements z0:k and the model choice m is given by Eq. 3.145

(m) E xk z0:k,m = xˆ . (3.156) { | } k/k The MMSE is the conditional mean and is given by

xˆk/k = E xk z0:k { | }

= xkf (xk z0:k) dxk | Z = xkf (xk,mz0:k) dxkdm | ZZ = xkf (xk m, z0:k) f (m z0:k) dxkdm | | ZZr

= xkf (xk m, z0:k) dxkP (m z0:k) | | m=1 Z Xr (m) = xˆk/k P (m z0:k) . (3.157) m=1 | X The density f (xk z0:k) canbecomputedasfollows |

f (xk z0:k)= f (xk,mz0:k) dm | | Z = f (xk m, z0:k)(m z0:k) dm | | Zr

= f (xk m, z0:k) P (m z0:k) . (3.158) m=1 | | X Since each of the f (xk m, z0:k) are Gaussian, the conditional density f (xk z0:k) is a | | weighted sum of Gaussian PDFs. This of course means that the conditional density f (xk z0:k) is not Gaussian, making statistical inference about the estimation error | difficult (i.e. confidence regions). 90

Kalman Filter The multiple model Kalman Filter equations follow directly from pre- vious results. Each of the m Kalman filters are initialized with an initial estimate (m) (m) (m) xˆ0/0 ,aninitialcovarianceP0/0 , and an initial model probability µ0 .Ateachstep k, the standard Kalman filter equations are computed for each model m

(m) (m) (m) (m) xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.159) − − − − − T (m) (m) (m) (m) (m) Pk/k 1 = Φk 1Pk 1/k 1 Φk 1 + Qk 1 (3.160) − − − − − − ³T ´ T 1 (m) (m) (m) (m) (m) (m) − Kk = Pk/k 1 Hk Hk Pk/k 1 Hk + Rk (3.161) − µ − ¶ (m) (m) ³ (m)´ (m) (m) (m³) ´ xˆk/k = xˆk/k 1 + Kk zk Hk xˆk/k 1 (3.162) − − − (m) (m) (m)³ (m) ´ Pk/k = I Kk Hk Pk/k 1 . (3.163) − − ³ ´ The conditional density f (zk z0:k 1,m) is evaluated at the current measurement zk | − T T 1 1 (m) (m) (m) (m) (m) (m) − (m) (m) 2 zk Hk xˆk/k 1 Hk Pk/k 1 Hk +Rk zk Hk xˆk/k 1 − − − − − − e         f (zk z0:k 1,m)= . − T 1/2 | m/2 (m) (m) (m) (m) (2π) Hk Pk/k 1 Hk + Rk ¯ − ¯ ¯ ³ ´ ¯ (3.164) ¯ ¯ ¯ ¯ The conditional model probability P (m z0:k) is updated |

f (zk z0:k 1,m) P (m z0:k)= r | − P (m z0:k 1) . (3.165) | m=1 f (zk z0:k 1,m) P (m z0:k 1) | − | − | − TheMMSEestimateisformedP

r (m) xˆk/k = xˆk/k P (m z0:k) . (3.166) m=1 | X 91

Chapter 4 Stochastic Motion Models

As with any other chapter of this dissertation, an entire book could be written on the subject of stochastic motion modeling. An excellent introduction to the subject is given in [12, Ch. 4], under the title of “Modeling and Tracking Dynamic Targets”. Without question, domain knowledge can greatly assist in motion modeling. For example, one would use a different motion model for tracking a boost-stage missile [12, p. 242], than one would for tracking people (deformable objects) in a cluttered environment [14, p. 267], [83, p. 744]. It may be that one would like to use a tracker to guide a missile to a target. If such is the case, then the motion model will likely consist of the states that are required by the guidance law. That is, the system model will be influenced by how the tracker fits into a larger design, whether it be a missile or an automated visual-to-audio sign language interpretation system. This chapter will introduce some of the more common stochastic motion models. The section will begin with a discussion of Markov models and their relation to motion modeling. Next, process noise modeling will be discussed. This will be followed by motion models of increasing complexity.

4.1 Markov Models

Regardless of the motion model, there is a distinguishing feature which makes the problem manageable — motion, by its very nature, is Markov (see Section A.8.2). Thus, we can expect models to be of the form

discrete: xk = fk (xk 1, uk 1, vk 1) − − − , (4.1) continuous: x˙ = f (x, u, v) 92 where x is the system state vector and typically consists of some type of position and velocity information; u is a known control or disturbance input; v is an unknown white forcing function usually referred to as process noise. The fact that v is white is not limiting, as a correlated process can be formed from filtering a white process and incorporating the filtering sates into the process model [32, pp. 78-84], [19, pp. 226-228] and [3, Ch. 11]. It can also be argued that the Markov property is a direct result of state space modeling. By definition, the state of a system is a set of quantities which allow one to uniquely determine the status of the system for all future times if all inputs are specified [18, p. 76]. That is, only the current state, not past values of it, influence the future state of the system. The generality of Markov processes is briefly discussed in [20, p. 317].

4.1.1 Principle of Inertia

Most of the objects one can conceive of tracking possess inertia. That is they are resistant to changes in their current state of motion. The time derivative of position is velocity and the time derivative of velocity is acceleration 1.Letr represent a vector locating an object of mass M. Newton’s law states that a rigid object of mass M will undergo an acceleration ¨r ifacteduponbyaforceF according to the relationship

1 ¨r = F . (4.2) M

Let x be a state vector with the first three states position and the next three states velocity (additional states would be needed if one were interested in the rotational motion of an object)

d r 0 I 1 0 x˙ = = 3x3 3x3 x+ 3x3 F . (4.3) dt r˙ 03x3 03x3 M I3x3 ∙ ¸ ∙ ¸ ∙ ¸ 1 The time derivative of acceleration is jerk. The author hesitates to say how a jerk changes with time! 93

Physically realizable forces are finite in magnitude. For such systems, the process model is given by Eq. 4.3. The force F can consist of a known, deterministic component and an unknown stochastic component. If the stochastic component has a non-uniform PSD, then additional states must be added to shape a white process to the desired form. Process noise models are discussed in the next section. The principle of inertia, which represents domain knowledge for the tracking problem, gives much of the structure to stochastic motion models. While the model given by Eq. 4.3 is only applicable to Cartesian coordinate systems, the principle of inertia is more general. In other, more complex coordinate systems (e.g. spherical) with interacting states, the principle of inertia will still apply, but the resulting model will often be nonlinear which will likely complicate state estimation. Generalized coordinate systems and the principle of inertia are discussed in [36, Ch. 6]. Since the position states in a Cartesian system are decoupled, the state model need only be developed for one spatial direction. The remaining two spatial directions have state models that are equivalent to the one developed.

4.2 Process Noise Models

The most common state estimation schemes are only feasible for processes that have white noise inputs. White noise is discussed in Section A.8.1. Even if the process noise is not white, a white noise model is appropriate when the noise is nearly uniform over frequencies that define the system (model) bandwidth. However, it should be no surprise that white noise does not reflect reality in many situations. More often the noise is colored, which means that the power spectral density (PSD) of the noise is not uniform across all frequencies, as it is in the case of a white noise process. Fortunately, many colored noise processes can be adequately modeled as the output of a linear time-invariant (LTI) filter that is driven by a white noise process. Similarly, a colored noise sequence can usually be adequately modeled as the output of a discrete-time 94 linear shift-invariant filter that is driven by a white noise sequence. The following scalar system is to be used as a shaping filter

x˙ = ax + bu , (4.4) − where a>0 and u is a white noise process with autocorrelation and PSD (Section A.8.1)

RU (t)=Aδ (t) (4.5a)

PU (Ω)=A .(4.5b)

The output autocorrelation and PSD are given by the following equations. Ab2 R (τ)= e a τ X 2a − | | Ab2 P (Ω)= (4.6) X a2 + Ω2 Thus, the simple scalar system given by Eq. 4.4 has been used to create a process with an exponential autocorrelation function. The parameter α will determine the correlation time of the process. Needless to say, a white process is not possible to synthesize. However, a white sequence is. Consider the system that results from sampling the continuous-time system

xk+1 = φkxk + wk . (4.7)

The state transition function φk and autocorrelation RW are given by the following equations.

aT φk = e− (4.8a)

RW = δ (n k) Q (4.8b) 2 − b 2aT Q = 1 e− (4.8c) 2a − ¡ ¢ The sampled system can be simulated by applying a white noise sequence, with variance Q to the discrete equivalent system. Although filtering a white sequence 95 with a shaping filter produces the desired autocorrelation, it does have a short coming.

The short coming is that the density of the output signal xk can not be specified. This is explained in more detail in Appendix B.5.3.

4.3 Random Walk

One can hardly discuss motion models without mention of the infamous random walk. Thetermrandomwalkderivesitsnamefromtheexampleofamanwhotakesfixed length steps in arbitrary directions [32, p. 79]. It is most common to see a random walk model in discrete form

x (k +1)=x (k)+w (k) . (4.9)

After N steps we have

x (N)=x (N 1) + w (N 1) − − = x (N 2) + w (N 2) + w (N 1) − N 1 − − − = x (0) + w (n) . (4.10) n=0 X Taking the expectation with x (0) = 0 gives the mean

N 1 − E x (N) = E x (0) + w (n) { } ( n=0 ) XN 1 − = E x (0) + E w (n) { } n=0 { } = E x (0) X { } =0. (4.11) 96

The autocorrelation is

RX (m, n)=E x (m) x (n) { }m n = E x (0) + w (n) x (0) + w (n) (Ã r=0 !Ã k=0 !) X m X = E x2 (0) + E x (0) w (r) ( r=0 ) © ª n Xm n +E x (0) w (k) + E w (r) w (n) (4.12) { } ( k=0 ) r=0 k=0 m n X X X = E w (r) w (n) { } r=0 k=0 X X =min(m, n)

= mS (n m)+nS (m n) . (4.13) − −

Thus, the process is non-stationary.

4.3.1 Continuous Time Random Walk

The continuous-time version of random walk is [19, pp. 100-102]

x˙ = u (t) , (4.14) where u (t) is white noise

E u (t + τ) u (t) = Aδ (τ) . (4.15) { }

Integrating with x (t =0)=0gives

t x (t)= u (γ) dγ . (4.16) Z0 Taking the expectation gives

t E x (t) = E u (γ) dγ =0. (4.17) { } { } Z0 97

The autocorrelation is

t1 t2 E x (t1) x (t2) = E u (γ) dγ u (β) dβ { } ½Z0 Z0 ¾ t2 t1 = E u (γ) u (β) dγdβ 0 0 { } Z t2 Z t1 = δ (γ β) dγdβ − Z0 Z0 =min(t1,t2)

= t1S (t2 t1)+t2S (t1 t2) . (4.18) − −

Sampled Continuous Time Random Walk The continuous-time system state transi- tion matrix is equal to one (this can be seen by setting u (t)=0and solving the homogeneous equation x˙ =0) φ (t)=1, (4.19) and the equivalent noise is given by

2 Q = E wk tk+1 = A © ª dη Ztk = AT . (4.20)

In summary, the sampled continuous-time random walk is given by

x (k +1) = x (k)+w (k) (4.21a)

RWW [k]=Qδ (k) .(4.21b)

Therefore, the discrete equivalent is given by Eq. 4.9 with E wm+kwm = RWW [k]= { } Qδ (k)=AT δ (k).

Relation to Inertia The concept of a random walk is something that one can easily visualize. One can picture a person taking random steps that are uncorrelated. It seems that the definition itself is sufficient to justify the credibility of a random walk. 98

However, one would still like a justification for the continuous-time random walk model and its relationship to Newton’s Law of motion. The system model given by Eq. 4.3 can be generalized such that the force F has components that are due to a restoring force, viscous friction and noise. The restoring force is essentially a spring, it acts in the opposite direction as r.Viscous friction is a dissipative force that can be modeled as a constant times the velocity of the object M¨r = F = u kr cr˙ . (4.22) − − Rearranging gives M¨r+kr+cr˙ = u . (4.23)

This model has a wide range of application in engineering. It can be used to represent a mass-spring-damper system or an inductor-capacitor-resistor system. It can also be used to describe the movement of a particle in a liquid, subjected to collisions and other forces — resulting in a motion termed Brownian motion [63, pp. 447-449]. If the restoring constant f and the mass M are small in comparison to the viscous damping, then an approximate model is given by

1 r˙ = u . (4.24) c

This random process is a random walk and is often called the Wiener process.

4.4 White Acceleration

Consider one direction in a Cartesian coordinate system. This may represent the row or column coordinate on an FPA. Alternately, it may represent one linear direction in three dimensional space. The second time-derivative of this coordinate is

r¨ = u , (4.25) 99 where u is a white noise process

RU (τ)=Aδ (τ) (4.26)

This model corresponds to a random walk velocity

d r¨ = r˙ = u . (4.27) dt

Let the state be composed of the coordinate and its first time-derivative

x = r r˙ T . (4.28) £ ¤ The state model is given by x˙ = Fx + Gu , (4.29) where

01 F = (4.30a) 00 ∙ ¸ 0 G = .(4.30b) 1 ∙ ¸ 4.4.1 Discrete Equivalent

The homogeneous system has u = 0

x˙ = Fx . (4.31)

Taking the Laplace transform gives

sx (s) x0= Fx (s) . (4.32) −

Solving for x gives

1 x (s)=(sI F)− x0 −

= Φ (s) x0 . (4.33) 100 where Φ (s) is the Laplace transform of the state transition matrix

1 1 s 1 − 1 s 1 Φ (s)=(sI F)− = − = . (4.34) − 0 s s2 0 s ∙ ¸ ∙ ¸ TakingtheinverseLaplacetransformsgives

1 t Φ (t)= S (t) . (4.35) 01 ∙ ¸ where S (t) is the unit step function. The complete solution is given by

x (tk+1)=Φ (tk+1 tk) x (tk)+wk − 1 T = x (t )+w , (4.36) 01 k k ∙ ¸ 1 where the T − isthesamplerateandwk is white noise with an autocorrelation that can be found by Eq. B.70

RW [k, n]=δ (n k) Q , (4.37) − where

T T T Q = Φ (ν) GAnxnG Φ (ν) dν Z0 T 1 v 0 10 = A 01 dv 01 1 v 1 Z0 ∙ ¸ ∙ ¸ ∙ ¸ T v £ ¤ = A v 1 dv 1 Z0 ∙ ¸ T v2 £v ¤ = A dv v 1 Z0 ∙ ¸ T 3 T 2 3 2 = A 2 . (4.38) T T ∙ 2 ¸ A stochastic motion model for a white noise acceleration process is completely defined by Eq. 4.36-4.38. 101

4.5 Correlated Acceleration

A correlated acceleration model can be obtained by appending process noise model given by Eq. 4.4 to Newton’s Equations given by Eq. 4.3. The equations are uncoupled so only one dimension needs to be analyzed. The state is the position, velocity and acceleration x = r r˙ r¨ . (4.39)

The process model is given by £ ¤

x˙ = Fx + Gu , (4.40) where 01 0 F = 00 1 (4.41a) ⎡ 00 α ⎤ − G = ⎣ 001 .(4.41b)⎦ £ ¤ 4.5.1 Discrete Equivalent

The Laplace transform of the state transition matrix is

1 Φ (s)=(sI F)− − 1 s 10 − = 0 −s 1 ⎡ 00s−+ α ⎤ ⎣ s (s ⎦+ α)0 0T 1 = (s + α) s (s + α)0 s2 (s + α) ⎡ 1 ss2 ⎤ ⎣ s (s + α)(s + α)1⎦ 1 = 0 s (s + α) s s2 (s + α) ⎡ 00s2 ⎤ 1 1 ⎣ 1 ⎦ s s2 s2(s+α) 1 1 = 0 s s(s+α) . (4.42) ⎡ 1 ⎤ 00 s+α ⎣ ⎦ 102

Taking the inverse Laplace transform gives

1 αt 1 t α2 ( 1+αt + e− ) −1 αt Φ (t)= 01 α (1 e− ) . (4.43) ⎡ −αt ⎤ 00 e− The discrete-time process noise⎣ has an autocorrelation given⎦ by

RU [k, n]=δ (n k) Q , (4.44) − where Q can be determined from Eq. B.70 T T T Q = Φ (t) GAnxnG Φ (t) dt 0 Z T T 0 0 = A Φ (t) 0 Φ (t) 0 dt ⎛ ⎡ ⎤⎞ ⎛ ⎡ ⎤⎞ Z0 1 1 ⎝ 1 ⎣ ⎦⎠ ⎝ αt ⎣ ⎦1 ⎠ αt T T α2 ( 1+αt + e− ) α2 ( 1+αt + e− ) −1 αt −1 αt = A α (1 e− ) α (1 e− ) dt . (4.45a) 0 ⎡ −αt ⎤ ⎡ −αt ⎤ Z e− e− ⎣ ⎦ ⎣ ⎦ This equation would be tedious to evaluate, but certainly possible (the result is given

αt in [80]). Expand e− in a Taylor series about αT =0

αt 2 3 e− =1 αt +(αt) + O (αt) . (4.46) − ¡ ¢ It is often the case that the correlation time α and sample time T are small enough so that αt 1 2 e− 1 αt + (αt) . (4.47) ' − 2 Using this approximation, the state transition matrix is approximated by

1 αt 1 t α2 ( 1+αt + e− ) −1 αt Φ (t)= 01 α (1 e− ) ⎡ −αt ⎤ 00 e− ⎣ 1 2 ⎦ 1 t 2 t 1 2 01 t 2 αt − 2 ' ⎡ 001 αt + 1 (αt) ⎤ − 2 ⎣ 1 2 ⎦ 1 t 2 t 01 t . (4.48) ' ⎡ 001 αt ⎤ − ⎣ ⎦ 103

Similarly, the matrix Q can be approximated by

T T T Q = Φ (t) GAnxnG Φ (t) dt Z0 1 2 1 2 T T 2 t 2 t A t t dt ' ⎡ ⎤ ⎡ ⎤ Z0 1 αt 1 αt − − ⎣ 1 4 1 3⎦ ⎣ 1 2 1⎦ 3 T 4 t 2 t 2 t 2 αt 1 3 2 − 2 = A 2 t t t αt dt ⎡ 1 − ⎤ Z0 t2 t 1 2αt + α2t2 2 − ⎣ 1 4 1 3 1 2 ⎦ T 4 t 2 t 2 t 1 3 2 A 2 t t t dt ' ⎡ 1 2 ⎤ Z0 t t 1 2αt 2 − 1 ⎣5 1 4 1 3 ⎦ 20 T 8 T 6 T 1 4 1 3 1 2 = 8 T 3 T 2 T . (4.49) ⎡ 1 T 3 1 T 2 T (1 αT ) ⎤ 6 2 − ⎣ ⎦ In summary, the discrete equivalent system is given by

x (tk+1)=Φ (tk+1 tk) x (tk)+wk − 1 2 1 T 2 T = 01 T x (tk)+wk , (4.50) ⎡ 001 αT ⎤ − ⎣ ⎦ where the process noise has the autocorrelation function

RU [k, n]=δ (n k) Q (4.51) 1 − 5 1 4 1 3 20 T 8 T 6 T 1 4 1 3 1 2 Q = 8 T 3 T 2 T . (4.52) ⎡ 1 T 3 1 T 2 T (1 αT) ⎤ 6 2 − ⎣ ⎦ This model can be further simplified by letting α =0. The resulting motion model has a random walk acceleration. 104

Chapter 5 Optimization and Control Theory

This chapter reviews important material related to optimal control theory. Section 5.1 introduces terms and concepts from control theory, such as the controllable set. Section 5.2 addresses the topic of constrained optimization and the use of Lagrange multipliers. Section 5.3 shows how parametric optimization can be used to obtain suboptimal controllers for nonlinear systems. Section 5.4 provides a derivation of the minimum principle using a geometrical approach, as opposed to the (perhaps more common) calculus of variations approach. Section 5.5, the last section of this chapter, contains a statement of the Min-Max Principle for continuous-time systems.

5.1 Basic Control Theory Concepts

The discussion that follows is provided to introduce the reader to the notation that is used throughout this chapter, as well as to concepts that the reader may not be aware of, such as that of the controllable set [92]. It is not the purpose of this section to discuss the more common aspects of control theory, such as the definition of a state space, or a discussion specifically on linear systems. A rather general class of nonlinear control systems can be described by equations of the form x˙ = f (x, u) , (5.1) where x is an nx dimensional state vector, u is an nu dimensional control vector and f ( ) is assumed to be continuous and continuously differentiable in all of its · arguments. There are typical constraints imposed on the control vector u.These are generally in the form of nh inequalities

u = u h (x, u) 0 . (5.2a) ∈ U { | ≥ } 105

A controller u that satisfies Eq. 5.2a and is piecewise differentiable is said to be an admissible control. Although not necessary for all problems, control design is much easier if the control constraints are state independent

u = u h (u) 0 .(5.2b) ∈ U { | ≥ } The goal of control theory is to find an admissible controller that, among other things, transfers the system state to a target set that can be described by a set of equalities X = x g (x)=0 . (5.3) X { | } Given a control system (Eq. 5.1) with specified control constraints (Eq. 5.2a), an initial state is said to be controllable to the target (Eq. 5.3) if there exists an admis- sible control u (x) such that the solution to Eq. 5.1 transfers the initial state to the target in a finite time [92, p. 90]. The set of all controllable initial states is said to be the controllable set. For a given admissible control law u (x),thesetofinitial points that actually get transferred to the target in finite time is called the domain of effectiveness for u (x) and is a subset of the controllable set.

5.2 Parametric Optimization

This section presents the basic elements of nonlinear parametric optimization. The basic topic of nonlinear optimization is covered in [92, Ch. 3] and in extensive detail in [93]. Suppose we wish to optimize a scalar valued function G of the following form

minimize G (u) , (5.4) subject to a set of inequality constraints of the form

u = u h (u) 0 . (5.5) ∈ U { | ≥ }

The parameter vector u has dimension nu and the vector valued function h (u) has dimension nh. 106

U B u*

Figure 5.1. Hatched region indicating intersection of a spherical ball and the control constraint set.

Aballisaspecialtypeofopenneighborhood,showninFigure5.1,anddefined by

= u∗ + ∆u ∆u <for any finite >0 (5.6) B { ||| || }

The function G (u) has a global minimum at u∗ iff

G (u∗) G (u) u . (5.7) ≤ ∀ ∈ U

The function G (u) has a local minimum at u∗ iff

G (u∗) G (u∗ + ∆u) for all u∗ + ∆u . (5.8) ≤ ∈ B ∩ U

From the definition of the local and global minimums, it follows that if all local minima for G (u) can be found, then the global minimum is among them.

5.2.1 Constraints

The inequality constraint h (u) 0, can also be used to model equality constraints ≥ of the form ψ (u)=0 . (5.9) 107

The equality constraint is implemented as two sets of inequality constraints as follows

ψ (u) 0 (5.10a) ≥ ψ (u) 0 .(5.10b) − ≥

Satisfying both inequality constraints requires that the equality constraint also be satisfied.

5.2.2 Necessary Conditions for a Local Minimum

The function G (u) forms a nu 1 dimensional surface in the nu dimensional space. As − with any dimensional surface, an extremum exists when the gradient to the surface is equal to zero. Whether or not this extremum is a minimum or a maximum depends on the curvature around the extremum, and therefore second order derivatives. If there are no constraints, then a necessary condition for a minimum is to have a null gradient vector. However, when the parameter vector u is constrained, alternate conditions are necessary. These alternate conditions are best discussed by illustration. Consider the condition shown in Figure 5.2. Figure 5.2a shows a minimizing point u∗ that exists on the control constraint set. The distinguishing feature of this point is that the gradient of the cost function is between the two constraint function gradients. As such, the constraint function gradients can be used as a basis for the cost function gradient ∂G ∂h ∂h = γ 1 + γ 2 , (5.11) ∂u 1 ∂u 2 ∂u with γ1 and γ2 greater than zero. Conversely, Figure 5.2b shows a non-minimizing point u that exists on the control constraint set. The distinguishing feature of this point is that the gradient of the cost function is not between the two constraint function gradients. While the constraint function gradients can still be used as a basisforthecostgradient,thiswillrequirethatsomeoftheγi be less than zero. The previous discussion involving Figure 5.2 provides a geometrical perspective for 108

u* h1 = 0

∂h2 ∂u ∂h ∂G 1 ∂u G = constant ∂u

(a) h2 = 0 u h1 = 0 G = constant ∂h2 ∂u

∂h1 ∂u ∂G ∂u h = 0 (b) 2

Figure 5.2. Borrowed with permission from [92, p. 127]. Cost and cost gradient geometry for minimizing G (u). (a) At a minimizing point u∗. (b) At a nonminimizing point u. 109 the minimization conditions about to be presented. For further discussion on this topic, see [92, Ch. 3], [93, Ch. 1] and [20, Ch. 1].

Condition 5.1 (Minimization Necessary Conditions). If u is a regular local mini- mizing point for G (u) subject to the constraints given by Eq. 5.5, then there exists a vector γ = γ γ γ such that 1 2 ··· nh £ ¤ ∂L(u, γ) 0T = (5.12a) ∂u ¯u=u∗ ¯ 0 h (u∗) ¯ (5.12b) ≤ ¯ 0 γ (5.12c) ≤ T 0=γ h (u∗) , (5.12d) where γ is called a Lagrange multiplier vector and the Lagrangian function L ( ) in · Eq. 5.12a is defined as L (u, γ) G (u) γT h (u) . (5.13) , − The first minimization condition, given by Eq. 5.12a, ensures that the gradient of the cost function G (u) is a linear combination of the constraint equations. The sec- ond minimization condition, given by Eq. 5.12b, ensures that all constraint equations are satisfied. The third and fourth minimization conditions, given by Eqs. 5.12c- 5.12d, jointly ensure that only the active constraints play a role and furthermore that the gradient of the cost function G (u) lies in the constraint cone, as shown in Figure 5.2.

T Example 5.1 (Borrowed from [92, p. 128]). In two dimensions with u = u1 u2 , consider the problem of minimizing £ ¤ 1 1 G (u)= u2 + u2 = uT u , (5.14) 2 1 2 2 ¡ ¢ subject to the constraint

T 0 h (u)=u1 + u2 1=γ 1 u 1 , (5.15) ≤ − 2 − ¡ ¢ 110

where 1N is an N-dimensional vector with all elements equal to one. The Lagrangian function, given by Eq. 5.13, is

L (u, γ)=G (u) γT h (u) − 1 = uT u γ 1T u 1 . (5.16) 2 − 2 − The necessary conditions, given by Eq. 5.12, are¡ ¢ ∂L T 0 = = u γ12 (5.17a) ∂u − ∙ ¸ 0 1T u 1 (5.17b) ≤ 2 − 0 γ¡ ¢ (5.17c) ≤ 0=γ 1T u 1 . (5.17d) 2 − Suppose that the constraint is inactive,¡ then Eq.¢ 5.17d requires that γ =0.This condition γ =0and Eq. 5.17a then require that u = 0. However, the condition u = 0 violates Eq. 5.17b. Therefore, the constraint must be active. Having established that the constraint is active, we still need to solve for three unknowns: γ and the two elements of u. There are precisely three equations to enable this, namely with γ =0 Eqs. 5.17a and 5.17d give

u γ12 = 0 (5.18a) − 1T u 1=0. (5.18b) 2 − Using matrix notation

10 1 u1 0 − 01 1 u2 = 0 . (5.19) ⎡ 11− 0⎤ ⎡ γ ⎤ ⎡ 1 ⎤ Since the determinate is not⎣ equal to zero⎦ ⎣ we⎦ are assured⎣ ⎦ a unique solution that can be found by matrix inversion 1 u 10 1 − 0 1 11 0 1/2 1 1 1 u = 01−1 0 = 111− 0 = 1/2 = 1 . 2 2 2 3 ⎡ γ ⎤ ⎡ 11− 0⎤ ⎡ 1 ⎤ ⎡ −1 11⎤ ⎡ 1 ⎤ ⎡ 1/2 ⎤ − − ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ (5.20) 111

5.3 Lyapunov Control Theory

One of the many uses of parametric optimization is in the design of Lyapunov control systems. A closely related subject is Lyapunov stability theory which is discussed in detail in [92, Ch. 4]. However, this section will focus only on Lyapunov control systems — the main reason for this is that it is undesirable, or at least unnecessary, to design a missile so that the resulting engagement is Lyapunov stable (simply hitting or even coming close to the target will do). To begin the discussion of Lyapunov control system design, we introduce the concept of a decent function. A function W (x) is a descent function for the target set if the following conditions hold at X points that are controllable to the target [92, p. 276]:

1. W (x) is continuous and continuously differentiable outside the target set and on the boundary of the target set.

2. The regions W (x) c are nested, that is, for c1

x W (x) c1 x W (x) c2 . (5.21) { | ≤ } ⊂ { | ≤ }

3. The target is contained in one of the W (x) c regions. ≤ 4. The regions W (x) c are bounded. ≤

The last restriction is only needed if the target set is bounded. The descent X function W (x) is a measure, albeit somewhat arbitrary, of distance to the target. For example, the radial distance to the target is an acceptable descent function

W (x)= x = √xT x . (5.22) || ||

Although, the following descent function will accomplish the same thing and is easier to work with 1 W (x)= xT x . (5.23) 2 112

∂W ∂x

f(x,u ) f(x, ) s U

f(x,uq)

W(x) = constant X

Target

Figure 5.3. Geometry for steepest descent control us (green) and quickest descent control uq (red).

Since the regions of W (x) are nested according to Eq. 5.21 and the target must be contained in one of the W (x) c regions, the descent function indicates a preferred ≤ direction of motion. Specifically, one method of arriving at the target is to seek a controller that will cause W (x) to continuously decrease until the target is reached. There at least two methods of achieving this goal: steepest descent control and quickest descent control. Both steepest and quickest descent control are illustrated in Figure 5.3. The(blue)curvelabeledf (x, ) in Figure 5.3 represents the set of all possible U state derivatives that result from all possible admissible controls. The steepest descent controller, labeled us in Figure 5.3, is such that the angle between the state derivative and the descent function gradient is maximized. The quickest descent controller, labeled uq in Figure 5.3, is such that the projection of f (x, ) onto the U descent function gradient is maximized (in the direction of decreasing W). Steepest descent control, since it makes use of the magnitude of f (x, ),ismoredifficult to U realize than quickest descent control. The interested reader is referred to [92, Ch. 5] for further discussion of steepest descent control. 113

5.3.1 Quickest Descent Control

A quickest descent controller attempts to decrease a descent function as quickly as possible. More specifically, a feedback control law uq (x) is a Lyapunov optimizing control for a specified descent function W (x) if

W˙ x, u W˙ (x, u) for all u , (5.24) q ≤ ∈ U ¡ ¢ where ∂W W˙ (x, u)= f (x, u) . (5.25) ∂x

Example 5.2 (Zermelo’s Problem). Zermelo’s problem is well known, but this par- ticular example has been adapted from [92, p. 283]. In Zermelo’s problem, there is an object (namely a person swimming in a river) that has the following dynamics

x˙ 1 = 2+cosu (5.26a) −

x˙ 2 =sinu . (5.26b)

Note the effective way in which these dynamics limit the control authority of the object (swimmer), with out directly imposing control constraints. Were this not done, we would need to model the system as

x˙ 1 = 2+u1 (5.27a) −

x˙ 2 = u2 , (5.27b) subject to the constraint 2 2 u1 + u2 =1. (5.28)

Although this model has linear state equations, they come at the cost of a nonlinear control constraint. For this problem, we define the target set as X

= xT x 1=0 , (5.29) X − © ª 114 which is a circle about the origin. The form of the terminal set suggests a Lyapunov function of the form 1 W (x)= xT x . (5.30) 2 The derivative of the descent function is

∂W W˙ (x, u)= f (x, u) ∂x = xT f (x, u)

= x1 ( 2+cosu)+x2 (sin u) . (5.31) −

Since there are no restrictions on u, the necessary conditions for a minimum are obtained by simply setting the derivative of W˙ with respect to u equal to zero ˙ T ∂W 0 = = x1 sin u + x2 cos u , (5.32) ∂u − or x tan u = 2 . (5.33) x1 Any controller of the form

x u = nπ + arctan 2 for integer n , (5.34) x µ 1 ¶ will satisfy the necessary conditions for a minimum. However, if the control u is to result in a minimum of W˙ , then it is also necessary that the second derivative be greater than zero

x1 cos u x2 sin u 0 . (5.35) − − ≥ Substituting in the necessary condition for a minimum gives

x1 x1 cos u x2 sin u = x1 sin u x2 sin u − − − x2 − 2 2 1 = x1 + x2 sin u . (5.36) − x2 ¡ ¢ 115

For this to be greater than zero, the control u must have a vertical component that is in the opposite direction as the vertical position component x2. Thus, we see that n must be odd, resulting in the quickest descent control x u = π +arctan 2 . (5.37) x µ 1 ¶ Several trajectories for this problem are shown in Figure 5.4. The (blue) circle in Figure 5.4 represents the target set for Zermelo’s problem. The (red) straight lines emanating from the circle show the boundary of the controllable set for Zermelo’s problem. The series of (black) trajectories interior to the controllable set boundaries in Figure 5.4 are a result of using the quickest descent controller from starting points inside this controller’s domain of effectiveness. The outermost such trajectory is on the boundary of the domain of effectiveness for the quickest descent controller. As can be seen from Figure 5.4, the domain of effectiveness for the quickest descent controllerisasubsetofthecontrollableset1.

5.3.2 Quickest Descent with Minimum Incremental Cost

Suppose that the controller is penalized according to an integral cost function of the form t J (t)= f0 (x, u) dt . (5.38) Z0 This means that the function f0 ( ), most likely positive definite, is integrated over · time and the resulting value is assigned as a figure of merit for the controller — with a lower value indicating better performance. If the control u is chosen such that the function f0 ( ) is minimized at each time step then the integral cost will likely be · close to a minimum. However, we can not be assured of this because this method has no foresight. That is, a short term gain may be chosen that results in a long term loss. This is contrary to optimal control theory that will (correctly) prefer a

1 Incidently, the domain of effectiveness for a steepest descent controller applied to this problem results in a domain of effectiveness that equals the controllable set! 116

3

2

1

0

-1

-2

-3 -3 -2 -1 0 1 2 3 4 5

Figure 5.4. Quickest descent trajectories (black) and controllable set boundary for Zermelo’s problem. 117 short term loss if it merits a long term gain. Nevertheless we will proceed, largely because the design of Lyapunov controllers is often far more tractable than the design of optimal controllers. The inclusion of the integral cost function is simple enough in a Lyapunov design. We simply append the integrand of the cost function to the descent function that is to be minimized. To this end, we define x0 as

x˙ 0 = f0 (x, u) , (5.39) such that t x0 (t)=J (t)= f0 (x, u) dt . (5.40) Z0 Then, the augmented descent function becomes

W0 (x0, x)=x0 + W (x) . (5.41)

Taking the derivative gives

W˙ 0 (x0, x)=x˙ 0 + W˙ (x) ∂W = f (x, u)+ f (x, u) . (5.42) 0 ∂x

Acontrolu (x) is a minimum-cost descent control if, at each state x in the ∈ U controllable set to the target set ,thecontrolminimizesW0. C X

5.4 Optimal Control Theory

This section develops the basic equations of optimal control theory. The idea of a cost function is central to optimal control theory. In fact, no claim about the optimality of a controller can be made without also stating the cost function for which the controller is optimal. The goal of optimal control theory is to find a control law u (t) to minimize a performance index given by

tf J = ψ (x (tf ) ,tf )+ f0 (x, u) dt , (5.43) Zt 118 subject to terminal conditions given by Eq. 5.3 and the control constraints given by Eq. 5.2a. There are two basic techniques that can be used to develop the optimal control equations known as the optimal control minimum principle. The first of these two techniques makes us of a mathematical theory known as the calculus of variations [47]. The calculus of variations is essentially a technique used for working with functions of functions. In this technique the cost function, given by Eq. 5.43, is a function of the control law u (x) and the calculus of variations seeks to determine a function u (x) that will minimize the cost function. It should be clear that minimizing over a field of functions is distinctly different from the parametric optimization problem discussed in Section 5.2, where the minimization takes place over the field of real numbers. The other approach to optimal control rests on simple geometrical arguments that make it possible to avoid the use of calculus of variations all together and make use of only parametric optimization theory. It is the later approach which will be presented herebecauseitismoreintuitiveanddoesnotrequireabackgroundinthecalculusof variations. The reader interested in the calculus of variations approach can consult either [20] or [47].

5.4.1 Optimal Return Function

The optimal return function is an important concept in optimal control theory. For a given value of tf , optimal or otherwise, the total cost incurred as a result of an admissible control u (x) is given by

tf V (x (0) , u (x)) = ψ (x (tf ) ,tf )+ f0 (x, u) dt . (5.44) Z0

An admissible control u∗ (x) is an optimal control for V at x (t) if and only if

V (x (0) , u∗ (x)) V (x (0) , u (x)) . (5.45) ≤ 119

The optimal control law u∗ (x) must be optimal for all possible initial conditions x (0). Otherwise, an initial condition could be selected such that another u (x) will yield a smaller value of the cost function than u∗ (x), which is contrary to the definition of u∗ (x). This concept, which is known as the principle of optimality, essentially states that past decisions (i.e. those that brought the system to the state x (0))shouldnot affect the current decision because optimal control theory is (appropriately) concerned with what can be gained and not with what might have already been lost. It follows from this fact that the optimal return function can be thought of as a function of the state x only tf V ∗ (x)=ψ (x (tf )) + f0 (x, u) dt . (5.46) Z0 The function V ∗ (x) is referred to as the optimal return function.

5.4.2 The Augmented State Vector and the Augmented State Space

To begin the development of the minimum principle, we form the augmented state vector x X = 0 , (5.47) x ∙ ¸ with dynamics given by X˙ = F (X, u) , (5.48) where x˙ f (x, u) F (X, u)= 0 = 0 . (5.49) x˙ f (x, u) ∙ ¸ ∙ ¸ Now,supposeanoptimalcontrolisappliedfromaninitialstatexi as shown in Figure

5.5. The resulting augmented trajectory starts at Xi

0 Xi = , (5.50) xi ∙ ¸ and eventually reaches the point X (t) showninFigure5.5. Atthepointdenotedby X (t), a sub-optimal control would result in the trajectory departing from the optimal 120

x0

) * u , X ( N F X(t) Σ xi X State Space S

Figure 5.5.AΣ surface in augmented state space.

trajectory in the direction of increasing x0. No control exists that would result in a trajectory that penetrates below the optimal trajectory in the direction of decreasing x0. If this were not true, then the trajectory shown would not be optimal because a future state x (t + ∆) could be reached that has a smaller value of x0 than does the optimal trajectory shown at time t + ∆. The curve labeled in Figure 5.5 passes S through the set of states that all have the same minimum cost-to-go to the terminal set . The surface that results from all trajectories in is denoted by Σ in Figure X S 5.5. The surface Σ is a semipermeable surface and no trajectory in the augmented state space can penetrate Σ from above (higher cost) to below (lower cost). Let N denote the outward normal of the Σ surface. Then the Σ surface has the property that for any X Σ ∈

T T 0=N F (X, u∗) N F (X, u) , (5.51) ≤ for any admissible control u. The goal now is to describe the outward normal by a 121 set of equations. To this end, we define the augmented adjoint vector

λ Λ = 0 , (5.52) λ ∙ ¸ such that ΛT F = 0 . (5.53)

Taking the time derivative gives dF Λ˙ T F + ΛT F˙ = Λ˙ T+ΛT F = 0 . (5.54) dX µ ¶ Solving for Λ˙ gives dF Λ˙ T = ΛT . (5.55) − dX

The function F is not a function of x0,sothisresultmaybesimplified

˙ λ0 =0, (5.56) and T dF ∂F ∂F ∂u λ˙ = Λ = ΛT + . (5.57) − dx − ∂x ∂u ∂x µ ¶ 5.4.3 Temporal Boundary Conditions

Now that the equations describing the dynamics of an outward normal to Σ have been found, we seek a set of temporal boundary conditions for Λ. To that end, we note that the Σ surface can be described by the equation

x0 + V ∗ (x)=final cost . (5.58)

This equation must be satisfied at any point X = x0 x on the Σ surface. Because this function is equal to a constant, the surface£ is equal¤ to an equipotential surface and the outward normal can easily be found. The gradient of Eq. 5.58 gives the outward normal to the Σ surface

T ∂V ∗ ∂V ∗ grad [x0 + V ∗ (x)] = 1 . (5.59) ∂x1 ··· ∂xnx h i 122

Since the vector Λ is also normal to this surface

Λ =k grad [x0 + V ∗ (x)] . (5.60)

In general we choose k =1and the adjoint equations are given by

λ0 =1, (5.61) and ∂V λT = ∗ . (5.62) ∂x This equation allows us to determine the final time values of λ

∂V ∂ψ λT (t )= ∗ = , (5.63) f ∂x ∂x ¯tf ¯tf ¯ ¯ ¯ ¯ where ψ is the terminal penalty given by Eqs.¯ 5.43¯ and 5.46. However, this does not address the possibility of a terminal constraint of the type given by Eq. 5.3. To handle this type of constraint we make use of Lagrange multipliers ρ and rewrite Eq. 5.58 as T x0 + V ∗ (x)+ρ g (x)=final cost , (5.64) subject to ρT g (x)=0. (5.65)

Because of this, we still have

T Λ =k grad x0 + V ∗ (x)+ρ g (x) . (5.66) £ ¤ With k =1these equations simplify to

λ0 =1, (5.67) and ∂V ∂g (x) λT = ∗ + ρT . (5.68) ∂x ∂x 123

Note that this results in ng more unknowns because of the vector ρ. However, on the target set we also have the ng equations obtained directly from the target set requirement

g (x (tf )) = 0 . (5.69)

Thus, the final-time value of λ is given by

∂V ∂g λT (t )= ∗ + ρT f ∂x ∂x ¯tf ¯tf ¯ ¯ ∂ψ ¯ T ∂g ¯ = ¯ + ρ ¯ . (5.70) ∂x ∂x ¯tf ¯tf ¯ ¯ ¯ ¯ ¯ ¯

5.4.4 The Optimal Control H Function

To facilitate further development, we define the optimal control H function as follows

T H (x, u, λ,λ0) , Λ F . (5.71)

We know from Section 5.4.2 that an optimal control u∗ will result in value of H =0. Furthermore, because Σ is a semipermeable surface any sub-optimal control will result in a vector F that has a larger component in the x0 direction than one obtained with an optimal control. For this reason, the minimum value of H is that which can only be obtained by an optimal control

min H (x, u, λ,λ0)=H (x, u∗, λ,λ0)=0. (5.72) u ∈U In terms of the H function, the adjoint dynamics of Eq. 5.57

T dF ∂H ∂H ∂u λ˙ = ΛT = . (5.73) − dx − ∂x − ∂u ∂x

The necessary conditions for minimization, given in Section 5.2, can be used to find the control u∗ that will minimize the H function according to Eq. 5.72. The necessary 124

conditions for u∗ to be a minimum are given by the following equations ∂L(u, γ) 0T = (5.74a) ∂u ¯u=u∗ ¯ 0 h (u∗) ¯ (5.74b) ≤ ¯ 0 γ (5.74c) ≤ T 0=γ h (u∗) ,(5.74d) where γ is a Lagrange multiplier vector and the Lagrangian function L ( ) in Eq. · 5.12a is defined as

T L (x, u, λ,λ0, γ) H (x, u, λ,λ0) γ h (u) . (5.75) , −

5.4.5 State Independent Control Constraints

The goal of this section is to show that Eq. 5.57 can be simplified if the control constraints are of the form specified by Eq. 5.2b. We begin this discussion with the requirement given by Eq. 5.74d

T γ h (u∗)=0. (5.76)

Figure 5.6 shows a plot of a control constraint in two dimensions, but the same concept carries over to higher dimensions. When the control constraint function is zero for a non-zero interval then so is its derivative. The only time that γ can be greater than zero is when the control constraint is zero. If a control constraint hi is instantaneously zero and then increases again, the only time that γi is possibly non-zero is the instant when hi was zero and so dh γ i =0. (5.77) i dx

Similarly, if a control constraint hi is zero for a non-zero time interval, then γi may be greater than zero, but still the derivative of hi is equal to zero and so we still have dh γ i =0. (5.78) i dx 125

h(x)

x γ = 0 γ ≥ 0 γ = 0

dh dx x

γ = 0 γ ≥ 0 γ = 0

Figure 5.6. Plot of control constrant function and its derivative.

Therefore, in all cases we have dh ∂h ∂h ∂u γT = γT + =0. (5.79) dx ∂x ∂u ∂x ∙ ¸ If h is not a function of the state x, then this condition reduces to ∂h ∂u γT =0. (5.80) ∂u ∂x By Eq. 5.74a we have ∂L(u, γ) ∂H ∂h = γT = 0T . (5.81) ∂u ∂u − ∂u Multiplying by ∂u/∂x gives ∂H ∂u ∂h ∂u γT = 0T . (5.82) ∂u ∂x − ∂u ∂x Substituting in Eq. 5.80 ∂H ∂u = 0T . (5.83) ∂u ∂x Substituting this result into Eq. 5.73 gives

T ∂H λ˙ = . (5.84) − ∂x 126

5.4.6 The Optimal Control Minimum Principle

The proceeding sections have developed the theory used in the optimal control mini- mum principle.

Theorem 5.2. Givenasetofdifferential equations

x˙ = f (x, u) , (5.85) and control constraints u = u h (u) 0 . (5.86) ∈ U { | ≥ }

If u∗ (x) is an optimal control for the cost function ∈ U tf J = ψ (x (tf )) + f0 (x, u) dt , (5.87) Zt then there exists a continuous and piecewise differentiable vector function λ (t) and a constant (Lagrange) multiplier vector ρ such that λ (t) satisfies the adjoint equations

T ∂H λ˙ = , (5.88) − ∂x such that, at the terminal set defined by

g (x)=0 , (5.89)

λ (tf ) satisfies the transversality conditions

∂θ λ (t )= , (5.90) f ∂x ¯tf ¯ ¯ and such that H takes on a global minimum value¯ with respect to u at every point ∈U x along the trajectory x (t) generated by u∗ (x). Furthermore, the minimum value of H at every such point is zero

min H (x, u, λ,λ0)=H (x, u∗, λ,λ0)=0, (5.91) u ∈U 127 where T H (x, u∗, λ,λ0)=λ0f0 (x, u)+λ f (x, u) , (5.92) and θ =ψ (x)+ρT g (x) . (5.93)

The proof of this theorem is simply a collection of the results from the previous sections on optimal control theory.

5.4.7 Linear Quadratic Regulator (LQR)

The linear quadratic regulator (LQR) is an important topic in control theory. In the LQR problem, the state equations are described by a linear system of the form

x˙ = Fx + Bu . (5.94)

Thecostindexmakesuseofquadraticfunctions

tf 1 T 1 T T J = x Sf x + x Qx + u Ru dt , (5.95) 2 2 0 ¯tf Z ¯ ¡ ¢ ¯ where without loss of generality¯ we can assume that the matrices Sf , Q, and R are symmetric. This problem can be solved by application of the optimal control minimum principle, which is summarized in Theorem 5.2. The H function is given by 1 1 H = xT Qx+ uT Ru + λT (Fx + Bu) . (5.96) 2 2 The time derivative of the adjoint vector is given by

T ∂H λ˙ = − ∂x = xT Q λT F . (5.97) − − Since there are no control constraints, the necessary condition for minimizing the H function is obtained by simply setting its derivative equal to zero ∂H = uT R + λT B = 0 . (5.98) ∂u 128

Solving for the control gives 1 T u = R− B λ . (5.99) − With 1 θ = xT S x , (5.100) 2 f the terminal conditions for λ are T ∂θ λ (t )= f ∂x " ¯tf # ¯ = Sf x (¯tf ) . (5.101) ¯ The system of equations in λ and x are linear and homogeneous x˙ F BR 1BT x = − . (5.102) λ˙ Q − FT λ ∙ ¸ ∙ − − ¸ ∙ ¸ The final-time values for both x and λ are

x (tf ) x (tf ) Inx = = x (tf ) . (5.103) λ (tf ) Sx (tf ) Sf ∙ ¸ ∙ ¸ ∙ ¸ Since the overall system is homogeneous, the final-time and initial time values are related by

x (t) x (tf ) = Φ (t tf ) λ (t) − λ (tf ) ∙ ¸ ∙ ¸ Inx = Φ (t tf ) x (tf ) − Sf ∙ ¸ Φx (t tf ) Inx = − x (tf ) , (5.104) Φλ (t tf ) Sf ∙ − ¸ ∙ ¸ where Φ is the state transition matrix of the linear, time-invariant system. We can see a relationship between x (t) and λ (t)

Inx λ (t)=Φλ (t tf ) x (tf ) − Sf ∙ ¸ 1 Inx = Φλ (t tf ) Φx− (t tf ) Φx (t tf ) x (tf ) − − − Sf 1 ∙ ¸ = Φλ (t tf ) Φ− (t tf ) x (t) − x − = ST (t) x (t) . (5.105) 129

This does not mean that the inverse of Φx need exist because a generalized inverse wouldhavealsoworkedifΦx did not have full rank. The point is, that there exists amatrixS (t) such that λ (t)=ST (t) x (t) . (5.106)

Taking the time derivative of this equation gives

λ˙ = S˙ T x + ST x˙

T T 1 T = S˙ x + S Fx BR− B λ − T T 1 T T = S˙ x + S £Fx BR− B S¤x . (5.107) − £ ¤ Equating this to Eq. 5.97 gives

T T 1 T T T S˙ x + S Fx BR− B S x = Qx F λ − − − £ ¤ = Qx FT Sx . (5.108) − − Rearranging gives

T T 1 T T T S˙ + S F BR− B S + Q + F S x = 0 . (5.109) − h £ ¤ i Since this must be true for all x we must have

T T T 1 T T T S˙ + S F S BR− B S + Q + F S = 0 . (5.110) − This is the matrix Ricatti equation

T T 1 T T T T S˙ = S BR− B S Q F S S F , (5.111) − − − and can be solved numerically with the boundary condition

S (tf )=Sf . (5.112)

Since the Ricatti equation is symmetric we may also write it as

1 T T S˙ = SBR− B S Q F S SF . (5.113) − − − 130

Alternately, the steady-state solution may be obtained by setting S˙ = 0 yielding the algebraic Ricatti equation

T T 1 T T T S F S BR− B S + Q + F S = 0 . (5.114) − Example 5.3. For example [92, Ex. 2.3-3, p. 106], if 01 0 F = , B = , (5.115) 23 1 ∙ − ¸ ∙ ¸ and T 1 1 1 J = x2 + x2 + u2 dt (5.116a) 2 1 2 2 2 Z0 µ ¶ 10 1 Q = , R = . (5.116b) 01 1 ∙ ¸ ∙ ¸ The eigenvalues of F are (1, 2). MATLAB’s lqr command can be used to solve the algebraic Ricatti equation

[K, S, E]=lqr(F, B, Q, R) (5.117a)

K = 0.2361 6.2361 T (5.117b) 13.2361 0.2361 S = £ ¤ (5.117c) 0.2361 6.2361 ∙ ¸ E = 1 2.2361 , (5.117d) − − £ ¤ where S is the solution to the Ricatti equation and E are the eigenvalues of the closed loop system. The time varying gains were computed by solving the Ricatti equation

1 T T S˙ (t)= Q + S (t) BR− B S (t) S (t) F (t) F S (t) − − − 10 00 01 0 2 = + S (t) S (t) S (t) − S (t) , − 01 01 − 23 − 13 ∙ ¸ ∙ ¸ ∙ − ¸ ∙ ¸ (5.118) with

S (tf )=Sf 00 = . (5.119) 00 ∙ ¸ 131

7 K 1 K 6 2

5

4

3

2

Controller Gains 1

0

−1

−2

−3 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 Time (seconds)

Figure 5.7. Optimal control gains as a function of time.

The system of equations was easily solved using MATLAB. Once S is obtained, the closed loop control is given by

u = K (t) x (5.120) − 1 T T K = R− (t) B (t) P (t) . (5.121)

TheresultsareshowninFigure5.7. Thefactthattf was chosen equal to five seconds was arbitrary for this problem. The steady-state gains are shown at t =0to be consistent with those found by solving the algebraic Ricatti equation. Let’s assume the actual system’s control is bounded such that u2 1. It is desired to obtain the ≤ controllable set for the system. If a point is on the boundary of the controllable set, it will either remain on the boundary, or leave the controllable set. The state of the system can only remain on the boundary if an optimal control is used. On the boundary, it is obvious that the control will be saturated. Thus, to sketch the boundary of the controllable set, one must simply start from some point on the boundary and then use retro-time integration with u = 1. The starting points are the equilibrium ± 132

0.4

0.3

0.2 Controllable Set Domain of Attraction

0.1 2

x 0

−0.1

−0.2

−0.3

−0.4

−0.6 −0.4 −0.2 0 0.2 0.4 0.6 x 1

Figure 5.8. Controllable set and LQR domain of attraction for system with bounded control. points on the boundary, which are found by letting x˙ = 0

1 (x˙ = 0,u =1) x = 2 (5.122) ⇒ 0 ∙ 1¸ (x˙ = 0,u = 1) x = − 2 . (5.123) − ⇒ 0 ∙ ¸ The boundary of the controllable set was computed and is shown in Figure 5.8. The domain of attraction is the set of all states that asymptotically approach the origin. It may be obtained in a similar manner to the controllable set. Essentially, any trajectory that is in the domain of attraction for the system must have been in the domain for all time up to the current time. If one integrates the system of equations backward in time from the origin using the LQR controller, the trajectory will approach the boundary of the domain of attraction, as shown in Figure 5.8. 133

5.4.8 Linear Systems with Process Noise

A linear system with process noise can be described by Eq. 5.94 with the signal u being split up into a control signal uD and a stochastic input uS

Bu = BDuD + BSuS , (5.124) where the dimensions of uD and uS are D 1 and S 1, respectively. Substituting × × Eq. 5.124 into Eq. 5.94 gives

x˙ = Fx + BDuD + BSuS . (5.125)

It is assumed that the process noise is white and therefore has the autocorrelation function

Ru (τ)=E uS (t) uS (t + τ) = Aδ (τ) , (5.126) S { } where A is a symmetric matrix of dimension S S. A white process is not possi- × ble to generate because of its infinite variance. Because of this, it is unlikely that white noise will adequately model an uncertain input. However, when processed by a linear, time-invariant (LTI) shaping filter, white noise becomes exponentially corre- lated as discussed in Appendix B. Exponentially correlated noise adequately models many random processes and is a very common model used in practical engineering applications. The LTI shaping filtercanbeincludedwiththestateequationsand the overall model will have the form given by Eq. 5.125, with the system matrices F,

BD,andBS including the shaping filter equations. The response of the system can be expressed using the state transition matrix Φ as shown in Appendix B t x (t)=Φx (t t0) x (t0)+ Φx (t τ)[BDuD + BSuS] dτ − t0 − t Z = xD (t)+ Φx (t τ) BSuSdτ , (5.127) − Zt0 where xD (t) is the state that results when uS = 0 t xD (t)=Φx (t t0) x (t0)+ Φx (t τ) BDuDdτ . (5.128) − − Zt0 134

Due to the stochastic input uS the appropriate cost function JE is the expectation of Eq. 5.95

JE = E J { } tf 1 T 1 T T = E x Sf x + x Qx + uDRuD dt 2 2 0 ( ¯tf Z ) ¯ tf¡ ¢ 1 T ¯ 1 T T = E x Sf x¯ + E x Qx +E uDRuD dt . (5.129) 2 tf 2 Z0 © ª¯ ¡ © ª © ª¢ It is impossible for the control u¯D (t) to counter act or even correlate with the current or any future values of uS (t). In fact, the best estimate of the disturbance for the remainder of the trajectory is the unconditional mean of the disturbance, which is zero. For this reason, an LQR controller is optimal for the system with white process noise. ThedynamicsofxD are

x˙ D= FxD+BDuD . (5.130)

An LQR design will produce a feedback gain matrix K (t) that is optimal even in the presence of white process noise

uD = K (t) x (t) . (5.131) −

5.4.9 Linear Systems with Process Noise and Measurement Noise

This section assumes that linear measurements of the system state become available at discrete instants in time, denoted by tk

zk = Hkx (tk)+vk , (5.132) where vk is a white sequence with covariance Rk

T E vkvk = Rk . (5.133)

The sum of all knowledge available© to theª controller at time t is referred to as the information state and is denoted by D (t). The expected value of the state x (t) = conditioned on the information state D (t) can be expressed using Eq. 5.127 = 135

µx D (t)=E x (t) D (t) |= { |= } t t = E Φx (t t0) x (t0)+ Φx (t τ) BDuDdτ + Φx (t τ) BSuSdτ D (t) . − − − = ½ Zt0 Zt0 ¯ ¾ ¯(5.134) ¯ ¯ In practical application, the control uD (t) will only be a function of the information state D (t). Thatis,thecontroluD (t) cannotbeafunctionof D (tf ) because = = the control is applied at time t before the information state D (tf ) is available (i.e. = before D (tf ) has be reached). It would be inappropriate to ask even an optimal = control to make use of information not yet available. The control uD can be brought outside of the expectation because it is a function of the conditioning variable D (t) =

t t

µx D (t)=E Φx (t t0) x (t0)+ Φx (t τ) BDuDdτ + Φx (t τ) BSuSdτ D (t) |= − − − = ½ Zt0 Zt0 ¯ ¾ t t ¯ ¯ = E Φx (t t0) x (t0)+ Φx (t τ) BSuSdτ D (t) + Φx (t τ¯) BDuDdτ . − − = − ½ Zt0 ¯ ¾ Zt0 ¯ (5.135) ¯ ¯ Thecovarianceofx (t) conditioned on the information state D (t) is denoted by =

T Px D (t)=E x (t) µx D (t) x (t) µx D (t) D (t) . (5.136) |= − |= − |= = n ¯ o ¡ ¢¡ ¢ ¯ If Eq. 5.127 and Eq. 5.135 are substituted into Eq. 5.136,¯ the result will show that the covariance Px D (t), does not depend on the control uD (t) —afactthatwill |= be used shortly. Due to the stochastic input uS and measurement noise vk,the appropriate cost function JE is the expectation of Eq. 5.95

JE = E J { } tf 1 T 1 T T = E x Sf x + x Qx + uDRuD dt 2 2 0 ( ¯tf Z ) ¯ tf¡ ¢ 1 T ¯ 1 T T = E x Sf x¯ + E x Qx +E uDRuD dt . (5.137) 2 tf 2 Z0 © ª¯ ¡ © ª © ª¢ ¯ 136

It useful to condition certain elements of the expectation of the information set D = tf 1 T 1 T T JE = E x Sf x + E x Qx +E uDRuD dt 2 tf 2 Z0 © ª¯ ¡ © tf ª © ª¢ 1 T ¯ 1 T T = E E x Sf x D (t) + E E x Qx D (t) +E uDRuD dt , 2 = tf 2 = Z0 © © ¯ ªª ¡ © © ¯ ªª © (5.138)ª¢ ¯ ¯ where the outer expectation is complete — i.e. it removes any conditioning variables. Since the trace of a scalar is the scalar itself and the trace operator is cyclically commutative, the cost may be written as

1 T JE = E E x Sf x D (t) 2 = tf tf ©1 © ¯ T ªª T + E E¯ x Qx D (t) +E u RuD dt 2 = D Z0 1 ¡ ©T © ¯ ªª © ª¢ = E E Tr x Sf x D¯(t) 2 = tf tf ©1 © ¡ ¢¯ T ªª T + E E Tr¯ x Qx D (t) +E u RuD dt 2 = D Z0 1 ¡ © ©T ¡ ¢¯ ªª © ª¢ = E Tr Sf E xx D (t) ¯ 2 = tf tf ©1 ¡ © ¯ ª¢ªT T + E Tr Q¯ E xx D (t) +E u RuD dt . (5.139) 2 = D Z0 ¡ © ¡ © ¯ ª¢ª © ª¢ The conditional expectation of xxT can be¯ written as

T T E xx D = E x µx D + µx D x µx D + µx D D = − |= |= − |= |= = © ¯ ª = E n¡x µ x µ¢¡ T + µ µ¢ T¯ o x D x D D x D x¯ D ¯ − |= − |= = |= ¯|= = P n¡ + µ µ¢¡T ¢ ¯ o x D x D x D .¯ (5.140) |= |= |= ¯

Substituting Eq. 5.140 into Eq. 5.139 gives

JE = JCE + JS , (5.141)

where JCE is the certainty-equivalent stochastic cost function 137

1 1 tf J = E Tr S µ µT + E Tr Qµ µT +E uT Ru dt CE f x D x D x D x D D D 2 |= |= tf 2 0 |= |= n ³ ´o Z ³ n ³ ´o ´ 1 1 tf © ª = E µT S µ + E µT Qµ +E uT Ru dt x D f x D x D x D D D 2 |= |= tf 2 0 |= |= n o Z ³ n o ´ 1 1 tf © ª = E µT S µ + µT Qµ +uT Ru dt , (5.142) x D f x D x D x D D D 2 |= |= tf 2 0 |= |= ½ ¯ Z ³ ´ ¾ ¯ and JS is the cost function¯ increment due to estimation error 1 1 tf J = E Tr S P + E Tr QP dt S f x D t x D 2 |= f 2 |= Z0 1 © ¡ ¢ª tf © ¡ ¢ª = E Tr Sf Px D (tf )+ QPx D dt . (5.143) 2 |= |= ½ µ Z0 ¶¾ An optimal control uD∗ will have the minimum cost JE∗ among all choices for the control input uD

JE∗ =minJE =min(JCE + JS) . (5.144) uD uD ∈U ∈U

The cost JS is only a function of the covariance Px D , which (as has already been |= shown) is dependent on the control uD. Therefore, the cost may be written as

JE∗ =min(JCE + JS) uD ∈U =minJCE + JS uD ∈U = JCE∗ + JS , (5.145) where JCE∗ is the minimum value of the certainty-equivalent cost function given by Eq. 5.142

JCE∗ =minJCE . (5.146) uD ∈U

The conditional mean µx D in Eq. 5.142 is the estimate xˆ (t) produced by the |= Kalman filter equations

µx D (t)=xˆ (t) . (5.147) |= The time derivative of xˆ (t) is given by Eq. 3.87 and repeated here dxˆ = Fˆx + B u + δ (t t ) K (z H xˆ (t)) , (5.148) dt D D j j j j j 1 − − X≥ 138

where Kj is the Kalman filter gain for the measurement zj. The solution to this differential equation can be found by use of the state transition matrix Φ for the system matrix F

t xˆ (t)=Φ (t t0) x (t0)+ Φ (t τ) BDuDdτ − t0 − t Z + Φ (t τ) δ (t tj) Kj (zj Hjxˆ (t)) dτ (5.149) − − − t0 j 1 Z X≥ t = Φ (t t0) xˆ (t0)+ Φ (t τ) BDuDdτ + Φ (t tj) Kj (zj Hjxˆ (tj)) − t0 − − − j 1,tj t Z ≥X≤ = xˆD (t)+ Φ (t tj) Kj (zj Hjxˆ (tj)) , (5.150) − − j 1,tj t ≥X≤ where xˆD (t) is given by t xˆD (t)=Φ (t t0) xˆ (t0)+ Φ (t τ) BDuDdτ , (5.151) − − Zt0 and therefore satisfies the dynamics dxˆ D = Fˆx + B u (5.152) dt D D D . The conditional covariance of x is also obtained by the Kalman filter equations. Specifically, the conditional covariance satisfies the differential equation given by Eq. 3.88 and repeated here

˙ T T Px D (t)=FPx D (t)+Px D (t) F + BSABS δ (t tj) KjHjPx D (t) . |= |= |= |= − j 1 − X≥ (5.153)

It has already been shown that Px D (t) is independent of uD and Eq. 5.153 is |= further evidence of this. In any case, the reason that the conditional covariance is independent of the control uD (t) is because the system is linear and the control is restricted to be a function of the conditioning variable — namely the information state

D (t). In Eq. 5.150, the term Hjxˆ (tj) is the conditional expected value of zj given = zj 1 −

E zj z1:j 1 = E Hjxj + vj z1:j 1 = HjE xj z1:j 1 = Hjxˆ (tj) . (5.154) { | − } { | − } { | − } 139

Therefore, the sequence of random variables zj Hjxˆ (tj) is an innovations sequence. − As discussed in Appendix A.9, the elements of an innovations sequence are mutually uncorrelated

T E [zj Hjxˆ (tj)] [zm Hmxˆ (tm)] = δ [j m] cov (zj Hjxˆ (tj)) − − − − n o = δ [j m] cov ((Hjx (tj)+vj Hjxˆ (tj))) − −

= δ [j m] cov ((Hj (x (tj) xˆ (tj)) + vj)) − − T = δ [j m] HjPx D (tj) Hj + Rj , − |= (5.155) ¡ ¢ where δ [j m] is the unit impulse function which has a value of one when its argu- − ment is zero and has a value of zero otherwise. At this point, the problem of finding the optimal control becomes very similar to finding the optimal control for the system with perfect measurements and white process noise that was considered in Section 5.4.8. The reason for this is that the optimal control uD∗ must minimize JCE given by Eq. 5.142 1 1 tf J = E µT S µ + µT Qµ +uT Ru dt CE x D f x D x D x D D D 2 |= |= tf 2 |= |= ½ Z0 ¾ ¯ tf ¡ ¢ 1 T 1 ¯ T T = E xˆ Sf xˆ + xˆ Qˆx + uDRuD dt , (5.156) 2 tf 2 ½ Z0 ¾ ¯ ¡ ¢ where xˆ (t) satisfies the differential¯ equation given by Eq. 5.148. The stochastic quantity zj Hjxˆ (t) in Eq. 5.148 is a white sequence that parallels the white process − noise considered in Section 5.4.8. Therefore, the optimal controller is an LQR con- troller for the cost function

tf 1 T 1 T T J = xˆDSf xˆD + xˆ Qˆx + uDRuD dt , (5.157) 2 tf 2 Z0 ¯ ¡ ¢ subject to the dynamics ¯ dxˆ D = Fˆx + B u . (5.158) dt D D D The LQR design will yield a matrix K (t) that is also optimal for the cost function

tf 1 T 1 T T JE = E x Sf x + x Qx + uDRuD dt , (5.159) 2 tf 2 ½ Z0 ¾ ¯ ¡ ¢ ¯ 140 subject to the measurement equation

zk = Hkx (tk)+vk , (5.160) and dynamics

x˙ = Fx + BDuD + BuS . (5.161)

Thatis,thecontrolisgivenby

uD = K (t) xˆ (t) , (5.162) − where xˆ (t) is the Kalman filter estimate of the state x (t).

5.5 Differential Game Theory

This section contains a statement of the Min-Max Principle for continuous-time sys- tems. A concise definition of each aspect of the minimum principle is stated, however no proofs are given. For further information, the reader is referred to [92] or [45].

5.5.1 System Definition

The system is described by a set of differential equations of the form:

x˙ = f(x, u, v) , (5.163)

where x = nx dimensional state vector u = nu dimensional control vector (Player 1) v = nv dimensional control vector (Player 2) . The vector function f(x, u, v) is assumed to be continuous and have continuous partial derivatives with respect to its arguments. The control functions u(x) and v(x) may be discontinuous. Allowable trajectories for x(t) consist of those for which u[x(t)], v[x(t)], ∂u[x(t)]/∂x and ∂v[x(t)]/∂x are at least piecewise continuous. 141

5.5.2 Control Constraints

The control variables u and v are subject to control constraints of the form:

= u Rnu h(u) 0 (5.164a) U { ∈ | ≥ } = v Rnv h¯(v) 0 . (5.164b) V ∈ | ≥ © ª The vector functions h(u) and h¯(v) are assumed to be continuous and have continuous partial derivatives with respect to their arguments.. The gradient vectors ∂hi(u)/∂u, i =1, ..., nh are assumed to be linearly independent at any “active” inequalities hi =0 (same assumption for the gradient vectors of h¯).

5.5.3 Terminal Set

It is desired to drive the system defined by (5.163) to a terminal set which can be defined by a set of equality conditions of the form:

= x Rnx g(x)=0 . (5.165) X { ∈ | } The vector function g(x) is assumed to be continuous and have continuous partial derivatives with respect to its arguments.. The gradient vectors ∂gi(x)/∂x,i=

1, ..., ng, are assumed to be linearly independent.

5.5.4 The Payoff

It is desired to drive the system defined by (5.163) to the terminal set defined by (5.165) subject to the control constraints defined by (5.164a) while minimizing a “cost” functional of the form:

tf J [u( ), v( )] = φ(xf )+ fo(x, u, v)dt . (5.166) · · Zo 142

The first term in (5.166) is known as the “terminal payoff”, and the second term is known as the “integral payoff.” The scalar valued function φ(x) is assumed to be smooth on the terminal set defined by (5.165). The scalar valued function fo(x, u, v) is assumed to be continuous and to have continuous partial derivatives with respect to its arguments.

5.5.5 Games of Kind and Games of Degree

Beforetheminimumprincipleisstated,somemoredefinitions and terminology must be introduced. There are two general and related aspects in game theory. The firstaspectofgametheoryisthatofa“qualitativegame”[13],alsocalleda “game of kind” [45]. The qualitative game is a problem of determining which initial states can be transferred to the terminal set defined by (5.165). The collection of all such initial states is known as the “playable set” [13]. Although it is no longer considered a game if there is only one player, for a single controller, this collection is known as the “controllable set” [92]. In a qualitative game, a specific controller is not sought, but rather a description of the space for which there exists at least one (possibly more) controller for player 1 (u) which will guarantee transfer to the terminal set, regardless of any effort by player 2 (v). The second aspect of game theory is that of a “quantitative game” [13], also called a “game of degree” [45]. The quantitative game is a problem of determining a specific controller which will guarantee transfer to the terminal set from any point in the playable set, while optimizing a specified “cost” functional. Although it is no longer considered a game if there is only one player, for a single player, the related problem is known as an optimal control problem.

Admissible and Playable Controls For the controls u(x) and v(x) to be considered “admissible”, they must satisfy the following:

1.Thecontrolsmustbepiecewisecontinuousandpiecewisedifferentiable. 143

2. The controls must yield a solution x(t) to (5.163) from any initial state-space point of interest.

3. The controls must satisfy (5.164a) along each point of the trajectory x(t).

A control u(x) is definedasplayableifitisadmissible,andittransfersanypoint intheplayablesettothetargetdefined by (5.165) for any admissible control v(x).

The goal of a quantitative game is to find an admissible control v∗(x) that max- imizes (5.166) and a playable control u∗(x) that minimizes (5.166). This pair of controls u∗(x) and v∗(x) is known as "min-max controls."

5.5.6 Min-Max Principle

Given the constraint sets (5.164a), if u∗(x) and v∗(x) are min-max controls ∈ U ∈ V for the differential game (5.163),5.168 and 5.166, then there must exist a continuous T and piecewise differentiable vector function λ(t)=[λ1 ...λnx ] ,aconstantλo 0, T ≥ with (λo, λ) = 0 and a constant multiplier vector ρ = ρ ...ρ such that λ(t) 6 1 ng satisfies the adjoint equations: h i

T ∂H λ˙ = . (5.167) − ∂x

At the terminal set, λ(tf ) satisfies the following equation:

∂ψ λT = , (5.168) f ∂x ¯xf ¯ where the function ψ is defined as: ¯ ¯

T ψ , φ (x)+ρ g (x) . (5.169)

Themin-maxHfunctionisdefined as:

T H(x, u, v, λ,λo) , λofo(x, u, v)+λ f(x, u, v) . (5.170) 144

The min-max function H takes on a global min-max value with respect to u ∈ U and v at every point x along the trajectory x(t) generated by u∗(x) and v∗(x). ∈ V Furthermore, the min-max value of H at every such point is zero:

0=H (x(t), u∗ [x(t)] , v∗ [x(t)] , λ(t),λo)=minmax H (x(t), u, v, λ(t),λo) . u v ∈U ∈V (5.171)

5.5.7 Min-Max Necessary Conditions

Let u∗ be a regular point of and v∗ be a regular point of . If (5.166) takes on a min- U V T max with respect to u at u∗ and v at v∗, then Lagrange multipliers γ = γ1 γnh T ··· and γ¯ = γ¯ γ¯ exist such that: £ ¤ 1 ··· nh¯ h i

∂L =0 (5.172a) ∂u ¯u=u∗ ∂L¯ ¯ =0 (5.172b) ∂v¯ ¯v=v∗ ¯ h¯(u∗) 0 (5.172c) ¯ ≥ h¯(v∗) 0 (5.172d) ≥ T γ h(u∗)=0 (5.172e)

T γ¯ h¯(v∗)=0 (5.172f)

γ 0 (5.172g) ≥ γ¯ 0 , (5.172h) ≤ where

T T L(x, u, v, λ,λo, γ, γ¯) H(x, u, v, λ,λo) γ h (u) γ¯ h¯(v) . (5.173) , − − 145

Chapter 6 Airframe and Autopilot Modeling

This chapter develops the equations governing a missiles airframe and autopilot, which are corporately referred to as the flight control system. The job of the flight control system is to provide a desired level of acceleration while maintaining stability of the missile body. For the purposes of this dissertation, the goal of flight control system modeling is to develop an equation which relates the achieved level of acceleration to the commanded level of acceleration. The first section of this chapter provides a introduction to flight control system concepts such as lift and angle of attack. This introductory section is followed by a section that develops the equations governing a tail controlled missile. The next section presents the conventional three-loop autopilot topology, and develops the equations that govern the flight control system. The final two sections of this chapter provide two lower-fidelity models of the flight control system.

6.1 Introduction

This dissertation is primarily concerned with missiles that sustain and direct flight by making use of aerodynamic forces — that is, endo-atmospheric missiles. The principle by which missiles, as well as other aircraft, fly is generically referred to as lift.

Definition 6.1. Lift [42, p. 2-6]. The basis of flight, whether by airplanes, missiles, or birds, is lift, the component of aerodynamic force perpendicular to the direction of travel. Lift is generated by airflow on wings, canards, control surfaces and the missile body. If lift is equal to or greater than the device’s weight, it will fly.

A useful concept related to lift is the angle of attack. 146

Definition 6.2. Angle of Attack [42, p. 2-10]. Angle of attack, sometimes called angle of incidence, is the angle between the velocity vector and the missile longitudinal axis.

There are two primary mechanisms to produce lift: (1) Bernoulli’s principle and (2) redirection of air flow due to an angle of attack. According to Bernoulli’s principle, if the speed of air parallel to a surface is increased, the pressure on the surface is decreased. Bernoulli’s principle is the primary mechanism of lift for small angles of attack and low to moderate air speeds. Suffice it to say that wings are designed to force the airflow over them to increase in speed across the top surface of the wing, thus producing lift. For large accelerations, lift is created by redirection of the air flow due to a large angle of attack. Since air possesses inertia it takes a force to change its velocity. On a relative scale, the air mass can be considered nearly stationary (i.e. zero velocity). As the inclined airfoil passes through a stationary air mass, the airfoil creates a force against the air mass in the only direction in which the airfoil can sustain one — normal to the airfoil surface. The air mass in turn "pushes" back against the airfoil as it is displaced, creating an aerodynamic force that resolves into drag and lift, as shown in Figure 6.1. Having discussed the force that acts on the missile and how it is created, we now turn our attention to the dynamics associated with lift. We begin with three definitions taken from [42].

Definition 6.3. Center of Gravity (CG). The center of gravity of an object is the point at which all of the objects weight is effectively positioned. It is the location at which a single point of support would balance the object.

Definition 6.4. Center of Pressure (CP). The distributed aerodynamic forces acting over a missile body and control surfaces may be resolved into one effective force applied at a point, known as the center of pressure (CP). 147

Total LIFT Aerodynamic Force α

DRAG Direction of Airflow

Figure 6.1. Aerodynamic force, which is resolved into lift and drag components, is generated by creating an angle of attack, α, with respect to the directio of airflow against the airfoil.

Definition 6.5. Static Margin. Static margin, the physical distance between the center of pressure and the center of gravity with the control undeflected, is a measure of missile stability.

The CG, CP and static margin are shown for a typical missile in Figure 6.2. When the center of pressure is aft of the center of gravity, as shown in 6.2, the missile is statically stable, meaning any disturbance tends to make the missile rotate opposite in sense to the rotation caused by the disturbance to a condition of static equilibrium. This situation is easily verified by examination of Figure 6.2. Suppose that the statically stable missile is perturbed in the clockwise direction. In such a case, the resulting angle of attack will create an upward normal force through the CP. The upward force at the CP will create a counter-clockwise moment about the CG, which will cause the moment to rotate in the counter-clockwise direction, so as to reducetheangleofattack. Staticmarginaffects how quickly a missile will react to commands in pitch or yaw. As static margin increases, the airframe response becomes sluggish; if static margin is too short, the airframe becomes excessively responsive to the point of being difficult to control [42]. Nevertheless, many advanced designs are statically unstable configurations because of the much quicker response when 148

STATIC MARGIN

CG CP

Figure 6.2. A standard measure of serodynamic stability is static margin, the distance between the center of gravity and the center of pressure. commanding maneuvers. It is important to note that the CG changes as a missile’s fuel is expended. Similarly, the CP changes as a function of missile speed and angle of attack. While it is possible to have a missile’s control surfaces near the CG of the missile, this reduces the moment arm of the aerodynamic force acting on the control surface. For this reason control surfaces are usually located at the far ends of the missile. When the control surfaces are at the nose of the missile, they are referred to as canards. Figure 6.3 shows a picture of a forward controlled missile with the pitch canards inclined at an angle δ relative to the missile body. Note that the missile shown in Figure 6.3 is statically stable. If it is desired to maintain the level of lift shown, then the angle of attack must be maintained, which requires that the net moment about the CG be zero. This can only be accomplished if additional force created canard due to the angle δ creates a moment that counters the moment created by the force acting at the CP of the nominal system (i.e. with δ =0). The most attractive feature of canards is that the force acting on the canard acts to increase lift in the desired direction. The force acting on the canard is small compared to that which will be 149

Canard Aerodynamic Force LIFT δ α DRAG VM CP CG

Figure 6.3. Canard control requires actuators to be located near the nose of the missile, typically in the same area as the seeker. generated by the body as the body rotates to the equilibrium angle of attack. Yet, even so, it contributes to the net lift available to a canard controlled missile. Despite the favorable lift configuration of the canards, there are at least two disadvantages to forward control: (1) the downwash created by air passing over the canard surfaces and (2) the extra load of the canards and associated controls at the front end of the missile (along with the rest of the guidance package) results in a drastically varying CG as the missile’s fuel is expended. Figure 6.4 shows a picture of a tail controlled missile with the pitch controls inclined at an angle δ relative to the missile body. If it is desired to maintain the level of lift shown, then the angle of attack must be maintained, which requires that the net moment about the CG be zero. This can only be accomplished if the additional force created by the tail surfaces, due to the angle δ, creates a moment that counters the moment created by the force acting at the CP of the nominal (i.e. with δ =0)system. Unlike the canard lifting force, the aerodynamic force at the tail acts in the opposite direction as the lifting force. This has two implications: (1) increasing δ will initially (transient response) decrease the lift, until the missile’s angle of attack α increases such that the aerodynamic force on the body and wings exceeds the aerodynamic 150

LIFT α

δ CG V CP M

Tail Aerodynamic Force

Figure 6.4. Movement of tail control surfaces will not disturb the airflow across the wings or missile body. force on the tail surfaces; and (2) in equilibrium (steady state) the aerodynamic force acting on the tail decreases the aerodynamic forces acting elsewhere on the missile. A qualitative comparison of the tail and canard methods of missile control is shown in Figure 6.5. In addition to the pitch (and yaw) control discussed above, a missile will have roll control of some kind. Some missiles, particularly the type that relay a video feed, must be roll stabilized, meaning that they must maintain a stable roll angle. Other missiles are designed to roll continuously to retain stability. If a missile is rolling, then any destabilizing moment will be approximately averaged out during the roll of the missile. An effective method of inducing roll into a missile is by use of the missile’s control surfaces [49, p. 91]. The incidence on one control is increased whilst on another, diametrically opposed to the first, the incidence will be reduced. The first surface develops a positive lift increment and the second a negative lift increment. The net result is a pure rolling moment. Because of downwash, rolling missiles are usually tail controlled. 151

Lift Canard

Tail

Time

Figure 6.5. A qualitative comparison of canard and tail control.

6.2 Airframe Modeling of Tail Controlled Missiles

This section presents a model of a tail controlled missile’s airframe. It also discusses a common autopilot configuration used to improve the step response.

6.2.1 Aerodynamic Forces

The force acting on the missile is a function of the air density, the missile velocity and the angle of attack.

Dynamic Pressure Consider the kinetic energy in a mass, m, of air moving at a speed

VM 1 E = mV 2 . (6.1) 2 M

Assuming a constant velocity VM , the time derivative of the energy (i.e. the power), is given by 1 E˙ = mV˙ 2 . (6.2) 2 M Should this power be harvested, it would be equal to the resulting force times the velocity, another well-known equation for power

E˙ = FVM . (6.3) 152

Equating the two equations for power gives

1 F = mV˙ . (6.4) 2 M

During an interval of time dt, the differential amount of air flowing through an area

A with a velocity VM is given by

dm = ρA (VM dt) . (6.5)

Dividing by dt gives the mass flow rate

m˙ = ρAVM . (6.6)

Substituting Eq. 6.6 into Eq. 6.4 gives

1 F = ρAV 2 . (6.7) 2 M

Dividing through by the area A gives the dynamic pressure Q associated with the air

1 Q = ρV 2 . (6.8) 2 M

The dynamic pressure is the amount of pressure that would develop if the air were brought into contact with a surface that stopped the macroscopic motion of the air. Of course, the kinetic energy of the air (i.e. the energy due to the macroscopic motion) would have been converted to unorganized microscopic (internal) energy. This must occur because of the principle of energy conservation. If the air barrier is immobile, then no work is done by the force and so the kinetic energy of the air must have been converted into internal energy. Dynamic pressure is a fundamental quantity in fluid dynamics and aerodynamics.

Normal Force Equations Having discussed dynamic pressure, we now show how it can be used to create lift. In the case of a missile, it is not really the air that is being brought to a stop. Rather, as a mobile missile comes into contact with the air, the 153

FBODY

FTAIL FWING

FNOSE

CG

XCPN

XCPW

XCG

XCPB

XCPT

Figure 6.6. Forces on a tail-controlled missile. air is being pushed out of the way — that is, the missile is exerting a force on the air, so as to give the air a velocity. If all the air that comes into contact with the missile is given a velocity equal to that of the missile, then the normal force acting at the CP would be

FN = QSM sin α , (6.9) where SM is the cross-sectional area of the missile, and α is the angle of attack. Of course, this is only an approximate equation because not all of the air is given a velocity equal to that of the missile. The actually velocity that is imparted to the air depends on the aerodynamic shape of the surface that is being brought into contact with the air. The normal force created by a typical tail controlled missile is due to the aerodynamic force acting on each of the missile’s surfaces, as shown in Figure 6.6. The total normal force acting at the CP of the missile is the superposition of the individual forces shown in Figure 6.6

FN = FNOSE + FBODY + FWING + FTAIL . (6.10) 154

The force acting on the nose of the missile is approximated by [101, p. 462] πD2 F = Q 2α , (6.11) NOSE 4 where D is the diameter of the missile. The force acting on the missile’s body is approximated by [101, p. 462]

2 2 FBODY = QLD1.5α = QSB1.5α , (6.12) where L is the length of the missile and SB = LD. The aerodynamic force acting on the missile’s wings is given by [101, p. 462] 8 F = QS α , (6.13) WING W β 2 where SW is the wing area and β = √Mach 1. In the case of the wing, the − normal force is dependent on the Mach number, which is simply the ratio of the missile’s speed to the local speed of sound. The aerodynamic force acting on the missile’s tail is given by [101, p. 462] 8 F = QS (α + δ) , (6.14) TAIL T β

where ST is the tail area. Since the wing and tail are roughly approximated as trapezoids, the areas are given by 1 S = h (C + C ) (6.15a) W 2 W T,W R,W 1 S = h (C + C ) ,(6.15b) T 2 T T,T R,T where h represents the height of the trapezoid, CR, represents the root chord length, · and CT, represents the time chord length. It is common to express the normal force · acting at the missile’s CP as

FN = QSref CN , (6.16)

2 where Sref = πD /4,andCN is the normal force coefficient 2 1.5SBα 8SW α 8ST (α + δ) CN =2α + + + . (6.17) Sref βSref βSref Nose Body Wing Tail |{z} | {z } | {z } | {z } 155

The similarity of Eq. 6.9 and Eq. 6.16 should be obvious. The moment about the missile’s center-of-gravity (CG) is given by

M =(XCPN XCG) FNOSE +(XCPB XCG) FBODY − −

+(XCPW XCG) FWING +(XCPT XCG) FTAIL . (6.18) − −

It is common practice to express the moment using the moment coefficient CM

M = QSref DCM , (6.19) where

2 2α 1.5SPLANα CM = (XCG XCPN)+ (XCG XCPB) D − Sref D −

8SW α 8ST (α + δ) + (XCG XCPW )+ (XCG XHL) , (6.20) βSref D − βSref D − and XCG is the distance from the nose of the missile to the center of gravity and

XCPN, XCPB, XCPW , are the distances from the nose to the centers of pressure for the nose, body, and wing, respectively. The term XHL is the distance from the nose of the missile to the tail hinge line.

Linearization of the Airframe The normal force and moment coefficients are nonlinear functions of the angle of attack α. This is unfortunate because we would like to use linear control theory to develop a controller for the airframe. This problem is handled by use of a trim angle of attack, which is the steady-state value of α for a given δ.At trim, the moment coefficient is zero and the trim angle αTR can be solved for. That is, for a given value of δ,thevalueofαTR is obtained by setting Eq. 6.20 equal to zero

2 2αTR 1.5SPLANαTR 0= (XCG XCPN)+ (XCG XCPB) D − Sref D −

8SW αTR 8ST (αTR + δ) + (XCG XCPW )+ (XCG XHL) . (6.21) βSref D − βSref D − 156

This is a simple quadratic equation and its roots are easily solved for. Having obtained the trim angle, Equation 6.20 can be expressed as a linear function in α and δ

CN = CNαα + CNδδ , (6.22) where

8SW 8ST CNα =2+1.5SBαTR + + (6.23a) βSref βSref 8ST CNδ = .(6.23b) βSref Similarly for the moment equation

CM = CMαα + CMδδ , (6.24) where

1.5SBαTR CMα =2(XCG XCPN)+ (XCG XCPB) − Sref − 8SW 8ST + (XCG XCPW )+ (XCG XHL) (6.25) βSref − βSref − 8ST CMδ = (XCG XHL) . (6.26) βSref −

6.2.2 Airframe Dynamics

A missile’s equations of motion are governed by the normal force equation, Eq. 6.16 and Eq. 6.22, and the moment equation, given by Eq. 6.19 and Eq. 6.24. A diagram of the missile with the force and moment is shown in Figure 6.7. Let Iθ denote the rotational inertia of a missile (about the axis normal to this paper). An often used approximation of the moment of inertia is that for a long slender rod 1 I = mL2 , (6.27) θ 12 where m and L are the missile’s mass and length, respectively. From the laws of dynamics, the angular acceleration of the missile body due to the moment is given by ¨ M θ = = Mαα + Mδδ , (6.28) Iyy 157

α VM FN = QSref CN γ G C

P C

δ M = QSref DCM

Figure 6.7. Missile with normal force FN at CP and moment M about CG. where

QSref D Mα = CMα (6.29a) Iyy QSref D Mδ = CMδ .(6.29b) Iyy

The acceleration nL of the missile’s CG is given by

F QS C n = N = ref N . (6.30) L m m

Since the acceleration nL is normal to the missile body, for small angles of attack, it will serve to rotate the missile velocity vector

nL γV˙ m , (6.31) ' where γ is the angle of the missile velocity vector, and often called the flight path angle. Rearranging this equation and substituting Eq. 6.16 gives an equation for the angular rate of the missile velocity vector

γ˙ = Zαα Zδδ , (6.32) − − 158

+ − θ&& 1 θ& α& 1 α γ& Mδ Zα VΜ δ Σ s Σ Σ + s nL + + + −

Figure 6.8. Block diagram of linearized airframe. where

QSref Zα = CNα (6.33a) − mVM QSref Zδ = CNδ .(6.33b) − mVM

6.2.3 Airframe Transfer Function

Figure 6.8 shows a block diagram of Eqs. 6.28 and 6.32

Relationship of Fin Angle and Missile Acceleration It is desired to obtain a relation- ship between the finangleδ and the missile acceleration nL. From Figure 6.8, the 159 following relationship is obtained

nL = VM γ˙

= VM (Zδδ + Zαα) − Zα = VM Zδδ + α˙ − s µ ¶ Zα 1¨ = VM Zδδ + Zδδ + θ + Zαα − s s µ µ ¶¶ Zα 1 = VM Zδδ + Zδδ + (Mαα + Mδδ)+Zαα − s s µ µ ¶¶ ZαZδ ZαMδ Zαα Mα = VM δ Zδ + + + + Zα − s s2 s s µ µ ¶ µ ¶¶ ZαZδ ZαMδ 1 nL Mα = VM δ Zδ + + + Zδδ + Zα − s s2 − s V s µ µ ¶ µ M ¶µ ¶¶ ZαZδ ZαMδ Zδ Mα 1 nL Mα = VM δ Zδ + + 2 + Zα + Zα − s s − s s − s VM s µ µ µ ¶¶ µ ¶¶ ZαMδ ZδMα 1 nL Mα = VM δ Zδ + + Zα . (6.34) − s2 − s2 − s V s µ µ ¶ M µ ¶¶ Grouping terms yields

ZδMα ZαMδ n V Z + −2 L = M δ s 1 Mα δ − 1 1 + Zα Ã − s s ! 2 VM Zδs +¡ ZδMα ¢ZαMδ = 2 − − 1 s (Mα + sZα) µ − ¶ 1 Zδ s2 VM (ZδMα ZαMδ) ZδMα ZαMδ = − − − 2 . (6.35) M Zα s − α Ã 1+ M s M ! α − α 160

Theairframetransferfunctionisgivenby

n L = G (s) (6.36a) δ 1 s2 K1 1 2 − Ωz = 2 (6.36b) 2ζAF s 1+ ³ s + ´2 ΩAF ΩAF

ΩAF = Mα (6.36c) − pZαΩAF ζAF = (6.36d) 2Mα 2 ZδMα ZαMδ Ωz = − (6.36e) Zδ VM (ZδMα ZαMδ) VM ZδΩz K1 = − = . (6.36f) − Mα − Mα

Relationship of Body Rate and Missile Acceleration The transfer function relating missile body rate θ to tail angle nL is given by

1 θ˙ = (M α + M δ) s α δ 1 Mδ nL = Mαα + Zαα s − Zδ VM µ µ ¶¶ 1 MδZα Mδ nL = Mα α s − Z − Z V µµ δ ¶ δ M ¶ MδZα Zδ Mα 1+ s 1 Zδ Mδ M n = − θ˙ δ L . (6.37) s ⎛³ ´³ZδMα ´ − Zδ VM ⎞ s Zα + − Mδ ⎝ ³ ´ ⎠ 161

Rearranging terms gives

M θ˙ 1 δ = s ZδVM MδZα Zδ nL − Mα 1+ s 1 1 − Zδ Mδ s  ZδMα  − s Zα+ M µ − δ ¶   ZδMα M s Zα + = δ − Mδ −ZδVM 2 ZδMα MδZα Zδ s + s Zα Mα 1+ s Mδ − − − Zδ Mδ MαZ M Zα M ³ ´ s³+ δ− δ ´³ ´ = δ Mδ −ZδVM 2 ZδMα 1 MδZα s + s Zα (MαZδ MδZα) Mα Mδ − − − Mδ − − Zδ MαZ M Zα M s + ³ δ− δ ´ ³ ´ = δ Mδ 2 MαZδ MδZα −ZδVM s − − Zδ 1+ Mδ s 1 MαZδ MδZα = − . (6.38) Zδ 2 VM 1 s MαZ M Zα − δ− δ

Introducing a new variable Tα gives the final result ˙ θ 1 1+Tαs = s2 (6.39a) nL VM 1 − Ωz Mδ Tα = .(6.39b) ΩzZδ

Relationship of Body Rate and Fin Angle The transfer function relating missile body rate θ to tail angle δ is given by ˙ ˙ θ nL θ = = G2 (s) δ δ nL s2 K1 1 2 − Ωz 1 1+Tαs = 2 2 2ζAF s s 1+ ³ s + ´2 VM 1 Ω Ω Ωz AF AF − K3 (1 + Tαs) = 2 (6.40a) 2ζAF s 1+ s + 2 ΩAF ΩAF K1 K3 = .(6.40b) VM 162

G(s)

+ − + δ e ω I C KDC KA KR nC Σ Σ Σ GM ()s − + s +

K 3 (1 + Tα s) θ& 2 2ζ AF s 1 + s + 2 Ω AF Ω AF

⎛ s 2 ⎞ K ⎜1 − ⎟ n 1⎜ 2 ⎟ L ⎝ Ω z ⎠ δ 2 2ζ AF s 1 + s + 2 Ω AF Ω AF

Figure 6.9. Standard Three-Loop Autopilot Topology [101, p. 508].

6.3 Three-Loop Autopilot

The purpose of the autopilot is to achieve a desired level of acceleration while main- taining stability. Maintaining stability means maintaining stability of the missile body. The autopilot is a control system which issues commands to the fin position motors in order to achieve a desired level of acceleration while maintaining stability. Therefore, the appropriate error signal is the difference between the commanded ac- celeration nC and the achieved acceleration nL. In order to maintain stability, we must have a derivative feedback term for the body rate. In order to reduce steady- state error, we must also include an integrator. The three-loop autopilot is the preferred control system that has these desired properties. The standard three-loop autopilot has the feedback control scheme shown in Figure 6.9. It should be clear from 6.9 that the outer-control loop is a proportional controller acting on the accel- eration error. The outer-control loop commands the missile body orientation. The inner-control loops form a proportional plus derivative (PD) controller for stabilizing the missile body. The inner-control loops command the motor to orient the tail fin to a desired position. Although not shown, the finmotorshaveacontrolsystemas well. The outer-loop equations are given by 163

e = nC KDC nL −

= nC KDC G (s) G1 (s) e . (6.41) −

Rearranging yields

1 nC KDC = (1 + G (s) G1 (s)) nL (6.42a) G (s) G1 (s) nL G (s) G1 (s) = KDC .(6.42b) nC 1+G (s) G1 (s)

The transfer function G (s) is found by

δ G (s)= (6.43) e

δ = GM (s) δC

ωI = GM (s) KR (G2 (s) δ eKA)+G2 (s) δ . (6.44) s − ³ ´ Rearranging gives

ωI ωI δ 1 KRGM (s) G2 (s) +1 = GM (s) KR KA e , (6.45) − s − s ³ ³ ´´ ³ ´ or 1 G (s) K ω K G (s)= M R I A . (6.46) ωI s KRGM (s) G2 (s) +1 1 s − ¡ ¢ 164

The relationship between the achieved and commanded acceleration is given by

nL G (s) G1 (s) = KDC nC 1+G (s) G1 (s)

s2 K1 1 2 1 GM (s)KRωI KA − Ωz   s K3(1+Tαs) ωI 2ζAF s2 KRGM (s) +1 1 1+ s+ 2ζ s2 ( s ) Ω Ω2 1+ AF s+ − AF AF Ω Ω2 AF AF = KDC s2 K1 1 2 1 GM (s)KRωI KA − Ωz   1+ s K3(1+Tαs) ωI 2ζAF s2 KRGM (s) +1 1 1+ s+ 2ζ s2 ( s ) Ω Ω2 1+ AF s+ − AF AF ΩAF Ω2 AF 2 1 GM (s)KRωI KA s K1 1 2 s ωI 2ζAF s2 Ωz KRGM (s)K3(1+Tαs) +1 1+ s+ − ( s ) Ω Ω2 − AF AF = KDC   ³ ´ 1 GM (s)KRωI KA s2 1+ K1 1 2 s ωI 2ζAF s2 Ωz KRGM (s)K3(1+Tαs)( +1) 1+ s+ − s − ΩAF Ω2  AF  ³ ´ s2 GM (s) KRωI KAK1 1 Ω2 = K − z . (6.47) DC 3 ωI ³ ´2ζAF 2 s sKRGM (s) K3 (1 + Tαs) s +1 s + Ω s + Ω2 − AF AF ⎧ s2 ⎫ +GM (s) KR¡ωI KAK¢1 1³ 2 ´ ⎨ − Ωz ⎬ Define the variables K and K as follows ³ ´ ⎩ 0 ⎭

K = KRωI KA (6.48a)

K0 = KRωI KA [(K3/KA)+K1]=K [(K3/KA)+K1] .(6.48b)

Using these definitions, the relationship between the achieved and commanded accel- eration is given by

s2 GM (s) KK1 1 2 nL − Ωz = KDC ³ ω´I nC sKRGM (s) K3 (1 + Tαs) s +1 3 2 2ζAF 2 s s s + Ω s + Ω2 + GM (s) KK1 1 Ω2 ( − AF AF ¡ ¢ − z ) ³ ´ s2 ³ ´ GM (s) KK1 1 2 − Ωz = KDC GM (s) K0 + sKRGM (s)³K3 (1 +´Tαs + ωI Tα) 3 2 2ζAF 2 s s s + Ω s + Ω2 GM (s) KK1 Ω2 ( − AF AF − z ) ³ ´ s2 GM (s) KK1 1 2 − Ωz = KDC . (6.49) GM (s) K0 +[KRGM (s)³K3 (1 +´ωI Tα) 1] s − 3 KK1 2ζAF 2 s + GM (s) KRK3Tα Ω2 Ω s Ω2 ( − z − AF − AF ) h ³ ´ i 165

6.3.1 High Bandwidth Actuator

For guidance system analysis, it is common practice to assume that the motor dy- namics are very fast relative to the system response

GM (s) 1 . (6.50) '

This approximation results in the following relationship between the achieved and commanded acceleration

GM (s) 1 (6.51) ' s2 KK1 1 2 nL − Ωz = KDC nC K0 +[KRK3 (1³ + ωI T´α) 1] s − 3 KK1 2ζAF 2 s + KRK3Tα Ω2 Ω s Ω2 ( − z − AF − AF ) h³ ´ s2 i KK1 1 2 /K0 − Ωz = KDC 1+[KRK3 (1³ + ωI T´α) 1] s/K0 − 3 KK1 2ζAF 2 s + KRK3Tα Ω2 Ω s /K0 K Ω2 ( − z − AF − 0 AF ) h³ ´s2 i KK1 1 2 /K0 − Ωz = KDC , (6.52) 3 2ζ0 1 ³ 1 ´2ζAF 2 s 1+ Ω K s + Ω2 K Ω s K Ω2 0 − 0 0 − 0 AF − 0 AF h i h i where

2ζ KRK3 (1 + ωI Tα) 1 0 = − Ω0 K0 K (1 + ω T ) = 3 I α (6.53) ωI [K3 + KAK1]

1 KK1 1 2 = KRK3Tα 2 Ω0 − Ωz K0 µ ¶2 K3Tα ωI KAK1/Ω = − z . (6.54) ωI [K3 + KAK1]

For zero steady state error to a step input we require

K KK DC 1 =1, (6.55) K0 166 and so s2 n 1 Ω2 L = − z . (6.56) 3 nC 2ζ0 1 ³ 1 ´2ζAF 2 s 1+ Ω K s + Ω2 K Ω s K Ω2 0 − 0 0 − 0 AF − 0 AF h i h i 6.3.2 3-Loop Summary

The closed-loop transfer function given by Eq. 6.56 is third order. One of the poles are real and the other two are complex conjugates. Factoring the denominator results in the transfer function

s2 n 1 Ω2 L = − z , (6.57) 2 nC ³ 2ζ´As s (1 + TAs) 1+ + 2 ΩA ΩA ³ ´ where

2 2 K0ΩAF ΩA = (6.58a) − TA

2ζ0 1 ΩA ζ = TA .(6.58b) A Ω − K − 2 µ 0 0 ¶ ˙ It is also desirable to know the relationship between nL and θ. This relationship is readily obtained from Figure 6.9

θ˙ 1+T s = α . (6.59) nL s2 VM 1 2 − Ωz ³ ´ 6.3.3 3-Loop Parameters

The three-loop autopilot is described by Eq. 6.57 and Eq. 6.59. Five parameters appear in these two equations. The names of these parameters and their typical values [60] are shown in Table 6.1. The parameters ζA, ΩA,andΩz are known as the flight control system damping, natural frequency, and zeros, respectively . The variable Tα, known as the turning rate time constant, gets its name because of its relationship to the angle of attack and the flight path angle. Consider the ratio of α 167

Symbol Definition Value Tα Aerodynamic turning rate time constant 2 sec Ωz Flight control system zero 30 rad/sec ΩA Flight control system natural frequency 20 rad/sec ζA Flight control system damping 0.7 TA Flight control system time constant 0.5

Table 6.1. Nominal values for flight control system parameters [60], [61]. to γ˙

α θ γ = − γ˙ γ˙ θ 1 = γ˙ − s θ 1 = VM nL − s θ 1 = VM nL − s θ˙ 1 1 = VM nL s − s 1+Tαs 1 1 = s2 . (6.60) 1 2 s − s − Ωz 2 2 For the typical values of Tα and Ωz given in Table 6.1, we can neglect the term s /Ωz, which gives

Tα α/γ˙ . (6.61) ' In words, "The turning rate time constant is the amount of time it takes to turn the missile flight path angle γ through an equivalent angle of attack α. [101, p. 397]"

6.3.4 3-Loop Performance

A thorough analysis of the 3-loop autopilot performance is beyond the scope of this dissertation. However, it is useful to look at the 3-loop autopilot’s frequency response and step response. The frequency response of Eq. 6.57 is shown in Figure 6.10 for thee different values of the flight control system time constant. The step response of 168

Flight Control System Frequency Response 0

−10

−20

−30

Magnitude (dB) −40

−50 −1 0 1 10 10 10

0 T = 0.1 −50 A T = 0.5 A −100 T = 0.9 A −150 Phase (deg) −200

−250 −1 0 1 10 10 10 Frequency (Hz)

Figure 6.10. Bode plot (frequency response) of flight control system.

Eq. 6.57 is shown in Figure 6.10 for three different values of the flight control system time constant

6.4 First Order (Pole) Approximation of Flight Control Sys- tem

The most common approximation of the flight control system (like many other sys- tems) is a first order lag n 1 L = , (6.62) nC 1+sτ p where τ is the overall time constant of the system. The Laplace-domain step response of the pole model is given by

1 1 A B 1 τ = + = p . (6.63) s 1+sτ p s 1+sτ p s − 1+sτ p

The equivalent time-domain step response is

t/τ p S (t)= 1 e− U (t) . (6.64) − £ ¤ 169

Flight Control System Step Response 1.2

1

0.8

0.6 C

/ n T = 0.1

L A n 0.4 T = 0.5 A T = 0.9 A 0.2

0

−0.2 0 0.5 1 1.5 Time (sec)

Figure 6.11. Flight control system step response for different flight control system time constants.

When t = τ p,theoutputisgivenby

1 S (τ p)=1 e− . (6.65) −

In order for the pole model to approximate the 3-loop autopilot, the time constant

τ p is set equal to the time constant of the 3-loop autopilot given by Eq. 6.57

2ζA τ p = TA + . (6.66) ΩA

Figure 6.12 shows the quality of the first order approximation for three different values of the flight control system time constant. It is evident that the first order model does an excellent job except for the initial wrong-way effect. This initial wrong way effect is a result of the right plane zero (non-minimum phase zero). The frequency response of the pole model is shown in Figure 6.13. The pole model is a very good match at low frequencies. At high frequencies, the pole model magnitude response decreases at 20 dB/decade — the same as the 3-loop autopilot magnitude response. However, at high frequencies, the pole model phase lag is only 90 degrees, while the 170

1.2

1

0.8

0.6 C / n L 3−Loop Autopilot: T = 0.1 n 0.4 A Pole Model: T = 0.1 A 3−Loop Autopilot: T = 0.5 0.2 A Pole Model: T = 0.5 A 3−Loop Autopilot: T = 0.9 0 A Pole Model: T = 0.9 A

−0.2 0 0.5 1 1.5 Time (sec)

Figure 6.12. Step response comparison for 1st order (pole) approximation of 3-loop autopilot. high frequency phase lag of the 3-loop autopilot is 270 degrees. The difference in phase lag is due to the right-plane zero in the 3-loop autopilot. The flight control system takes a commanded acceleration and produces an achieved acceleration. The commanded acceleration will be in the form of a guidance command that is a function of the system states. Notice the low-pass filter aspect of both the 3-loop autopilot model and the pole model. Because the pole model exhibits this filtering effect, it will likely result in a better filter design than the pole-zero model that is discussed in Section 6.5.

6.5 Pole-Zero Approximation of Flight Control System

With the excellent results obtained from the pole approximation of the flight control system, one is tempted to use a pole-zero model to also get the wrong-way effect exhibited in the 3rd order model. That is, the following model is proposed

n 1+sτ L = z . (6.67) nC 1+sτ p 171

0

−5

−10 3−Loop Autopilot −15 Pole Model

Magnitude (dB) −20

−25 −1 0 1 10 10 10

0

−50

−100

−150 Phase (deg) −200

−250 −1 0 1 10 10 10 Frequency (Hz)

Figure 6.13. Frequency response comparison of 3-loop autopilot and pole model.

This pole-zero was explored in [84, Ch. 4]. The Laplace-domain step response of the pole-zero model is given by

1 1+sτ z 1 τ z 1 τ z/τ p τ z 1 1 1 = + − = +(1 τ z/τ p) . (6.68) s 1+sτ p s τ p 1+sτ p τ p s − s 1+sτ p ∙ ¸ The equivalent time-domain step response is given by

τ z t/τ p S (t)= U (t)+(1 τ z/τ p) 1 e− U (t) τ p − −

t/τ p = 1 (1 τ z/τ p) e− £ U (t) .¤ (6.69) − − £ ¤ This model has two parameters that can be used to help it approximate the three-loop autopilot. Let τ denote the time-constant of the 3-loop autopilot

2ζA τ = TA + . (6.70) ΩA

1 At t = τ, the pole-zero model’s step response should be 1 e− −

1 τ/τp 1 e− =1 (1 τ z/τ p) e− . (6.71) − − − 172

Solving for τ z gives

(1 τ/τp) τ z = τ p 1 e− − . (6.72) − ¡ ¢ Substituting τ z into the step response gives

t/τ p S (t)= 1 (1 τ z/τ p) e− U (t) − − (1 τ/τp) t/τ p = £1 1 1 e− − ¤ e− U (t) − − − (1 τ/τp) t/τ p = £1 e¡− −¡ e− U (t¢¢) ¤ − (1 τ/τp+t/τ p) = £1 e− − ¤U (t) . (6.73) − £ ¤ There is still a single degree-of-freedom to choose for the value of τ p. Suppose it is chosen such that the cross-over time of the 3-loop autopilot and the pole-zero mode are the same. Let this time be denoted by t = τ m. Then,

(1 τ/τp+τ m/τ p) S (τ m)=0=1 e− − . (6.74) − Taking the logarithm gives

1 τ/τp + τ m/τ p =0. (6.75) −

Solving for τ p gives

τ p = τ τ m . (6.76) −

We can now solve for τ z as

(1 τ/τp) τ z = τ p 1 e− − − (1 τ/(τ τ m)) =(τ ¡ τ m) 1 e− ¢− − . (6.77) − − ¡ ¢ Substituting τ p into the step response gives

(1 τ/τp+t/τ p) S (t)= 1 e− − U (t) − 1+(τ t)/τ p = £1 e− − U¤(t) − 1+(τ t)/(τ τ m) = £1 e− − −¤ U (t) . (6.78) − £ ¤ 173

It is clearly seen that 1 S (τ)=1 e− , (6.79) − and

S (τ m)=0. (6.80) as required. A plot of the pole-zero model’s step response is shown in Figure 6.14. Note that the pole-zero model’s step response (Eq. 6.78) is similar in form to the pole model’s step response (Eq. 6.64). The main difference is that the pole-zero model step response is initially negative (τ>τm and so the exponential is greater than one)

1+τ/(τ τ m) 1/(τ/τm 1) S (0) = 1 e− − =1 e − . (6.81) − −

The frequency response of the pole-zero model is shown in Figure 6.15. The pole- zero model is a very good match at low frequencies. It certainly matches the low frequency region better than the pole model shown in Figure 6.14. At high frequen- cies, the pole model magnitude response is flat — contrary to the 3-loop autopilot magnitude response which decreases at 20 dB/decade. However, at high frequencies, the pole model phase lag is 180 degrees, while the high frequency phase lag of the 3-loop autopilot is 270 degrees. The right-plane zero of the pole-zero model accounts for the additional 90 degree phase lag (relative to the pole model) in the system. Fur- thermore, the right-plane zero makes the system non-minimum phase and produces a wrong way effect, as is evident from Figure 6.14. The wrong-way effect present in the pole-zero model may make it a more attractive model from a guidance system design perspective. However, the lack of high-frequency magnitude attenuation makes the pole model preferable from a filtering and estimation perspective.

6.6 Binomial Approximation of Flight Control System

The pole and pole-zero models are not the only ones used to represent the flight- control system of a missile. Another approach is to use a binomial of the form [101, 174

1.2

1

0.8

0.6 C / n L n 3−Loop Autopilot: T = 0.1 0.4 A Pole−Zero Model: T = 0.1 A 3−Loop Autopilot: T = 0.5 0.2 A Pole−Zero Model: T = 0.5 A 3−Loop Autopilot: T = 0.9 0 A Pole−Zero Model: T = 0.9 A −0.2 0 0.5 1 1.5 Time (sec)

Figure 6.14. Step response comparison for pole-zero approximation of 3-loop au- topilot.

0

−5

−10 3−Loop Autopilot −15 Pole−Zero Model

Magnitude (dB) −20

−25 −1 0 1 10 10 10

0

−50

−100

−150 Phase (deg) −200

−250 −1 0 1 10 10 10 Frequency (Hz)

Figure 6.15. Frequency response comparison of 3-loop autopilot and pole-zero model. 175

1.2

1

0.8

0.6 C 3−loop / n

L st n 1 order 0.4 2nd order 3rd order 0.2 4th order 5th order 0

−0.2 0 0.2 0.4 0.6 0.8 1 Time (sec)

Figure 6.16. Binomial approximations of flight control system with TA =0.3 sec- onds.

Ch. 5] n 1 L = . (6.82) n sτ n C 1+ n A comparison of this approximation with¡ the three-loop¢ autopilot is shown in Figure 6.16. As can be seen from Figure 6.16, not much is gained by using a higher-order binomial approximation — certainly nothing is gained beyond n =2in either quality or complexity. The binomial approximation is often used in preliminary design to represent both the flight control system and any other time delays present, such as the time lag due to filtering the sensor measurements. In the preliminary design phase, the individual time constants are not generally known, and so an average time constant is used to represent each of the individual time constants, thereby reducing thenumberofparametersdescribingthesystem. 176

Chapter 7 Optimal Guidance

This chapter develops several guidance laws using optimal control theory. Section 7.1 presents a kinematic engagement model and the fundamental equations that describe the engagement. Section 7.1 is followed by three sections that introduce the miss, the zero effort miss (ZEM), and the heading error. These quantities are all of prime importance in optimal guidance and the equations describing them will facilitate the subsequent development of optimal guidance laws. The next several sections in this chapter rigorously develop optimal guidance laws under various control constraints (dynamic, magnitude and directional). This chapter concludes with Section 7.11 on estimating the time at which the engagement ends (the time at which a missile and target are closest), which is also a fundamentally important guidance parameter.

7.1 Engagement Geometry and Dynamics

This section presents a kinematic engagement model and the fundamental equations that describe the engagement. A typical engagement geometry is illustrated in Figure 7.1, in which a pursuing object and an evading object are separated by a vector distance r, known as the line-of-sight (LOS). The pursuer’s velocity and acceleration vectors are denoted by Vp and Ap, respectively. Similarly, the evader’s velocity and acceleration vectors are denoted by Ve and Ae, respectively. From the perspective of guidance, a model describing the relative distance of the evader to the pursuer is necessary — that is, a model of the dynamics of the LOS, r, is required. Let rp and re denote vectors extending from the origin of some inertial coordinate system to the 177

Evader Ve Vp r

Pursuer

Figure 7.1. Typical engagement geometry. pursuer and evader, respectively. Then the LOS r is expressed as

r = re rp . (7.1) − Taking the first time-derivative gives

r˙=Ve Vp . (7.2) −

Taking the time derivative of Ve and Vp gives the following two equations.

V˙ e = Ae (7.3a)

V˙ p = Ap (7.3b)

Therefore, the basic equations governing a missile engagement are given by the fol- lowing three equations.

r˙ = Ve Vp (7.4a) −

V˙ e = Ae (7.4b)

V˙ p = Ap (7.4c)

It may be the case that the acceleration vectors Ae and Ap also have dynamics. For example, the pursuer may issue a commanded acceleration ap that is related to the 178

achieved acceleration Ap by the LTI differential equation that follows.

dm p = F m + G a (7.5a) dt p p p p

Ap = Cpmp (7.5b)

In practice, the most common model of the acceleration dynamics is the single pole model discussed in Section 6.4. Let I3 denote the 3 3 identity matrix. × 100 I3 = 010 (7.6) ⎡ 001⎤ ⎣ ⎦ Then, the single pole model is described by the following matrices.

1 Fp = I3 (7.7a) −τ p 1 Gp = I3 (7.7b) τ p

Cp = I3 (7.7c)

In certain cases the evader dynamics may also be adequately modeled by a LTI differential equation of the form shown below.

dm e = F m + G a (7.8a) dt e e e e

Ae = Ceme (7.8b)

7.2 Miss

This section provides a definition of the miss and derives an equation for it in terms of the kinematic state variables introduced in Section 7.1. The primary goal of guidance is to achieve an acceptable miss. Accordingly, it is no surprise that the equations governing the miss will be important to the development of optimal guidance laws.

Themissisdefined to be the LOS vector at a specified final time tf ,andcanbe 179 obtained by direct integration of the state equations. Integrating the pursuer and evader state equations yields

tf Vp (tf ) Vp (t)= Ap (γ) dγ , (7.9) − Zt and tf Ve (tf ) Ve (t)= Ae (γ) dγ. (7.10) − Zt The equation for r˙ is integrated to yield

r (tf )=r (t)+(tf t)[Ve (tf ) Vp (tf )] (7.11) tf tf− − [Ae (γ) Ap (γ)] dγdβ . (7.12) − − Zt Zβ

Solving Eq. 7.9 for Vp (tf ) and Eq. 7.10 for Ve (tf ) and substituting the results into thepreviousequationgives

tf r (tf )=r (t)+(tf t) r˙ (t)+(tf t) [Ae (γ) Ap (γ)] dγ − − − Zt tf tf (Ae (γ) Ap (γ)) dγdβ .(7.13) − − Zt Zβ Using integration by parts, one may show that

tf tf tf tf (tf γ) a (γ) dγ =(tf t) a (γ) dγ a (γ) dγdβ . (7.14) − − − Zt Zt Zt Zβ Usingthisresultthemissdistanceisgivenby

τ τ r (tf )=r (t)+τr˙ (t)+ ηAe (tf η) dη ηAp (tf η) dη , (7.15) − − − Z0 Z0 where τ = tf t is the time to go. −

7.2.1 Acceleration Dynamics

The matrix Fp is time invariant and so the state transition matrix is given by

Fpt Φp (t)=e . (7.16) 180

The pursuer acceleration equation can be expressed as

tf mp (tf )=Φp (tf t) mp (t)+ Φp (tf γ) Gpap (γ) dγ − t − tZf Fp(tf t) Fp(tf γ) = e − mp (t)+ e − Gpap (γ) dγ t Z t t f − Fp(tf t) Fpβ = e − mp (t)+ e Gpap (tf β) dβ . (7.17) − Z0 Solving for mp (t) gives

tf t − Fp(tf t) Fpβ mp (t)=e− − mp (tf ) e Gpap (tf β) dβ . (7.18) − − ∙ Z0 ¸ An integral of mp is needed in Eq. 7.15 τ τ ηmp (tf η) dη = ηmp (tf η) dη 0 − 0 − Z Z τ η Fpη Fpβ = ηe− mp (tf ) e Gpap (tf β) dβ dη . (7.19) − − Z0 ∙ Z0 ¸ Substituting Eq. 7.17 into this result gives τ τ Fpη Fp(tf t) ηmp (tf η) dη = ηe− dη e − mp (t) 0 − 0 Z ∙Z τ ¸ τ Fpη Fpβ + ηe− e Gpap (tf β) dβ dη − 0 ∙ η ¸ τ Z Z Fpη Fpτ = ηe− dη e mp (t) 0 ∙Z τ ¸τ Fp(β η) + ηe − Gpap (tf β) dβdη .(7.20) − Z0 Zη Substituting Eq. 7.20 into Eq. 7.15 gives

τ τ Fpη Fpτ r (tf )=r (t)+τr˙ (t)+ ηAe (tf η) dη Cp ηe− dη e mp (t) 0 − − 0 τ τ Z ∙Z ¸ Fp(β η) Cp ηe − Gpap (tf β) dβdη . (7.21) − − Z0 Zη If the evader also has dynamics, then a similar form of Eq. 7.20 holds

τ τ Feη Feτ ηme (tf η) dη = ηe− dη e me (t) 0 − 0 Z ∙Z τ τ ¸ Fe(β η) + ηe − Geae (tf β) dβdη . (7.22) − Z0 Zη 181

If both the pursuer and evader have acceleration dynamics, the miss is found by substituting Eq. 7.22 into Eq. 7.21

τ τ Feη Feτ Fpη Fpτ r (tf )=r (t)+τr˙ (t)+Ce ηe− dη e me (t) Cp ηe− dη e mp (t) 0 − 0 τ τ ∙Z ¸ τ ∙τZ ¸ Fe(β η) Fp(β η) + Ce ηe − Geae (tf β) dβdη Cp ηe − Gpap (tf β) dβdη . − − − Z0 Zη Z0 Zη (7.23)

7.3 Zero Effort Miss (ZEM)

This section provides a definition of the zero effort miss (ZEM) and derives an equation for it in terms of the kinematic state variables introduced in Section 7.1. The zero effort miss is the miss distance that would occur if the pursuer exerted no acceleration

(Ap= 0). The ZEM can be obtained by letting Ap= 0 in Eq. 7.15

τ ZEM−−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη . (7.24) − Z0 The miss distance can be conveniently expressed using the −ZEM−−→

τ r (tf )=−ZEM−−→ ηAp (tf η) dη . (7.25) − − Z0 The component of the angular line-of-sight (LOS) rate Ω that is perpendicular to the LOS is denoted by Ω and can be computed by ⊥ r r˙ Ω = × . (7.26) ⊥ rT r

In general, the closing velocity r˙ (t) has a component parallel r˙ and perpendicular k r˙ to the LOS ⊥

r˙ = r˙ + r˙ (7.27a) k ⊥ = r˙ + Ω r (7.27b) k × = r˙ + Ω r .(7.27c) k ⊥ × 182

Using this equation, the −ZEM−−→ can be expressed as

−ZEM−−→ = r (t)+τr˙ (t)+τΩ (t) r (t) τ k ⊥ × + ηAe (tf η) dη . (7.28) − Z0 The following near collision-course approximation [43] is often used

r (t)+τr˙ (t) 0 . (7.29) k '

Substitution of Eq. 7.29 into Eq. 7.28 gives a ZEM approximation that uses the angular velocity of the LOS vector

τ −ZEM−−→ τΩ (t) r (t)+ ηAe (tf η) dη . (7.30) ' ⊥ × − Z0 7.3.1 Acceleration Dynamics

If the pursuer has acceleration dynamics, the ZEM is found by setting ap = 0 in Eq. 7.21

τ τ Fpη Fpτ −ZEM−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη Cp ηe− dη e mp (t) . (7.31) − − Z0 ∙Z0 ¸

In all cases of practical interest, the matrix Fp will be of full rank and therefore have an inverse. The following integral is useful

τ τ k ∞ (F γ) eFpγdγ = p dγ k! Z0 Z0 k=0 X k ∞ (F ) τ k+1 = p (k +1)! k=0 X k+1 ∞ (F ) τ k+1 = F 1 p p− (k +1)! k=0 X k ∞ (F ) τ k = F 1 p p− k! k=1 1 XFpτ = F− e I . (7.32) p − £ ¤ 183

Another useful integral that can be solved using integration by parts is

τ τ τ Fpη Fpτ Fp(τ η) Fpγ ηe− dη e = ηe − dη = (τ γ) e dγ . (7.33) − ∙Z0 ¸ Z0 Z0 Define the following integration by parts variables

u = τ γ (7.34a) − du = 1 (7.34b) − dv = eFpγ (7.34c)

1 Fpγ v = F− e I .(7.34d) p − £ ¤ Thus, the integral is evaluated

τ τ Fpγ 1 Fpγ τ 1 Fpγ (τ γ) e dγ =(τ γ) F− e I + F− e I dγ − − p − 0 p − Z0 Z0 τ ¯ 1 Fpγ£ ¤ £ ¤ = Fp− e I dγ ¯ 0 − Z 1 1 Fpτ = F− F−£ e ¤I τI p p − − 2 Fpτ = F− £e £ I τF¤p .¤ (7.35) p − − £ ¤ Using this result, the ZEM is given by

τ 2 Fpτ −ZEM−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη CpF− e I τFp mp (t) . (7.36) − − p − − Z0 £ ¤ The miss distance can be conveniently expressed using the −ZEM−−→

τ τ Fp(β η) r (tf )=−ZEM−−→ ηCpe − Gpap (tf β) dβdη . (7.37) − − Z0 Zη 7.4 Heading Error

This section provides a definition of the heading error and derives an equation for it in terms of the kinematic state variables introduced in Section 7.1. The heading error is defined to be the angular misalignment of the pursuer’s velocity vector from the 184 orientation that would result in a null zero effort miss. A pursuer with a zero heading error can achieve r (tf )=0 without accelerating (i.e. V˙ p = 0). Let uˆp denote a unit vector along the pursuer velocity vector

Vp = Vpuˆp . (7.38)

/ Aunitvectoruˆp is sought that will null the zero effort miss

τ / / −ZEM−−→ = r+ Ve Vpuˆ τ + ηAedη = 0 . (7.39) − p Z0 ¡ ¢ This relationship gives three equations and four unknowns; three for the components / of uˆp and one for τ. Another equation is needed and can be obtained by noting that / the magnitude of uˆp must be unity

T uˆp uˆp =1. (7.40)

/ Solving Eq. 7.39 for uˆp gives the orientation required for a collision course

1 τ uˆ/ = ηA dη + τV + r (7.41) p τV e e p µZ0 ¶ 1 = −ZEM−−→ + τVp . (7.42) τVp ³ ´ Any deviation in this value will result in a heading error

/ H =arccosuˆ uˆp (7.43a) p · 1 Vp =arccos¡ ¢−ZEM−−→ + τVp (7.43b) τVp · Vp µ ¶ 1 ³ ´ T =arccos 2 −ZEM−−→ + τVp Vp (7.43c) τVp µ ³ ´ ¶ 1 T =arccos −ZEM−−→ Vp +1 .(7.43d) τV 2 µ p ¶ The magnitude constraint is given by

T 2 2 −ZEM−−→ + τVp −ZEM−−→ + τVp = τ Vp . (7.44) ³ ´ ³ ´ 185

For a constant velocity evader, the magnitude constraint is given by

1 T 1 V + r V + r = V 2 . (7.45) e τ e τ p µ ¶ µ ¶ This is a quadratic equation in τ with roots

T T 2 2 2 2 Ve r (Ve r) + Vp Ve r τ = ± − . (7.46) 2 2 q Vp V¡e ¢ −

For Vp >Ve and τ>0, the only viable¡ root is ¢

T T 2 2 2 2 Ve r+ (Ve r) + Vp Ve r τ = − . (7.47) 2 2 q Vp V¡e ¢ − ¡ ¢ 7.5 Basic Model

The first guidance law to be developed will be for a rather basic engagement model. The basic model assumes a pursuing object (e.g. a missile) and an evading object (e.g. a target) can be modeled as particles in space that have unconstrained and instantaneous control over their acceleration vectors. Under such conditions, the equations of motion are given by

r˙ = Ve Vp (7.48a) −

V˙ e = Ae (7.48b)

V˙ p = Ap ,(7.48c) where r is the vector distance from the pursuer to the evader. It is desired that the pursuer minimize the separation between itself and the evader in a minimal time while using a minimal amount of control effort. The following cost functional is parameterized to balance the final time, the finalmissdistanceandthetotalcontrol effort: k 1 tf J = k t + r r (t )T r (t )+ AT A dt . (7.49) t f 2 f f 2 p p Z0 ¡ ¢ 186

If the following equation is added to the system 1 c˙ = k + AT A , (7.50) t 2 p p the complete state vector given by1 r x = Vp . (7.51) ⎡ c ⎤ ⎣ ⎦ Under this formulation the cost functional becomes k J = r rT r + c . (7.52) 2 The Hamiltonian is given by

1 T T T H = kt + A Ap λc + λ (Ve Vp)+λ Ap . (7.53) 2 p r − p µ ¶ An optimal solution requires min H =0. (7.54) Ap The adjoint dynamics are given by

T ∂H λ˙ = (7.55a) − ∂x ˙ λc =0 (7.55b)

λ˙ r = 0 (7.55c)

λ˙ p = λr .(7.55d)

The final-time values for the adjoint variables are given by ∂J λT = (7.56a) f ∂x ¯t=tf ¯ λc (tf )=1 ¯ (7.56b) ¯ λr (tf )=krr (tf ) (7.56c)

λp (tf )=0 .(7.56d)

1 For purposes of solving the optimal control problem, the evader’s dynamics are not important. However, since they enter the problem through the velocity vector, they are taken into account. 187

The adjoint variable λc is given by

λc (t)=1. (7.57)

The adjoint variable λr is given by

λr (t)=krr (tf ) . (7.58)

The adjoint variable λp is given by

λp (t)= kr (tf t) r (tf ) . (7.59) − −

Optimality requires

∂H = 03x1 ∂Ap T T = Ap λc + λp . (7.60)

Rearranging and substituting Eq. 7.59 gives

1 Ap = λp = kr (tf t) r (tf ) . (7.61) −λc −

Substituting Eq. 7.61 into Eq. 7.25

τ 2 r (tf )=−ZEM−−→ krr (tf ) η dη , (7.62) − Z0 and solving for r (tf ) gives

1 r (tf )= −ZEM−−→ . (7.63) kr 3 1+ 3 τ Substituting Eq. 7.63 into Eq. 7.61 results in the zero effort miss (ZEM ) guidance law (GL)

Ap = krτr (tf ) (7.64a) N (τ) = −ZEM−−→ ,(7.64b) τ 2 188 where

k τ 3 N (τ)= r (7.65) kr 3 1+ 3 τ 3 . (7.66) ' ¡ ¢ is referred to as the navigation gain. The guidance law given by Eq. 7.64b is optimal for the performance index given by Eq. 7.49 and the state equations given by Eq. 7.4. The ZEM guidance law given by Eq. 7.64b is closely related to two other well known guidance laws. These guidance laws are developed in the next two subsections.

7.5.1 Augmented Proportional Navigation Guidance

Substituting Eq. 7.30 into Eq. 7.64b results in the true augmented proportional navigation (TAPN )GL

τ krτ Ap= τΩ r + ηAedη . (7.67) kr 3 τ 1 ⊥ × 0 3 − µ Z ¶ The guidance law given by¡ Eq. 7.67¢ has traditionally been known as augmented proportional navigation (APN). Several forms of APN appear are discussed in this dissertation and it is necessary to distinguish between them. The reason for calling this form of APN “true” will become clear in Section 7.5.2.

7.5.2 Proportional Navigation Guidance

Neglecting target acceleration results in the true proportional navigation (TPN )GL

2 krτ Ap= Ω r . (7.68) kr τ 3 1 ⊥ × 3 − The TPN guidance law given by Eq.¡ 7.68 is¢ often called PN. However, it is not the only form of PN, and when it is necessary to distinguish Eq. 7.68 from other forms of PN, it is referred to as TPN [53]. For this reason, the ZEM and APN guidance laws given by Eq. 7.64b and Eq. 7.67 are referred to as TZEM and TAPN. 189

7.6 Acceleration Dynamics

This section considers a more complicated model than the basic model that was used in Section 7.5. In this model, the pursuer and evader are unable to instantaneously command their acceleration. Rather, the pursuer (evader) commands an accelera- tion ap (ae) and produces an achieved acceleration Ap (Ae). The commanded and achieved acceleration are described by a generic linear time-invariant set of differential equations. The engagement model is given by the following equations.

r˙ = Ve Vp (7.69a) −

V˙ e = Ae = Ceme (7.69b)

V˙ p = Ap = Cpmp (7.69c)

m˙ p = Fpmp + Gpap (7.69d)

m˙ e = Feme + Geae (7.69e)

It is desired that the pursuer minimize the separation between itself and the evader in a minimal time while using a minimal amount of control effort. The following cost functional is parameterized to balance the final time, the final miss distance and the total control effort:

k 1 tf J = k t + r r (t )T r (t )+ aT a dt (7.70) t f 2 f f 2 p p Z0 ¡ ¢ If the following equation is added to the system 1 c˙ = k + aT a . (7.71) t 2 p p the complete state vector is given by2 r Vp x = ⎡ ⎤ . (7.72) mp ⎢ c ⎥ ⎢ ⎥ ⎣ ⎦ 2 For purposes of solving the optimal control problem, the evader’s dynamics are not important. However, since they enter the problem through the velocity vector, they are taken into account. 190

Under this formulation the cost functional becomes k J = r rT r + c . (7.73) 2 The Hamiltonian is given by

1 T T T T H = kt + a ap λc + λ (Ve Vp)+λ Cpmp + λ (Fpmp + Gpap) . (7.74) 2 p r − v a µ ¶ An optimal solution requires min H =0 . (7.75) ap The adjoint dynamics are given by

T ∂H λ˙ = − ∂x

λ˙ r = 0 (7.76a)

λ˙ v = λr (7.76b)

T T λ˙ a = C λv F λa (7.76c) − p − p ˙ λc =0.(7.76d)

The final-time values for the adjoint variables are given by ∂J λT = (7.77) f ∂x ¯t=tf ¯ λr (tf )=krr ¯(tf ) (7.78) ¯ λv (tf )=0 (7.79)

λa (tf )=0 (7.80)

λc (tf )=1. (7.81)

The adjoint variable λc is given by

λc (t)=1. (7.82)

The adjoint variable λr is given by

λr (t)=krr (tf ) . (7.83) 191

The adjoint variable λv is given by

λv (t)= kr (tf t) r (tf ) . (7.84) − −

Let Φp denote the state transition matrix corresponding to Fp

k ∞ (F t) Φ (t)= p = eFpt . (7.85) p k! k=0 X Then, the state transition matrix for FT is − p T T k k T ∞ F t ∞ F t p ( Fpt) T Fpt T e− p = − = − = Φ ( t)= e− . (7.86) k! k! p − k=0 ¡ ¢ "k=0 # X X ¡ ¢ The adjoint variable λa is given by

tf T T Fp (tf t) Fp (tf γ) T λa (tf )=e− − λa (t) e− − Cp λv (γ) dγ − t Z tf T T Fp (tf t) Fp (tf γ) T = e− − λa (t)+kr (tf γ) e− − dγ Cp r (tf ) t − τ∙Z ¸ T T Fp τ Fp β T = e− λa (t)+kr βe− dβ Cp r (tf ) , (7.87) ∙Z0 ¸ where τ = tf t.Solvingforλa (t) gives − τ T Fp (τ β) T λa (t)= kr βe − dβ C r (tf ) . (7.88) − p µZ0 ¶ Optimality requires

∂H = 03x1 ∂ap T T = ap λc + λa Gp . (7.89)

Rearranging and substituting Eq. 7.88 gives

τ 1 T T T Fp (τ β) T ap = G λa = G kr βe − dβ C r (tf )=θN (τ) krτr (tf ) , (7.90) −λ p p p c µZ0 ¶ 192

where θN (τ) is given by the following equation.

τ 1 T θ (τ)= GT k βeFp (τ β)dβ CT N τ p r − p µZ0 ¶ 1 τ T = k C k βeFp(τ β)dβ G τ r p r − p ∙ µZ0 ¶ ¸ 1 2 Fpτ T = CpF− e I τFp Gp . (7.91) τ p − − £ £ ¤ ¤ Substituting Eq. 7.90 into Eq. 7.37

τ τ Fp(γ η) r (tf )=ZEM−−−→ Cp ηe − Gpap (tf γ) dγdη − 0 η − Z τ Z τ γ T Fp(γ η) T Fp (γ β) T = ZEM−−−→ Cp ηe − GpGp kr βe − dβ Cp r (tf ) dγdη − 0 η 0 Z τZ τ γ µZ ¶ T Fp(γ η) T Fp (γ β) T = ZEM−−−→ kr ηCpe − GpGp βe − Cp dβdγdη r (tf ) − 0 η 0 ∙Z τ Z τ Z γ ¸ T Fp(γ η) Fp(γ β) = ZEM−−−→ kr ηβ Cpe − Gp Cpe − Gp dβdγdη r (tf ) − ∙Z0 Zη Z0 ¸ τ 3 ¡ ¢¡ ¢ = ZEM−−−→ kr θD (τ) r (tf ) , (7.92) − 3 where

τ τ γ 3 T θ (τ)= ηβ C eFp(γ η)G C eFp(γ β)G dβdγdη . (7.93) D τ 3 p − p p − p ∙Z0 Zη Z0 ¸ ¡ ¢¡ ¢ Solving for r (tf ) in Eq. 7.92 gives

3 1 τ − r (tf )= I+kr θD (τ) −ZEM−−→ . (7.94) 3 ∙ ¸ Substituting Eq. 7.94 into Eq. 7.90 gives

3 1 τ − ap = θN (τ) krτ I+kr θD (τ) −ZEM−−→ . (7.95) 3 ∙ ¸ 7.6.1 Decoupled Dynamics

In Eq. 7.95 the pursuer’s commanded acceleration is formed by rotating and scaling the ZEM vector. Let (x, y, z) denote the axes of a non-rotating coordinate system 193 attached to the pursuer. The control dynamics are decoupled if the component of the pursuer’s achieved acceleration along an axis of the coordinate system is only affected by the component of the commanded acceleration along that same axis. When the dynamics are decoupled, the component of the pursuer’s commanded acceleration along an axis of the coordinate system is only a function of the ZEM projected onto the corresponding axis. Furthermore, if the acceleration dynamics are the same along each of the coordinate system axes, then the pursuer commanded acceleration is simply the −ZEM−−→ multiplied by a scalar valued function.

7.7 Single Pole Flight Control System Model

This section considers a special case of the general guidance law developed in Section 7.6. In this section, the pursuer’s commanded and achieved acceleration are related byasinglepole(timeconstant)model. Thesinglepoleflight control system model described in Section 6.4 is really only valid for the axes that are perpendicular to the missile velocity vector. Nevertheless, by convenient choice of the coordinate system, the acceleration dynamics can be approximated by the single pole model in all coordinate axes. When this is done, the matrices describing the pursuer dynamics in Eq. 7.5 are given below.

1 Fp = I3 (7.96a) −τ p 1 Gp = I3 (7.96b) τ p

Cp = I3 (7.96c)

The state transition matrix is given by

Fpt t/τ p e = e− I3 . (7.97) 194

The ZEM can be obtained by substitutingEq.7.96aandEq.7.97intoEq.7.36

τ 2 Fpτ −ZEM−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη CpF− e I τFp mp (t) − − p − − Z0 τ £ ¤ 2 τ/τp τ = r (t)+τr˙ (t)+ ηAe (tf η) dη τ p e 1+ mp (t) . (7.98) − − − τ p Z0 ∙ ¸

The term θN can be evaluated by substituting Eq. 7.96 and Eq. 7.97 into Eq. 7.91

1 2 Fpτ T θN (τ)= CpF− e I τFp Gp τ p − −

1 £ τ/τ£p τ ¤ ¤ = τ p e− 1+ I3 τ − τ p µ ¶ 1 h = e− 1+h I3 , (7.99) h − ¡ ¢ where h = τ/τp.ThetermθD can be evaluated by substituting Eq. 7.96 and Eq. 7.97 into Eq. 7.93

τ τ γ T 3 Fp(γ η) Fp(γ β) θD (τ)= 3 ηβ Cpe − Gp Cpe − Gp dβdγdη τ 0 η 0 ∙Z Z τ Z τ γ ¸ 3 1 ¡ ¢¡ T¢ = ηβ e (γ η)/τ p I e (γ β)/τ p I dβdγdη τ 3 τ 2 − − 3 − − 3 ∙ p Z0 Zη Z0 ¸ 3 1 τ τ γ ¡ ¢¡ ¢ = ηβe (2γ η β)/τ p dβdγdη I . (7.100) τ 3 τ 2 − − − 3 ∙ p Z0 Zη Z0 ¸ The following integral will be used several times in the evaluation of the above triple integral

γ γ β/α β/α γ β/α βe dβ = βαe 0 αe dβ 0 − 0 Z Zγ γ/α ¯ β/α = γαe ¯ τ pe dβ − Z0 = γαeγ/τp α2 eγ/α 1 − − γ = α2 eγ/α e¡γ/α +1 ¢ α − γ = α2 h 1 eγ/α +1i . (7.101) α − h³ ´ i 195

With appropriate substitutions for α,thetermθD is solved for

τ τ γ 3 1 (2γ η β)/τ p θD (τ)= 3 2 ηβe− − − dβdγdη I3 τ τ p 0 η 0 ∙ Zτ Z Z τ γ ¸ 3 1 η/τp 2γ/τp β/τp = 3 2 ηe e− βe dβdγdηI3 τ τ p 0 η 0 Z τ Z τ Z 3 1 η/τp 2γ/τp 2 γ γ/τp = 3 2 ηe e− τ p 1 e +1 I3 τ τ p 0 η τ p − τZ τZ ∙µ ¶ ¸ 3 η/τp γ γ/τp γ/τp 2γ/τp = 3 ηe e− e− + e− dγdη τ 0 η τ p − Z τ Z ∙ ¸ τ 3 η/τp γ/τp γ/τp γ/τp τ p 2γ/τp = ηe γe− + e− dγ + τ pe− e− dηI3 τ 3 − − 2 Z0 ∙µ Z ¶ ¸η τ τ 3 η/τp γ/τp γ/τp γ/τp τ p 2γ/τp = 3 ηe γe− τ pe− + τ pe− e− dηI3 τ 0 − − − 2 η Z τ h τ i 3 η/τp γ/τp τ p 2γ/τp = 3 ηe γe− e− dηI3 τ 0 − − 2 η Z τ h i 3 η/τp τ τ/τp 1 2τ/τp η/τp τ p 2η/τp = 3 ηe τ p e− + e− + ηe− + e− dηI3 τ 0 − τ p 2 2 Z τ ∙ µ ¶ ³ ´¸ 3 τ τ/τp 1 2τ/τp η/τp 2 τ p η/τp = τ p e− + e− ηe + η + ηe− dηI3 τ 3 − τ 2 2 Z0 ∙ µ p ¶ ¸ 3 ³ ´ = (I + I ) I , (7.102) τ 3 1 2 3 where I1 and I2 are the integrals

τ τ τ/τp 1 2τ/τp η/τp I1 = τ p e− + e− ηe dη (7.103a) − 0 τ p 2 Zτ µ ¶ 2 τ p η/τp I2 = η + ηe− dη . (7.103b) 0 2 Z ³ ´ 196

The integral I1 has the solution

τ τ τ/τp 1 2τ/τp η/τp I1 = τ p e− + e− ηe dη − 0 τ p 2 Z µ ¶τ τ τ/τp 1 2τ/τp η/τp = τ p e− + e− ηe dη − τ 2 µ p ¶ Z0 3 τ τ/τp 1 2τ/τp τ τ/τp = τ e− + e− 1 e +1 − p τ 2 τ − µ p ¶ ∙µ p ¶ ¸ 3 τ 1 τ/τp τ τ/τp = τ p + e− 1 + e− − τ p 2 τ p − µ ¶ ∙µ ¶ ¸ 3 τ τ τ τ/τp 1 τ τ/τp 1 2τ/τp = τ p 1 + e− + 1 e− + e− − τ p τ p − τ p 2 τ p − 2 ∙ µ ¶ µ ¶ ¸ 3 h 1 h 1 2h = τ h (h 1) + he− + (h 1) e− + e− . (7.104) − p − 2 − 2 ∙ ¸

The integral I2 has the solution

τ 2 τ p η/τp I2 = η + ηe− dη 0 2 Z ³ ´ τ 1 3 τ p 2 η η/τp = η + τ 1 +1 e− 3 2 p − τ ∙ ∙ µ p ¶ ¸¸0 3 τ 3 1 τ 1 η η/τp = τ p + 1 +1 e− 3 τ p 2 − τ p à µ ¶ ∙ µ ¶ ¸0! 3 3 1 τ 1 τ τ/τp = τ p + 1 +1 e− 3 τ p 2 − τ p à µ ¶ ∙ µ ¶ ¸! 3 3 1 τ 1 1 τ τ/τp = τ + +1 e− p 3 τ 2 − 2 τ à µ p ¶ µ p ¶ ! 3 1 3 1 1 h = τ h + (h +1)e− . (7.105) p 3 2 − 2 µ ¶ Substituting Eq. 7.104 and Eq. 7.105 into Eq. 7.102 gives 197

3 θ (τ)= (I + I ) I D τ 3 1 2 3 3 h 1 h 1 2h = h (h 1) + he− + (h 1) e− + e− h3 − − 2 − 2 ∙ µ ¶ 1 3 1 1 h + h + (h +1)e− I3 3 2 − 2 µ ¶¸ 3 2 h 1 h 1 h = h + h he− he− + e− h3 − − − 2 2 ∙ 1 2h 1 3 1 1 h 1 h e− + h + he− e− I3 −2 3 2 − 2 − 2 ¸ 3 2 h 1 2h 1 3 1 = h + h 2he− e− + h + I3 h3 − − − 2 3 2 ∙ ¸ 1 2 h 2h 3 = 6h +6h 12he− 3e− +2h +3 I3 . (7.106) 2h3 − − − h i Substituting Eq. 7.99 and Eq. 7.106 into Eq. 7.95 gives

3 1 τ − ap = θN (τ) krτ I+kr θD (τ) −ZEM−−→ 3 ∙ 1 h ¸ e− 1+h krτ = h − −ZEM−−→ 6+k τ 3 1 [ 6h2 +6h 12he h 3e 2h +2h3 +3] r 3 2h3 ¡ −¢ − − 1 h − − 6 e− 1+h τ = h ZEM 3 6 2 − h 2h 3 −−−→ τ p 3 6h +6h 12he− 3e− +2h +3 krτ p − ¡− −¢ h 6 e− 1+h = ZEM 2 6 2 − h 2h 3 −−−→ τ p 3 6h +6h 12he− 3e− +2h +3 krτ p − ¡− −¢ 2 h 6h e− 1+h 1 = ZEM 6 2 − h 2h 3 2 −−−→ 3 6h +6h 12he− 3e− +2h +3τ krτ p − −¡ − ¢ 1 = N (τ p,kr,h) −ZEM−−→ , (7.107) τ 2 where N is given by

2 h 6h e− 1+h N (τ ,k ,h)= . (7.108) p r 6 3 2 − h 2h 3 +2h 6h +6h +3 12he− 3e− krτ p − ¡ − ¢ −

In the limit as kr becomes very large, we have

2 h 6h e− 1+h lim N = 3 2 − h 2h . (7.109) kr 2h 6h +6h +3 12he 3e →∞ − ¡ − ¢− − − 198

In the limit as τ r becomes very small the "navigation gain" N becomes equivalent to that given by Eq. 7.66

2 h 6h e− 1+h lim N = lim − τ p 0 h 6 3 2 h 2h → →∞ 3 +2 h 6h¡ +6h +3 ¢12he− 3e− krτ − − − k ³τ 3 ´ = r . (7.110) kr 3 1+ 3 τ ¡ ¢ 7.8 Optimal Evasion

This section develops an optimal guidance law for the pursuer when the evader is expected to perform an optimally evasive maneuver. It is assumed that the evader has complete control over its acceleration vector and issues a commanded acceleration according to

V˙ e = Ae + w , (7.111) where v is a known function of time and w is the portion of the control that is to be optimally evasive. The engagement dynamics are given by

r˙ = Ve Vp (7.112a) −

V˙ p = Cpmp (7.112b)

m˙ p = Fpmp + Gpap (7.112c)

V˙ e = Ae + w . (7.112d)

The cost functional is given by

tf kr T 1 T 2 T J = kttf + r (tf ) r (tf )+ a ap γ w w dt . (7.113) 2 2 p − Z0 ¡ ¢ The parameter γ appearing in the cost functional is a design parameter that represents the relative maneuverability between the pursuer and the evader. With γ>1,the 199 cost functional implies that the evader is less maneuverable than the pursuer [10, p. 90]. If the following equation is added to the system

1 T 2 T c˙ = kt + a ap γ w w , (7.114) 2 p − then the complete state vector is given by

r Vp ⎡ ⎤ x = mp . (7.115) ⎢ Ve ⎥ ⎢ ⎥ ⎢ c ⎥ ⎢ ⎥ ⎣ ⎦ Theadjointvectorhascomponentsdenotedby

λr λv,p λ ⎡ ⎤ = λa . (7.116) λc ∙ ¸ ⎢ λv,e ⎥ ⎢ ⎥ ⎢ λc ⎥ ⎢ ⎥ ⎣ ⎦ The Hamiltonian is given by

1 T 1 2 T T T H = kt + a ap γ w w λc + λ (Ve Vp)+λ Cpmp 2 p − 2 r − v,p µ T ¶ T +λv,e (Ae + w)+λa (Fpmp + Gpap) . (7.117)

An optimal solution requires

∂H ∂H min max H (x(t), ap, w, λ(t),λc)=0 = 0 and = 0 . (7.118) 3 3 ap R w R ⇒ ∂ap ∂w ∈ ∈ The partial derivative of the Hamiltonian with respect to the pursuer control is

∂H T T = ap λc + λa Gp = 0 . (7.119) ∂ap

Solving for ap gives 1 T ap = Gp λa . (7.120) −λc 200

The partial derivative of the Hamiltonian with respect to the evader control is

∂H 2 T T = γ w λc + λ = 0 . (7.121) ∂w − p v,e Solving for w gives 1 wp = 2 λv,e . (7.122) γ λc The adjoint dynamics are given by

T ∂H λ˙ = − ∂x

λ˙ r = 0 (7.123a)

λ˙ v,p = λr (7.123b)

T T λ˙ a = C λv,p F λa (7.123c) − p − p

λ˙ v,e = λr (7.123d) − ˙ λc =0. (7.123e)

The final-time values for the adjoint variables are given by ∂J λT = (7.124a) f ∂x ¯t=tf ¯ λr (tf )=krr ¯(tf ) (7.124b) ¯ λv,p (tf )=0 (7.124c)

λa (tf )=0 (7.124d)

λv,e (tf )=0 (7.124e)

λc (tf )=1. (7.124f)

The adjoint variable λc is given by

λc (t)=1. (7.125)

The adjoint variable λr is given by

λr (t)=krr (tf ) . (7.126) 201

The adjoint variable λv,p is given by

λv,p (t)= kr (tf t) r (tf ) . (7.127) − −

The adjoint variable λv,e is given by

λv,e (t)=kr (tf t) r (tf ) . (7.128) −

The adjoint variable λa has a solution given by Eq. 7.88

τ T Fp (τ β) T λa (t)= kr βe − dβ C r (tf ) . (7.129) − p µZ0 ¶ The pursuer control is identical to that given by Eq. 7.90

τ T T Fp (τ β) T ap = Gp kr βe − dβ Cp r (tf )=θN (τ) krτr (tf ) , (7.130) µZ0 ¶ where

1 2 Fpτ T θN (τ)= CpF− e I τFp Gp . (7.131) τ p − − The evader control is obtained by£ substituting£ Eq. 7.128¤ into¤ Eq. 7.122

k τ w = r r (t ) . (7.132) p γ2 f

In the context of game theory, we must define the zero-effort miss as the miss that would occur if both ap and w arezero—thatis,theevader’sknownmaneuverv is included in the zero effort miss. Using this definition, the miss is expressed as

τ τ τ Fp(β η) r (tf )=ZEM−−−→ + ηv (tf η) dη ηCpe − Gpap (tf β) dβdη , 0 − − 0 η − Z Z Z (7.133a) where

τ 2 Fpτ ZEM−−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη CpF− e I τFp mp (t) . − − p − − Z0 £ ¤ (7.134) Substituting Eq. 7.130 and Eq. 7.132 into Eq. 7.133a gives 202

τ kr 2 r (tf )=−−−→ZEM + 2 η dηr (tf ) γ 0 τ Z τ γ T Fp(γ η) T F (γ β) T Cp ηe − GpG kr βe p − dβ C r (tf ) dγdη − p p Z0 Zη µZ0 ¶ τ 3 kr 2 τ = −−−→ZEM + 2 η dηr (tf ) kr θD (τ) r (tf ) γ 0 − 3 Z3 krτ 1 = −−−→ZEM + I3 θD (τ) r (tf ) , (7.135) 3 γ2 − ∙ ¸ where θD is given by Eq. 7.93

τ τ γ 3 T θ (τ)= ηβ C eFp(γ η)G C eFp(γ β)G dβdγdη . (7.136) D τ 3 p − p p − p ∙Z0 Zη Z0 ¸ ¡ ¢¡ ¢ Solving for r (tf ) in Eq. 7.135 gives

3 1 τ 1 − r (tf )= I3+kr θD (τ) I3 −ZEM−−→ . (7.137) 3 − γ2 ∙ µ ¶¸ Substituting Eq. 7.137 into Eq. 7.130 gives

3 1 τ 1 − ap = θN (τ) krτ I3+kr θD (τ) I3 −ZEM−−→ . (7.138) 3 − γ2 ∙ µ ¶¸ This result shows that the presence of the optimally evasive maneuver results in an amplified acceleration. This fact is made very clear in the very common case of decoupled pursuer dynamics. For example, when the flight control system is modeled by a single pole, the pursuer acceleration is identical to Eq. 7.107

2 1 ap = N τ p,kr,h,γ −ZEM−−→ , (7.139) τ 2 ¡ ¢ but with N containing the additional term due to γ2

2 h 6h e− 1+h N τ ,k ,h,γ2 = . (7.140) p r 6 3 2 2 − h 2h 3 +2h (1 γ− ) 6h +6h +3 12he− 3e− krτ p ¡ ¢ ¡ ¢ − − − − This result is easy to arrive at because θD (τ) in Eq. 7.138 is identical to the θD (τ) given by Eq. 7.106. If kr is set to infinity a singularity (i.e. N will be infinite for 203 some value of h) will occur unless γ2 is also infinite. This result implies that a finite miss distance (finite value of kr) is necessary in the presence of an optimally evasive maneuver [10, p. 98]. In the limit that γ2 goes to infinity, the acceleration given by Eq. 7.138 becomes identical to the case when there is not an optimally evasive maneuver (Eq. 7.95)

3 1 τ − lim ap = θN (τ) krτ I3+kr θD (τ) −ZEM−−→ . (7.141) γ2 3 →∞ ∙ ¸ 7.9 Magnitude Constraints (Saturation)

The cost functional given by Eq. 7.70 effectively soft bounds the control ap by placing a penalty on the integral of the acceleration magnitude squared. However, under even moderately extreme conditions, it is possible that the commanded acceleration ap will be excessive of what the airframe can physically produce. One way to handle this is to use a hard bound on the commanded acceleration, which can be achieved by a constraint of the form 2 T h (ap)=a a ap 0 . (7.142) max − p ≥ This results in a control constraint set denoted by

p = ap : h (ap) 0 . (7.143) A { ≥ }

The cost functional and dynamics considered are the general ones analyzed in Section 7.6. The cost is given by Eq. 7.70 and repeated here

k 1 tf J = k t + r r (t )T r (t )+ aT a dt . (7.144) t f 2 f f 2 p p Z0 ¡ ¢ The optimal control law has been derived in [70] and [71]. The Hamiltonian is given by Eq. 7.74 and repeated here

1 T T T T H = kt + a ap λc + λ (Ve Vp)+λ Cpmp + λ (Fpmp + Gpap) . (7.145) 2 p r − v a µ ¶ 204

An optimal solution requires min H =0 . (7.146) ap p ∈A Minimizing H is a constrained optimization problem of the type discussed in Section 5.2. The Lagrangian function L is given by

L x, a , λ,λc, γ H x, a , λ,λc γh(ap) . (7.147) p , p − ¡ ¢ ¡ ¢ where γ is a Lagrange multiplier. A necessary condition for H to be minimized is that the partial derivative of H with respect to ap be equal to zero

∂L ∂H ∂h T T T = γ = λcap + λa Gp +2γap = 0 . (7.148) ∂ap ∂ap − ∂ap

Solving for ap gives 1 T ap = Gp λa . (7.149) −λc +2γ Substituting Eq. 7.88 and Eq. 7.82 into this result gives

τ kr T a = GT βeFp (τ β)dβ CT r (t ) p 1+2γ p − p f µZ0 ¶ k τ T = r C βeFp(τ β)dβ G r (t ) . (7.150) 1+2γ p − p f ∙µ Z0 ¶ ¸ Assume that the control constraint is not active. Then, by Eq. 5.12d, we must have γ =0and τ T Fp(τ β) ap = kr Cp βe − dβ Gp r (tf ) . (7.151) ∙µ Z0 ¶ ¸ T 2 When the control constraint is active, we have a ap = a and γ 0 p max ≥ k τ T a = r C βeFp(τ β)dβ G r (t ) and aT a = a2 . (7.152) p 1+2γ p − p f p p max ∙µ Z0 ¶ ¸ T 2 Since γ is a scalar, it has the effect of scaling the vector control such that ap ap = amax. In so doing, the direction of the control is unaltered by the value of γ.Suchacontrol can be expressed using the saturation function

τ T Fp(τ β) ap = Sat kr Cp βe − dβ Gp r (tf ) , (7.153) ( ∙µ Z0 ¶ ¸ ) 205 where x if x xmax | | | | The term in parentheses in Eq. 7.153 represents the ramp response of the flight control system τ Fp(τ β) Cp βe − dβ . (7.155) Z0 For minimum phase systems, the ramp response is an monotonically increasing func- tion of τ. However, the ramp response is not monotonically increasing for non- minimum phase systems. The more difficult case of non-minimum phase systems is discussed in [72]. Consider the value of the control at the final time tf .FromEq. 7.88 we have

λa (tf )=0 . (7.156)

Substituting this result into Eq. 7.149 gives

1 T ap (tf )= G λa (tf )=0 . (7.157) −1+2γ p

For minimum phase systems, the value of ap will be zero at t = tf and monotonically increase with τ until saturation occurs. This means that the pursuer control is given by Eq. 7.95, but saturates when the control constraint is active

3 1 τ − ap = Sat θN (τ) krτ I+kr θD (τ) −ZEM−−→ . (7.158) 3 ( ∙ ¸ ) 7.9.1 Single Pole Flight Control System

The ramp response of the single pole flight control system (a minimum phase system) can be obtained by use of Eq. 7.35 with the system matrices given by Eq. 7.96 and the state transition matrix given by Eq. 7.97

τ Fp(τ β) 2 Fpτ Cp βe − dβ = CpF− e I τFp p − − Z0 £ ¤ 1 τ/τp τ = 2 e− 1 I3 . (7.159) τ − − τ p p ∙ ¸ 206

Substituting this result into Eq. 7.153 gives

1 τ/τp τ ap = Sat kr 3 e− 1 r (tf ) . (7.160) τ − − τ p ½ µ p ∙ ¸¶ ¾ Assume at time t the control is not saturated. Then, because the argument to the saturation function is a monotonically decreasing function of t (alternately a monotonically increasing function of τ), the control will not saturate for the remainder of the trajectory. That is, if the control is not initially saturated, then the control for the remainder of the engagement is given by

1 τ/τp τ ap = kr e− 1 r (tf ) . (7.161) τ 3 − − τ µ p ∙ p ¸¶ The closed-loop control can then be obtained in the same manner as was used to obtain Eq. 7.107 1 ap = N (τ p,kr,h) −ZEM−−→ , (7.162) τ 2 where N is given by 6h2 e h 1+h N (τ ,k ,h)= − . (7.163) p r 6 3 2 − h 2h 3 +2h 6h +6h +3 12he− 3e− krτ p − ¡ − ¢ − Once the control given by Eq. 7.153 has been computed, one can verify that the constraint given by Eq. 7.142 is indeed satisfied. Now, suppose that it is found that the control constraint has been violated. Then, the control at time t is given by

amax ap (t)= r (tf ) . (7.164) r (tf ) | | We know that the control will unsaturate at some time before tf andthenremain unsaturated for the remainder of the engagement. For example, assume that the control is saturated from τ to τ S. Then, the complete control is given by

ap (τ)=β (τ) r (tf ) , (7.165) where amax if τ τ s r(tf ) β (τ)= ≥ . (7.166) 1 |τ/τp | τ ⎧ kr 3 e− 1 if τ<τs ⎨ τ p − − τ p ³ h i´ ⎩ 207

Substituting Eq. 7.165 into Eq. 7.37 gives

τ τ 1 Fp(φ η) r (tf )=−ZEM−−→ ηe − β (φ) r (tf ) dφdη − τ p Z0 Zη = −ZEM−−→ ψ (τ) r (tf ) , (7.167) − where 1 τ τ ψ (τ)= ηeFp(φ η)β (φ) dφdη . (7.168) τ − p Z0 Zη Solving for r (tf ) gives 1 r (tf )= −ZEM−−→ . (7.169) 1+ψ (τ) Substituting this result into Eq. 7.165 gives β (τ) ap (τ)= −ZEM−−→ . (7.170) 1+ψ (τ) During the initial portion of the engagement, when the control is saturated, then the control will be in the direction of the ZEM vector and have a magnitude equal to amax.Atsometimeτ S, the control given by Eq. 7.162 will cease to violate the control constraint. By Bellman’s principal of optimality we known that an optimal policy does not depend on previous actions. Therefore, the optimal control for the unsaturated portion of the trajectory is exactly that given by Eq. 7.162. Since both the saturated and unsaturated portion of the trajectory issue a control in the direction of the ZEM, the control for the entire trajectory can be expressed as 1 ap = Sat N (τ p,kr,h) −ZEM−−→ , (7.171) τ 2 µ ¶ where N is given by Eq. 7.163.

7.10 Directional Constraints

In this section, optimal control theory will be used in an attempt to obtain an optimal controller for a constant speed pursuer. This section is longer than other sections of this chapter because it contains developments that are new to the field of missile 208 guidance. This problem was first addressed in two dimensions by Guelman in 1984 [37] and later addressed in three dimensions by Guelman in 1995 [38]. In [37] and [38], it is shown that optimal control is theoretically possible, but not practical. In fact, the optimal control law suggested requires the solution of an online constrained optimization problem involving four nonlinear algebraic equations and non-analytic functions. However, this section will readdress the problem and show that it is possible to specify the form of the optimal guidance law to within a scale factor, which itself is a function of engagement variables that can be estimated. A constant speed pursuer has only lateral thrusting capabilities, that can be mod- eled by the equation

Ap = ω Vp , (7.172) × where ω is the angular velocity of the pursuer’s velocity vector. Using this model for the pursuer’s dynamics, the equations of motion are given by

r˙ = Ve Vp (7.173a) −

V˙ e = Ae (7.173b)

V˙ p = ω Vp . (7.173c) × As before, a parameterized cost functional is used to balance the final time, the final miss distance and the total control effort:

k 1 tf J = k t + r r (t )T r (t )+ ωT ω dt . (7.174) t f 2 f f 2 Z0 ¡ ¢ If the following equation is added to the system

1 c˙ = k + ωT ω , (7.175) t 2 then the complete state vector is given by

r x = Vp . (7.176) ⎡ c ⎤ ⎣ ⎦ 209

Under this formulation the cost functional becomes k J (t)= r rT r + c . (7.177) 2 The Hamiltonian is given by

1 T T T H = kt + ω ω λc + λ (Ve Vp)+λ (ω Vp) . (7.178) 2 r − p × µ ¶ The adjoint dynamics are given by

T ∂H λ˙ = (7.179a) − ∂x ˙ λc =0 (7.179b)

λ˙ r = 0 (7.179c)

λ˙ p = λr + ω λp . (7.179d) × The final-time values for the adjoint variables are given by ∂J λT = (7.180a) f ∂x ¯t=tf ¯ λc (tf )=1 ¯ (7.180b) ¯ λr (tf )=krr (tf ) (7.180c)

λp (tf )=0 . (7.180d)

The adjoint variable λc is given by

λc (t)=1. (7.181)

The adjoint variable λr is given by

λr (t)=krr (tf ) . (7.182)

Optimality requires ∂H = 0 (7.183a) ∂ω 3x1 T T = λcω λ V˜ p . (7.183b) − p 210 where the tilde accent represents the cross product operation

a b= ˜ab (7.184) × 0 a3 a2 − = a3 0 a1 b . ⎡ − ⎤ a2 a1 0 − ⎣ ⎦ Rearranging Eq. 7.183 gives

ω = Vp λp . (7.185) − × From Eq. 7.180d, the final value of the pursuer control is given by

ω (tf )=0 . (7.186)

Combining Eq. 7.179d and Eq. 7.182

λ˙ p = krr (tf )+ω λp . (7.187) ×

Differentiating the control given in Eq. 7.185 and simplifying gives

ω˙ = λp V˙ p Vp λ˙ p × − ×

= λp (ω Vp) Vp (krr (tf )+ω λp) × × − × ×

= λp (ω Vp) Vp (ω λp) krr (tf ) Vp × × − × × − ×

= λ˜pV˜ pω + V˜ pλ˜pω krr (tf ) Vp − − ×

= V˜ pλ˜p λ˜pV˜ p ω krr (tf ) Vp − − × ³ ´ =(Vp λp) ω krVp r (tf ) × × − ×

= ω ω krVp r (tf ) − × − ×

= krVp r (tf ) . (7.188) − ×

Consider the equations governing the pursuer’s dynamics

V˙ p = ω Vp (7.189a) ×

ω˙ = krr (tf ) Vp . (7.189b) × 211

These equations are subject to the boundary conditions

Vp (t0)=given (7.190a)

ω (tf )=0 . (7.190b)

The pursuer’s dynamics and boundary conditions imply that the control vector ω (t) is perpendicular to the final miss r (tf ) and the interceptor velocity vector

ω (t) r (tf )=0 (7.191a) ·

ω (t) Vp (t)=0. (7.191b) · Therefore, the engagement occurs in a plane and the direction of the control vector ˙ ω is a constant. Define θ to be the angle between r (tf ) and Vp.Thenθ = ω and k k according to Eq. 7.189b ¨ θ = ks sin θ , (7.192) where ks = krrf Vp. Thus, the equation governing the pursuer’s motion is the well known nonlinear equation describing the motion of a zero friction pendulum under the influence of gravity. Solution of this equation requires use of elliptic integrals [37]. For a constant speed missile, the optimal control vector ω has a fixed direction in space. For this reason, it can be written as

ω =f (τ) ωˆ , (7.193) where ωˆ is a unit vector along ω and f (τ) is an unknown function of the time-to-go τ. Substituting Eq. 7.193 into Eq. 7.25 gives τ r (tf )=−ZEM−−→ ηω Vpdη (7.194a) − 0 × Z τ = −ZEM−−→ ωˆ f (η) ηVpdη . (7.194b) − × Z0 Taking the dot product of this result with ω and using Eq. 7.191a gives

ω −ZEM−−→ =0. (7.195) · 212

Thus, the optimal control vector ω is perpendicular to the −ZEM−−→. ThisfactandEq. 7.191b indicate the optimal control is given by

N / (τ) ω = 2 2 Vp −ZEM−−→ . (7.196) Vp τ × The choice of writing the time varying scaling function in this manner is simply a mat- ter of convenience and will become clear in what follows. However, this guidance law is not implementable because of the unknown function of time N / (τ). Substituting Eq. 7.196 into Eq. 7.172, gives the equation for the pursuer’s acceleration

V˙ p = ω Vp × N / (τ) = 2 2 Vp −ZEM−−→ Vp Vp τ × × / ³ ´ N (τ) 2 = 2 2 Vp −ZEM−−→ −ZEM−−→ Vp Vp . (7.197) Vp τ − · h ³ ´ i Thus, this method applies a control that is the projection of the linear system’s optimal control, given by Eq. 7.64b, onto the plane perpendicular to the pursuer’s velocity vector. Suppose that the pursuer velocity vector is perpendicular to the ZEM, in which case the control is given by

N / (τ) V˙ p = −ZEM−−→ . (7.198) τ 2

However, under these conditions, the pursuer’s acceleration command should be given by Eq. 7.64b. This implies that the nominal value of N / (τ) is 3.Therest of this section is devoted to the behavior of the navigation gain N / (τ) under various off-nominal conditions.

7.10.1 Simulation

A simulation analysis was conducted to study off-nominal behavior of the navigation gain appearing in Eq. 7.196. Optimal control trajectories (extremals) were produced by integrating the equations of motion from a specified terminal point backwards 213 in time by a specified time-to-go using the control law given by Eq. 7.189b, thus producing an initial state that leads to the target in an optimal fashion. Specifically, the set of equations used are given by

r˙ = Ve Vp (7.199a) −

V˙ p = ω Vp (7.199b) ×

ω˙ = krr (tf ) Vp . (7.199c) ×

To make the results more readable, the entire engagement was then transformed to a new coordinate system in which the optimal trajectory begins at the origin with the pursuer (or evader) heading in a desired direction. The control at each point along the trajectories was computed during the simulation and then used to compute the navigation gain N / (τ) in Eq. 7.196. A numerical solution necessitates the selection of all system parameters. The results that follow indicate that, when appropriate metrics are used (e.g. heading error), actual parameter values are irrelevant. Nevertheless, to insure that the fol- lowing results are repeatable the parameter values used are shown in Table 10.1. The initial and terminal states are irrelevant, as long as the final value of the Hamiltonian, given by Eq. 7.178, is zero for the optimal trajectory. This forces the final value of r to be perpendicular to the final value of r˙

T r (tf ) r˙ (tf )=0 . (7.200)

The values listed in Table 10.1 result in trajectories of the type shown in Figure 7.2. The evader speed is not constant in all trajectories shown in Figure 7.2, but its final value is equal to one. Regardless of the maneuver performed by the evader, the pursuer’s trajectory is the same. That is, the optimal guidance law will guide the pursuertowherethetargetwillbeatthetimeofinterceptandnotwasteeffort in matching any intermediate evasive maneuvers. 214

Parameter Value Reference 3 rf = r (tf ) 10− Eq. 7.189b k k kr 10 Eq. 7.49 kt 0 Eq. 7.49 Vp = Vp 1 Eq. 7.173 k k Ve = Ve 1 Eq. 7.173 k k tf 10 Time duration of optimal trajectory Table 7.1. Simulation parameters.

12

10

8

6

4

2

0

0 2 4 6 8 10 12 14 16

Figure 7.2. Typical engagement geometry for spiraling target (green), constant acceleration target (blue), and constant velocity target (red). 215

7.10.2 Engagement Configuration

It is convenient to setup the engagement such that the viewing plane (the plane of this page) has its horizontal axis (x-axis) along r˙ (tf ) and its vertical axis (y-axis) along r (tf ); this is always possible because the terminal requirement imposed by Eq. 7.200 ensures that r (tf ) and r˙ (tf ) are perpendicular. For the purpose of setting up the engagement, assume that all final-time data is known. This data is only required to set up the desired engagement geometry, and not to assist in any guidance schemes. The final value of r is given by

T r (tf )=rf 010 . (7.201) £ ¤ The pursuer velocity vector Vp has components denoted by

T Vp = Vpx Vpy Vpz . (7.202) £ ¤ Similarly, the evader velocity vector Ve has components denoted by

T Ve = Vex Vey Vez . (7.203) £ ¤ At the final time, the relative velocity vector must point in some direction that is perpendicular to r (tf ). One can maintain complete generality if the direction of the relative velocity r˙ (tf ) is along the x-axis. This requires the final-time values of the pursuer and evader velocity vectors to be related by

Vey (tf )=Vpy (tf ) (7.204)

Vez (tf )=Vpz (tf ) .

Since Ve (tf ) is known, Vex (tf ) can be solved for

2 2 2 Vex (tf )= V (tf ) V (tf ) V (tf ) ± e − ey − ez q 2 2 2 = V (tf ) V (tf ) V (tf ) . (7.205) ± e − py − pz q 216

Thus, the evader velocity vector is given by

2 2 2 Ve (tf ) Vpy (tf ) Vpz (tf ) V (t )= ± − − . (7.206) e f ⎡ q Vpy (tf ) ⎤ V (t ) ⎢ pz f ⎥ ⎣ ⎦ The pursuer’s velocity vector can be described using the spherical coordinates θ (az- imuth) and φ (elevation)

cos φ cos θ Vp (tf )=Vp sin φ . (7.207) ⎡ cos φ sin θ ⎤ ⎣ ⎦ By inspection, one can see that holding φ fixed and varying θ from0to360degrees causes the pursuer velocity vector to trace a cone about the y-axis of the coordinate system. Alternately, holding the angle θ fixed and varying φ has the effect of varying the angle between the pursuer velocity vector and the final value of the line-of-sight, as is evident from the following relationship

T r (tf ) Vp (tf )=rf Vp cos (90 φ) . (7.208) −

Substituting Eq. 7.207 into Eq. 7.206 gives

2 Ve(tf ) sin2 φ cos2 φ sin2 θ ± Vp − − Ve (tf )=Vp ⎡ sµ ¶ ⎤ . (7.209) sin φ ⎢ ⎥ ⎢ cos φ sin θ ⎥ ⎢ ⎥ ⎣ ⎦ 7.10.3 Trajectory Shaping

An optimal trajectory can be generated by integration of Eq. 7.189. Let the pursuer velocity vector and the final miss vector be written as unit vectors

Vp = Vpuˆp (7.210a)

r (tf )=rf uˆr (tf ) , (7.210b) 217

where uˆp is a unit vector along the missile velocity vector and uˆr is a unit vector along the relative position vector. Substituting Eq. 7.210 into Eq. 7.189 gives d uˆp = ω uˆp (7.211a) dt ×

ω˙ = krrf Vpuˆr (tf ) uˆp (7.211b) ×

= ksuˆr (tf ) uˆp , (7.211c) × where the trajectory shaping parameter ks is defined by

ks , krrf Vp . (7.212)

Increasing the value of ks results in a more curved optimal trajectory. In effect, the value of ks determines the initial heading error that would need to be removed in a forward-time simulation (engagement). Optimal trajectories were generated by retro-time solution of Eq. 7.199. The terminal conditions were such that the pursuer intercepted the evader in a "head-to-head" configuration. Typical trajectories are shown in Figure 7.3. The trajectories shown in Figure 7.3 have been rotated such that, in each case, the pursuer’s velocity vector is initially aligned with the horizontal axis of the coordinate system. Clearly the larger value of ks results in a more curved trajectory and a much more challenging intercept than the smaller value of ks. Several more simulations were conducted and at each point along the trajectories, the heading error was computed. Figure 7.4 is a plot of the heading error as a function of flight-time, or time-to-go. These results show that a time-to-go of 10 seconds and a value of ks =0.01 produce a heading error of approximately 20 degrees. These values are typical in guidance analysis [101, Ch. 3]. A closed form solution exists for the acceleration command of a PN guided pursuer with an initial heading error of γ0 [78, p. 113] 3Vp (tf t) Ap = 2 − γ0 . (7.213) tf The acceleration command is a maximum at t =0and is given by

3Vp Ap = γ0 . (7.214) tf 218

10

8

6

4

2

0

−2 0 2 4 6 8 10

Figure 7.3. Typical pursuer trajectories for ks =0.01 (red) and ks =0.05 (green).

70 k = 0.01 s 60 k = 0.02 s k = 0.03 s 50 k = 0.04 s

40

30

Heading Error (degrees) 20

10

0 0 2 4 6 8 10 time−to−go (seconds)

Figure 7.4. Heading error as a function of time-to-go for various trajectory shaping parameter values. 219

Any point along a trajectory in Figure 7.4 can be used as a starting point for a forward-time simulation. As such, any time-to-go in Figure 7.4 can be regarded as a final time for a PN guided missile and the corresponding heading error of γ0. In particular, for a pursuer with a velocity of 3000 feet per second, the previously discussed scenario of a time-to-go of 10 seconds with a heading error of 20 degrees results in a PN acceleration command of about 10 G, where G is the acceleration due to gravity. This also is a typical value considered in guidance analyses.

7.10.4 Navigation Ratio

It is desirable to determine how the navigation gain N / varies as a function of per- tinent engagement parameters. The navigation gain N / is, in general, a nonlinear function of the time-to-go, the intercept conditions, and the trajectory shaping para-

/ meter ks. However, when the navigation gain N is plotted against the heading error, the resulting function is independent of the time-to-go and the trajectory shaping pa- rameter! That is, the navigation gain is only a function of the terminal conditions, which can be described by a single (scalar) variable, and the heading error. This greatly simplifies the analysis and interpretation of the navigation gain.

Two Dimensions For convenience, the planar (2-D) engagement was confined to the (x, y) plane of the (x, y, z) rectangular coordinate system. This requires that θ =0 in Eq. 7.207 cos φ Vp (tf )=Vp sin φ . (7.215) ⎡ 0 ⎤ Similarly, setting θ =0in Eq. 7.209 gives ⎣ ⎦

2 Ve(tf ) sin2 φ ± Vp − Ve (tf )=Vp ⎡ sµ ¶ ⎤ . (7.216) sin φ ⎢ ⎥ ⎢ 0 ⎥ ⎢ ⎥ ⎣ ⎦ 220

Parameter A B 8 12 rf 10− 10− ks .03 .06 kt 0 0 Vp = Vp 1 1.8 k k Ve = Ve 1 1.5 k k tf 10 6 φ 30 30 θ 0 0

Table 7.2. Simulation parameters for navigation gain analysis

In general Vp will be larger than Ve. However, to examine the guidance law for θ near

0◦ and 180◦ degrees, the analysis should be carried out with Ve = Vp. This explains the reason for setting Ve (tf ) equal to Vp in Tabel 10.1. Clearly the intent of the simulations is to determine how the parameters influence

/ the navigation gain N . In the planar engagement, the parameters are kr, rf , Vp, / Ve,andφ. As with any regression, it is desirable that the explanation of N be as compact as possible. A very extensive set of simulations were conducted where the navigation gain was plotted against various functions of the engagement parameters. It was observed that the navigation gain N / can best be explained by the heading error given by Eq. 7.43. However, the true time-to-go should be used rather than the time-to-go given by Eq. 7.47. For a given value of φ, the navigation gain N / is completely described by heading error γ. That is, the navigation gain is independent of the particular values of kr, rf , Vp,andVe except in how they corporately influence the heading error. As evidence of this, consider the two sets of parameters listed in Table 7.2. Figure 7.5 shows the trajectories that result from the values listed in Table 7.2. Figure 7.6 shows the heading error as a function of time-to-go for the values listed in Table 7.2. Figure 7.7 shows the navigation gain as a function of time-to-go for both sets of engagement parameters. Figure 7.8 shows the navigation gain as a function of heading error for several values of the elevation angle φ. 221

2

1

0

−1 y

−2

−3

−4

−5 0 2 4 6 8 10 12 14 16 18 x

Figure 7.5. Trajectories A (green) and B (red) generated from parameters listed in Table 7.2.

60

50

40

30

20 Heading Error (degrees)

10

0 0 2 4 6 8 10 time−to−go (seconds)

Figure 7.6. Heading error curves A (green) and B (red) as a function of time-to-go. 222

4

3.5

3

2.5 Navigation Gain

2

1.5 0 10 20 30 40 50 60 Heading Error (degrees)

Figure 7.7. Navigation gain as a function of heading error.

5.5

φ o 5 = 40 φ = 60o φ = 20o 4.5 φ = 80o 4 φ = 0o 3.5

o Navigation Gain 3 φ = −20

2.5 φ = −40o o 2 φ = −60 φ = −80o 1.5 0 20 40 60 80 100 Heading Error (degrees)

Figure 7.8. Navigation gain Vs. heading error for different values of φ. 223

Three Dimensions The experiments performed in two dimensions were repeated in three dimensions by varying the azimuth parameter θ in Eq. 7.207. It should be no surprise that, due to symmetry, the simulations results showed no dependence on the value of θ. Therefore, the navigation gain N / is only a function of the heading error γ and the elevation angle φ, as shown in Figure 7.8.

7.11 Endgame Geometry and Final Time tf

The value of tf is needed by all guidance laws developed in this chapter. This issue of determining tf was probably first addressed (in a simplistic manner) by Bryson and Ho in [20] and [43]. However, a more rigorous approach was taken by Riggs in 1979 [69]. Other important contributions were developed by Lee in 1985 [48], Ben-Asher and Yaesh in 1997 [9] and more recently by Thak et al. in 2002 [90]. The approach taken in this section is probably most similar to that used by Riggs [69]. The most notable contribution is a closed form solution for the final time when the evader has aconstantvelocity(Eq.7.226). Tothebestoftheauthor’sknowledge,thissolution has not been arrived at prior to this dissertation. In all of the engagement models discussed in this chapter, the final value of the pursuer control vector is zero. The value of the Hamiltonian H is zero for optimal control, which results in the following equation for all models discussed in this chapter

T 0=kt + krr (tf ) r˙ (tf ) . (7.217)

This condition is consistent with the geometry of the problem. If the value of kt is zero, then the final separation vector is perpendicular to the velocity difference; the value of r that satisfies such a condition is know as the point of closest approach (PCA). The final value for r is obtained from Eq. 7.15

τ r (tf )=r (t)+τr˙ (t)+ η¨r (tf η) dη . (7.218) − Z0 224

The final value for r˙ can be found from Eq. 7.9 and Eq. 7.10

r˙ (tf )=Ve (tf ) Vp (tf ) −τ = r˙ (t)+ ¨r (tf β) dβ . (7.219) − Z0 Substituting Eq. 7.218 and Eq. 7.219 into Eq. 7.217 gives

τ T τ r (t)+τr˙ (t)+ η¨r (tf η) dη r˙ (t)+ ¨r (tf β) dβ =0. (7.220) − − ∙ Z0 ¸ ∙ Z0 ¸ If the relative separation r (t) and its first two time-derivatives are known, then this equation can be used to solve for the final time tf .

7.11.1 Approximate Final Time Estimate

An approximate value of the final time, or equivalently the time-to-go τ,canbe obtained if the relative acceleration terms are set to zero in Eq. 7.220

[r (t)+τr˙ (t)]T r˙ (t)=0. (7.221)

Solving for τ gives rT r τ = . (7.222) −rT r˙ This approximation gives the time required to reach the point of closest approach if neither the pursuer nor the evader accelerate; as such, it is a first-order approximation ofthetruetime-to-go. Thisestimatecanbethestartingpointformoreadvanced algorithms that estimate the time-to-go. The time-to-go estimate provided by Eq. 7.222 was developed by Bryson and Ho in 1965 [43] and, surprisingly, still finds use in some modern weapons systems. 225

7.11.2 Unconstrained Missile Acceleration

In the case of the ideal pursuer, the final value of r˙ is given by

tf r˙ (tf )=r˙+ ¨r (β) dβ t Z tf tf = r˙+ Ae (β) d (β) Ap (β) d (β) t − t Z tf Z tf = r˙+ Ae (β) d (β) kr (tf β) r (tf ) d (β) − − Zt Zt tf τ 2 = r˙+ Ae (β) d (β)+kr r (tf ) t 2 Z tf 2 1 krτ = r˙+ Ae (β) dβ + −ZEM−−→ kr 3 t 2 1 τ Z − 3 tf 3 r˙+ Ae (β) dβ −ZEM−−→ . (7.223) ' − 2τ Zt Substituting Eq. 7.63 and Eq. 7.223 into Eq. 7.217 gives

τ T 3 −ZEM−−→ r˙+ Aedγ −ZEM−−→ =0. (7.224) − 2τ µ Z0 ¶

For a know target acceleration profile, the exact value of τ = tf t can be solved by a − nonlinear zero crossing detector, such as Newton iteration. Suppose that the evader has a constant velocity

3 0=−ZEM−−→T r˙ −ZEM−−→ − 2τ µ ¶ = −ZEM−−→T 2r˙τ 3−ZEM−−→ − =(r+τr˙)T³(3r+τr˙) ´

=3rT r+4rT r˙τ + r˙T r˙τ 2 . (7.225)

The roots of this equation are

4rT r˙ 16 (rT r˙)2 12 (rT r)(r˙T r˙) τ = − ± − . (7.226) q 2r˙T r˙ 226

Define r and r˙ to be the magnitudes of r and r˙, respectively. The angle between r and r˙ is denoted by β

4rT r˙ 16 (rT r˙)2 12 (rT r)(r˙T r˙) τ = − ± − q 2r˙T r˙ r 1 r 2 r 2 = 2 cos β 16 cos2 β 12 r˙ 2 r˙ r˙ − ± r − r 1 ³ ´ ³ ´ = 2cosβ 16 1 sin2 β 12 r˙ − ±2 − − ∙ ¸ r 1q ¡ ¢ = 2cosβ 4 16 sin2 β r˙ − ±2 − ∙ q ¸ r = 2cosβ 1 4sin2 β . (7.227) r˙ − ± − ∙ q ¸ Having developed an accurate solution for the time-to-go, as given by either Eq. 7.226 or Eq. 7.227, it should be preferred to the more rough approximation provided by Eq. 7.222. 227

Chapter 8 Estimating the Zero Effort Miss

All guidance laws developed in Chapter 7 make use of the ZEM. A rather general form of the ZEM is given by Eq. 7.36 and repeated here

τ 2 Fpτ −ZEM−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη CpF− e I τFp mp (t) . (8.1) − − p − − Z0 £ ¤ A pursuer (missile) will not be able to directly measure all variables that are neces- sary to compute the ZEM. Even if the pursuer were able to directly measure these variables, the measurements would be contaminated by noise. Therefore, in any practical situation the variables making up the ZEM must be estimated from knowl- edge available to the pursuer. As discussed in Chapter 3, the sum of all knowledge available at time t is referred to as the information state and is denoted by D (t). = At time t0 the information state typically consists of an estimate xˆ0/0 of the state x (t0) and an associated covariance P0/0

D (0) = xˆ0/0, P0/0 . (8.2) = © ª At time instances denoted by tk, measurements zk become available and are added to the information state

xˆ0/0, P0/0 t0 t

which are the LOS r, the time derivative of the LOS r˙, the pursuer acceleration Ap, and the evader acceleration Ae

ˆr = E r D (t) (8.4) { |= }

r˙ = E r˙ D (t) (8.5) { |= }

mˆ p = E mp D (t) (8.6) b { |= }

Aˆ e = E Ae D (t) . (8.7) { |= }

Any deterministic guidance law which makes use of the −ZEM−−→, given by Eq. 8.1, will have a complementary stochastic guidance law which makes use of the ZEM with all quantities replaced by their estimates (separation theorem). That is, the estimate of the ZEM is then given by1

τ \ 2 Fpτ −−−→ZEM = ˆr (t)+τr˙ (t)+ ηAˆ e (tf η) dη CpF− e I τFp mˆ p (t) − − p − − Z0 tf £ ¤ b 2 Fpτ = ˆr (t)+τr˙ (t)+ (tf β) Aˆ e (β) dβ CpF− e I τFp mˆ p (t) . (8.8) − − p − − Zt £ ¤ b 8.1 Target Modeling

Any information about the target can be used to assist in specifying Aˆ e (t),andcan be considered as part of the information state. It would be impossible to discuss all types of information that can be used to assist in estimating the target’s acceleration profile. However, there are some very common models that form the basis of many other more complex modeling schemes.

8.1.1 Step Change in Target Acceleration

It is possible that an evader will perform a step change in target acceleration. How- ever, the popularity of this model more likely arises out of its simplicity than its

1 Strictly speaking the time-to-go τ will also have to be replaced by its estimate τˆ = tˆf t − 229 applicability. When one is unsure of the application time or magnitude of the step maneuver, a multiple-model Kalman filter can be used to improve accuracy. This discussion will only show the form of the ZEM that results by assuming that the target acceleration begins at the current time t and continues for the remainder of the engagement. It is most common to assume that the direction of the evader’s acceleration is normal to the LOS. However, this does not completely specify the direction of the acceleration in three-dimensional space. Assume that the direction of the acceleration is either known or estimated and is denoted by u .Thenthe ⊥ evader’s acceleration is given by

Aˆ e (t)=S (t) u , (8.9) ⊥ where S (t) is the step function. Substituting this result into Eq. 8.8 gives

τ 2 \ 2 Fpτ −ZEM−−→ = ˆr (t)+τr˙ (t)+ u CpFp− e I τFp mˆ p (t) . (8.10) 2 ⊥ − − − £ ¤ b 8.1.2 Uncorrelated Target Acceleration

It is often the case that very little is known about the target, and one must resort to stochastic motion modeling. When absolutely nothing is known about the tar- get’s acceleration profile, it must be modeled as zero mean. Anything else would indicate a bias which contradicts the assumption of no knowledge. The "no knowl- edge" assumption also implies that future target motion is completely independent of the current information set and the conditional expectation is equivalent to the unconditional expectation

E Ae (β) D (t) ,β >t = E Ae (β) = 0 . (8.11) { |= } { }

Substituting this result into Eq. 8.8 gives

\ 2 Fpτ ZEM−−−→ = ˆr (t)+τr˙ (t) CpF− e I τFp mˆ p (t) . (8.12) − p − − £ ¤ b 230

8.1.3 Correlated Target Acceleration

In some situations, it may be appropriate to assume that the target’s acceleration is correlated in time. A correlated acceleration can be achieved by a single pole (first order) model. That is, the single pole model will shape a white noise input ae to from a correlated target acceleration Ae. The matrices describing the evader’s dynamics in Eq. 7.8 are given by

1 Fe = I3 (8.13a) −τ e 1 Ge = I3 (8.13b) τ e

Ce = I3 .(8.13c)

The state transition matrix is given by

Fet t/τ e e = e− I3 . (8.14)

The state transition matrix can be used to express the evader acceleration for time β t ≥ β Fe(β t) Fe(β γ) Ae (β)=Cee − Ae (t)+Ce e − Geae (γ) dγ Zt 1 β = eFe(β t)A (t)+ eFe(β γ)a (γ) dγ − e τ − e e Zt 1 β = e (β t)/τ e A (t)+ e (β γ)/τ e a (γ) dγ . (8.15) − − e τ − − e e Zt

Since the input ae (γ) is white, it is uncorrelated with the information set D (t) for = all γ>t

E ae (γ) D (t) ,γ >t = E ae (γ) = 0 . (8.16) { |= } { } 231

Taking the conditional expectation of Eq. 8.15

Aˆ e (β)=E Ae (β) D (t) { |= } β (β t)/τ e 1 (β γ)/τ e = e− − E Ae (t) D (t) + e− − E ae (γ) D (t) dγ . { |= } τ e t { |= } Z (8.17)

Substituting Eq. 8.16 into this result gives

(β t)/τ e Aˆ e (β)=e− − E Ae (t) D (t) { |= } (β t)/τ e = e− − Aˆ e (t) . (8.18)

The integral term involving the target acceleration in Eq. 8.8 can be evaluated using Eq. 8.18

tf tf (β t)/τ e (tf β) Aˆ e (β) dβ = (tf β) e− − dβ Aˆ e (t) t − t − Z µZ t t ¶ f − (tf γ t)/τ e = γe− − − dγ Aˆ e (t) 0 µZ τ ¶ τ/τe γ/τe = e− γe dγ Aˆ e (t) µ Z0 ¶ τ/τe 2 τ τ/τe = e− τ 1 e +1 Aˆ e (t) e τ − ∙µ e ¶ ¸ 2 ¯ h¯ = τ h 1+e− Aˆ e (t) , (8.19) e − ³ ´ where τ h¯ = . (8.20) τ e Substituting Eq. 8.19 into Eq. 8.8 gives

tf \ ˆ 2 Fpτ −ZEM−−→ = ˆr (t)+τr˙ (t)+ (tf β) Ae (β) dβ CpFp− e I τFp mˆ p (t) t − − − − Z ¯ 2 ¯ h ˆ 2£ Fpτ ¤ = ˆr (t)+τbr˙ (t)+τ h 1+e− Ae (t) CpF− e I τFp mˆ p (t) . e − − p − − ³ ´ £ ¤ (8.21) b 232

8.1.4 Optimal Evasion

Optimal evasion was discussed in Section 7.8. When the pursuer’s controls are decoupled (i.e. when the commanded acceleration along a given axis only affects the achieved acceleration along that same axis) then the resulting guidance command is given by Eq. 7.139 and repeated here

2 1 ap = N τ p,kr,h,γ −ZEM−−→ , (8.22) τ 2 ¡ ¢ where the ZEM estimate is given by Eq. 8.1 with Ae = 0. The corresponding estimate of the ZEM is given by Eq. 8.21 with Aˆ e = 0

\ 2 Fpτ ZEM−−−→ = ˆr (t)+τr˙ (t) CpF− e I τFp mˆ p (t) . (8.23) − p − − £ ¤ When the evader is modeled asb "optimal" it is not necessary to estimate the evader’s acceleration (as it is already known by assumption). The effect of the optimal evader assumption is to adjust the magnitude of the acceleration command — the direction is still colinear with the ZEM. Should the evader choose a suboptimal acceleration then the game theoretic cost will decrease (which is good from a pursuit point of view). However, the corresponding optimal control cost will increase because an incorrect model of the target’s acceleration was assumed. For example, assume the evader performs a step change in acceleration (see Eq. 8.9). The cost given by Eq. 7.70 will be larger for the game theoretic control law than for the optimal control law with the ZEM given by Eq. 8.10.

8.1.5 Summary

The different target model assumptions and their impact on the ZEM are summarized in Table 8.1. 233

Target Mo del Description Contribution to ZEM tf ˆ 1 General Model t (tf β) Ae (β) dβ − 2 2 Step in Direction u τ u ⊥ R 2 ⊥ 3 Uncorrelated Noise 0 2 ¯ h¯ 4 Correlated White Noise τ h 1+e− Aˆ e (t) e − 5 Optimal2 ³ 0 ´ Table 8.1. Target model assumptions and impact on ZEM equation

8.2 Flight Control System Modeling

There is generally much more knowledge about the missile’s state than the target’s state. In fact, most missiles have an on board accelerometer or inertial measurement unit (IMU) that directly measures incremental changes in the missiles linear velocity and angular orientation. These quantities can be used to form a very accurate estimate of the state of the missile, including its acceleration. Depending on the

flight control system model, the pursuer’s dynamic state mp (t) may consist of only the current acceleration, but increased accuracy is achieved if other "higher order" terms are also included in mp (t). From a guidance perspective, it is desirable to keep mp (t) as simple as possible without a serious compromise in performance. A simple model will result in a more compact and well understood guidance scheme. The three-loop autopilot developed in Chapter 6 is very closely approximated by the transfer function given by Eq. 6.56. However, even this model is quite complex. This complexity not only makes guidance synthesis extremely difficult, but the resulting model will have several parameters that can only be approximated in an actual implementation. This creates a very serious analysis problem because the designer must be sure the resulting guidance scheme is robust to uncertainties in these parameters. Furthermore, from an implementation point of view, the state model chosen becomes an important consideration. For example, Zarchan [102] developed and analyzed a guidance law for the transfer function given by Eq. 6.56. 234

However,thestatesusedbyZarchan’sguidancelawarenotdirectlymeasurable. On the contrary, the approach taken by Aggerwall [4] was to select the state variables directly from the three-loop autopilot topology shown in Figure 6.9. The advantage of Aggerwall’s approach is that it leads to a state space representation in which all variables are directly measurable (this is necessary because these are the states used by the autopilot). Further, but somewhat dated, information on this topic can be found in Stallard’s Ph.D. dissertation [84]. Regardless of the flight control system model, the pursuer’s dynamic state is es- timated separately from the target’s state. The accuracy with which the pursuer’s state can be measured is so much better than that of the target’s state that it has little impact on guidance system performance3. Estimating the state of the missile from IMU measurements or other measurements is a topic of many textbooks [17].

8.2.1 Single Pole Flight Control System

Thesinglepoleflight control system was discussed in Section 6.4 and Section 7.7. Thesinglepoleflight control system is applicable when the flight control system is adequately modeled by a first-order time lag. The system matrices describing the single pole flight control system are given by Eq. 7.96. The ZEM that results from the single pole model is given by Eq. 7.98 and repeated here with all states replaced by their estimates

tf \ ˆ 2 Fpτ −ZEM−−→ = ˆr (t)+τr˙ (t)+ (tf β) Ae (β) dβ CpFp− e I τFp mˆ p (t) t − − − − Z tf 2 £ h ¤ = ˆr (t)+τbr˙ (t)+ (tf β) Aˆ e (β) dβ τ h 1+e− mˆ p (t) , (8.24) − − p − Zt ¡ ¢ where b τ h¯ = . τ p

3 Prior to acquiring the target (and therfore prior to terminal guidance), the accuracy of the missile’s state estimate is very important. That is, with out accurate knowledge of its own state it would be difficult for a missile to orient itself such that acquisition is possible. 235

In this case, the pursuer’s acceleration state mp coresponds to the pursuer’s instaneous acceleration Ap

mp = Ap , (8.25) resulting in the following equation for the ZEM estimate

tf \ 2 h −ZEM−−→ = ˆr (t)+τr˙ (t)+ (tf β) Aˆ e (β) dβ τ h 1+e− Aˆ p (t) . − − p − Zt ¡ ¢ If the evader is characterizedb by the correlated acceleration model then the estimate of the ZEM is found by substituting Eq. 8.19 into Eq. 8.24

\ 2 ¯ h¯ 2 h −ZEM−−→ = ˆr (t)+τr˙ (t)+τ h 1+e− Aˆ e (t) τ h 1+e− Aˆ p (t) . (8.26) e − − p − ³ ´ ¡ ¢ Alternately, if the evaderb is characterized by the uncorrelated acceleration model then the estimate of the ZEM is found by substituting Eq. 8.11 into Eq. 8.24

\ 2 h −ZEM−−→ = ˆr (t)+τr˙ (t) τ h 1+e− Aˆ p (t) . (8.27) − p − ¡ ¢ b 8.2.2 Fast Flight Control System

If the flight control system is very fast then τ p will be large, and therefore h will be very small. Taking the limit as h 0 in Eq. 8.24 gives → t \ f −ZEM−−→ = ˆr (t)+τr˙ (t)+ (tf β) Aˆ e (β) dβ . (8.28) − Zt The same result could have been obtainedb without taking the limit by use of Eq. 7.24. Another interpretation of the ZEM provided by Eq. 8.28 is that the pursuer has no knowledge about its own acceleration mˆ p (t), in which case it would be zero mean. Such a case might arise if the pursuer does not have the capability to estimate mˆ p (t) due to lack of instruments — e.g. inertial measurement unit (IMU’s) or Global Positioning System (GPS) receiver. 236

Flight Control System Model Description Contribution to ZEM

2 Fpτ 1 Generic CpF− e I τFp mˆ p (t) − p − − 2 Single Pole τ 2 h 1+e h Aˆ (t) p £ − p¤ 3 Fast − − 0 ¡ ¢ Table 8.2. Flight control system model assumptions and impact on ZEM equation

8.2.3 Summary

The different flight control system models and their impact on the ZEM are summa- rized in Table 8.2.

8.3 Estimation

It would be difficult to provide a comprehensive discussion of the many estimator’s that have been developed for estimating the variables required to evaluate the ZEM.

It has already been mentioned that the missile’s dynamic state, denoted by mp (t), should be estimated separately from the other components of the ZEM. This is possible because (among other factors) of the high accuracy with the missiles dynamic state can be estimated. Thus, the remaining quantities to be estimated are the current values of LOS vector r and it’s time derivative r˙,aswellasthetarget’s acceleration Ae. This problem often falls under the category of target state estimation or under even more specific terms as tracking. A good starting point, it seems, for research in this area is the text by Blackman and Papoli [12] or the somewhat dated, but well respected, text by Bar-Shalom and Fortmann [7]. For research more specific to the problem at hand, the reader is referred to the excellent Ph.D. dissertation by Pearson [65], [66] and the important paper by Daum and Fitzgerald [26]. The goal of this section is to present a simple filter that can be used to efficiently and accurately track a target — i.e. provide estimates of r, r˙,andAe. A decoupled Kalman filter is used for three reasons [26]: (1) reduction in computational requirements, (2) reduction of ill-conditioning, and (3) mitigation 237 of the ill effects of certain nonlinearities. The term "decoupled" means that there is a separate filter for each of the three spatial axes, which are chosen to minimize any potential correlation between them. That is, the axes are chosen such that they are along the principal axes of the error covariance ellipsoid. To further illustrate, suppose that it is unnecessary to estimate the target acceleration — a situation which might result from modeling the target acceleration as white noise. In this case, the covariance matrix is a 6 6 matrix representing estimation errors in position and × velocity in three Cartesian coordinate directions, and it can be expected that the only significant off-diagonal elements will be those which represent cross correlations between position and velocity errors in the same axis. If the other (small) elements are neglected, the resulting matrix can be represented as three distinct uncoupled 2 2 matrices [26, p. 275]. × The first sub-section that follows is a derivation of the kinematic equations that form the backbone of the decoupled target state estimator. The next sub-sections make use of these kinematic equations to form three decoupled estimators, one for each spatial degree of freedom.

8.3.1 Kinematic Estimator Equations

Figure 8.1 shows a diagram of the target’s location relative to the missile. The mis- sile’s radar sensor provides measurements of the target relative to the seeker boresight axis, shown in green in Figure 8.1. Three coordinate frames are depicted in Figure 8.1. The primary coordinate frame is the antenna frame which has its first axis along the antenna’s electrical boresight. The other two axes of the antenna frame are the axes about which the two components of the angle tracking error are measured by the two channels of the angle error receiver. An inertial coordinate system is also shown in Figure 8.1. At the instant depicted in Figure 8.1, the inertial coordinate system is aligned with the antenna frame. This is simply a matter of convenience 238

Evader t igh f S e o Lin λel Pursuer x Seeker Boresight

y λaz

z

X

Inertial Y Coordinate Z Frame

Figure 8.1. Engagement relative to seeker boresight. and has nothing to do with small angle approximations. The inertial coordinate system has axes denoted by (X, Y, Z). The unit vectors I,ˆ J,ˆ Kˆ are defined to be along the first, second and third axes, respectively, of the³ inertial´ coordinate system. The third coordinate frame shown, which will be referred to as the line-of-sight (or

LS) frame, is defined by rotating the antenna frame through the boresight angles λaz ˆ and λel. The unit vectors ˆı, j,ˆ k are defined to be along the first, second and third axes, respectively, of the LS³ coordinate´ frame. Thetransformationfromtheline-of-sight(LS)frametotheantenna(A)frameis given by

cos λaz sin λaz 0 cos λel 0 sin λel (A) − − (LS) r = sin λaz cos λaz 0 01 0 r ⎡ ⎤ ⎡ ⎤ 001sin λel 0cosλel

⎣ cos λaz cos λel sin λ⎦az⎣ cos λaz sin λel ⎦ − − (LS) = sin λaz cos λel cos λaz sin λaz sin λel r ⎡ − ⎤ sin λel 0cosλel R⎣ A r(LS) ,⎦ (8.29) ' LS 239 where 1 λaz λel A − − R = λaz 10. (8.30) LS ⎡ ⎤ λel 01 Alternately, the reverse transformation⎣ is ⎦

(LS) A T (A) r = RLS r . (8.31) ¡ ¢ For the purposes of developing a target state estimator, it is necessary to develop a series of relationships between the kinematic state variables. The following develop- ment makes direct use of the equations developed in Appendix B. In the development that follows, the LOS is referred to by, as it is elsewhere in the dissertation, r.The time derivative of the LOS vector relative to the LS coordinate system is denoted by r˙T . The meaning of and reason for this notation is clearly explained in Ap- |LS pendix B; suffice it to say here that the derivative of the LOS vector with respect to one coordinate system (e.g. Antenna frame) will be different than the derivative with respect to another coordinate system with a common origin but with a different angular velocity (e.g. the LS frame). This convention leads to the following set of equations for the range.

r = rˆı (8.32a)

r˙T = r˙ˆı (8.32b) |LS

¨rT =¨rˆı (8.32c) |LS

The angular velocity Ω and angular acceleration Ω˙ oftheLOSframerelativetothe inertial frame are given by the following equations.

ˆ Ω = Ωxˆı + Ωyjˆ+ Ωzk (8.33) ˆ Ω˙ = Ω˙ xˆı + Ω˙ yjˆ+ Ω˙ zk (8.34)

The target’s inertial acceleration ¨rT is given by Eq. B.163 |I 240

¨rT = ¨rM + ¨rT + Ω˙ r + Ω (Ω r)+2Ω r˙T |I |I |LS × × × × |LS

= ¨rM +¨rˆı + Ω˙ rˆı + Ω (Ω rˆı)+2Ω r˙ˆı . (8.35) |I × × × ×

To evaluate this expression, the second order cross product is needed

ˆ ˆ Ω (Ω rˆı)= Ωxˆı + Ωyjˆ+ Ωzk Ωyrk + Ωzrjˆ × × × − ³ ´ ˆ ³ ´ˆ = Ωyr (Ωxˆı + Ωyjˆ) k + Ωzr Ωxˆı + Ωzk jˆ − × × ³ ˆ ´ = Ωyr ( Ωxjˆ+ Ωyˆı)+Ωzr Ωxk Ωzˆı − − − 2 2 ³ ˆ ´ = r Ω + Ω ˆı + ΩyΩxrjˆ+ ΩzΩxrk . (8.36) − z y Substituting this result into Eq.¡ 8.35 gives¢ the following equation.

¨rT = ¨rM +¨rˆı + Ω˙ rˆı + Ω (Ω rˆı)+2Ω r˙ˆı |I |I × × × × 2 2 = ¨rM +¨rˆı + Ω˙ yjˆ+ Ω˙ zkˆ rˆı r Ω + Ω ˆı |I × − z y ³ ´ + ΩyΩxrjˆ+ ΩzΩxrkˆ +2 Ωyjˆ+ ¡Ωzkˆ r˙ˆ¢ı × ³ 2 2´ = ¨rM +¨rˆı rΩ˙ ykˆ + rΩ˙ zjˆ r Ω + Ω ˆı |I − − z y ˆ ˆ + ΩyΩxrjˆ+ ΩzΩxrk 2r˙Ωy¡k +2r˙Ωz¢jˆ .(8.37) −

(LS) Let ¨rT , which denotes the target’s acceleration relative to the inertial frame and |I expressed in the LS frame, have components denoted by the following equation.

r¨Tx (LS) ¨rT = r¨Ty (8.38) |I ⎡ ⎤ r¨Tz ⎣ ⎦ (LS) Similarly, let ¨rT , which denotes the missile’s acceleration relative to the inertial |I frame and expressed in the LS frame, have components denoted by the following equation. r¨Mx (LS) ¨rM = r¨My (8.39) |I ⎡ ⎤ r¨Mz ⎣ ⎦ 241

Thus, in the line-of-sight (LS) frame, the target’s acceleration (relative to the inertial frame) is given by the following equation.

2 2 r¨ r Ωy + Ωz (LS) (LS) − ¨rT = ¨rM + rΩ˙ z +2r˙Ωz + ΩyΩxr (8.40) |I |I ⎡ ¡ ¢ ⎤ rΩ˙ y 2r˙Ωy + ΩzΩxr − − ⎣ ⎦ Solving these equations for r¨, Ω˙ y,andΩ˙ z gives the following equation.

2 2 r¨ =(¨rTx r¨Mx)+r Ω + Ω (8.41a) − y z (¨rTy r¨My) 2r˙Ωz Ω˙ z = − ¡ Ωy¢Ωx (8.41b) r − r − (¨rTz r¨Mz) 2r˙Ωy Ω˙ y = − ΩzΩx (8.41c) − r − r −

The equations for Ω˙ z and Ω˙ y can be integrated to yield equations for Ωz and Ωy.

Some explanation is in order for the term Ωx. The goal is to determine the angular velocity of the LOS, which is a vector. The vector will not have any intrinsic roll (a component of the angular velocity along the vector itself) — it is not possible to define one. Should Ωx be zero then? Definitely not, because the angular velocity Ω will be used by the Kalman filter to predict future boresight angles. This prediction is only possible if the LOS coordinate system were rolled along with the antenna, so as to make the measured boresight angles along the same axes as the angular velocity vector Ω. Therefore, the component of the antenna roll rate along the LOS is the

4 quantity Ωx.LetD˙ denote the angular velocity of the antenna coordinate system

D˙ X D˙ (A)= D˙ . (8.42) ⎡ Y ⎤ D˙ Z ⎣ ⎦ Using Eq. 8.29, the quantity Ωx is given by

Ωx = D˙ X + λazD˙ Y + λelD˙ Z . (8.43)

4 TheuseoftheletterD canbefoundelsewhereintheliteratureandislikelyusedtorepresent the seeker "D"ish. 242

The second two components are often neglected because the boresight angles are kept quite small

Ωx D˙ X . (8.44) ' The range rate and angular velocity of the LOS vector are described by Eq. 8.41. However, an equation describing the dynamics of the boresight angles is also needed. The boresight angle rates are found by subtracting the antenna motion from the LOS motion. The equation for the azimuth channel is given by the following equation.

˙ (LS) λaz = Ω Kˆ D˙ Z · − LS (A) T = R Ω Kˆ D˙ Z A − T A T (A) = £ R Ω¤ Kˆ D˙ Z LS − h i = ¡λelΩx¢+ Ωz D˙ Z − −

= λelD˙ X + Ωz D˙ Z (8.45) − − The equation for the elevation channel is given by the following equation.

˙ (LS) λel = Ωy D˙ yˆ − · T LS (A) = Ωy R D˙ yˆ − A h i = Ωy λazD˙ X D˙ Y (8.46) − − The important equations forming the decoupled estimator have all been developed. Thesub-sectionsthatfollowwillmakeuseoftheseequationstodevelopthedecoupled target state estimators.

8.3.2 Range and Range-Rate Filter

This sub-section is concerned with developing an estimator for the range r, range rate r˙, and the component of the target acceleration along the line-of-sight r¨Tx.The primary equation for this estimator is given by Eq. 8.41a and repeated below.

2 2 r¨ =(¨rTx r¨Mx)+r Ω + Ω (8.47) − y z ¡ ¢ 243

The inertial missile acceleration along the line-of-sight, r¨Mx,isknownfromthemis- sile’s inertial navigation hardware and software. The LOS angular rate variables Ωy and Ωz are known from the decoupled LOS angle filters. The inertial target accelera- tion along the line-of-sight, r¨Tx, can be treated as white noise as discussed in Section 8.1.2. However, it is also possible to use some type of stochastic motion model. For example, the standard correlated noise model considered in Section 8.1.3 can be used. This results in the following equation.

dr¨Tx 1 = ( r¨Tx + wx) (8.48) dt τ e −

For purposes of building a state space model, the following set of state variables are defined.

x1 = r (8.49a)

x2 = r˙ (8.49b)

x3 =¨rTx (8.49c)

The following state space model is formed from Eq. 8.47 and Eq. 8.48.

x˙ 1 010 x1 0 0 2 2 x˙ 2 = Ωy + Ωz 01 x2 + r¨Mx + 0 wx (8.50) ⎡ ⎤ ⎡ 1 ⎤ ⎡ ⎤ ⎡ − ⎤ ⎡ 1 ⎤ x˙ 3 00τ − x3 0 τ − ¡ ¢ − e e ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The following two measurement equations represent the measurement of the range and range rate.

z1 = x1 + vr (8.51a)

z2 = x2 + vr˙ (8.51b)

AKalmanfilter can be used with Eq. 8.50 and Eq. 8.51. The process noise and measurement noise levels, as well as the target time-constant, are system dependent. For further information see Blackman [12], Pearson [65], or Toomay [91]. 244

8.3.3 Azimuth Angle Filter

This sub-section is concerned with developing an estimator for the boresight elevation angle λaz, the component of the LOS angular rate along the z-axis of the LS coordinate system Ωz, and the target acceleration along the y-axis of the LS coordinate system r¨Ty. The primary equation for this estimator is given by Eq. 8.41b and repeated below. ˙ (¨rTy r¨My) 2r˙Ωz Ωz = − ΩyΩx (8.52) r − r −

The inertial missile acceleration along the y-axis of the LS coordinate system, r¨My,is known from the missile’s inertial navigation hardware and software. The component of the LOS angular velocity along the x-axis of the LS coordinate system Ωx,isequal to the seeker roll rate D˙ X , as shown by Eq. 8.44. The component of the LOS angular velocity along the y-axis of the LS coordinate system Ωy,isprovidedbytheelevation angle filter. The range r is provided by the range and range rate filter. The target acceleration term r¨Ty can be modeled as white noise, or possible correlated noise. For example, the standard correlated noise model considered in Section 8.1.3 can be used. This results in the following equation.

dr¨Ty 1 = ( r¨Ty + wy) (8.53) dt τ e − For purposes of building a state space model the following set of state variables are defined.

x1 = λaz (8.54a)

x2 = Ωz (8.54b)

x3 =¨rTy (8.54c)

The following state space model is formed from Eq. 8.45, Eq. 8.52 and Eq. 8.53. ˙ ˙ x˙ 1 01 0 x1 λelDX DZ 0 −r¨ − x˙ = 0 2r 1 x + My ˙ + 0 w (8.55) 2 r˙ r 2 r ΩyDX y ⎡ ⎤ ⎡ − 1 ⎤ ⎡ ⎤ ⎡ − − ⎤ ⎡ 1 ⎤ x˙ 3 00 τ − x3 0 τ − − e e ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ 245

The following measurement equation represents the measured azimuth boresight an- gle.

z1 = x1 + vaz (8.56)

8.3.4 Elevation Angle Filter

The development of the elevation angle filter follows in a very similar manner to that of the azimuth angle filter. For this reason, the final result will be shown without further discussion. The elevation angle state vector is defined by the following equations.

x1 = λel (8.57a)

x2 = Ωy (8.57b)

x3 =¨rTz (8.57c)

The state space model is given by the following equation. ˙ ˙ x˙ 1 01 0 x1 λazDX DY 0 r¨ − x˙ = 0 2r 1 x + My ˙ + 0 w (8.58) 2 r˙ r 2 r ΩzDX z ⎡ ⎤ ⎡ − − 1 ⎤ ⎡ ⎤ ⎡ − ⎤ ⎡ 1 ⎤ x˙ 3 00 τ − x3 0 τ − − e e ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ The following measurement equation represents the measured elevation boresight an- gle.

z1 = x1 + vel (8.59)

8.3.5 Estimating the ZEM Using Decoupled Filter Estimates

The three decoupled filters previously discussed provide all the information necessary to estimate the ZEM. The firstcomponentoftheZEM,theLOSvectorcanbe obtained from the range and range rate filter as shown in the following equation.

rˆ ˆr(LS) = 0 (8.60) ⎡ 0 ⎤ ⎣ ⎦ 246

The second component of the ZEM, the time derivative of the LOS vector, can be obtained by the use of Eq. B.161, as shown in the following equation. ˆ r˙ Ωx rˆ r˙ (LS) r˙ = 0 + Ωˆ 0 = Ωˆ rˆ (8.61) ⎡ ⎤ ⎡ y ⎤ × ⎡ ⎤ ⎡ z ⎤ b0 Ωˆ z 0 Ωˆbyrˆ b ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ − ⎦ It should be noted that the time derivative in the above equation is with respect to inertial space, but the resulting vector is (as the super script indicates) being expressed in the LS coordinate system. The target acceleration also appears in the ZEM and can be expressed in the LS coordinate system using the filter estimates as shown in the following equation.

r¨Tx ˆ (LS) Ae (t)=⎡ r¨Ty ⎤ (8.62) r¨cTz ⎢ ⎥ ⎣ c ⎦ It is convenient to define a missile body framecB. Herein, assume B is attached to the missile body and is a convenient frame to use for expressing the ZEM. The orientation of the antenna frame relative to the missile body, which is known from

B other missile hardware and software, is denoted by the rotation matrix RA.Using this rotation matrix and Eq. 8.29, the variables in the LS coordinate frame can be expressed in the missile body frame as shown in the following equations.

(B) B A (LS) ˆr = RARLSˆr (8.63)

(B) B A (LS) r˙ = RARLSr˙ ˆ (B) B A ˆ (LS) Abe = RARLSbAe

Substituting these results into Eq. 8.21 gives an equation for the ZEM estimate in body coordinates as shown in the following equation.

(B) (B) ¯ \ (B) 2 ¯ h (B) 2 Fpτ (B) −ZEM−−→ = ˆr (t)+τr˙ +τ h 1+e− Aˆ (t) CpF− e I τFp mˆ (t) e − e − p − − p ³ ´ £ ¤ (8.64) b 247

Although this is a perfectly acceptable form for the ZEM estimate, it is not necessarily or even likely the one that would be used in practice. It is more common to express the ZEM in the line-of-sight coordinate system and then assume that the first element of the ZEM, that is along the LOS itself, is approximately zero. This is a generalization of the collision course assumption that gave rise to Eq. 7.29. Practically speaking, one is not too concerned with motion along the LOS, so long as r˙ is negative (i.e. the missile is closing on the target). Rather, the missile must counter any heading error or target maneuver normal to the LOS vector. If this is accomplished, intercept is ensured. In the LS coordinate system, the ZEM is given by the following equation. (LS) \ (LS) (LS) 2 h¯ (LS) −ZEM−−→ = ˆr (t)+τr˙ +τ h¯ 1+e− Aˆ (t) e − e 2 Fpτ ³ (LS´ ) CpF− e I τFp mˆ (t) (8.65) − pb − − p Making appropriate substitutions gives the£ following equation.¤

(LS) rˆ r˙ r¨Tx \ ˆ 2 ¯ h¯ −ZEM−−→ = 0 + τ Ωzrˆ +τ e h 1+e− ⎡ r¨Ty ⎤ ⎡ 0 ⎤ ⎡ ˆ ⎤ − Ωbyrˆ ³ ´ r¨cTz − ⎢ ⎥ ⎣ ⎦ ⎣2 Fpτ ⎦ (LS) ⎣ c ⎦ CpFp− e I τFp mˆ p (t) (8.66) − − − c Using the assumption that the ZEM is approximately£ zero¤ along the line-of-sight gives the following expression for the ZEM.

(LS) 0 0 \ 2 ¯ h¯ −ZEM−−→ τ Ωˆ zrˆ +τ h 1+e− r¨Ty ' ⎡ ⎤ e − ⎡ ⎤ Ωˆ yrˆ r¨ − ³ ´ Tz ⎣ 000⎦ ⎣ c ⎦ 2 Fpτ c (LS) 010 CpFp− e I τFp mˆ p (t) (8.67) − ⎡ 001⎤ − − £ ¤ If the flight control system is modeled⎣ with⎦ a single pole then the following expression for the ZEM can be used

(LS) 0 0 0 \ ˆ 2 ¯ h¯ 2 h −ZEM−−→ τ Ωzrˆ +τ h 1+e− r¨Ty τ h 1+e− r¨My . ' ⎡ ⎤ e − ⎡ ⎤ − p − ⎡ ⎤ Ωˆ rˆ r¨ r¨ y ³ ´ Tz ¡ ¢ Mz ⎣ − ⎦ ⎣ c ⎦ ⎣ d(8.68)⎦ c d 248

8.4 Estimation without Range Information

Many missile systems sense the target by passively collecting electromagnetic emis- sions and reflections from naturally occurring sources, for example visible light re- flected from the sun or infrared emissions which cause the target to contrast from its surroundings due to a difference in temperature. Missiles that operate in this way are known as passive systems. Alternately, systems that illuminate the target with detectable energy and then sense the reflected energy are known as active systems5. Active systems have the ability to discern range and range rate while passive systems do not6. Clearly, without range and range-rate measurements the range and range- rate filter discussed in Section 8.3.2 is of limited use. Furthermore, the azimuth and elevation filters of Sections 8.3.3 and 8.3.4 must be modified because they are based ontheassumptionofaccuraterangeandrangerateinformation. Morespecifically, without range and range rate information, it is not possible to make use of Eqs. 8.47 and 8.52. Instead, the angular velocity of the LOS must be modeled stochastically. Since both the missile and target possess inertia, we can expect that the angular ve- locity of the LOS will be time correlated. The standard correlated stochastic motion model can be used, as shown by the following equation.

1 Ω˙ y = ( Ωy + wy) (8.69a) τ Ω − 1 Ω˙ z = ( Ωz + wz) (8.69b) τ Ω −

Using this model, the azimuth filter is composed of only two states, the LOS angle

λaz andtheLOSrateΩz.

x1 = λaz (8.70a)

x2 = Ωz (8.70b)

5 A third party illuminator results in what is known as a semi-active system. 6 This is not entirely true since passive ranging is possible, but at a much degraded level of accuracy [12, Section 5.3]. 249

The following state space model is formed from Eq. 8.45 and Eq. 8.69b.

x˙ 1 01 x1 λelD˙ X D˙ Z 0 = 1 + − − + 1 wy (8.71) x˙ 2 0 τ − x2 0 τ − ∙ ¸ ∙ − Ω ¸ ∙ ¸ ∙ ¸ ∙ Ω ¸ The following measurement equation represents the measured azimuth boresight an- gle.

z1 = x1 + vaz (8.72)

A very similar set of equations is valid for the elevation angle filter. The correlated noise model given by Eq. 8.69 implies that incremental changes in the LOS are white, but bounded. The solution to Eq. 8.69 can be expressed using the convolution integral, as shown in the following equation.

tf γ/τΩ t/τ Ω γ/τΩ Ωy (γ)=e− e Ωy (t)+ e wy (γ) dγ ∙ Zt ¸ tf (γ t)/τ Ω (tf γ)/τ Ω = e− − Ωy (t)+ e− − wy (γ) dγ (8.73) Zt Taking the conditional expectation gives the following equation.

(γ t)/τ Ω (γ t)/τ Ω E Ωy (γ) D (t) = e− − E Ωy (t) D (t) = e− − Ωˆ y (t) (8.74) { |= } { |= }

This result has important implications for evaluating the ZEM. The relative velocity can be expressed using the following general relationship.

r˙ =ru˙ r + Ω r (8.75) ×

In the LS coordinate system, the relative velocity is given by the following equation

r˙ 0 (LS) r˙ = 0 + Ωzr . (8.76) ⎡ ⎤ ⎡ ⎤ 0 Ωyr − ⎣ ⎦ ⎣ ⎦ It is desired to use this equation to obtain an equation for the ZEM. To do so, it is necessary to assume that r varies independently of Ω, a reasonable assumption under the circumstances. The LS coordinate system rotates very little until the 250

final seconds of the engagement, when the range r is very small. Therefore, we can integrate the previous equation in the LS coordinate system to obtain the relative position. The portion of time is very small for which the coordinate system does experience a noticeable rotation. Accordingly, one can safely integrate up to and including the final time tf with only a little error. Doing so results in the following equation

tf 0 r (tf ) r (t) t r˙ (γ) dγ − tf y (tf ) y (t) = 0 + t Ωz (γ) r (γ) dγ , (8.77) ⎡ − ⎤ ⎡ R ⎤ ⎡ tf ⎤ z (tf ) z (t) 0 Ω (γ) r (γ) dγ − R t y ⎣ ⎦ ⎣ ⎦ ⎣ − ⎦ where all results are to be interpreted along the axesR of the current LS frame. At the end of the integration, the final value of r (tf ) is approximately zero. Furthermore, due to the orientation of the LS frame, the current values of y (t) and z (t) are zero. This results in the following set of equations.

tf 0 r (t)+ t r˙ (γ) dγ (LS) tf r (tf )= y (tf ) = t Ωz (γ) r (γ) dγ (8.78) ⎡ ⎤ ⎡ tf R ⎤ z (tf ) Ω (γ) r (γ) dγ R t y ⎣ ⎦ ⎣ − ⎦ Taking the expected value conditioned on the informationR state gives

0 (LS) E r (tf ) D (t) = yˆ(tf ) = ⎡ ⎤ zˆ(tf ) © ¯ ª ¯ ⎣ ⎦ tf E r (t)+ r˙ (γ) dγ D (t) t = = ⎡ En tf r (γ) Ω (γ) dγ ¯ (t)o ⎤ t R z ¯ D ¯= ⎢ En tf r (γ) Ω (γ) dγ¯ (to) ⎥ ⎢ R t y ¯ D ⎥ ⎢ − ¯ = ⎥ ⎣ n tf ¯ o ⎦ RE r (t)+ r˙ (γ¯ ) dγ D (t) t ¯ = = tf ⎡ t E rn(γ) D R(t) E Ωz (¯γ) D o(t) dγ ⎤ tf { |= } { ¯ |= } E r (γ) D (t) E Ωy¯(γ) D (t) dγ ⎢ −R t { |= } { |= } ⎥ ⎣ tf ⎦ R E r (t)+ r˙ (γ) dγ D (t) t = t f (γ t)/τ Ω ˆ = ⎡ E nr (γ) DR (t) e− −¯ dγ oΩz (t) ⎤(8.79). t { |= } ¯ t ¯ ⎢ ³ f E r (γ) (t) e (γ t)/τ Ω dγ´ Ωˆ (t) ⎥ ⎢ R t D − − y ⎥ ⎢ − { |= } ⎥ ⎣ ³R ´ ⎦ 251

The previous equation gives the estimate in terms of the unknown range and final time. One method of overcoming this difficultyistouseapreprogrammedvalueof the range rate r˙ and final-time tf . Itistruethatifthisisdone,thentheazimuthand elevation filters discussed in Section 8.3.3 and Section 8.3.4 could be used because the range r under this assumption is given by the following equation.

r =(tf t) r˙ (8.80) −

Substituting Eq. 8.80 into Eq. 8.79 gives the following equation.

tf 0 (LS) (γ t)/τ Ω ˆ E r (tf ) D (t) = r˙ (tf γ) e− − dγ Ωz (t) = t − ⎡ ˆ ⎤ µ Z ¶ Ωy (t) © ¯ ª − ¯ ⎣ ⎦ τ 0 τ/τΩ β/τΩ = re˙ − βe dβ Ωˆ z (t) 0 ⎡ ˆ ⎤ µ Z ¶ Ωy (t) − ⎣ 0 ⎦ τ 2 τ/τΩ ˆ = rτ˙ Ω e− 1+ Ωz (t) (8.81) − τ Ω ⎡ ⎤ µ ¶ Ωˆ y (t) − ⎣ ⎦ For the purposes of missile guidance, filtering without range information does not requiretheassumptionfortherangerateandfinal-time that were used to develop Eq. 8.81. It is evident from Eq. 8.79 that this assumption only assists in assigning a magnitude to Ω, not in modifying the direction of the Ω. It will soon be clear that the magnitude of Ω only affects the magnitude of the resulting optimal controller and not its direction. Stable, but possibly suboptimal, control is still possible for a range of different magnitudes and so the results that follow do not critically depend on the assumption made by Eq. 8.80. Nevertheless, the author is certain that practical implementation will make use of Eq. 8.80. All guidance laws considered in Chapter 7 result in a control that is related to r (tf ). For example, using the general flight control system model (see Eq. 7.5), the optimal pursuer acceleration is given by Eq. 7.90, and repeated in the following 252 equation.

2 Fpτ T ap = kr CpF− e I τFp Gp r (tf ) (8.82) p − − If the separation theorem is used,£ then£ the actual controller¤ ¤ will implement this control using the estimate of r (tf ) given by Eq. 8.79 or Eq. 8.81. For the sake of simplicity, assume that the flight control system is modeled with a single pole. The guidance law for this model, which was developed in Section 7.7, is given by the following equation.

krτ h ap = e− 1+h r (tf ) (8.83) h − ¡ ¢ Substituting Eq. 8.81 into Eq. 8.83 gives the following equation.

0 krτ τ (LS) h 2 τ/τΩ ˆ ap = e− 1+h rτ˙ Ω e− 1+ Ωz (t) (8.84) h − − τ Ω ⎡ ⎤ µ ¶ Ωˆ y (t) ¡ ¢ − ⎣ ⎦ This result could probably be written in a more compact form by defining a variable ˜ h = τ/τΩ in a similar manner to which h is defined. Doing so results in the following equation.

2 0 (LS) krτrτ˙ Ω h h˜ a = e− 1+h e− 1+h˜ Ωˆ z (t) (8.85) p h − − ⎡ ⎤ Ωˆ y (t) ¡ ¢ ³ ´ − ⎣ ⎦ Notice that the control is proportional to kr, which is true even with the more general estimator given by Eq. 8.79. A larger value of kr means that the control engineer is placing a higher emphasis on miss than on accumulated control effort (see Eq. 7.70).

This is consistent with the control given by Eq. 8.85. Since kr affects only the magnitude of the optimal control, the control engineer is free to choose it directly, or simply group the scaling terms and choose a value for them collectively. This is why it was earlier stated that the assumption indicated by Eq. 8.80 is not critical to the resulting controller. One possibility is to group the terms so that the PN guidance law is used, which implies that different values of kr are used throughout the engagement. Doing so results in the following optimal controller, which is the 253 standard equation for proportional navigation in three dimensions.

0 (LS) a = Nr˙ Ωˆ z (t) (8.86) p ⎡ ⎤ Ωˆ y (t) − ⎣ ⎦ 8.5 Lack of A Priori Information

Lack of a priori information is a very plausible situation to consider for low cost weapons that lack processing capability. Estimation without a priori information is discussed in detail in Section 3.6. The basic sensor on a missile is a gimbaled sensor that measures the direction of the LOS vector r. If the sensor is an active sensor then the magnitude of the LOS, r = r , will also be available. Alternately, there || || may be more than one passive sensor (for example human eyes), which can jointly be used (via image registration) to estimate the magnitude of the LOS. In any case, the LOS can be represented using a unit vector ur (t) along the LOS

r (t)=r (t) ur (t) . (8.87)

An estimate of the LOS ˆr (t) can also be written in the form

ˆr (t)=ˆr (t) uˆr (t) . (8.88)

The unit vector ur (t) must be estimated if any form of guidance is to be possible.

When ur (t) is the only quantity measured and there is no a priori information avail- able, then the estimates of r˙ (t), Ae and Ap must all be zero, resulting in the following equation for the ZEM estimate

\ −ZEM−−→ = ˆr (t)=ˆr (t) uˆr (t) . (8.89)

To use a non-zero value for r˙ (t), Ae or Ap would contradict the assumption of no a priori information. However, it would not be appropriate to use a non-positive value for the estimate of the range r (t) because, by definition, the range r is a non-negative 254 quantity — that is, as long as the sensor provided a measurement then the target must have been detected in front of the sensor (missile), thereby indicating a positive value of r (t). For all control laws developed in Chapter 7, an unknown (but positive) range r (t) would simply serve to amplify the ZEM in Eq. 8.89 and thereby amplify the acceleration command. 255

Chapter 9 Guidance Strategies

The numerous engagement models considered in Chapter 7 all result in guidance laws that make direct use of the zero-effort miss. This chapter explores the implications of this result in a stochastic context. It will be shown that the information available to the pursuer (information constraints) determines the form of the zero effort miss, resulting in what the author calls guidance strategies. Furthermore it will be shown that the control constraints, magnitude and directional, determine the control law that most effectively implements an optimal guidance strategy.

9.1 Guidance Strategies and Information Constraints

All control laws developed in Chapter 7 were the result of applying deterministic optimal control theory to various engagement models. In practical implementation, the evader/target is often modeled, at least partly, by a white-noise stochastic in- put. Furthermore, the missile does not have direct access to the states required by the optimal guidance laws and so control is only possible by estimation of these states. Numerous separation theorems ([20, Ch. 14], [39], [73], [75], and [86] among others) have been developed which specify the conditions under which an optimal estimator and optimal deterministic control law are jointly optimal. Typically, a practitioner simply assumes the separation theorem holds under the rationale that it is unreasonable to assume a system’s states can be controlled more accurately than they can be estimated. Nevertheless, separation theorems continue to be developed, and rightly so. The goal of this section is not to apply certain separation theorems to the various engagement models discussed in Chapter 7, but rather to assume the separation theorem holds in all cases considered and then examine the ramifications 256 of the separation theorem. The following three subsections present three missile guidance strategies. These strategies are each optimal depending on the information available to the missile. Each strategy is first stated and then shown to be optimal when certain information constraints apply.

9.1.1 Zero Effort Miss Guidance Strategy

The goal of the zero effortmissstrategyistoreducethezeroeffortmissinthemost efficient manner possible, which itself is a function of the control constraints imposed on the missile. Clearly, this strategy necessitates the ability to estimate the zero effort miss. Therefore, the zero effortmissguidancestrategyisapplicablewhen thereissufficient information about the system state (in the form of measurements or other a priori information) to allow estimation of the LOS r and its time-derivative r˙, as well as, either the pursuer acceleration Ap or the evader acceleration Ae.This is a higher-level strategy that not only takes into account first-order perturbations in the ZEM due to the LOS rate r˙, but also takes into account second-order changes in the ZEM due to some knowledge of either the pursuer’s acceleration or evader’s acceleration. An estimate that makes use of these second-order terms is given by Eq. 8.8 and repeated in the equation below.

tf \ 2 Fpτ −ZEM−−→ = ˆr (t)+τr˙ (t)+ (tf β) Aˆ e (β) dβ CpF− e I τFp mˆ p (t) (9.1) − − p − − Zt £ ¤ The expressionb for the ZEM is simplified depending on the choice of the target model and the flight control system model. Three of the target models listed in Table 8.1 result in non-zero values for the estimated target acceleration. Similarly, two of the flight control system models listed in Table 8.2 result in non-zero values for the estimated missile acceleration state. Although other target models and flight control system models are possible, the number of combinations of those already mentioned is six. It does not seem necessary to explicitly state the zero-effort miss for all six 257 combinations, as they can easily be constructed directly from Table 8.1 and Table 8.2. In addition to estimating the missile and target acceleration, the LOS r and its time derivative r˙ must also be estimated. Therefore, to be able to effectively implement this type of strategy, an active sensor is necessary, or alternately multiple passive sensors (e.g. human eyes). In any case, the sensors and estimator must be able to estimate the LOS and its time derivative in three dimensional space. One possible estimator for estimating these vectors, as well as, the target’s acceleration is the estimator discussed in Section 8.3.

9.1.2 Parallel Navigation Strategy

The goal of the parallel navigation strategy is to stabilize the line-of-sight between the missile and the target, with the implicit assumption that the range is continuously decreasing. If the range is decreasing and the angular velocity of the LOS is zero, then the missile and target are on a collision course. A non-zero LOS rate results in the possibility of a miss. The word possibility is used because the target may perform a maneuver that temporarily increases the LOS rate, but later perform another that brings the rate back to zero without the missile performing any acceleration. In such a situation, the zero-effort miss strategy would recognize when current perturbations in the LOS rate would be offset by later perturbations without needlessly expending control effort to correct for them as they occur. The parallel navigation strategy seeks to actively nullify any perturbations in the LOS rate without regard to future perturbations. The parallel navigation strategy is appropriate when it is uncertain how the mis- sile’s acceleration state and target’s acceleration state affect the zero effort miss. Uncertainty in how these accelerations affect the zero effort miss can occur because of extreme uncertainty about the range between the missile and its target. This is a problem in single sensor passive systems in which the range is very difficult to 258 accurately estimate. However, this is not the only situation in which the parallel navigation strategy is optimal. Consider the more general pursuit and evasion problem — that is, beyond just missiles and targets. A pursuer, whether it be human, animal, or mechanical (e.g. a missile), may wish to minimize its miss distance while using a feasible amount of control (precisely the cost function used throughout this dissertation). Regard- less of the overall optimality of the controller and estimator, the pursuer can only make use of what quantities it knows how to estimate. Therefore, it does not seem appropriate to be concerned with the overall optimality of the system, but rather with making optimal use of the knowledge the pursuer is capable of discerning. For example, an adult (or perhaps younger) person is capable of implementing the zero effort miss strategy to the extent to which they are able to predict their opponents maneuvers. However, a much younger person may lack the same reasoning powers and be completely unable to process such higher level information. That is, the younger person may be able to discern the momentum of the game at hand and how it will influence the final result, but be unable to see the long-term effects of their own actions as well as that of their opponents. This is as true in a game of chase as well as it is true in a game of chess. In terms of pursuit and evasion, the ability to discern the momentum of the game means understanding how the LOS and its time derivative affect the zero effort miss, while the ability to discern the long-term effects corresponds to understanding how the acceleration states effect the zero effort miss. The next few paragraphs will show that the primary means by which the LOS and its time derivative affect the zero effort miss is through the angular motion of the line-of-sight. The parallel navigation strategy is closely linked to the ZEM strategy. In fact, it is possible to derive the parallel navigation strategy directly from the ZEM strategy. Consider the situation in which the pursuer is completely unable to estimate the target acceleration state and in addition the missile either has a "fast" flight control 259 system, or an inability to estimate its own acceleration state (e.g. the missile does not have GPS or an IMU). In this case, the missile must assume that Ap and Ae are both zero. To assume otherwise would be to contradict the premise that the missile is totally ignorant of these states. In any case, the estimate of the ZEM is given by the following equation. −ZEM\−−→ = ˆr (t)+τr˙ (t) (9.2)

Making use of the approximation given by Eq. 7.29b results in the following equation for the ZEM. −ZEM\−−→ = τΩˆ (t) ˆr (t) (9.3) ⊥ × In the coordinate system introduced in Section 8.3, the ZEM estimate is given by the following equation.

0 \ −ZEM−−→ = τrˆ Ωˆ z (t) (9.4) ⎡ ⎤ Ωˆ y (t) ⎣ − ⎦ Presumably, the range r is decreasing primarily due to the component of the missile’s and target’s velocity vectors along the line-of-sight. The ZEM can be reduced to zero by reducing the angular velocity of the LOS. Therefore, application of the ZEM strategy to this situation implies that optimal control requires the missile to stabilize the LOS (i.e. reduce it’s angular rate to zero). The previous paragraph showed that the ZEM strategy reduces to the parallel navigation strategy when the pursuer is ignorant of the target’s acceleration state as well as its own, or when these states can for other reasons be set equal to zero (for several instances when this is true, see Table 8.1 and Table 8.2). It is also possible to arrive at the optimality of parallel navigation under the assumption that the range can not be accurately estimated, as is the case for a single passive sensor. This situation was already considered in detail in Section 8.4. Therein, the following 260 stochastic motion model was used for the angular motion of the LOS.

1 Ω˙ y = ( Ωy + wy) (9.5a) τ Ω − 1 Ω˙ z = ( Ωz + wz) (9.5b) τ Ω −

This resulted in a value of the LOS given by the following equation.

0 τ (LS) 2 τ/τΩ ˆ ˆr (tf )=rτ˙ Ω e− 1+ Ωz (t) (9.6) − τ Ω ⎡ ⎤ µ ¶ Ωˆ y (t) − ⎣ ⎦ The previous estimate of r (tf ) makes use of the closing speed r˙ and the time-to- go τ. However, it was explained in Section 8.4 that optimal guidance does not require accurate estimates of these quantities. As indicated by any appropriate cost function considered in Chapter 7, the goal of optimal missile guidance is to minimize the magnitude of r (tf ), while using a reasonable level of control. The general flight control system considered in Section 7.6 results in an optimal control that is directly proportional to r (tf ). Application of the separation theorem with Eq. 9.6 implies that the resulting control is zero when the angular rate of the LOS is zero. As such, the optimal control is attempting to stabilize the LOS — i.e. implement parallel navigation. The previous discussion demonstrated that parallel navigation can be optimal for a missile with arbitrary acceleration dynamics. The parallel navigation strategy is also optimal in the presence of control constraints. For example, consider the missile that can only effect lateral acceleration. Optimal control theory was used to develop a guidance law for such a missile in Section 7.10. During this development, the following two constraints were found on the missile’s control vector ω (see Eq. 7.191).

ω (t) r (tf )=0 (9.7a) ·

ω (t) Vp (t)=0 (9.7b) · 261

From these constraints, one can infer that ω must be aligned with the cross product of r (tf ) and Vp (t). Therefore, the control can be written as

ω (t)=g (t) Vp (t) r (tf ) , (9.8) × where g (t) is an unknown scalar function. The control can be expressed in the line-of-sight coordinate system by making use of Eq. 9.6. Clearly, the resulting controller will have a control that is proportional to the angular rate of the LOS vector. Therefore, the control constraints do not change the conditions under which parallel navigation is an optimal strategy.

9.1.3 Direct Pursuit

The goal of the direct pursuit strategy is to force the missile to accelerate so as to reduce the range between the missile and the target. Direct pursuit is perhaps the most naive approach to guidance, but under certain conditions, it is an optimal strategy. The direct pursuit strategy is optimal when the pursuer only has a single measurement of the target’s relative position. In this situation, the pursuer must reason without a priori information, and therefore can make no estimates of velocity or acceleration. This situation was considered in more detail in Section 8.5. Without making use of mathematics, it should be clearly evident that the only reasonable choice for a pursuer to take when there is only a single measurement of the target’s location is to travel in the direction of the target’s measured position. Nevertheless, this heuristic will not be relied upon to prove the optimality of the direct pursuit strategy. The proof by which parallel navigation was shown to be an optimal strategy can also be (with slight modification)usedtoshowthatdirectpursuitis also an optimal strategy. Accordingly, suppose that the pursuer has no information by which the velocity or acceleration states can be estimated. This situation may occur in a low cost weapon that has a sensor, but lacks sophisticated processing and 262 memory hardware. In this case, the zero effort miss is given by Eq. 8.89 and repeated below. \ ZEM−−−→ = ˆr (tf )=ˆr (t)=ˆr (t) uˆr (t) (9.9)

Depending on the missile’s sensor it may or may not be possible to measure the range r (t). However, even a passive sensor is capable of measuring the direction vector ur (t) from the missile to the target. Regardless of the ability to accurately esti- mate the range r, the optimality of direct pursuit strategy follows directly from the optimality of the ZEM strategy, as shown by Eq. 9.9. A missile without control constraints, but with uncoupled acceleration dynamics, results in an optimal accel- eration that is along the ZEM vector. Therefore, the optimal control will cause the missile to accelerate in the direction of the target, thus implementing the direct pursuit strategy. In a similar manner, consider again the pursuer that does not have axial acceler- ation. The optimal control is given by Eq. 9.8. Substituting Eq. 9.9 into Eq. 9.8 gives the following optimal control law.

ω (t)=g (t)ˆr (t) Vp (t) uˆr (t) (9.10) ×

The direction of the control ω is such that it forces the missile’s velocity vector to rotate in the direction of the line-of-sight vector uˆr (t), thus implementing the direct pursuit strategy. Therefore, the control constraints do not change the conditions under which direct pursuit is an optimal strategy.

9.2 Control Strategies and Control Constraints

The numerous engagement models discussed in Chapter 7, and the corresponding estimators discussed in Chapter 8, all support the claim that control constraints do not influence the optimal guidance strategy. Rather, the control constraints dictate 263 the optimal way of implementing a guidance strategy, that itself is determined by the information constraints as discussed in Section 9.1. There are basically three types of control constraints: (1) magnitude, which limit the magnitude of the control available; (2) direction, which limit the direction in which the control can be exerted; and (3) dynamic, which limit how quickly the control can be applied. It would certainly be a stretch to claim that magnitude constraints always result in a saturating controller. However, when directional constraints are not also present, it was shown in Section 7.9 that magnitude constraints do result in a saturating controller, so long as the dynamic constraints on the pursuer’s acceleration are minimum phase. Directional constraints result in a very nonlinear system, that makes direct application of optimal control difficult. However, it was shown in Section 7.10 that the optimal control is clearly related to the control law when the directional constraints are not present. One aspect of this relationship that is of prime importance here is that the pursuer with direction constraints will implement the same strategy as that without directional constraints. Dynamic constraints are really only important in the ZEM guidance strategy. If the information constraints dictate that the ZEM guidance strategy is optimal, then the dynamic constraints ensure that the ZEM properly accounts for the zero-input response of the missile’s flight control system. On the contrary, if the parallel navigation strategy or direct pursuit strategy is deemed optimal (based on the information constraints), then the "higher-order" effects of flight control systems latent energy are not of importance.

9.3 Guidance Strategy Paradigm

The three guidance strategies discussed in Section 9.1 each have tremendous intuitive appeal. If one has sufficient higher-level information such as the ability to estimate the target’s acceleration and predict its future location, then it seems very logical that the optimal strategy would simply be to position the missile such that it is 264 on a collision course with the target’s future location — which is the ZEM guidance strategy. However, if one lacks such higher-level information, but still has the ability to estimate the target’s position and velocity relative to the missile, then it seems very logical to position the missile on a constant velocity collision course (stabilize the angular motion of the line-of-sight) with the target — which is the parallel navigation strategy. Finally, if one only has knowledge of the target’s position relative to the missile, then the only logical course of action seems to be to exert effort to steer the missile directly in that direction — which is the direct pursuit strategy. The separation theorem was useful in demonstrating the optimality of these logi- cal conclusions, but by itself it is not sufficient. More evidence is necessary to claim that these strategies are indeed fundamentally optimal. Specifically, the optimality of these strategies must also be demonstrated when the missile has directional con- straints on its acceleration vector. The problem of direction constraints was first addressed by Guelman’s work in the early 1980’s (see [37] and [38]). Because Guel- man’s final result was not a closed loop guidance law, he never realized the importance of the zero-effort miss for this engagement model. Prior to this dissertation, the im- portance of zero-effort miss and proportional navigation were well known, but still they were not recognized or even classified as fundamental guidance strategies, and only appeared in the context of guidance laws themselves. The one exception to this, which has already been mentioned is the book by Shneydor [78]. Shneydor did not discuss optimal strategies, but rather recognized that a wide number of guidance laws could be classified as the parallel navigation type or the direct pursuit type. It was only after extensive study of Guelman’s engagement model that the author of this dissertation finally arrived at an optimal control where the ZEM explicitly appeared. That is, this author did not begin the study of Guelman’s engagement model with the intention of finding a control law that implemented the ZEM strategy because it was not suspected that such nonlinear equations would yield such a result. It was only after the optimal guidance law was obtained and compared with the guidance 265 law without direction constraints that this author began to see that there must be guidance strategy and control strategy dichotomy to guidance law development. It seems that there is now sufficientevidencetoclaimthestrategiesdiscussedin Section 9.1 are indeed optimal regardless of the control constraints. The next and final chapter of this dissertation will use this observation as a premise (or at least as an aide) for developing new guidance laws. 266

Chapter 10 Applications of the Strategy Paradigm and Conclusion

The previous chapter of this dissertation developed the concept of guidance strategies and control strategies. This chapter will exploit this concept in two ways. First, the guidance strategy paradigm will be used with Lyapunov control theory as a method of developing sub-optimal control strategies for optimal guidance strategies. Next, the guidance strategy paradigm will be used to extend the well known pure proportional navigation guidance law to the zero-effort miss strategy, thus making it effective against maneuvering targets. The last section of this chapter will be a conclusion for this dissertation.

10.1 Lyapunov Guidance

This section will show how the guidance strategy paradigm can be used with Lyapunov control theory as a method of developing sub-optimal control strategies for optimal guidance strategies. The theory of Lyapunov, or function minimizing, controls was introduced in Section 5.3, and a slight extension will be included in this section. The key to a successful Lyapunov controller is the choice of the Lyapunov function W (x, u). The approach taken in this section is to use the guidance strategy as the basis for the Lyapunov function. As discussed in Section 5.3, the control is chosen to minimize the cost functional

k J = u uT u+W˙ (x, u) .(10.1) 2

It may be the case that the control can not directly effect W˙ (x, u), so the next best thing is to choose a control that decreases W¨ (x, u) as much as possible. One might 267 generalize this idea as follows

k N dnW J = u uT u+ k .(10.2) 2 n dtn n=1 X The next two subsections will show how Lyapunov control theory can be used to obtain sub-optimal control strategies for the direct strategy and the zero-effort miss guidance strategy.

10.1.1 Direct Pursuit Guidance

This section uses Lyapunov control theory to develop a sub-optimal control strategy for the direct pursuit guidance strategy. Recall that the goal of the direct pursuit guidance strategy is to force the missile to accelerate so as to reduce the range between the missile and the target. This strategy is optimal when the missile only has information about the relative position of the target. That is, when the missile does not have information about relative velocity and relative acceleration. The following Lyapunov function is appropriate for the direct pursuit guidance strategy 1 W (x)= rT r .(10.3) 2 The time derivative of this function is given by

T W˙ = r (Ve Vp) .(10.4) −

Since the pursuer has no control over the evader’s velocity vector Ve, the pursuer’s strategy is to align it’s velocity vector Vp with the line-of-sight vector r. In this case, one can reason the control strategy from the form of W˙ (x). Because this may not always be possible, a mathematical approach to arriving at the same result is now discussed. A cost function can be formed by substituting Eq. 10.4 into Eq. 10.1 as follows

ku T T J = u u + r (Ve Vp) .(10.5) 2 − 268

However, this cost function is not useful for obtaining a control law because the pursuer’s control has no direct effect on W˙ (x). The solution to this problem is to instead seek a control to minimize W¨ (x), as indicated by Eq. 10.2. The second derivative of the Lyapunov function is given by the following equation.

T T W¨ = r V˙ e V˙ p + r˙ r˙ (10.6) − ³ ´ To proceed further, we must specify if there are control constraints on the pursuer’s acceleration vector. Suppose that the pursuer can only accelerate normal to its velocity vector, as discussed in Section 7.10. The pursuer’s acceleration vector is given by

V˙ p = ω Vp .(10.7) × From purely geometric considerations, one can see that the pursuer’s control will minimize W¨ (x) when

ω =kVp r .(10.8) × Alternately, one may select a control that will minimize Eq. 10.2 for N =2 k J = u ωT ω+k W˙ + k W¨ 2 1 2 ku T T T T = ω ω+k1r (Ve Vp)+k2 r V˙ e V˙ p + r˙ r˙ .(10.9) 2 − − ³ ³ ´ ´ Necessary conditions for a minimum of J with respect to ω are ∂J = 0 ∂ω T T = kuω + k2r (Vp ) . (10.10) × Rearranging this equation gives

ω =kVp r . (10.11) × The choice of descent function determines the guidance strategy, while the derivative terms included in the cost function assist in explicitly developing the closed loop control. 269

For the sake of completeness, assume that the pursuer can effect control in any direction, as discussed in Section 7.5. The pursuer’s acceleration vector is given by

V˙ p = Ap . (10.12)

The control can be chosen to minimize Eq. 10.2 for N =2

k J = u AT A +k W˙ + k W¨ 2 p p 1 2 ku T T T T = A Ap+k1r r˙ + k2 r (Ae Ap)+r˙ r˙ . (10.13) 2 p − ¡ ¢ Necessary conditions for a minimum of J with respect to Ap are

∂J = 0 ∂Ap T T = kuAp + k1r . (10.14)

Rearranging this equation gives

Ap=kr . (10.15)

This subsection has developed two sub-optimal controllers, Eq. 10.11 and Eq. 10.15, for the direct pursuit guidance strategy. Comparing these results to those of Chapter 7, one can see that these controllers are nearly optimal implementations of the direct pursuit guidance strategy. More precisely, they are optimal to within a time-varying scale factor! The next subsection develops similar results for the ZEM guidance strategy.

10.1.2 Zero-Effort Miss Guidance

This section uses Lyapunov control theory to develop a sub-optimal control strategy for the ZEM guidance strategy. Recall that the goal of the ZEM guidance strategy is to reduce the magnitude of the ZEM. The ZEM guidance strategy is optimal when the pursuer can estimate the relative position, relative velocity, and the relative 270 acceleration. The following Lyapunov function is appropriate for the ZEM guidance strategy kL (t) W (x)= −ZEM−−→T −ZEM−−→ , (10.16) 2 where the scaling factor is an unspecified function of time. To proceed, it is necessary to specify the engagement model, and in particular the pursuer’s control constraints. Suppose that the pursuer can only accelerate normal to its velocity vector, as discussed in Section 7.10. The pursuer’s acceleration vector is given by

V˙ p = ω Vp . (10.17) × The engagement model is given by Eq. 7.173 and repeated below.

r˙ = Ve Vp (10.18a) −

V˙ e = Ae (10.18b)

V˙ p = ω Vp (10.18c) × The ZEM is given by Eq. 7.24 and repeated below.

τ −ZEM−−→ = r (t)+τr˙ (t)+ ηAe (tf η) dη (10.19) − Z0 The applied control will minimize the cost function

k J = W˙ + ω ωT ω . (10.20) L 2

The term W˙ is given by

∂W ∂W dx W˙ = + ∂t ∂x dt

∂W T ∂−ZEM−−→ dx = + kL (t) −ZEM−−→ , (10.21) ∂t ∂x dt where r x = Ve . (10.22) ⎡ ⎤ Vp ⎣ ⎦ 271

The time derivative of the zero-effortmissisgivenby

Ve Vp ∂−ZEM−−→ dx − = I3x3 τI3x3 τI3x3 Ae ∂x dt − ⎡ ⎤ ω Vp £ ¤ × = Ve Vp + τAe+τVp ⎣ω .⎦ (10.23) − ×

Substituting this result into Eq. 10.21 gives

∂W T ∂−ZEM−−→ dx W˙ = + kL (t) −ZEM−−→ ∂t ∂x dt ∂W T = + kL (t) −ZEM−−→ (Ve Vp + τAe+τVp ω) . (10.24) ∂t − ×

Eq. 10.20 has an extremum with respect to ω when

∂J L =0 ∂ω

T ∂ T ∂−ZEM−−→ dx = kωω + kL (t) −ZEM−−→ ∂ω " ∂x dt # T T = kωω + kL (t) τ−ZEM−−→ (Vp ) . (10.25) ×

Rearranging to solve for ω yields

ω =N (τ) Vp −ZEM−−→ , (10.26) × where N (τ) is a completely general (i.e. unconstrained) scale factor. The optimal guidance law for this system was considered in detail in Section 7.10. For reasons explained therein, a good (but sub-optimal) choice for the scale factor is

3 N 2 2 . (10.27) ' Vp τ Using this value for N results in what the author refers to as the Lyapunov ZEM (LZEM) guidance law 3 ω = 2 2 Vp −ZEM−−→ . (10.28) Vp τ × 272

Substituting Eq. 7.30 into Eq. 10.28 results in a guidance law that the author refers to as the Lyapunov Augmented Proportional Navigation (LAPN) guidance law

3 τ ω = Vp τΩ (t) r (t)+ ηAe (tf η) dη . (10.29) V 2τ 2 × ⊥ × − p µ Z0 ¶ Both Eq. 10.28 and Eq. 10.29 implement the ZEM guidance strategy. Omitting the velocity and acceleration terms for the ZEM in Eq. 10.28 results in a guidance law that implements the direct pursuit guidance strategy

3 ω = 2 2 Vp r (t) . (10.30) Vp τ × Note that this result is equivalent to Eq. 10.11. In a similar manner, a sub-optimal guidance law can be obtained for the parallel navigation strategy by omitting the target acceleration term in Eq. 10.29. Doing so results in a guidance law that the author refers to as Lyapunov Proportional Navigation (LPN)

3 ω = 2 Vp (Ω r) . (10.31) Vp τ × ⊥ × This is precisely the guidance law that results when the following Lyapunov function (which is appropriate for the parallel navigation strategy) is used

k (t) W (x)= L ΩT Ω . (10.32) 2 ⊥ ⊥

This section has used Lyapunov control theory and the guidance strategy paradigm to develop several guidance laws. More specifically, two guidance laws were developed to implement the direct pursuit guidance strategy, one of which is valid for a missile with directional control constraints (Eq. 10.11), and the other is valid for a missile without control constraints (Eq. 10.15). Two other guidance laws were developed for a missile with directional constraints, one of which implements the ZEM guidance strategy (see Eq. 10.28), and the other implements the parallel navigation strategy (see Eq. 10.31). In all cases, the guidance laws are identical in structure to optimal guidance laws, with the only difference being a (possibly time-varying) scale factor. 273

Three of the guidance laws developed in this section are very similar to existing guidance laws. Accordingly, the names of these laws were used as the basis for naming the new guidance laws by simply preceding the appropriate names by the word "Lyapunov." This naming convention is also appealing for reasons that will become clear in the next section of this chapter. The next section of this chapter will use the guidance strategy paradigm in a slightly different manner than what it was used for in this section. In the following section a well known guidance law, pure proportional navigation (PPN), will be ana- lyzed and separated into a guidance strategy and a control strategy. Once the control strategy is explained, it is applied to the ZEM guidance strategy, thereby greatly ex- tending the capabilities of the PPN guidance law against maneuvering targets.

10.2 Extending PPN for Maneuvering Targets

The underlying theme of this dissertation has been on the development of guidance laws in the context of guidance strategies. This abstraction aides in the extension of the well-known pure proportional navigation (PPN) guidance law, which implements the parallel navigation strategy, to a guidance law that implements the ZEM strategy. Pure proportional navigation (PPN) is defined as [82]

ω =NΩ . (10.33) ⊥

PPN has been found to have the following advantages over TPN (given in Eq. 7.68) [79, p. 382]

1. Unlike TPN, PPN does not require forward acceleration or breaking

2. PPN is more efficient that TPN in terms of control effort

3. TPN has stronger restrictions on initial conditions to insure intercept, resulting in larger capture zones for PPN 274

Ve r

Vp

Ω ⊥ ZEM ω PPN ATPN ω LPN

Figure 10.1. Comparison between three forms of proportional navigation.

4. TPN requires larger acceleration commands

To demonstrate the differences between TPN (Eq. 7.68), LPN (Eq. 10.31) and PPN (Eq. 10.33), consider the engagement geometry illustrated in Figure 10.1. The cross plane is defined to be the plane containing r and Ω and the engagement plane ⊥ is defined to be perpendicular to the cross plane and containing r. The engagement illustrated in Figure 10.1 is completely general and could represent any scenario. However, the viewing angle of the engagement is such that the plane containing r and Ω is in the page. ⊥ For the moment, assume the evader has a constant velocity. For a constant velocity evader, the ZEM is perpendicular to the engagement plane. If the missile can accelerate in any direction (i.e. it has no control constraints), then TPN would be optimal. The TPN guidance command, indicated by the vector ATPN in Figure 275

10.1, is colinear with the ZEM. The TPN law would keep the heading error zero in the cross plane and simply accelerate the missile along the direction of the ZEM. That is, the control strategy of TPN is to maintain zero heading error in the cross plane while reducing the ZEM in the engagement plane. On the contrary, the LPN guidance law rotates the missile velocity vector directly toward the ZEM, thereby creating a heading error in the cross plane where there was previously none. That is, the control strategy of LPN is to reduce the angle between the missile’s velocity vector and the ZEM, so as to reduce the ZEM in the most direct manner possible. The PPN guidance law rotates the missile velocity vector in such a way that the ZEM is reduced without the (possibly adverse) effect of creating a heading error in the cross plane. Because of this, it can be reasoned that the PPN control strategy is identical to the TPN control strategy. It is not possible to conclude which guidance law is the best (LPN or PPN) from this qualitative analysis. LPN may reduce the zero effort miss faster, while PPN may conserve energy better. One guidance law may result in severely degraded performancewhenradomeerrorsorothermeasurementdegradationsarepresent. It is only possible to thoroughly analyze the merits (and demerits) of these guidance laws by use of a high fidelity, application specific simulation. Furthermore, the goal of this section is not to make such assessments, but rather to better understand the PPN guidance law in the context of the guidance strategy paradigm. This goal was achieved by identifying both the guidance strategy and the control strategy of the PPN guidance law. The following subsection uses this information to extend the PPN guidance law to a form that implements the ZEM guidance strategy.

10.2.1 Pure Augmented Proportional Navigation

Theprevioussubsectionidentified PPN as a guidance law that implements the parallel navigation strategy by use of a control strategy that is identical to that used by TPN. 276

This section will use these results to develop an advanced form of PPN that is capable of engaging accelerating targets in a similar manner to the True Augment PN (TAPN) guidance law. It should be clear from the work in this chapter and previous chapters that the ZEMguidancestrategyistobepreferredto the parallel navigation strategy when sufficient information (knowledge of target acceleration) is available to the missile. To further illustrate this point, consider the conventional (True) forms of the pro- portional navigation and augmented proportional navigation guidance laws. True augmented proportional navigation (TAPN), as given by Eq. 7.67, is superior to true proportional navigation (TPN), as given by Eq. 7.68, against maneuvering targets. This can be demonstrated with the aide of Figure 10.1. One could imagine that the evader performs a constant acceleration maneuver that directly cancels the heading error resulting in a zero magnitude ZEM. While TAPN would issue a zero guidance command, TPN would attempt to (unnecessarily) remove the heading error, thereby increasing the ZEM and wasting energy. The previous discussion demonstrates the advantage of including a target accel- eration term in the TPN guidance law and gives motivation for including one in the PPN guidance law. The guidance strategy paradigm will be the means by which this is accomplished. In the previous subsection, it was found that PPN attempted to maneuver the missile in such a way as to prevent inducing a cross plane heading error. This can be thought of as the control strategy used by PPN to implement the parallel navigation guidance strategy. For a maneuvering target, this control strategy could 277 be accomplished by forcing ω to be perpendicular the LOS and the ZEM

ω = kPAPN r −ZEM−−→ × ³ ´ τ = kPAPN r r + τr˙ + ηAedη × 0 µ ∙ Z τ ¸¶ = kPAPN r τΩ r + ηAedη × ⊥ × 0 µ ∙ Zτ ¸¶ 2 = kPAPN r τΩ + r ηAedη . (10.34) ⊥ × ∙ µZ0 ¶¸

The value of kPAPN is selected so that PAPN reduces to PPN for zero target acceler- ation

N ω = r −ZEM−−→ (10.35a) r2τ × 1 τ = N Ω + r ηAedη . (10.35b) ⊥ r2τ × ∙ µZ0 ¶¸ This results in a control law that shares the velocity referenced advantages of PPN and evader acceleration advantages of TAPN. The next section will demonstrate the effectiveness of the PAPN guidance law by use of a simulation.

10.2.2 Numerical Results

Much has been said about the necessity of relying on qualitative analysis when com- paring complex guidance laws. For example, a qualitative analysis revealed the controlstrategyusedbypureproportional navigation. However, a quantitative analysis, even if it is limited, is often of great value as long as one is aware of its limitations. For this reason, this section will present simulation results that assist in showing some important aspects of the PAPN guidance law. It will be verified that the PAPN guidance law does indeed perform better against maneuvering targets than the PPN guidance law. To do so requires a simulation of the guidance laws against a maneuvering target. It will also be verified that the PPN guidance law maintains the ZEM in a single plane. Such a simulation is important because this 278

Parameter Value Reference 4 r (tf ) 10− Eq. 7.189b k k 6 kr 10− Eq. 7.49 kt 0 Eq. 7.49 NPAPN 5 Eq. 10.35b Vp 1 Eq. 7.173 k k Ve 1 Eq. 7.173 k k tf 8 Time duration of optimal trajectory Table 10.1. Simulation parameters aspect of PPN was revealed in a qualitative analysis, and subsequently used as the control strategy effectively defining the family of "Pure" guidance laws. The engagement model used for these simulations is given in Section 7.10. A thorough analysis of this engagement model and the applicability of optimal control theory was performed in that section as well. While it is possible to implement the optimal guidance law by making use of the results therein, to do so would necessitate the development of an estimator for certain engagement parameters. Developing such an estimator is not the goal of this section. However, it is desirable to use the optimal control as a benchmark by which other guidance laws may be evaluated. To this end, an optimal control trajectory was produced by integrating the equations of motion from a specified terminal end point backwards in time by a specified time-to- go, thereby producing an initial state that leads to the target in an optimal fashion. To make the results more readable, the entire engagement is then transformed to a new coordinate system in which the optimal trajectory begins at the origin with the missile heading in a desired direction. Suboptimal guidance laws are then applied from the origin until terminal criteria are met. This method allows one to compare performance metrics, such as the value of the cost integral and the acceleration profile, for other guidance laws used in the engagement. Table 10.1 shows the parameter values for the engagement. The initial and terminal states are irrelevant, as long as Eq. 7.200 is satisfied for the optimal trajectory. 279

9 OPT PPN 8 PAPN LAPN TARGET

7

6

5

4

3

2

1

0 −2 −1.5 −1 −0.5 0 0.5 1

Figure 10.2. Plane (2D) engagement with accelerating target

Accelerating Target Acomputersimulationwasconductedwithanevaderthattra- versed the curved path shown in Figure 10.2. The goal was to demonstrate the advantage of including knowledge of the target maneuver in the guidance command. The engagement was confined to a plane to make the results easier to interpret. How- ever, simulations in 3D space were conducted as well and similar performance was observed. Due to PPN’s inability to anticipate the target maneuver, it grossly over- corrected and had a cost of nearly fourteen times that incurred by optimal guidance. Both LAPN and PAPN incurred a cost approximately fifteen percent more than op- timal guidance. These results demonstrate the excellent performance of both LAPN and PAPN against an agile evader. 280

0.16 Optimal LPN 0.14 PPN

0.12

0.1

0.08

0.06 X−Axis ZEM

0.04

0.02

0

−0.02 0 1 2 3 4 5 6 7 8 9 time

Figure 10.3. X-axis ZEM trajectories.

Constant Velocity Target A computer simulation was conducted with a constant ve- locity evader to demonstrate the effectiveness of PPN and LPN. The resulting data presented here is characteristic of constant velocity engagements. It was observed that PPN performed better for some parameter settings and LPN performed better for other parameter settings. Figures 10.3-10.5 show the ZEM for the Optimal, LPN and PPN guidance laws. Note that the viewing angle is such that the initial engage- ment plane is the X Z plane. This was done to show that the PPN GL does indeed − keep the ZEM confined to this plane. Figure 10.6 shows the magnitude of the ZEM (decreasing trajectories) and the normalized cost (increasing trajectories), which is a constant times the integral in Eq. 7.174. Figure 10.7 shows the magnitude of the pursuer’s acceleration vector. 281

0.12 Optimal LPN PPN 0.1

0.08

0.06

Y−Axis ZEM 0.04

0.02

0

−0.02 0 1 2 3 4 5 6 7 8 9 time

Figure 10.4. Y-axis ZEM trajectories. 282

0.2

0

−0.2

−0.4 Optimal LPN PPN −0.6 Z−Axis ZEM

−0.8

−1

−1.2

−1.4 0 1 2 3 4 5 6 7 8 9 time

Figure 10.5. Z-axis ZEM trajectories. 283

1.4 Optimal LPN PPN 1.2

1

0.8

0.6

0.4

0.2

0 0 1 2 3 4 5 6 7 8 9 time

Figure 10.6. Asymptotically decreasing ZEM trajectories and asymptotically in- creasing cost trajectories for guidance laws. 284

0.08 Optimal LPN PPN 0.07

0.06

0.05

0.04 Acceleration

0.03

0.02

0.01

0 0 1 2 3 4 5 6 7 8 time

Figure 10.7. Pursuer acceleration as a function of time.

10.3 Conclusions and Recommendations for Future Research

The author’s goal in writing this dissertation was to develop novel approaches to missile guidance. Such a dissertation is important, and will continue in its relevance, because of the evolving capabilities of the threats faced by modern defensive weapons. This dissertation should serve as a useful reference to those entering into the field of missile guidance, as well as, those who have been in the field for some time. The following three subsections conclude this dissertation. Subsection 10.3.1 provides a brief summary of the dissertation, mainly focusing on content. Subsection 10.3.2 provides a discussion of the contributions of this dissertation to the field of missile guidance. The final subsection indicates some possible avenues of future research. 285

10.3.1 Summary of Dissertation

This subsection provides a brief summary of the topics covered in this dissertation. The first several chapters of this dissertation were included to provide a theoretical framework that could be used to develop the theory of missile guidance. The theory of estimation (Chapter 2) and in particular estimation in linear sampled data systems (Chapter 3) were developed in detail. These topics are of direct importance to missile guidance because an estimate of pertinent engagement parameters is necessary to guide the missile to its target. The theory of estimation was essential to the discussion of estimating the zero-effort miss in Chapter 8 and to the discussion of guidance strategies in Chapter 9. It is possible that the chapters on estimation could have been omitted and the author could have instead relied on extensive referencing of various text books. Besides being cumbersome, this approach suffers other problems. The first problem is that the author did not always have a reference for all the material needed. Another problem is the inconsistency of conventions that occurs in the many discourses on estimation. Therefore, it was decided to include these chapters on estimation with the hope that doing so would at least make the dissertation more self contained and more easily read. Chapter 4 addressed the problem of tracking a target with an unknown motion profile. Therein, the target’s inertial properties were shown to result in a Markov model, which itself has been extensively studied elsewhere in the literature. Chapter 5 developed fundamental concepts in optimization and advanced control theory. This chapter was included because these theories are central to the study of missile guid- ance. The problem of modeling the airframe and associated controller (autopilot), which are together referred to as the flight control system, was taken up in Chapter 6. A tail controlled missile was analyzed and the equations governing the commanded to achieved acceleration were developed. The resulting model was shown to be a non-minimum phase, third-order system. Two lower-fidelity alternatives were also 286 proposed and compared with the full model. The target models developed in Chapter 4, and the missile models developed in Chapter 6, were used to construct various engagement models in Chapter 7. General equations describing the miss and zero-effort-miss (ZEM) were developed for these engagement models. These results were then used to concisely develop optimal controllers for the missile using the optimal control theory developed in Chapter 5. Chapter 8 addressed the problem of estimating the variables required by the optimal controllers. The results of Chapters 7-8 were used in Chapter 9 to develop the guidance strategy paradigm. The usefulness of the guidance strategy paradigm was made evident by the two examples given in Chapter 10 of the dissertation.

10.3.2 Contributions

This dissertation has made several contributions to the field of missile guidance. Some of these contributions were minor while others are likely to make an important impact on future work in missile guidance. It is suspected that there may, in addition, be some minor contributions made to fields outside of missile guidance. However, because of the author’s limited knowledge of these fields (outside of missile guidance), the author makes sure to note that others may have previously arrived at the same results.

Minor Contributions Chapters 2-4 of this dissertation may very well make some minor contributions to estimation theory. It is difficult to claim with certainty that contri- butionsweremadetothisfield because it is a very broad field and it is not as well known to the author as the field of missile guidance. Chapter 2 presents the theory of estimation in terms of risk. This proved an excellent method of developing Bayesian estimation and enabled the author to develop maximum likelihood estimators in the same context as minimum mean square error (MMSE) estimators. Nevertheless, the author doubts that this is an original approach to estimation. Chapter 3 presents 287 the theory of estimation in linear sampled data systems. This is relevant to this dis- sertation because the missile is a continuous-time system that receives measurements at discrete time intervals. As such, it can be modeled as a sampled data system. This is an interesting topic because estimation is usually discussed for continuous- time, or discrete-time systems, but not for sampled data systems. The results of this analysis did not reveal anything unexpected and resulted in only a blending of the results for continuous-time and discrete-time systems. Furthermore, it is very unlikely that this was the first time a Bayesian estimator was developed specifically for a sampled data system. Chapter 4 developed the theory of stochastic motion models. The author developed this theory in the context of the principle of inertia, and continuously referred back to this principle throughout the development of the various stochastic motion models considered therein. This could represent a minor contribution to stochastic motion modeling. While this principle is unlikely to result in any new tracking systems, it may assist in formalizing the theory or at least in providing a better context in which it can be presented. Another minor contribution may have been made in describing the uncertainty of a Gaussian distributed random variable. Uncertainty regions were discussed briefly in Chapter 4. However, a more thorough discussion can be found in Appendix A.7.1. Therein, the uncertainty of a multivariate Gaussian random variable was shown to be related to the eigenvalues of the covariance matrix — a fact that is well known. However, the different types of uncertainty regions considered by the author, for example a box type of uncertainty region (see Figure A.4 and Table A.2), may represent a minor contribution to the field of statistics. As a final minor contribution, the author will mention the development of Pear- son’s estimation equations in Chapter 8. Pearson’s original Ph.D. dissertation was well written and concise [66], but occasionally lacked rigor. The model proposed by Pearson is rigorously developed in Section 8.3. The author felt it necessary to derive these equations because of their importance to the field of estimation in radar 288 systems and the lack of a source containing a detailed derivation of them1.

Notable Contributions The extension of Lyapunov control theory discussed in Sec- tion 10.1 is likely a new contribution to the field of control theory. More specifically, the modified cost function, given by Eq. 10.2, can be used for systems in which the control cannot instantaneously affect the gradient of the Lyapunov function. The flight control system models developed in Chapter 7 of this dissertation have all been developed in other references (see [101] and [84]). However, the time-domain and frequency-domain comparisons of the single pole and pole/zero flight control system approximations may have resulted in some new insight into the problem of flight control system modeling. It was observed that the wrong-way effect present in the pole-zero model may make it a more attractive model from a guidance system de- sign perspective. However, the lack of high-frequency magnitude attenuation makes thepolemodelpreferablefromafiltering and estimation perspective (see Sections 6.4-6.5). Chapter 8 of this dissertation developed optimal guidance in a very concise man- ner. Thiswasmadepossiblebyfirst developing equations for the miss and the zero effort miss and then using these equations to facilitate development of subsequent guidance laws. This is a novel approach to guidance law development that not only facilitates understanding of the resulting guidance laws, but also greatly simplifies the resulting mathematics. Another aspect that made this approach more under- standable is the use of the state transition matrix rather than the Laplace transform approach used by Rusnak to develop guidance laws for missiles with acceleration dynamics [70]. A new definition of the heading error that is applicable in three di- mensional space is given in Section 7.4. While the importance of the zero effort miss is well known in the field of missile guidance, the importance of the heading error (especially in the context of maneuvering targets) has not been fully realized. The

1 Although Pearson originally developed the equations, his derivation was somewhat heuristic. 289 results of Section 7.10 show that the heading error is the metric of importance for a missile with directional acceleration constraints, thereby underscoring the importance of the heading error definition provided in Section 7.4. A new time-to-go estima- tion algorithm was developed in Chapter 8. This new algorithm, which is given by Eq. 7.227, is superior to the commonly used Bryson and Ho time-to-go algorithm [43]. The importance of this algorithm will likely be seen in the years that follow the publication of this dissertation.

Important Contributions This dissertation makes at least five very important contri- butions to the field of missile guidance. The first major contribution is obtaining a closed loop controller for the engagement model originally proposed by Guelman in 1984 [38]. This model was considered in Section 7.10, and the closed loop controller is given by Eq. 7.196. This guidance law, given by Eq. 7.196, should be effective when the missile cannot control axial acceleration. The same guidance law is applicable if the missile has a (possibly time-varying) drag component that acts parallel to the missile velocity vector. The only difference is that the drag must be accounted for in computing the zero effort miss. The second major contribution to the field of missile guidance is the development of the guidance strategy paradigm. This new result will likely change the way many engineers view the process of missile guidance and the way they approach guidance law development. The guidance strategy paradigm considers optimal guidance as a dichotomy that is composed of an optimal guidance strategy and an optimal control strategy. The optimal guidance strategy is dictated by the missile’s information constraints (i.e. level of knowledge about the state of the system), and the optimal control strategy is determined by the missile’s control constraints. The third major contribution of this dissertation to the field of missile guidance is on the use of Lyapunov control theory in conjunction with the guidance strategy paradigm as a means of developing missile guidance laws. The same ap- proach can be used in more complicated engagement models that have previously been 290 completely unassailable by optimal control. The fourth major contribution to the field of missile guidance is the extension of the pure proportional navigation (PPN) guidance law to a guidance law that is effective against maneuvering targets. The resulting guidance law was developed as a result of the guidance strategy paradigm and will likely be the topic of several papers over the next few years. The fifth (and perhaps least significant of the important contributions, but important none the less) important contribution is the development of an optimal guidance law for a missile that does not have the ability to estimate range effectively. The optimal guidance, which is given by Eq. 8.85, should be very useful in pasive sensor missiles.

10.3.3 Future Research

The previous subsection detailed this dissertations contributions to the field of missile guidance. Upon reading this dissertation, the reader will likely see that it provides many avenues for future research. Several of the areas that the author is most interested in will be briefly discussed in this subsection. One of the major contributions of this dissertation is in obtaining a closed loop controller for a missile without axial acceleration capability. This controller has a scale factor that is a function of the angle φ (tf ) between the final-time values of the pursuer’s velocity vector and the LOS vector. A plot of the scale factor for different values of φ (tf ) is shown Figure 7.8. To make the best use of this guidance law, an estimator should be developed to estimate the angle φ (tf ).One possibility is to use the current value of the angle as a starting point φ (tf )=φ (t) for further refinement. This estimate could then be improved by use of an onboard simulation. A simulation would be conducted using the current estimate of φ (tf ) in the guidance law until terminal conditions are reached. A new value of φ (tf ) could then be obtained from the resulting terminal state. This process would be repeated to produce increasingly accurate estimates of φ (tf ). Evenwithmodernhardware,it 291 is unlikely that a simulation could be conducted between each measurement update (which likely occur at 30-100 Hz). However, it is likely that the parameter could be updated at a lower rate of 1 to10 Hz. The new time-to-go algorithm, developed in Section 7.11, has a very interesting property. This algorithm makes use of the angle β between the relative position vector and relative velocity vector. When this angle is greater than thirty degrees, the resulting time-to-go estimate is complex (i.e. it has an imaginary component). This algorithm has been found to work perfectly for values of β less than 30 degrees. Further research is needed to determine what the optimal time-to-go is for values of β greater than 30 degrees. The most promising avenue of future research is further development of Lyapunov controllers under the guidance strategy paradigm. The guidance strategy paradigm is likely extendable to other types of systems outside the field of missile guidance. The most successful applications of this technique will likely be for systems that are fundamentally linear, but have controls that interact nonlinearly with the system state. Within the field of missile guidance, the methods of Section 10.1 could be applied to more complex engagement models. More specifically, an engagement model of a pursuer with both a directional control constraint and a dynamic control constraint should be subjected to study in the context of guidance strategies and Lyapunov control techniques. 292

Appendix A Probability and Stochastic Processes

The intent of this section is to provide a review of some of the important concepts from probability theory and random process theory. The two most popular text books for this area of study are [63] and [85]. Another excellent book is [21].

A.1 Concepts in Probability

Set theory is used in probability. It will be assumed that the reader can follow the discussion that follows without any set theoretic definitions. The axiomatic approach to probability is based on the following three axioms.

Axiom A.1. 0 P (A) 1 ≤ ≤ Axiom A.2. P (S)=1

Axiom A.3. A B = P (A B)=P (A)+P (B) ∩ ∅⇒ ∪ The first axiom states that the probability of an event A occurring is between zero and one. The second axiom states that the probability of the certain event S is equal to one. If the union of two events A and B is the null set the probability of either event occurring (the union of the events) is equal to the sum of their probabilities. Theconceptofafield is also important in the discussion of probability. A field F must satisfy two properties.

1. A F A¯ F ∈ ⇔ ∈ 2. (A F ) and (B F)= A B F ∈ ∈ ⇒ ∪ ∈ 293

We can think of the set S as arising from an experiment. In this context, the set S is simply the set of all possible experimental outcomes. In fact, the axiomatic definition of an experiment consists of three specifications

1. The set S of all experimental outcomes

2. The set of events comprising a field F of all experimental outcomes of S

3. The probability of these events

Assume that the experiment consists of a single toss of a fair coin, the set S is given by S = H, T .(A.36) { } The field is given by F = ,H,T,S .(A.37) {∅ } The probability of these events are

P (Φ)=0 (A.38a)

P (H)=1/2 (A.38b)

P (T)=1/2 (A.38c)

P (S)=1.(A.38d)

The conditional probability that event B occurred given that A occurred is defined to be P (A B) P (B A)= ∩ .(A.39) | P (A) An obvious consequence of this definition is Bayes’ theorem

P (A B) P (B) P (B A)= | .(A.40) | P (A) 294

A.2 Random Variables

ArandomvariableX is a mapping from the set of experimental outcomes S to the set of real numbers 1 < X = f (S) .(A.41)

The set X = x,wherex is a real number is

X = x = ζ : ζ S and f (ζ )=x ,(A.42) { } { i i ∈ i } and P (X = x)=P (ζ : ζ S and f (ζ )=x) .(A.43) i i ∈ i Similarly, the set X x is given by ≤ X x = ζ : ζ S and f (ζ ) x .(A.44) { ≤ } { i i ∈ i ≤ } The probability of the event X x is known as the distribution of X ≤

FX (x) P ( X x ) .(A.45) , { ≤ } ThederivativeofthisfunctionisknownasthedensityofX dF (x) f (x) X (A.46a) X , dx x FX (x)= fX (ξ) dξ .(A.46b) Z−∞ If there are multiple random variables then

FX (x) , P (X1 x1,X2 x2,...,Xn xn) (A.47a) n ≤ ≤ ≤ ∂ FX (x) fX (x) , (A.47b) ∂x1∂x2 ∂xn xn ···x2 x1 FX (x)= fX (x) dx1dx2 dxn ,(A.47c) ··· ··· Z−∞ Z−∞ Z−∞ where X = X1 X2 Xn . If the random variables are independent then ··· £ ¤ f (x)=f1 (x1) f2 (x2) fn (xn) .(A.48) ··· 295

The density function may be integrated to remove the dependency of a random vari- able ∞ fX1 (x1)= fX1,X2 (x1,ξ2) dξ2 .(A.49) Z−∞ A consequence of Bayes’ theorem is

FX2 X1 (x1 x2) , P ( X2 x2 x1 X1 x1 + dx1 ) | | { ≤ }|{ ≤ ≤ } P ( X2 x2 x1 X1 x1 + dx1 ) = { ≤ } ∩ { ≤ ≤ } P ( x1 X1 x1 + dx1 ) { ≤ ≤ } x2 x1+dx1 f (ξ ,ξ ) dξ dξ x1 X1,X2 1 2 1 2 = −∞ x1+dx1 fX (ξ ) dξ R R x1 1 1 1 x2 f (x ,ξ ) dξ dx XR1,X2 1 2 2 1 = −∞ f (x ) dx R X1 1 1 x2 fX1,X2 (x1,ξ2) dξ2 = −∞ .(A.50) f (x ) R X1 1 Taking the partial derivative gives the conditional density

∂FX2 X1 (x1 x2) | fX2 X1 (x1 x2)= | | | ∂x2 f (x ,x ) = X1,X2 1 2 .(A.51) fX1 (x1) Similarly, for higher dimensional vectors

fX,Z (x, z) fX Z (x z)= ,(A.52) | | fZ (z) where Z = Z1 Z2 Zm . ··· £ ¤ A.3 Moment Generating Functions

One of the more useful concepts related to density functions is the moment generating function (MGF), which is defined by

tX MX (t) , E e ∞ tx = © e ªfX (x) dx .(A.53) Z−∞ 296

Note that this is precisely Laplace transform of the density function with s = t. − Therefore, all properties of the Laplace transform hold for moment generating func- tion; in particular uniqueness

MX (t) fX (x) .(A.54) ↔ This relationship states that there is a unique (one-to-one) mapping from the moment generating function to the PDF. Thus, given either the MGF or the PDF, the other can be directly inferred. The MGF of an n-dimensional RV is defined by

T t X t1X1 t2X2 tnXn MX (t)=E e = E e e e .(A.55) ··· n o © ª If the n-dimensional random variables xi are all independent of one another then the MGF is given by

MX (t)=MX (t1) MX (t2) MXn (tn) .(A.56) 1 2 ···

A.3.1 Computation of Moments

Expanding the MGF into a Taylor series gives

tX MX (t)=E e (tX)2 (tX)3 = E © 1+ªtX + + + ( 2! 3! ···) t2 t3 = 1+tµ(1) + µ(2) + µ(3) + ,(A.57) x 2! x 3! x ··· ∙ µ ¶ µ ¶ ¸ (n) n th where µx = E X .Thus,then order moment can be computed from the MGF { } ∂nM (t) µ(n) = X ,(A.58) x ∂tn ∙ ¸t=0 as can be seen directly from the previously given Taylor series expansion of the MGF.

Theorem A.4. In the case of multivariate random variables, the firstmoment(mean) can be obtained from ∂M (t) T E X = X . (A.59) { } ∂t µ ¶t=0 297

Similarly, the correlation can be obtained from

∂ ∂M (t) T E XXT = X . (A.60) ∂t ∂t " µ ¶ #t=0 © ª Proof. Taking the partial derivative of MX (t) gives

∂M (t) ∂ T ∂ T T X = E et X = E et X = E XT et X .(A.61) ∂t ∂t ∂t n o ½ ¾ n o Setting t = 0,gives ∂M (t) T µ =E X = X .(A.62) { } ∂t µ ¶t=0 Taking the partial derivative again gives

T ∂ ∂M (t) ∂ T T X = E XT et X ∂t ∂t ∂t µ ¶ h n oi ∂ T = E Xet X ∂t ½ ¾ ³ T ´ ∂et X = E X ( ∂t ) T = E XXT et X .(A.63) n o Setting t = 0 gives

∂ ∂M (t) T E XXT = X .(A.64) ∂t ∂t " µ ¶ #t=0 © ª

A.3.2 Functions of Independent Random Variables

TheuniquenesspropertyoftheMGFmakesitveryconvenienttocomputethePDF of a random variable that is a function of other random variables. For example, suppose y = g (x),thentheMGFofy is given by

tY tg(X) ∞ tg(x) MY (t)=E e = E e = e f (x) dx .(A.65) Z−∞ © ª © ª 298

Theorem A.5. The sum of n independent random variables has an MGF equal to the product of the individual random variables MGF’s

Proof. Let Y denote the sum of the independent RV’s n

Y = Xi .(A.66) i=1 X The MGF of Y is given by

tY t(X1+X2+ Xn) MY (t)=MY (t)=E e = E e ··· © ª n n o tX1 tX2 tXn = E e E e E e = MX (t) .(A.67) ··· i i=1 © ª © ª © ª Y

Corollary A.6. The sum of n independent and identically distributed (IID) random variables has an MGF equal to the MGF of the individual RV’s raised to the nth power

Proof. Since the random variables are independent then we know that n n n MY (t)= MXi (t)= MX (t)=[MX (t)] .(A.68) i=1 i=1 Y Y

A.3.3 Marginal Distributions

The MGF provides a convenient way of evaluating marginal distributions. To show this, consider partitioning a random variable x as follows X X = 1 .(A.69) X2 ∙ ¸ Then, the moment generating function of x is

tT X MX (t)=E e

tT X +t X = E ne 1 1o 2 2

T n t x1+t2x2o = e 1 f (x1, x2) dx1dx2 .(A.70) ZZ 299

Suppose the MGF is evaluated at t2 = 0

T t x1 MX (t1, t2 = 0)= e 1 f (x1, x2) dx1dx2 ZZ T t x1 = e 1 f (x1) dx1 .(A.71) Z That is, the MGF of X1 can be obtained by evaluating the joint MGF at t2 = 0

MX1 (t1)=MX1,X2 (t1, t2 = 0) .(A.72)

A.4 Normal (Gaussian) Distribution

The (univariate) Normal probability density function (PDF) is given by

1 (x µ)2/2σ2 f (x µ, σ)= e− − .(A.73) | √2πσ AR.VX that has a normal density is often denoted by X N (µ, σ2).Thenormal ∼ PDF can be evaluated using the MATLAB normpdf command

f (x µ, σ)=normpdf(x, µ, σ) .(A.74) | Similarly, the normal CDF can be evaluated using the MATLAB normcdf command

F (x µ, σ)=normcdf(x, µ, σ) .(A.75) | Theorem A.7. The MGF of the normal distribution is given by

µt+σ2t2/2 MX (t)=e . (A.76)

Proof. This is a special case of the multivariate Gaussian random variable and so themoregeneralproofapplies(seeTheoremA.14).

Theorem A.8. The mean and variance of a normally distributed R.V. are given by

µx = µ (A.77)

2 2 σx = σ . (A.78) 300

Proof. This is a special case of the multivariate Gaussian random variable and so themoregeneralproofapplies(seeTheoremA.15).

A normal R.V. is usually converted to a standardized normal R.V. Z that has a zero mean and unity variance through the transformation X µ Z = − or X = Zσ + µ .(A.79) σ One can show that Z N (0, 1) using standard transformation techniques from ∼ probability theory. Since the transformation is one-to-one 1 f (z)= f (zσ + µ) Z dz/dx X

1 (zσ+µ µ)2/2σ2 = σ e− − √2πσ 1 z2/2 = e− √2π N (0, 1) .(A.80) ∼ This same relationship may also be obtained by a simple change of variables in the cumulative distribution integral of N (µ, σ2). A table of the Gaussian CDF is given in [56, pp. 638-639].

Example A.1. Let X be distributed as N (2, 9). Find the probability that X is less than 8;thatis,find P (X<8) =,sinceP (X =8)=0. Using the standard normal transformation given by Eq. A.79 gives X 2 Z = − or X =3Z +2, (A.81) 3 and 8 2 P (X 8) = P (3Z +2 8) = P Z − = P (Z 2) , (A.82) ≤ ≤ ≤ 3 ≤ µ ¶ where Z N (0, 1),asrequiredtouseastandardnormaltable. Usingthetablefound ∼ in [56, pp. 638-639], this probability is 0.97725

P (X 8) = P (Z 2) = 0.97725 . (A.83) ≤ ≤ 301

This same result could have been found using MATLAB

P (X 8) = F (8 2, 3) = normcdf(8, 2, 3) = 0.97724986805182 . (A.84) ≤ |

A.4.1 Confidence Intervals

When one is interested in determining how large a random variable will be, the cu- mulative distribution is usually used to determine an upper limit with a specified probability. This limit corresponds to the argument of the cumulative distribution that yields the specified probability level. In other words, to say that a random vari- able X is less than x with 95% probability (confidence level) means that x corresponds to the 95 percentile of the random variable’s probability distribution

X x 95% probability P (X x)=0.95 .(A.85) ≤ ⇔ ≤ For a given probability (confidence level), the value of x is usually obtained from a table of the CDF or from an inverse CDF function. An inverse CDF function requires the same distributional parameters as the CDF function would. While the CDF uses x to compute a probability, the inverse CDF uses the probability to compute a value of x. An inverse CDF is guaranteed to exist and be unique because the CDF is always a one-to-one mapping. When one is interested in dispersion about the mean of a random variable then a region about the mean is chosen such that the space within the region will contain the random variable with a specified probability. It is especially convenient if the region can be normalized to a random variables distributional parameters. For Gaussian random variables one is interested in the region M M 1 θ2/2 1 θ2/2 P ( Z M)= e− dθ =2 e− dθ =2erf(M) ,(A.86) | | ≤ √2π M √2π 0 Z− Z where M is a positive integer (i.e. a natural number) and erf ( ) is the error function · defined by x 1 θ2/2 erf (x)= e− dθ .(A.87) √2π Z0 302

There are slight variations on the definition of the error function, but the one given by Eq. A.87 is the one used here. Substituting Eq. A.79 into Eq. A.86 X µ P ( Z M)=P − M = P (µ Mσ X µ + Mσ)=2erf(M) . | | ≤ σ ≤ − ≤ ≤ µ¯ ¯ ¶ ¯ ¯ (A.88) ¯ ¯ This result indicates that¯ M is¯ the width of the confidence interval in terms of number of standard deviations. This equation can also be expressed as follows.

X µ Mσ 2erf(M) 100% confidence (A.89) ∈ ± × Typical values of M are one, two and three. X µ σ 68.3% confidence X ∈ µ ± 2σ 95.4% confidence (A.90) X ∈ µ ± 3σ 99.7% confidence ∈ ± Thelastresultindicatesthatonecanbe99.7%certainthatthevalueofX will be between plus or minus three standard deviations of the mean µ.

A.4.2 Sum and Difference of Independent Normally Distributed Random Variables

There are many ways to show that a sum of normally distributed random variables is also a normally distributed random variable. One such method is to use Theorem A.5 n

Y = Xi (A.91a) i=1 Xn

MY (t)= MXi (t) ,(A.91b) i=1 Y 2 where MXi (t) is the MGF of an RV Xi that is N (µi,σi ). Using this result and Eq. A.76 gives

n n n 2 2 µit+σi t /2 2 2 MY (t)= e =exp µi t + σi t /2 .(A.92) i=1 (Ã i=1 ! Ã i=1 ! ) Y X X 303

Thus, by the uniqueness of the MGF, we have

n n 2 y N µi, σi .(A.93) ∼ Ã i=1 i=1 ! X X A similar problem is the difference between two normal random variables d = x1 x2 −

2 xi N µ ,σ (A.94a) ∼ i i

d = x1 ¡ x2 .(A.94b)¢ −

It is easy to show, using moment generating functions, that

d N µ µ ,σ2 + σ2 .(A.95) ∼ 1 − 2 1 2 ¡ ¢ These same results could have been arrived at by recognizing that summations and differences can be considered linear transformations of a multivariate Gaussian ran- dom variable X.

A.4.3 Relationship to Chi-Square Distribution

The primary reason for introducing the chi-square distribution is because of its rela- tionship to the Gaussian distribution.

Theorem A.9. Given n independent N (0, 1) random variables Zi,weformthesum of their squares n 2 X = Zi . (A.96) i=1 X The distribution of X is χ2 (n).

Proof. Suppose that a random variable Z is N (0, 1) and a transformation is made Y = Z2. This problem is addressed in [63, p. 133], however the solution is trivial. The function y = z2 has two solutions

z = √y ,(A.97) ± 304 and dy =2z = 2√y .(A.98) dz ± Thus, the density of Y is given by 1 1 fY (y)= fZ ( √y)+ fZ (+√y) S (y) dy/dz − dy/dz "¯ ¯z= √y ¯ ¯z=+√y # ¯ ¯ − ¯ ¯ ¯ 1 1¯ y/2 1 1 ¯ y/2 ¯ = ¯ ¯ e− + ¯e− ¯S (y) 2√y √2π 2√y √2π ∙ ¸ 1 y/2 = e− S (y) ,(A.99) √2πy where S (y) is the step function that is zero for all y<0 and one otherwise. The density of Y is the density of a chi-square R.V. with one degree of freedom Y χ2 (1). ∼ Now, let X be the sum of Yi n

X = Yi i=1 Xn 2 = Zi . (A.100) i=1 X 2 Each of the Yi are χ (1).Thatis,X is a sum of chi square distributed R.V.’s, each withonedegreeoffreedom(i.e.n =1). ItisshowninSectionA.6thatthesumhas a distribution χ2 (n) .

A.5 Gamma Distribution

The basis for the gamma distribution is the gamma function defined by

∞ α 1 t Γ (α)= t − e− dt . (A.101) Z0 It is not necessary to have much insight into the gamma function for it to be useful in this context. Suffice it to say that the gamma function is finite and has a positive integrand for any positive constant α. This makes the function appealing from a probability density standpoint in which all densities must be non-negative and integrate to unity. 305

Theorem A.10. The gamma function satisfies the recursion property

Γ (α +1)=αΓ (α) . (A.102)

Proof. The gamma function, defined by Eq. A.101, can be evaluated at α +1

∞ α x Γ (α +1)= x e− dx . (A.103) Z0 α x Integrating by parts with u = x and v0 = e− gives

α 1 u0 = αx − (A.104a)

x v = e− , (A.104b) − and

∞ Γ (α +1) = [u v]∞ u0 vdx · 0 − · Z0 α x ∞ α 1 x = x e− ∞ + αx − e− dx − 0 Z0 £∞ α ¤1 x = αx − e− dx Z0 ∞ α 1 x = α x − e− dx Z0 = αΓ (α) . (A.105)

Q.E.D.

One can define a probability density function using the integrand of the gamma function tα 1e t f (t α)= − − S (t) , (A.106) | Γ (α) where S (t) is the step function which has a value of 0 if t is less than zero and one otherwise. The Γ (α) in the denominator insures that the cumulative distribution will have a maximum value of one t t α 1 τ t τ − e− 1 α 1 τ F (t α)= f (τ α) dτ = dτ = τ − e− dτ(A.107a) | | 0 Γ (α) Γ (α) 0 Z−∞ Z Z 1 ∞ α 1 τ 1 F ( α)= τ − e− dτ = Γ (α)=1. (A.107b) ∞| Γ (α) Γ (α) Z0 306

This is not quite the gamma function because the it only has one parameter, α.Using a change of variables t = x/β in the cumulative distribution integral, one arrives at a the general two parameter (α, β) gamma density function

1 α 1 x/β f (x α, β)= x − e− S (x) . (A.108) | Γ (α) βα Theorem A.11. The gamma PDF is non-negative and integrates to one, as required by any PDF, so as to satisfy the axioms of probability.

Proof. Each component of the gamma density is non-negative for any positive values

α 1 of α, β and x.Whenx is less than zero, it is possible that the term x − is less than zero. However, the step function S (x) is zero for all values of x less than zero. Therefore, the gamma density is non-negative for all values of x as long as α and β are greater than zero. To be a density function that will satisfy the axioms of probability (namely that the certain event has a probability of one), the density function must integrate to unity

∞ ∞ 1 α 1 x/β F ( α, β)= f (x α, β) dx = α x − e− dx ∞| | 0 Γ (α) β Z−∞ Z 1 = ∞ xα 1e x/βdx . (A.109) Γ (α) βα − − Z0 Let t = x/β,thendt = dx/β and

1 ∞ α 1 x/β 1 ∞ α 1 t x e dx = (βt) − e βdt Γ (α) βα − − Γ (α) βα − Z0 Z0 1 α ∞ α 1 t = α β t − e− dt Γ (α) β 0 1 Z = Γ (α) Γ (α) =1. (A.110)

Q.E.D.

The gamma PDF can be evaluated using the MATLAB gampdf command

f (x α, β)=gampdf(x, α, β) . (A.111) | 307

Similarly, the gamma CDF can be evaluated using the MATLAB gamcdf command

F (x α, β)=gamcdf(x, α, β) . (A.112) |

Theorem A.12. The MGF of the gamma distribution is given by

α MX (t)=(1 βt)− . (A.113) −

Proof. The MGF of a an RV X that has a gamma distribution is given by

tX MX (t)=E e

∞ tx = © e ªfX (x) dx Z−∞ 1 = ∞ etx xα 1e x/βS (x) dx Γ (α) βα − − Z−∞ 1 = ∞ etxxα 1e x/βdx Γ (α) βα − − Z0 1 ∞ α 1 x( 1 t) = x e− β − dx . (A.114) Γ (α) βα − Z0 Thegoalatthispointistoreducetheintegraltoaformthatisequivalenttothe gamma function given by Equation A.101. A change of variables y = x 1 t gives β − ³ ´ 1 ∞ 1 α 1 x( β t) MX (t)= α x − e− − dx Γ (α) β 0 Z α 1 1 ∞ y − y dy = α α 1 e− Γ (α) β 0 1 − 1 Z t β t β − − ³ ´ 1 ³ ´∞ α 1 y = α y − e− dy 1 t Γ (α) βα 0 β − Z ³ ´1 = α Γ (α) 1 t Γ (α) βα β − ³ 1 ´ = . (A.115) (1 βt)α − Q.E.D. 308

The following theorem and proof use the gamma function MGF to calculate the mean and variance of a gamma random variable.

Theorem A.13. The mean and variance of a gamma distributed R.V. are given by

µX = αβ (A.116)

2 2 σX = αβ . (A.117)

Proof. The MGF can be used to compute the mean

dM (t) αβ µ = E X = X = = αβ . (A.118) X { } dt (1 βt)α+1 ¯t=0 ¯ ¯ − ¯ ¯ ¯ The MGF can also be used to compute the¯ second moment¯ dM 2 (t) (α +1)αβ2 E X2 = X = =(α +1)αβ2 . (A.119) dt2 (1 βt)α+2 ¯t=0 ¯ © ª ¯ − ¯ ¯ ¯ Therefore, the variance is given¯ by ¯

σ2 = E (X µ )2 = E X2 µ2 = αβ2 . (A.120) X − X − X © ª © ª Q.E.D.

AR.VX that has a gamma distribution is often denoted by X gamma (α, β) ∼ or X G (α, β). ∼

A.5.1 Sum of Independently Distributed Gamma Random Variables

Suppose that a set of random variables Xi are gamma distributed with parameters

αi and β

Xi G (αi,β) . (A.121) ∼ Thealgebraicsumofthenumbersisgivenby

n

Y = Xi . (A.122) i=1 X 309

The MGF of Y can be found using Theorem A.5 and Eq. A.113

n

MY (t)= MXi (t) i=1 Yn αi = (1 βt)− i=1 − Y (α1+α2+ +αn) =(1 βt)− ··· . (A.123) −

Because of the uniqueness of the MGF, the RV Y is also gamma with parameters n α = i=1 αi and β n

P Y G αi,β . (A.124) ∼ Ã i=1 ! X A.6 Chi-Square Distribution

AR.VX that has a chi-square distribution with n degrees of freedom is often denoted by X χ2 (n). The chi-square distribution is a special case of the gamma distribution ∼ when α = n/2 and β =2 χ2 (n)=G (n/2, 2) . (A.125)

Accordingly, he chi-square PDF is defined by

n/2 1 x/2 x − e− fX (x n, λ)= S (x) . (A.126) | 2n/2Γ (n/2)

Since the chi-square R.V. is a special case of the gamma R.V., all states about the gamma R.V. apply to the chi-square random variable. In particular, the mean and variance are given by Eqs. A.116-A.117

µx = n (A.127a)

2 σx =2n , (A.127b) and the MGF is given by n/2 MX (t)=(1 2t)− . (A.128) − 310

Thechi-squarePDFcanbeevaluatedusingtheMATLABchi2pdfcommand

f (x n)=chi2pdf(x, n) . (A.129) | Similarly,thechi-squareCDFcanbeevaluated using the MATLAB chi2cdf command

F (x n)=chi2cdf(x, n) . (A.130) |

2 2 A table of the χ (n) CDF is provided in [56, p. 641]. To use the table, define χα,n such that P X<χ2 =1 α . (A.131) α,n − ¡ ¢ 2 Given a particular value of α, one can use a table to determine χα,n,ortheMATLAB command chi2inv χ2 = chi2inv (1 α, n) . (A.132) α,n − Example A.2. Let X be a random variable that is chi-square with 10 degrees of freedom; i.e. X χ2 (10).Wewishtofind a value that we can be 95% confident ∼ that X is less than. Therefore, α =0.05 and from the table in [56, p. 641], the value

2 of χ0.05,10 isfoundtobe18.31

P (X<18.31) = 0.95 . (A.133)

This same result could have been found using MATLAB

2 χ0.05,10 = chi2inv (0.95, 10) = 18.30703805327514 . (A.134)

Example A.3. As with Example A.2, let X χ2 (10). In Example A.2, we actually ∼ found an interval in which X would very likely (95% likely) be found. That is,

P (0

In effect, the previously found interval is the 0 to 95 percentile for the random vari- able. Other such intervals can be examined. For example, we could find the proba- bility that X is in the 2.5 to 97.5 percentile range. We could be 95% confident that 311 this X would be in this interval.

2 2 P χ1 α/2,n

2 2 A = χ1 α/2,n

A B = . (A.138) ∩ ∅ The union of A and B is the set

A B = X<χ2 . (A.139) ∪ α/2,n Since the intersection of A and B is the© null set weª have

P (A B)=P (A)+P (B) P (A B) ∪ − ∩ = P (A)+P (B) . (A.140)

Solving for P (A) gives

P (A)=P (A B) P (B) , (A.141) ∪ − or

2 2 2 2 P χ1 α/2,n

A.6.1 Sum of Independently Distributed Chi-Square Random Variables

The sum of independent gamma R.V.’s was discussed in Section A.5.1. From these results, it directly follows that the sum of independent chi-squared R.V.’s with n =1 is chi-squared with parameter n = m.Thatis,if

2 Xi χ (1) = G (1/2, 2) , (A.144) ∼ and m

Y = Xi , (A.145) i=1 X then Y G (m/2, 2) = χ2 (m) . (A.146) ∼

A.7 Multivariate Normal Distribution

The multivariate normal distribution is of prime importance to this dissertation and specifically to the discussion of estimation. Let the set of random variables xi be arranged in vector form

T X = X1 X2 Xn . (A.147) ··· £ ¤ The expected value of the vector is

T µ E X = E X1 E X2 E Xn . (A.148) , { } { } { }··· { } £ ¤ Thecovariancematrixis

2 2 2 σ11 σ12 σ1n 2 2 ··· 2 T σ21 σ22 σ2n V ,E (X µ)(X µ) = ⎡ . . ···. . ⎤ , (A.149) − − . . .. . n o ⎢ 2 2 2 ⎥ ⎢ σn1 σn2 σnn ⎥ ⎢ ··· ⎥ ⎣ ⎦ where 2 2 σ = σ = E (xi µ ) xj µ . (A.150) ij ji − i − j © ¡ ¢ª 313

Of course, when Xi is independent of Xj,thenσij =0. The multivariate normal density function is defined by

1 (x µ)T V 1(x µ) e− 2 − − − fX (x1,x2, ..., xn)= . (A.151) (2π)n/2 V 1/2 | | To indicate that a random variable X has a PDF given by Eq. A.151 the following notation is used X N (µ, V) . (A.152) ∼ In order for Eq. A.151 to be a valid PDF, it must satisfy certain properties (to conform with the axioms of probability) — namely, it must be non-negative and integrate to one. The following Lemma is used to shown that Eq. A.151 does indeed integrate to one.

Lemma A.1. For a positive definite symmetric matrix of order n, the following integral is satisfied (known as Aitken’s integral)

∞ ∞ 1 xT Ax n/2 1/2 e− 2 dx1 dxn =(2π) A − . (A.153) ··· ··· | | Z−∞ Z−∞ Proof. Because A is positive definite there exists a non-singular matrix P such that

T T 2 1/2 P AP = In. Hence P AP = P A =1and so P = A − ; and letting | | | | | | | | x = Py gives xT Ax = y¯ T PT APy¯ = yT y and so, ¯ ¯ ∞ ∞ 1 xT Ax ∞ ∞ 1 yT y e− 2 dx1 dxn = P e− 2 dy1 dyn ··· ··· || || ··· ··· Z−∞ Z−∞ Z−∞ Z−∞ ∞ ∞ ( 1 n y2) = P e − 2 i=1 i dy1 dyn | | ··· S ··· Zn−∞ Z−∞ ∞ 1 y2 = P e− 2 i dyi | | i=1 Y ½Z−∞ ¾ = P (2π)n/2 | | n/2 1/2 =(2π) A − . (A.154) | | Rearranging gives

1 ∞ ∞ 1 T 2 x Ax n/2 1/2 e− dx1 dxn =1, (A.155) (2π) A − ··· ··· | | Z−∞ Z−∞ 314 which is required for a probability density function of the form

1 1 T 2 x Ax f (x1,x2, ..., xn)= n/2 1/2 e− . (A.156) (2π) A − | | An alternate proof is given by [76, p. 43].

A simple application of Lemma A.1 and the variable transformation z = x µ − can be used to show that the multivariate Gaussian PDF, given by Eq. A.151, does integrate to one, as required by any density function. The other requirement on a PDF, that it must be non-negative, is clearly evident from Eq. A.151, with all terms being non-negative. It is often easier to prove certain properties of a random variable by working with the random variables moment generating function rather than the PDF itself. For this reason, the PDF of the multivariate Gaussian random variable is given in the following Lemma.

Theorem A.14. The MGF of the multivariate random variable is given by MX (t)= T tT µ+ 1 tT Vt E et x = e( 2 ) n o Proof. By definition

tT x MX (t)=E e

n 1 o ∞ ∞ tT x 1 (x µ)T V 1(x µ) = e e− 2 − − − dx1 dxn (2π)n/2 V 1/2 ··· ··· | | Z−∞ Z−∞ 1 1 T 1 T 1 T ∞ ∞ [ (x µ Vt) V− (x µ Vt)+t µ+ t Vt] = e − 2 − − − − 2 dx1 dxn (2π)n/2 V 1/2 ··· ··· | | Z−∞ Z−∞ T 1 T 1 1 T 1 (t µ+ t Vt) ∞ ∞ [ (x µ Vt) V− (x µ Vt)] = e 2 e − 2 − − − − dx1 dxn (2π)n/2 V 1/2 ··· ··· | | Z−∞ Z−∞ tT µ+ 1 tT Vt = e( 2 ) . (A.157)

The last result follows because the integral term is that of a multivariate Gaussian RV with N (µ + Vt, V), which integrates to one.

Theorem A.15. The mean and covariance of the multivariate Gaussian density func- tion are the density parameters µ and V 315

Proof. The mean can be found by use of the MGF

T ∂MX (t) ∂ tT µ+ 1 tT Vt T T T E X = = e( 2 ) = µ +t V M (t) = µ . ∂t ∂t X t=0 ¯t=0 ¯t=0 © ª ¯ ¯ ¡ ¢ ¯ (A.158) ¯ ¯ ¯ Similarly, the correlation¯ of X can be evaluated¯ using the MGF

∂ ∂M (t) T E XXT = X ∂t ∂t " µ ¶ #t=0 © ª ∂ T = µT +tT V M (t) ∂t X ∙ ¸t=0 ∂ £¡ ¢ ¤ = µ + VT t M (t) ∂t X ∙ ¸t=0 ¡¡ ¢ ¢ ∂ = µ µT +tT V M (t) + VT (tM (t)) X t=0 ∂t X ∙ ¸t=0 £ ¡T T ¢ ¤ T T = µµ + V MX (t)+t µ +t V MX (t) t=0 = µµT + VT £.¡ ¢ ¤ (A.159)

Thecovarianceisthengivenby

cov (X)=E (X µ )(X µ )T = E XXT µµT = VT . (A.160) − X − X − n o © ª Since the covariance matrix is symmetric

cov (X)=V . (A.161)

Linear Transformations One of the most attractive features of the multivariate Gaussian PDF is that normality is preserved under linear transformations.

Theorem A.16. A linear transformation of the form Y = AX,withX N (µ, V) ∼ preserves normality yielding Y N Aµ, AVA T ∼ ¡ ¢ 316

1 Proof. If the matrix A is of full rank and square, it has an inverse A− and

1 (x µ)T V 1(x µ) e− 2 − − − fX (x)= (A.162) (2π)n/2 V 1/2 1 | | f (y)= f A 1y Y A X − | | 1 1 T 1 1 (A−¡y µ) ¢V− (A− y µ) e− 2 − − = A (2π)n/2 V 1/2 | | | | 1 1 T T 1 1 1 (A− y Aµ) (A )− V− A− (y Aµ) e− 2 − − = (2π)n/2 AVA 1/2 | | 1 1 T T 1 (A− y Aµ) (AVA )− (y Aµ) e− 2 − − = 1/2 (2π)n/2 AVA T T = N Aµ, AVA¯ .¯ (A.163) ¯ ¯ When A is not square, but is of full¡ rank, the¢ same result holds, the proof of which is common in many statistics books. A simple proof that makes use of moment generating function follows

tT Y MY (t)=E e

T = E net AXo

n 1 o ∞ tT Ax 1 (x µ)T V 1(x µ) = e e− 2 − − − dx (2π)n/2 V 1/2 | | Z−∞ 1 T T 1 T T 1 T T 1 ∞ (x µ VA t) V− (x µ VA t)+t Aµ+ t AVA t = e − 2 − − − − 2 dx (2π)n/2 V 1/2 k l | | Z−∞ T 1 T T 1 1 T 1 (t Aµ+ t AVA t) ∞ [ (x µ VAt ) V− (x µ VA t)] = e 2 e − 2 − − − − dx (2π)n/2 V 1/2 | | Z−∞ tT Aµ+ 1 tT AVA T t = e( 2 ) . (A.164)

This is the moment generating function of the an RV with PDF given by N Aµ, AVA T . Since the moment generating function uniquely defines the PDF, the RV¡Y must be ¢ N Aµ, AVA T . ¡ ¢ Marginal and Conditional Distributions Another convenient property of multivariate normal random variables is that marginal and conditional random variables are also 317 normally distributed. To prove these properties, the RV X is partitioned as follows X X = 1 , (A.165) X2 ∙ ¸ where x1 has dimension k 1 and x2 has dimension (n k) 1. The associated × − × partitioning of the mean and covariance matrices is given by µ µ = 1 , (A.166) µ ∙ 2 ¸ and V11 V12 V = T . (A.167) V V22 ∙ 12 ¸ Lemma A.2. The marginal distributions of a multivariate Gaussian random variable are also Gaussian

X1 N (µ , V ) (A.168) ∼ 1 11

X2 N (µ , V ) . (A.169) ∼ 2 22 Proof. Making use of Theorem A.14, the MGF of X is given by

T 1 T (t µ+ 2 t Vt) . MX (t)=MX1,X2 (t1, t2)=e (A.170)

The moment generating function (MGF) of the marginal random variable (RV) X1 can be obtained using Eq. A.72

T 1 T (t1 µ1+ 2 t1 V11t1) MX1 (t1)=MX1,X2 (t1, t2 = 0)=e . (A.171)

Since the MGF of the marginal RV X1 has the form of a Gaussian RV, we can conclude the PDF of X1 is given by

1 T 1 (x1 µ ) V− (x1 µ ) e− 2 − 1 11 − 1 fX1 (x1)= k/2 1/2 . (A.172) (2π) V11 | | An identical result holds for X2

1 T 1 (x2 µ ) V− (x2 µ ) e− 2 − 2 22 − 2 fX2 (x2)= (n k)/2 1/2 . (A.173) (2π) − V22 | | 318

Theorem A.17. The conditional distributions of a multivariate Gaussian random variable are also Gaussian

1 1 T X1 X2 = x2 N µ + V12V− (x2 µ ) , V V12V− V . (A.174) | ∼ 1 22 − 2 11 − 22 12 ¡ ¢ T 1 T 1 X2 X1 = x1 N µ + V V− (x1 µ ) , V V V− V12 . (A.175) | ∼ 2 12 11 − 1 11 − 12 11 ¡ ¢ Proof. There are many ways to prove this theorem. One of the simplest [3, p. 25] makes use of the following easily checked formula

1 1 T I V12V22− I0 V11 V12V22− V12 0 − V 1 T = − , (A.176) 0I V− V I 0V22 ∙ ¸ ∙ − 22 12 ¸ ∙ ¡ ¢ ¸ or equivalently

1 T 1 1 1 I0V11 V12V22− V12 − 0 I V12V22− V− = 1 T − 1 − . V22− V12 I 0V22− 0I ∙ − ¸ ∙ ¡ ¢ ¸ ∙ (A.177)¸ This decomposition of V greatly assists in expressing the determinate as

1 T V = V11 V12V− V V22 . (A.178) | | − 22 12 | | ¯ ¯ The quadratic form that appears¯ in the PDF of X ¯(see Eq. A.151) can be expressed as

T 1 Q =(x µ) V− (x µ) − − T 1 T 1 x1 µ1 I0V11 V12V22− V12 − 0 = − 1 T − 1 x2 µ V− V I 0 V ∙ − 2 ¸ ∙ − 22 12 ¸ " ¡ ¢ 22− # 1 I V12V− x1 µ − 22 − 1 × 0I x2 µ ∙ ¸ ∙ − 2 ¸ = Q1 2 + Q2 , (A.179) | where

T 1 1 T − Q1 2 =(x1 µ1) V11 V12V22− V12 (x1 µ1) (A.180a) | − − − T 1 Q2 =(x2 µ ) V¡ − (x2 µ ) .¢ (A.180b) − 2 22 − 2 319

The conditional density of X1 given X2 is found to be

fX1 X2 (x1, x2) | fX1 X2 (x1, x2)= | fX2 (x2) 1 n/2 1/2 − 1 (x µ)T V 1(x µ) (2π) V e− 2 − − − = | | 1 1 T 1 ³(n k)/2 1/´2 − (x2 µ ) V− (x2 µ ) (2π) − V22 e− 2 − 2 22 − 2 | | 1 ³ n/2 ´ 1 T 1/2 1 1/2 − 1 Q (2π) V11 V12V22− V12 V22− e− 2 = − 1 1 ³ ¯ (n k)/2 1/¯2 −¯ ¯Q2 ´ (2π) − V22 e− 2 ¯ | | ¯ ¯ ¯ 1 Q +Q e³ 2 1 2 2 ´ = − | 1/2 1 1 T Q2 V11 V12V22− V12 e− 2 − 1 2 Q1 2 ¯ e− | ¯ = ¯ ¯ 1 T 1/2 V11 V12V− V − 22 12 T 1 1 1 T − ¯exp 2 (x1 µ1)¯ V11 V12V22− V12 (x1 µ1) = ¯ − − ¯ − − . n h 1 T 1/2 io V11¡ V12V− V ¢ − 22 12 (A.181) ¯ ¯ ¯ ¯ That is,

1 1 T fX1 X2 (x1, x2)=N µ1 + V12V22− (x2 µ2) , V11 V12V22− V12 . (A.182) | − − ¡ ¢ An identical result holds for X2

T 1 T 1 fX2 X1 (x1, x2)=N µ2 + V12V11− (x1 µ1) , V11 V12V11− V12 . (A.183) | − − ¡ ¢ An alternate proof of this theorem is given in [76, pp. 46-47].

As shown in Chapter 2, the conditional mean is, in the minimum variance esti- mator. If the random variables X1 and X2 are Gaussian, then the conditional mean is obtained directly from Theorem A.17

1 E X1 X2 = x2 = µ + V12V− (x2 µ ) . (A.184) { | } 1 22 − 2 It is can also be shown that the conditional mean is the minimum variance linear estimator even when X1 and X2 are not Gaussian [3, p. 93]. The next theorem 320 shows that the linear estimator given by

1 xˆ1 = µ + V12V− (x2 µ ) . (A.185) 1 22 − 2 has an error that is orthogonal to the random variable X2.

Theorem A.18. Orthogonality Principle. Let X1 and X2 be jointly Gaussian ran- dom variables — the same as those discussed in Theorem A.17. Then, the random variable X1 E X1 X2 is orthogonal to X2 − { | } T E [X1 E X1 X2 ] X = 0 . (A.186) − { | } 2 Proof. The proof follows directly© from the conditionalª mean given by Theorem A.17

T 1 T E [X1 E X1 X2 ] X = E X1 µ V12V− (X2 µ ) X − { | } 2 − 1 − 22 − 2 2 T T 1 T © ª = E ©£X1X µ X V12V− (X2 ¤ µ ª) X 2 − 1 2 − 22 − 2 2 T T 1 T = E ©X1X µ µ V12V− (X2 µ ) X ª 2 − 1 2 − 22 − 2 2 1 T T = V12© V12V− E X2X µ X ª − 22 2 − 2 2 1 T T = V12 V12V− V©22 + µ µ µ ªµ − 22 2 2 − 2 2 1 = V12 V12V− V£ 22 ¤ − 22

= V12 V12 − = 0 . (A.187)

Corollary A.19. Regardless of the distribution of X1 and X2, the linear minimum 1 variance estimator xˆ1 = µ +V12V− (x2 µ ) has an error x xˆ that is orthogonal 1 22 − 2 − the random variable X2

Proof. The proof follows directly steps used in the proof given of Theorem A.18

1 T E X1 µ V12V− (X2 µ ) X = 0 . (A.188) − 1 − 22 − 2 2 ©£ ¤ ª 321

A.7.1 Confidence Regions for Multivariate Normal Random Variables

For scalar (i.e. univariate) random variables it is often convenient to express an interval within which the random variable is likely to occur at a specified level of probability. For univariate Gaussian random variables, the most convenient type of interval is the number M of standard deviations about the mean. The reason for this is that the associated probability of the random variable occurring within such regions can be conveniently computed using the standard normal distribution. The probability of a Gaussian random variable having a value within one, two and three standard deviations of the mean is given by Eq. A.90. Knowledge of these values is often more convenient to use than the cumulative distribution function itself. For example, rather than displaying a graph of the cumulative distribution function, we can simply state that X will take on a value within three standard deviations of its mean with a 99.7% level of confidence.

X µ 3σ 99.7% confidence (A.189) ∈ ± Clearly the confidence interval is a more compact way of indicating the variability of X than using the CDF of X. It is desired to extend the use of the confidence interval to the multivariate case. In themultivariatecaseonewouldliketospecifyaregionorvolume in n-dimensional R space in which the random variable X is likely to occur at a specified level of proba- bility. In the discussion that follows three n-dimensional random vectors X, Y,andZ will be used. The random variable Z is uncorrelated with Zi N (0, 1).Therandom ∼ vector Y is related to Z by the transformation

Yi = σiZi + µi , (A.190) or

Y = ΣYZ + µY , (A.191) 322

where ΣY is a diagonal matrix with diagonal elements σi

σ1 0 .. ΣY = ⎡ . ⎤ . (A.192) 0 σn ⎢ ⎥ ⎣ ⎦ ThecovarianceofY is given by

T VY = E (Y µ )(Y µ ) − Y − Y n T o = E (ΣYZ)(ΣYZ)

n T T o = ΣYE ZZ ΣY

T = ΣYΣY© ª 2 σ1 0 .. = ⎡ . ⎤ . (A.193) 0 σ2 ⎢ n ⎥ ⎣ ⎦ Let R be an orthonormal matrix (i.e. a rotation) and X be a random vector defined by y = Rx . (A.194)

For example, a two-dimensional rotation matrix is of the form

cos β sin β R = . (A.195) sin β −cos β ∙ ¸ Since the matrix R is orthonormal we have the following two properties.

1 T R− = R (A.196a)

R =1 (A.196b) | | Thus, the random variable X can be written as

1 T x = R− y = R y . (A.197)

The mean of X is T µX = R µY . (A.198) 323

ThecovarianceofX is

T VX = E (X µ )(X µ ) − X − X T = E n RT Y RT µ RToY RT µ − Y − Y = RTnE¡ (Y µ )(Y¢¡ µ )T R ¢ o − Y − Y T n o = R VYR 2 σ1 0 T .. = R ⎡ . ⎤ R . (A.199) 0 σ2 ⎢ n ⎥ ⎣ ⎦ Equation A.199 represents a spectral decomposition of VX. The columns of the 2 matrix R correspond to the eigenvectors of VX and the σi are the eigenvalues of VX. The density of X is given by Eq. A.151

1 T 1 (x µ ) V− (x µ ) e− 2 − X X − X fX (x)= n/2 1/2 . (A.200) (2π) VX | | The set of all points x such that fX (x) is a constant forms an n 1 dimensional − surface (α) of equal probability SX

α = x : fX (x)=α . (A.201) SX { } where α is a constant. Let dV (x) denote an infinitesimally small n-dimensional (α) volume located at x.SincefX (x) is a constant on , the probability that X will SX take on a value in dV (x) is the same for all points x on the surface (α).UsingEq. SX A.200 and Eq. A.201, results in the surface

(α) T 1 = x :(x µ ) V− (x µ )=α . (A.202) SX − X X − X n o which is an n 1 dimensional ellipsoid. Similarly, for Y the eqiprobability surface is −

(α) T 1 = y :(y µ ) V− y µ = α SY − Y Y − y n n 2 o yi µ ¡ ¢ = y : − i = α . (A.203) σ ( i=1 i ) X µ ¶ 324

Of course, there are an infinite number of equiprobability ellipsoids — one for each value of the constant α. For a given value of α, the ellipsoid intersects axis j at a distance µj + √ασj from the origin. The eigenvalues of the diagonal matrix VY are related to σi by 2 λi = σi . (A.204) The ellipsoid defined by (α) has its principal axes aligned with the eigenvectors of SY VY.SinceVY is diagonal the eigenvectors are themselves aligned with the axes defining the n-dimensional vector space. The ellipsoid defined by (α) intersects SY these axes at a distance of √α√λi = √ασi from the center of the ellipsoid. Now, consider the random variable X.Thefirst thing to note is that the de- terminate of VX is equal to the determinate of VY,ascanbeshownbytakingthe determinate of Eq. A.199

T T VX = R VYR = R VY R = VY . | | | || | | | ¯ ¯ ¯ (α¯) (α) Since VX = VY ,thevalueof¯ fX (x¯) on¯ X¯ is equal to the value of fY (y) on Y | | | | S S 1 α 1 α (α) (α) e− 2 e− 2 fX X = fY Y = n/2 1/2 = n/2 n . (A.205) S S (2π) VY (2π) σi ³ ´ ³ ´ | | i=1 Substituting Eq. A.199 into Eq. A.202 gives Q (α) T 1 = x :(x µ ) V− (x µ )=α SX − X X − X n T T 1 o = x :(x µ ) R VYR − (x µ )=α − X − X n T T 1 o = x :(x µ ) R¡ V− R (¢x µ )=α − X Y − X n T 1 o = x :[R (x µ )] V− [R (x µ )] = α . (A.206) − X Y − X n o Define δy such that

x = µX + δx . (A.207) Then

(α) T 1 = x :[R (x µ )] V− [R (x µ )] = α SX − X Y − X n T 1 o = δx :[Rδx] VY− [Rδx]=α . (A.208) n o 325

Similarly, define δy such that

y = µY + δy . (A.209)

Then,

(α) T 1 = y :(y µ ) V− y µ = α SY − Y Y − y n T 1 o = δy : δy VY− δy = α ¡ .¢ (A.210)

© ª (α) Any vector δy that results in the vector y being on the surface has a counter SY (α) part Rδx that results in the vector x being on the surface . Since this is true for SX any α,thegeneralresultis

Rδx = δy . (A.211)

(α) All vectors δy resulting in a vector y that extends to the surface corporately SY trace out an ellipsoid centered at the origin of the n-dimensional vector space. The size of the ellipsoid is the same as that described by (α) itself — that is, the radius of SY principal axis j is equal to √ασj. The vectors δx can be obtained by pre-multiplying Eq. A.211 by RT T δx = R δy . (A.212)

Since R is an n-dimensional rotation matrix the ellipsoid formed by the δy is rotated to form a new ellipsoid of the same size — i.e. the ellipsoid described by δx is a rotated undistorted version of the ellipsoid described by δy. To interpret this important result, consider the bivariate case (i.e. n =2)withR given by Eq. A.195. In the case of bivariate random variables, the equiprobability surface is a (1 dimensional) ellipse.

The ellipses traced out by δx and δy are shown in Figure A.1. The corresponding equiprobability surfaces are shown in Figure A.2. Some very important observations are now made. First, among these is that the probability that the random variable X is in the n-dimensional volume (α) enclosed by the surface (α) is equal to the RX SX probability that the random variable Y is in the n-dimensional volume (α) enclosed RX (α) (α) by the surface . This follows from that fact that fX (x) on the surface , SY SX 326

β +γ

ασ2 β δx γ δy

ασ1

Figure A.1. Relationship between the ellipses tranced out bo δx and δy.

(α ) S X σ 2 σ1 µX,2 β (α ) S Y µ Y,2 β

µ µ (α ) X,1 Y,1 S Z

Figure A.2. Equiprobability ellipsoids for random variables X N (µX, VX), Y N (µ , V ) and Z N (0, I). ∼ ∼ Y Y ∼ 327

(α) (α) denoted by fX ,isequalthedensityfY (y) evaluatedonthesurface , SX SY (α) 1 denoted by fY ³ ´, for any α (see Eq. A.205) . Furthermore, the region need not SY even be one with³ a equiprobability´ boundary. For example, the two probability that

X belongs to X shown in Figure A.3 is equal to the probability that Y belongs to R Y showninthesameFigure R

P (X X)=P (Y Y) . (A.213) ∈R ∈R

This is a direct result of the one-to-one mapping of the orthonormal transformation given by Eq. A.194 — that is, the same result would hold for non-Gaussian random variables. This means that one can consider uncorrelated random variables, each of a any (possibly different) type of distribution and draw conclusions about probability regions. Then, the same result applies to the orthonormal transformation given by Eq. A.194, which correlates the random variables.

Rectangular Region To begin the discussion consider the uncorrelated vector Y.One might want to find the probability that each of the random variables Yi is with an

Mσi boundary about its mean mean µ . The boundary of this region is an n 1- i − dimensional box (M) (e.g. a rectangle in the bivariate case) centered about the BY mean µY. The sides of the box are equal to Mσj. Since the random variables are uncorrelated the probability that Y is in the region enclosed by the box is

(M) P Y = P (µ MσY Y µ + MσY) ∈BY Y − ≤ ≤ Y ³ ´ = P ( MσY Y µ MσY) − ≤ − Y ≤

= P ( MσY δY MσY) . (A.214) − ≤ ≤ 1 To avoid interrupting the flow of the discussion, we note here the minor point that while (α) (α) (α) (α) fX = fY , it is not in general true that the value fZ is also equal to fX SX SY SZ SX (α) and³fY ´ . ³ ´ ³ ´ ³ ´ SY ³ ´ 328

RX µX,2 β

Y µ R Y,2 β

µX,1 µY,1

Figure A.3.Regions X formed by rotating region Y such that P (X X)= R R ∈R P (Y Y) for the orthonormal transformation Y = RX. ∈R

This probability may be evaluated using the definition of Zi, given by Eq. A.190

(M) P Y = P ( MσY Y µ MσY) ∈BY − ≤ − Y ≤ ³ ´ n = P Mσi Yi µi Mσi Ãi=1 − ≤ − ≤ ! n \

= P ( Mσi Yi µi Mσi) i=1 − ≤ − ≤ Yn Yi µ = P M − i M − ≤ σ −≤ i=1 µ i ¶ Yn Yi µ = P − i M σ ≤ i=1 µ¯ i ¯ ¶ Yn ¯ ¯ ¯ ¯ = P ( Z¯ i M¯) i=1 | | ≤ Y n =[P ( Zi M)] | | ≤M n 1 z2/2 = e− dz . (A.215) √2π M ∙ Z− ¸ where Zi N (0, 1). That is, the probability that all random variables is within their ∼ th Mσi boundaries (about the mean µY) is equal to the n power of the probability that 329

(M ) X B Mσ 2 (M ) X µX,2 R

Mσ1

µX,1

Figure A.4. Rectangular confidence region (M) enclosed by box (M). RX BX asingleYj random variable is with in its Mσj boundary (centered about its mean µj). Clearly, our uncertainty about the region of occurrence of a random vector increases with the dimension of the random vector. From previous discussion (see Figure A.3) we know that the same result applies to the case when the random variables x are correlated, which can be accomplished by the orthonormal transformation given by Eq. A.194. The probability that X belongs to the region (M) enclosed by the box RX (M) is BX M n (M) (M) n 1 z2/2 P X X = P Y Y =[P ( Zi M)] = e− dz . ∈B ∈B | | ≤ √2π M ³ ´ ³ ´ ∙ Z− ¸(A.216) Inthecaseofbivariaterandomvariables(i.e. n =2), the box (2) is a rectangle of BX width 2Mσ1 and height 2Mσ2, as shown in Figure A.4.

Ellipsoidal Region Aside from rectangular confidence regions, ellipsoidal confidence regions are the most convenient to work with. To begin with, consider the random 330 vector z, which has elements that are uncorrelated standard normal random variables n n 1 z2/2 1 1 T f (z)= f (z )= e− j = exp z z . (A.217) Z Zj j √ n/2 2 j=1 j=1 2π (2π) − Y Y ½ ¾ The probability that Z is in an n-dimensional region is found by integrating the R PDF over R 1 1 T P (Z )= fZ (θ) dθ = exp θ θ dθ . (A.218) ∈R (2π)n/2 −2 θZ θZ ½ ¾ ∈R ∈R More specifically, we might specify the region as the volume enclosed by an n 1 R − dimensional surface . One very convenient choice is to use an n 1 dimensional S − sphere of radius r = θ : θT θ

T 2 1 1 T P Z Z

Z2 χ2 (1) . (A.222) j ∼ 2 2 Since each of the Zj are independent the random variable R (see Section A.6) is chi-square with n degrees of freedom

R2 χ2 (n) . (A.223) ∼ Thus the desired probability is readily obtainable by use of the well known chi-square distributions T 2 2 2 2 P Z Z

2 where the cumulative distribution F 2 (r n) refers to the chi-square distribution with R | n degrees of freedom. As already described, the surface is an n 1 dimensional S − sphere centered at the origin of the n-dimensional vector space. This sphere intersects eachofthecoordinateaxesataradiusr from the origin. Now, consider the random variable Y,givenbyEq.A.191

y = ΣYz + µY , (A.225) where ΣY is a diagonal matrix with diagonal elements σi

σ1 0 .. ΣY= ⎡ . ⎤ . (A.226) 0 σn ⎢ ⎥ ⎣ ⎦ Using the transformation φ = ΣYθ + µY, results in Eq. A.220 gives

(r2) P Z = P ZT Z

2 2 2 (r ) (r ) (r ) T 2 2 P X = P Y = P Z = P Z Z

Theorem A.20. For any random variable Q, with elements of arbitrary distrib- ution and correlation, a one to one mapping Q W, results in P (Q Q)= → ∈ R 332

P (Y W), where the region W (with bounding surface W) is the region that ∈ R R S results from applying the mapping to the region Q (with bounding surface Q). Al- R S ternately, the mapping can be applied directly to Q,thusproducing W (which defines S S the boundary of W). R A general proof of this theorem is not provided here because we only make use of this result for the special case of Gaussian random variables. A proof of the thistheoremforGaussianrandomvariablesisgivenbyEq.A.227. Itisusefulto interpret this result for the bivariate case (i.e. n =2). In this case, the hatched regions shown in Figure A.2 correspond to regions (α) (red), (α) (black), and (α) RZ RY RX (blue), all with equal probability as indicated by Eq. A.228. It is instructive to see how the dimension n and region (shape and size) affect the probability of occurrence of a Gaussian random variable. For this reason, Table A.1 is provided. The values 1σ, 2σ and 3σ in Table A.1 correspond to M = 1, 2, 3 fortherectangularregion { } and α = r2 = 1, 4, 9 for the ellipsoidal region. The rectangular regions are n 1 { } − dimensional boxes with widths corresponding to Mσi. Similarly, the ellipsoidal regions correspond to n 1 dimensional ellipsoids with principle axes equal to rσ − in radius. The rectangular region probabilities were obtained by use of Eq. A.216. More specifically, the first row of results for the rectangular region was obtained directly from Eq. A.90. Subsequent rows are simply the first row values to the power n. For the columns pertaining to the ellipsoidal region, MATLAB’s chi2cdf command was used to evaluate Eq. A.228

(r2) P X = chi2cdf r2,n . (A.229) ∈ RX µ ¶ Note that the ellipsoidal probabilities are always less¡ than¢ the counterpart rectangular probabilities. This is because the ellipsoidal region is a subset of the rectangular region, as shown in Figure A.4. While the confidence levels (probability 100) in × Table A.1 are very useful, it is perhaps even more useful to determine the values of M and α (or equivalently r) that will yield a desired level of probability P .Inthe 333

n Rectangular Ellipsoidal 1σ 2σ 3σ 1σ 2σ 3σ 1 68.3 95.4 99.7 68.3 95.4 99.7 2 46.6 91.1 99.4 39.3 86.5 98.9 3 31.8 87.0 99.2 19.9 73.9 97.1 4 21.7 83.0 98.9 9.0 59.4 93.9 5 14.8 79.2 98.7 3.7 45.1 89.1 6 10.1 75.6 98.4 1.5 32.3 82.6 7 6.9 72.2 98.1 0.5 22.0 74.7 8 4.7 68.9 97.9 0.1 14.3 65.8 9 3.2 65.8 97.6 0.1 8.9 56.3 10 2.2 62.8 97.3 < 0.1 5.2 46.8

Table A.1. Probabilities (x100) for different dimensions and different regions. case of the ellipsoidal random variable, the desired level of probability can be found using MATLAB’s chi2inv command

r2 = chi2inv (P, n) , (A.230) or r = chi2inv (P, n) . (A.231)

In the case of the rectangular region,p MATLAB does not provide an inverse of the function given by Eq. A.216. However, numerical methods or simply brute force can be used to determine the value of M such that Eq. A.216 is satisfied for a prescribed probability P . In either case, rectangular or ellipsoidal, the values of M and r denote the size of the region measured in number of standard deviations. Table A.2 lists the required region size (normalized to standard deviations) for various values of n and probability levels. For example, with n =6,the95%confidence (0.95 probability) ellipsoid is centered on the mean µ and intersects its principal axes (which are the eigenvectors of Pk) at a distance of 3.55σi from the elipsoid center. Similarly, with n =6,the95%confidence (0.95 probability) box is centered on the mean µ and has faces that intersect its principal axes (which are the eigenvectors of Pk)atadistance of 2.63σi from the box center. 334

n Rectangular Ellipsoidal 50% 90% 95% 50% 90% 95% 1 0.67 1.64 1.96 0.67 1.64 1.96 2 1.05 1.95 2.24 1.18 2.15 2.45 3 1.26 2.11 2.39 1.54 2.50 2.80 4 1.45 2.23 2.49 1.83 2.79 3.08 5 1.52 2.31 2.57 2.09 3.04 3.33 6 1.60 2.38 2.63 2.31 3.26 3.55 7 1.67 2.43 2.68 2.52 3.47 3.75 8 1.73 2.48 2.73 2.71 3.66 3.94 9 1.79 2.52 2.77 2.89 3.83 4.11 10 1.83 2.56 2.80 3.06 4.00 4.28

Table A.2. Required region size normalized to number of standard deviations along principal axes.

A.8 Random Processes and Random Sequences

A random process is a mapping from the set of experimental outcomes S to a set of continuous-time functions Ω,suchthatateachtimeti, the mapping is a random variable [85]. Similarly, a random sequence is a mapping from the set of experimental outcomes S to a set of index-valued sequences Ω, such that at each index k,the

th mapping is a random variable [85]. Let Xk be a random process. An N order joint distribution function can be defined

FX (xk,xk+1,...xk+N 1; k, k +1,...k+ N 1) − − , P (Xk xk,Xk+1 xk+1,...Xk+N 1 xk+N 1; k, k +1,...k+ N 1) . ≤ ≤ − ≤ − −

The density function is defined by

fX (xk,xk+1,...xk+N 1; k, k +1,...k+ N 1) − N − ∂ FX (xk,xk+1,...xk+N 1; k, k +1,...k+ N 1) = − − . ∂xk∂xk+1 ∂xk+N 1 ··· − 335

Sometimes, the explicit time notation is dropped fX (xk,xk+1,...xk+N 1).Aswith − any joint density function, the conditional density can be given by

fX (xn+k,xk,...xk+N 1) fX (xn+k xk,...xk+N 1)= − . (A.232) | − fX (xk,...xk+N 1) − The conditional density can be used to expand a joint density function

fX (xN ,xN 1,...x1,x0)=fX (xN 1,...x1,x0) fX (xN xN 1,...x1,x0) − − | − = fX (xN 2,...x1,x0) fX (xN 1 xN 2 ...x1,x0) fX (xN xN 1,...x1,x0) − − | − | − = fX (x0) fX (x1 x0) fX (x2 x1,x0) fX (xN xN 1,...x1,x0) .(A.233) | | ··· | − The autocorrelation of a random process is defined to be

RXX (m, n)=E xmxn (A.234) { } ∞ ∞ = xmxnfX (xm,xn; m, n) dxmdxn . Z−∞ Z−∞ In many instances, the autocorrelation function is shift invariant (continuous: time invariant), meaning that it is only a function of the difference k = n m (continuous: − tn tm). − discrete: R (n + k, n) R (k, 0) = R (k) XX XX XX (A.235) continuous: RXX (t + τ,t) RXX (τ,0) = RXX (τ)

A process that has a constant mean E xm = µ and a shift invariant autocorrelation { } function is called a wide sense stationary (WSS). The power spectral density (PSD) is defined to be the Fourier transform of the auto correlation function.

∞ jωk discrete: SXX (ω)= RXX (k) e− k= (A.236) −∞ jΩτ continuous: SXX (Ω)= P∞ RXX (τ) e− dt −∞ The autocorrelation function can be found byR taking the inverse Fourier transform of the PSD.

A.8.1 White Processes and Sequences

A white process/sequence has the property that the autocorrelation function is zero unless its argument is zero discrete: R (k)=Aδ (k) XX , (A.237) continuous: RXX (τ)=Aδ (τ) 336 where A is a constant and the delta functions satisfy

x (0) if k =0 x (k) δ (k)= (A.238a) 0 if k =0 ½ 6 t2 x (0) if t 0 t x (τ) δ (τ) dτ = 1 2 . (A.238b) 0 if otherwise≤ ≤ Zt1 ½ Therefore, the PSD of white noise is uniform across all frequencies

∞ jωk discrete: SXX (ω)=A δ (k) e− = A k= . (A.239) −∞ jΩτ continuous: SXX (Ω)=A P∞ δ (τ) e− dt = A −∞ R A.8.2 Markov Random Sequences

A Markov random sequence has the property that the conditional distribution/density of the RV Xn+k conditioned on previous values only depends on the most recent value.

fX (xn+k xk,...xk+N 1)=fX (xn+k xk) (A.240) | − |

A Markov sequence will be produced by any model of the form

Xk+1 = f (Xk,Vk) (A.241a)

E ViVj =0 if i = j , (A.241b) { } 6 where f is a linear or nonlinear function. The reader should see that this represents a wide class of real-world problems. The joint density given by Eq. A.233 can be simplified in the case of a Markov process

fX (xN ,xN 1,...x1,x0)=fX (x0) fX (x1 x0) fX (x2 x1,x0) fX (xN xN 1,...x1,x0) − | | ··· | −

= fX (x0) fX (x1 x0) fX (x2 x1) fX (xN xN 1) − N | | ··· |

= fX (x0) fX (xk xk 1) . (A.242) | − k=1 Y IfthevaluesofXk are discrete, then the Markov process is called a Markov chain. 337

A.9 Innovations Sequence

The following discussion provides a brief introduction to the innovations sequence. For more details, the interested reader is referred to [3, ch. 5], [54, Lesson 16], and

[86, Ch. 4]. Let zj denote a random vector and z0:j denote a collection of such random vectors

z0:j = z0, z1, ..., zj . (A.243) { } Theinnovationsequenceisdefined as

˜z0 = z0 E z0 z0 (A.244a) − { | }

˜z1 = z1 E z1 z0 (A.244b) − { | }

˜z2 = z2 E z2 z1, z0 (A.244c) − { | }

˜zk = zk E zk z1:k 1 . (A.244d) − { | − } The innovations have a zero mean

E ˜zk = E zk E zk z1:k 1 { } { − { | − }}

= E zk E E zk z1:k 1 { } − { { | − }}

= E zk E zk { } − { } =0. (A.245)

Theorem A.21. The innovations sequence is composed of orthogonal elements.

T E ˜zk˜z = 0 m = k (A.246) m ∀ 6 Proof. The innovations are independent© ª

T E ˜zk˜zm = E (zk E zk z1:k 1 )(zm E zm z1:m 1 ) { } − { | − } − { | − } n T T T o = E zkzm E zk z1:k 1 zm zkE zm z1:m 1 − { | − } − | − T © +E +E zk z1:k 1 E zm z1:m© 1 ªª { | − } | − T T T = E zkzm© E E zk z1:k 1© zm EªªzkE zm z1:m 1 − { | − } − | − T © +E ªE zk©z1:k 1 E zm z1:ªm 1 © .© ªª (A.247) { | − } | − © © ªª 338

Assume that m

T T T E E zk z1:k 1 zm = E E zkzm z1:k 1 = E zkzm . (A.248) { | − } | − Therefore © ª © © ªª © ª

T T T T E ˜zk˜zm = E zkzm E E zk z1:k 1 zm E zkE zm z1:m 1 − { | − } − | − T © ª © +E ªE zk©z1:k 1 E zm z1:ªm 1 © © ªª { | − } | − T T = E zkE© zm z1:m 1 +©E E zk zªª1:k 1 E zm z1:m 1 (A.249). − | − { | − } | − © © ªª © © ªª The first expectation can be expressed as

T T E zkE zm z1:m 1 = zkE zm z1:m 1 f (z1:k) dz1:k | − | − Z © © ªª © T ª = zkE zm z1:m 1 f (zk z1:k 1) f (z1:k 1) dzkdz1:k 1 | − | − − − ZZ © ª T = zkf (zk z1:k 1) dzkE zm z1:m 1 f (z1:k 1) dz1:k 1 | − | − − − ZZ T © ª = E zk z1:k 1 E zm z1:m 1 f (z1:k 1) dz1:k 1 { | − } | − − − Z T = E E zk z1:k 1 E© zm z1:m 1ª . (A.250) { | − } | − © © ªª Thus

T T T E ˜zk˜zm = E zkE zm z1:m 1 + E E zk z1:k 1 E zm z1:m 1 − | − { | − } | − T T © ª = E ©E z©k z1:k 1 Eªªzm z1:m© 1 + E E z©k z1:k 1 Eªªzm z1:m 1 − { | − } | − { | − } | − = 0 .© © ªª © © (A.251)ªª

The same result would have been obtained if, instead of assuming m

The innovations sequence has an important implication in mean square estimation.

The minimum mean square estimator of a random vector zk is

ˆzk=E zk z0:k 1 . (A.252) | − © ª 339

The mean square estimator takes a sequence of correlated measurements zk and pro- duces a an a series of uncorrelated random variables ˜zk

˜zj = zk E zk z1:k 1 = zk ˆzk . (A.253) − { | − } −

That is, the error in estimating zk isuncorrelatedwiththeerrorinestimatingzj (with j = k). 6 340

Appendix B Concepts from Systems Theory

This appendix provides material that is referenced elsewhere in the dissertation, but is considered of secondary importance. The main focus is on linear systems theory. However, information on coordinate systems and dynamics is also provided.

B.1 Common Signals

This section gives a definition of basic deterministic signals, such as the impulse and step functions. It also provides a basic discussion of stochastic signals.

B.1.1 Deterministic Signals

Dirac Impulse Function The Dirac impulse function is likely the most important mathematical construct in all the work that follows. A rectangular pulse with unity area is the basis for the Dirac impulse [?, p. 276]. The equation describing the rectangular pulse is

1/ε if 0 t<ε D (τ; ε)= .(B.1) 0 if ≤t>ε ½ Examples of the unity-area, rectangular pulse with different ε are shown in Figure B.1. Consider the following integral

ε 1 lim ∞ g (τ) D (τ; ε) dτ =lim g (τ) dτ = g (0) .(B.2) ε 0 ε 0 ε → Z0 → Z0 It is customary to write dispense with calling attention to the ε limit and write the integral as ∞ g (τ) δ (τ) dτ lim ∞ g (τ) D (τ; ε) dτ .(B.3) , ε 0 Z0 → Z0 341

D

−1 ε1

−1 ε 2

−1 ε 3 t

ε1 ε 2 ε 3

Figure B.1. Rectangular pulses with unity area.

The Dirac impulse can be shifted to give

∞ g (τ) δ (τ a) dτ = g (a) .(B.4) − Z0 It is necessary to consider what happens when the limits of integration are changed B g (a) if A aa or a B ZA ½ ≥ One can also define the Dirac impulse to be centered about the origin rather than to the right of it . For a centered impulse, the definition is given by [?,p.7]

B g (a) if Aa or a>B .(B.6) A − ⎧ Z ⎨ undefined if a = A or a = B 1 An impulse function can be used⎩ to describe a function g (t) as

g (t)= ∞ g (τ) δ (t τ) dτ .(B.7) − Z−∞ 1 This is often taken as the definition of the impulse function. 342

Step Function The step function also plays an important role in the work that follows. It is defined by 1 if t > τ S (t τ)= .(B.8) − 0 if t < τ ½ It can be expressed as a convolution integral

t S (t τ)= δ (γ τ) dγ .(B.9) − − Z−∞ Discrete-Time Impulse Function The discrete-time impulse function is much more simple than it’s continuous-time counterpart

1 if m =0 δ [m]= .(B.10) 0 if m =0 ½ 6 B.1.2 Stochastic Signals

A stochastic signal is an signal that is described by its statistical properties (autocor- relation, etc.). One can model any signal as a stochastic signal, simply by specifying its statistical properties. In most cases, much more is known about a signal than its statistical properties. For example, a sine wave can be modeled as the output of a second order differential equation that is subjected to an impulsive input. One can specify an autocorrelation function for such a signal, but their is little motivation to do so when a more complete model exists. Therefore, one uses a stochastic (sta- tistical) representation of a signal when that is the best that can be done. If more information is available, then perhaps a conditional autocorrelation function can be defined. Like deterministic signals, stochastic signals may be either continuous or discrete. A discrete-time stochastic signal is referred to as a stochastic sequence, where as a continuous-time stochastic signal is referred to as a stochastic process. 343

Stochastic Processes The most fundamental statistics of a stochastic signal are its mean and autocorrelation2

µ (t) E r (t) (B.11a) RR , { }

RRR (t + τ,t) E r (t + τ) ,r(t) .(B.11b) , { }

A wide sense stationary process has a constant mean µR and its autocorrelation function is stationary (independent of time t)

RRR (τ) E r (t + τ) ,r(t) .(B.12) , { } The autocorrelation function gives an indication of how quickly a signal is chang- ing with time. The power spectral density is simply the Fourier transform of the autocorrelation function

∞ jΩτ PRR (Ω) RRR (τ) = RRR (τ) e− dτ .(B.13) , F{ } Z−∞ The Inverse Fourier transform gives the autocorrelation

1 RRR (τ)= − PRR (Ω) F { } 1 = ∞ P (Ω) ejΩτ dτ .(B.14) 2π RR Z−∞ The mean-square value of the sigma r (t),whichisgivenbyRRR (0), can be obtained from the PSD by evaluating Eq. B.14 at Ω =0 1 R (0) = ∞ P (Ω) dτ .(B.15) RR 2π RR Z−∞ That is, the mean-square value is obtained by integrating the power spectrum. The autocorrelation function is a real-valued, even function. This means the power spec- trum will be real-valued, even function. For this reason, many engineers make use of a one-sided PSD, which is has twice the magnitude of its two-sided counterpart. The reason for this is that the area under the one-sided PSD gives the equivalent mean-square value as the area under the two-sided PSD. 2 It is assumed that the reader is familar with basic concepts from probability theory, such as the expectation operator. 344

White Noise A white noise process has an autocorrelation

RWW (τ) , Aδ (τ) .(B.16)

ThecorrespondingPSDisgivenby

PWW (Ω)=A .(B.17)

Needless to say, white noise can not actually be generated. However, it is a useful modeling approximation. If a stochastic signal has a bandwidth greater than the bandwidth of the system to which it is applied, then the stochastic signal bandwidth can be increased without affecting the system response. In such a case, a white noise model is useful.

Stochastic Sequences A stochastic sequence has a mean and autocorrelation function

µ [k] E r [k] (B.18a) RR , { }

RRR [m + k, m] E r [m + k] ,r(m) .(B.18b) , { } If the sequence is stationary, then the autocorrelation is only a function of k

RRR [k] E r [m + k] ,r(m) .(B.19) , { } The PSD of a sequence is simple the discrete-time Fourier transform of the autocor- relation function

∞ jωk PRR (ω) RRR [k] = r [k] e− .(B.20) , F{ } k= X−∞ White Noise A white sequence is similar to a white process in that it has an impulsive autocorrelation function

E w [n] w [m] = σ2δ [m n] ,(B.21) { } − where δ [m n] is the discrete-time impulse defined by Eq. B.10.A white sequence − also has a zero mean E w [n] =0.(B.22) { } 345

B.2 Convolution and Impulse Response

This section presents two important concepts related to differential (and difference) equations — a system’s impulse response and the convolution integral. A system’s impulse response is, as the name implies, it’s response to an impulse input function. Once computed, a linear system’s impulse response can be considered a complete description of the system. That is, the impulse response of a linear system contains just as much information about the system as the original system description (most likely differential equations). Furthermore, by a process known as convolution, the system’s response to other inputs can be computed using only the impulse response. Consider the linear, time-varying state space system

x˙ = F (t) x (t)+B (t) u (t) (B.23a)

y = C (t) x (t) .(B.23b)

To avoid any possible ambiguity, the dimension of F is n n, the dimension of B is × n r and the dimension of C is m n.Thismakesx an n-dimensional vector and × × y an r-dimensional vector. The approach to deriving the impulse response of a vector system will be somewhat different than that of the scalar system. First, the solution to Eq. with u (t)=0 will be discussed. This will lead to a definition of the state transition matrix. The state transition matrix will then be used to construct the convolution integral of Eq. B.23.

B.2.1 Principle of Superposition

A linear system, by definition, must satisfy the principle of superposition. Suppose that a linear system given by Eq. B.23a is subjected to the two inputs u1 (t) and 346

u2 (t). The respective responses are given by

x˙ 1 = F (t) x1 (t)+B (t) u1 (t) (B.24a)

x˙ 2 = F (t) x2 (t)+B (t) u2 (t) .(B.24b)

These equations can be added to yield

d x˙ + x˙ = (x (t)+x (t)) 1 2 dt 1 2

= F (t)(x1 (t)+x2 (t)) +B (t)(u1 (t)+u2 (t)) .(B.25)

Define x (t)=x1 (t)+x2 (t) and u (t)=u1 (t)+u2 (t).Thedifferential equation is given by x˙ = F (t) x (t)+B (t) u (t) ,(B.26) which is the original equation describing the system. Therefore, the system response to u1 (t)+u2 (t) is simply x1 (t)+x2 (t). The output of the system given by Eq. B.23b is also linear

y (t)=y1 (t)+y2 (t)

= C (t) x1 (t)+C (t) x2 (t) = C (t) x (t) .(B.27)

Nonlinear systems, by definition, do not satisfy the principle of superposition.

B.2.2 Homogeneous System

The homogenous system is defined by

x˙ = F (t) x (t) ,(B.28)

with initial conditions x (t0). A general solution to the homogenous system can be expressed as a vector addition of n linearly independent solutions (since the solutions are linearly independent, any vector of dimension n can be expressed as a linear 347 combination of them - that is, the solutions provide a basis for the n dimensional vector space). The n solutions are found by solving the system with initial conditions 1 0 0 0 1 0 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 0 . x1 (t0)= , x2 (t0)= , ..., xn (t0)= . .(B.29) ⎢ . ⎥ ⎢ . ⎥ ⎢ ⎥ ⎢ . ⎥ ⎢ . ⎥ ⎢ 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ An n n matrix Φx (t, t0) canbeformedsuchthatitscolumnscorrespondtothe × solutions

Φx (t, t0)= x1 (t) x2 (t) xn (t) .(B.30) ···

The second argument t0 is necessary£ because the initial condition¤ x0 was specified at t = t0.WithΦx (t, t0) computed, the solution to a general initial condition vector x0 would be given by

x (t)=Φx (t, t0) x (t0) .(B.31)

This is the homogeneous solution and Φx (t, t0) is known as the fundamental solution matrix or state transition matrix.

Manipulations of the fundamental matrix According to Eq. B.31, the state transition matrix must satisfy

Φx (t3,t1)=Φx (t3,t2) Φx (t2,t1) .(B.32)

Similarly, the following identity must hold

1 Φx− (t2,t1)=Φx (t1,t2) .

At t =0, the solution to Eq. B.31 must equal the initial condition, which gives rise to the identity

Φx (t0,t0)=In n .(B.33) ×

Each column of Φx must satisfy the homogeneous equation (otherwise they would not be solutions to the equation), which can be expressed in matrix notation as

Φ˙ x (t, t0)=F (t) Φx (t, t0) .(B.34) 348

Since the columns of Φx are linearly independent, an inverse is guaranteed to exist such that 1 Φx (t, t0) Φx− (t, t0)=In n .(B.35) × Differentiating Eq. B.35 gives

˙ 1 ˙ 1 Φx (t, t0) Φx− (t, t0)+Φx (t, t0) Φx− (t, t0)=0 .(B.36)

Substituting Eq. B.34 into Eq. B.36 gives

1 ˙ 1 F (t) Φx (t, t0) Φx− (t, t0)+Φx (t, t0) Φx− (t, t0)=0 ,(B.37) or 1 F (t)= Φx (t, t0) Φ˙ − (t, t0) .(B.38) − x 1 Pre-multiplying by Φx− (t, t0) gives

1 1 Φ˙ − (t, t0)= Φ− (t, t0) F (t) .(B.39) x − x

Post multiplying each side of Eq. B.39 by x (t) gives

1 1 Φ˙ − (t, t0) x (t)= Φ− (t, t0) F (t) x (t) .(B.40) x − x

B.2.3 The Particular Solution

1 If Eq. B.23a is pre-multiplied by Φx− (t, t0), the result is

1 1 1 Φx− (t, t0) x˙ (t)=Φx− (t, t0) F (t) x (t)+Φx− (t, t0) B (t) u (t) .(B.41)

Adding Eq. B.41 to Eq. B.40 gives

1 1 ˙ − 1 Φx− (t, t0) x˙ + Φx (t, t0) x (t)=Φx− (t) B (t) u (t) ,(B.42) or d Φ 1 (t, t ) x (t) = Φ 1 (t, t ) B (t) u (t) .(B.43) dt x− 0 x− 0 £ ¤ 349

Integrating this result gives

t 1 1 1 Φx− (t, t0) x (t)=Φx− (t0,t0) x (t0)+ Φx− (τ,t0) B (τ) u (τ) dτ .(B.44) Zt0

Pre-multiplying by Φx (t, t0) gives

t 1 x (t)=Φx (t, t0) x (t0)+Φx (t, t0) Φx− (τ,t0) B (τ) u (τ) dτ t0 t Z = Φx (t, t0) x (t0)+ Φx (t, t0) Φx (t0,τ) B (τ) u (τ) dτ t0 Z t = Φx (t, t0) x (t0)+ Φx (t, τ) B (τ) u (τ) dτ .(B.45) Zt0

Letting t0 go to minus infinity requires x ( )=0 for a stable system −∞ t x (t)= Φx (t, τ) B (τ) u (τ) dτ .(B.46) Z−∞ The system output y (t) is given by

t t y (t)=C (t) x (t)= C (t) Φx (t, τ) B (τ) u (τ) dτ = H (t, τ) u (τ) dτ . Z−∞ Z−∞ (B.47) where the impulse response H (t, τ) is an (m r) matrix with the i, j element corre- × sponding to the impulse response (at time t)attheith output port due to an impulse applied at the jth inputport(attimeτ)

C (t) Φ (t, τ) B (τ) τ t H (t, τ) x .(B.48) , 0 τ>t≤ ½

To prove this, consider applying an impulse to the first input port at τ = τ α

1 0 u (τ)=⎡ . ⎤ δ (τ τ α) , 11δ (τ τ α) .(B.49) . − − ⎢ ⎥ ⎢ 0 ⎥ ⎢ ⎥ ⎣ ⎦ 350

The system response is a column vector of dimension r (the same dimension as the output y)

t y (t)= H (t, τ) u (τ) dτ

Z−∞t = H (t, τ) 11δ (τ τ α) dτ − Z−∞ = H (t, τ α) 11 .(B.50)

The vector 11 selects the first column of the impulse response matrix for output. Similarly, an impulse applied to the jth input port will produce the jth column of the impulse response matrix

u (τ)=1jδ (τ τ α) y (t)=H (t, τ α) 1j .(B.51) − →

B.2.4 Time-Invariant Systems

When the system given by Eq. B.23 has constant matrices, the system is said to be time-invariant. The homogeneous system is given by Eq. B.28 with F (t)=F

x˙ = Fx .(B.52)

To proceed further, one must define the matrix exponential eFt t2 t3 eFt I + Ft + F2 + F3 + .(B.53) , 2 3! ··· Taking the time derivative gives d t2 eFt = F + F2t + F3 + dt 2 ··· ¡ ¢ = eFtF .(B.54)

Bring both terms of the homogeneous equation to one side of the equal sign and multiply by the matrix exponential

Ft Ft Ft d Ft e (x˙ Fx)=e x˙ e Fx = e− x = 0 .(B.55) − − dt ¡ ¢ 351

Integrate from t0 to t

Ft Ft0 e− x (t) e− x (t0)=0 .(B.56) − Multiplying by eFt and rearranging gives

F(t t0) x (t)=e − x (t0) .(B.57)

Comparing this result with Eq. B.31, one finds

F(t t0) Φx (t, t0)=e − .(B.58)

The total solution (homogeneous plus particular) can be obtained by substituting Eq. B.58 into Eq. B.45 t x (t)=Φx (t, t0) x (t0)+ Φx (t, τ) Bu (τ) dτ t0 Zt F(t t0) F(t τ) = e − x (t0)+ e − Bu (τ) dτ .(B.59) Zt0 The output can expressed in terms of the convolution integral given by Eq. B.47 with CeF(t τ)B τ t H (t, τ) − .(B.60) , 0 τ>t≤ ½ B.3 Linear Systems with Stochastic Inputs

Consider the system given by Eq. B.23. Suppose that the system has a deterministic input uD (t) and a stochastic input uS (t)

Bu (t)=BDuD (t)+BSuS (t) .(B.61) As is usually the case, the stochastic input is assumed to be white noise with auto- correlation

A1δ (τ γ) − 0 A2δ (τ γ) T ⎡ − . ⎤ E uS (τ) uS (γ) = .. ⎢ ⎥ ⎢ An 1δ (τ γ) ⎥ £ ¤ ⎢ 0 − − ⎥ ⎢ Anδ (τ γ) ⎥ ⎢ − ⎥ = A⎣ δ (τ γ) . (B.62)⎦ − 352

The total response can be found using Eq. B.45. For further analysis, it is convenient to let t0 = tk and t = tk+1

tk+1 x (tk+1)=Φx (tk+1,tk) x (tk)+ Φx (tk+1,τ) B (τ) u (τ) dτ Ztk tk+1 = Φx (tk+1,tk) x (tk)+ Φx (tk+1,τ) BD (τ) uD (τ) dτ Ztk tk+1 + Φx (tk+1,τ) BS (τ) uS (τ) dτ Ztk = Φx (tk+1,tk) x (tk)+wk + dk ,(B.63) where

tk+1 dk = Φx (tk+1,τ) BD (τ) uD (τ) dτ (B.64a) Ztk tk+1 wk = Φx (tk+1,τ) BS (τ) uS (τ) dτ .(B.64b) Ztk One needs to determine the statistical properties of wk. Suppose that the wk is evaluated as a limiting Riemann integral with T = tk+1 tk − tk+1 wk = Φx (tk+1,τ) BS (τ) uS (τ) dτ Ztk N T = lim Φx (tk+1,τj) BS (τ j) uS (τ j) ,(B.65) N N →∞ j=1 X where T τ = t + j .(B.66) j k N

The random variables Φx (tk+1,τj) BS (τ j) uS (τ j) in the summation are all indepen- dent because uS is white noise. According to the central limit theorem, an infinite sum of independently distributed random variables has a Gaussian distribution, re- gardless of the distribution of the random variables themselves. Therefore, the ran- dom variable wk, is Gaussian and complete characterized by its mean and covariance. Themeanisgivenby

tk+1 E wk = Φx (tk+1,τ) BS (τ) E uS (τ) dτ .(B.67) { } { } Ztk 353

Since the process noise uS (τ) is zero mean then

E wk = 0 .(B.68) { } Thecovarianceisgivenby T cov (wk)=E wkwk T © tk+1ª tk+1 = E Φx (tk+1,τ) BS (τ) uS (τ) dτ Φx (tk+1,γ) BS (γ) uS (γ) dγ (∙Ztk ¸ ∙Ztk ¸ ) tk+1 tk+1 T T = Φx (tk+1,τ) BS (τ) E uS (τ) uS (γ) BS (γ) Φx (tk+1,γ) dτdγ Ztk Ztk tk+1 tk+1 © ª T T = Φx (tk+1,τ) BS (τ) Aδ (τ γ) B (γ) Φ (tk+1,γ) dτdγ − S x Ztk Ztk tk+1 T T = Φx (tk+1,τ) BS (τ) ABS (τ) Φx (tk+1,τ) dτ Ztk = Qk , (B.69) where

tk+1 T T Qk = cov (wk)= Φx (tk+1,τ) BS (τ) ABS (τ) Φx (tk+1,τ) dτ .(B.70) Ztk Thus, wk is a white sequence with

wk N (0, Q ) .(B.71) ∼ k Example B.1. Consider a system with the following properties 40 A = (B.72a) 016 ∙ ¸ 01 F = (B.72b) 00 ∙ ¸ 01 B = . (B.72c) S 10 ∙ ¸ A discrete Kalman filter model is desired. Find Q for this model. The terms needed to compute this are given by 01 40 01 B ABT = S S 10 016 10 ∙ ¸ ∙ ¸ ∙ ¸ 016 01 = 40 10 ∙ ¸ ∙ ¸ 16 0 = , (B.73) 04 ∙ ¸ 354 and eνF . (B.74)

The eigenvalues of the matrix F are

λI F =0 λ1,2 =0. (B.75) | − | ⇒

Using this, the transition matrix is found to be

1 ν eνF = . (B.76) 01 ∙ ¸ The integrand in (B.70) is found to be

T 1 ν 16 0 10 eνFB ABT eνF = S S 01 04 ν 1 ∙ ¸ ∙ ¸ ∙ ¸ ¡ ¢ 16 4ν 10 = 04 ν 1 ∙ ¸ ∙ ¸ 16 + 4ν2 4ν = . (B.77) 4ν 4 ∙ ¸ The matrix Q is found to be

T νF T νF T Q = e BSABS e dν Z0 T 16 + 4ν2 ¡4ν ¢ = dν 4ν 4 Z0 ∙ ¸ 16ν + 4 ν3 2ν2 T = 3 2ν2 4ν ∙ ¸¯0 16T + 4 T 3 2T 2 ¯ = 3 ¯ . (B.78) 2T 2 4T ¯ ∙ ¸ This section will show how the autocorrelation of an LTI systems response may be computed. The motivation for tracking applications is that an LTI system can be used to shape white noise. That is, when white noise is passed through an LTI system, the output autocorrelation is colored (non-uniform). 355

B.4 Covariance and State Propagation

In this section we consider how the state x (tk) changes with time. When x is deter- ministic (i.e. uS = 0), the trajectory x (t) is completely described by the differential equation and the deterministic input uD.Thatis,givenx (tk) and uD,onecan compute x (tk+1) with complete certainty if uS = 0. However, when uS is not equal to zero, the differential equation does not allow one to compute x (tk+1) with com- plete certainty because of the randomness of uS. In the stochastic setting we must treatthestateattk+1 as a random variable denoted by X (tk+1). Using the system equations and any other a priori information, it is possible to specify x (tk+1) and an associated covariance P (tk+1) that indicates a region within which the true value of

X (tk+1) is to be found at a specified level of probability. For notational convenience we use xk to denote the random variable x (tk) and Φk to denote the state transition matrix Φx (tk+1,tk) . Suppose that the state Xk has a normal distribution

Xk N (xˆk, Pk) .(B.79) ∼

Itmayverywellbethatthestateattimetk is known with complete certainty, in which case Pk = 0. One of the most interesting features of a stochastic system is how the uncertainty changes with time. An excellent discussion of this is given in

[86, Sec. 4.2]. Since both Xk and Wk are Gaussian, the linear combination used to form Xk+1 will also be Gaussian and therefore completely characterized by its mean and covariance. The mean can be found by taking the expectation of Eq. B.63

xˆk+1 = E Xk+1 { }

= E ΦkXk + Wk + dk { }

= Φkxˆk + dk .(B.80) 356

Similarly, the covariance is given by

T Pk+1 = E (Xk+1 xˆk+1)(Xk+1 xˆk+1) − − n o T = E (Φk (Xk xˆk)+Wk)(Φk (Xk xˆk)+Wk) − − n T T o = E Φk (Xk xˆk)(Xk xˆk) Φ + E WkW − − k k T = Φk©E (Xk xˆk)(Xk xˆk) Φ ª+ Qk© ª { − − } k T = ΦkPkΦk + Qk .(B.81)

The random variable Xk is distributed as

T Xk+1 N (xˆk+1, Pk+1)=N Φkxˆk + dk, ΦkPkΦ + Qk .(B.82) ∼ k ¡ ¢ The value of the state at time tk+1 is given in a stochastic context as a mean value and an associated covariance. The mean propagates just as if uS were zero, as in the deterministic case. The confidence region associated with the covariance Pk+1 is discussed in detail in Appendix ??. The subject of covariance and state propagation is continued in Chapter 3. Therein, it is shown how measurements of the state can be used to reduce the uncertainty in the state (i.e. reduce the covariance). As a result of this discussion, one can readily see that the mean satisfies the differential equation dxˆ = F (t) xˆ + B u .(B.83) dt D D Thecovariancesatisfies the equation t T T T P (t)=Φ (t, t0) P (t0) Φ (t, t0)+ Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) dτ .(B.84) Zt0 Taking the derivative gives

T T P˙ (t)=Φ˙ (t, t0) P (t0) Φ (t, t0)+Φ (t, t0) P (t0) Φ˙ (t, t0)

T T +Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) t ˙ T T + Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) dτ t0 Z t T ˙ T + Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) dτ .(B.85) Zt0 357

Substituting Eq. B.34 into this result gives

T T P˙ (t)=F (t) Φ (t, t0) P (t0) Φ (t, t0)+Φ (t, t0) P (t0) Φ (t, t0) F (t)

T T +Φ (t, t) BS (t) ABS (t) Φ (t, t) t T T + F (t) Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) dτ t0 Z t T T T + Φ (t, τ) BS (τ) ABS (τ) Φ (t, τ) F (t) dτ .(B.86) Zt0 Substituting t0 = t gives the differential equation for the covariance

˙ T T P (t)=F (t) P (t)+P (t) F (t)+BS (t) ABS (t) .(B.87)

B.5 Shaping Filters

This section discusses shaping filters for continuous-time signals. In the case of deterministic signals, shaping filters are used to generate a wide variety of signals from impulsive inputs. In the case of stochastic signals, shaping filters are used to create a signal with a desired power spectral density (PSD) from a signal that has a uniform PSD (i.e. white noise).

B.5.1 Continuous-Time Shaping Filters for Deterministic Signals

Let u (t) denote a signal that is to have some desired form that is to be achieved by processing a signal r (t) with a shaping filter. The set of possible shaping filters considered here can be described by a linear single-input, single output (SIMO) system of the form

m˙ = D (t) m + P (t) r (t) (B.88a)

u = V (t) m .(B.88b) where the dimension of m is c (this fixes the dimensions of Pc 1, Dc c and V1 c). × × × The signal r (t) will generally be a delayed impulse

r (t)=δ (t t∗) .(B.89) − 358

One important application of shaping filters, at least for this paper, is in adjoint analysis. For adjoint analysis, it is undesirable to have Eq. B.88b be an explicit function of t∗, a point that will be made clear later in the paper. This places an additional constraint on the design of shaping filters for deterministic signals. In the subsections that follow, time-varying and time-invariant shaping filters will be discussed.

Linear Time-Varying Shaping Filters Suppose that one would like to construct a delayedrampinput

u (t)=(t t∗) S (t t∗) .(B.90) − − One possibility is to use the time-varying shaping filter

m˙ = δ (t t∗) (B.91a) − u = t m .(B.91b) ·

However, the system response is given by

u (t)=tS (t t∗) .(B.92) −

This system does not produce the desired response. This problem could be fixed with model of the form u =(t t∗) m. While it is acceptable to have the input to the − · system be a function of t∗, it may not be desirable, for reasons already mentioned, to have the system response (Eq. B.88b) be an explicit function of t∗.Another possibility is the time-invariant shaping filter defined by

m¨ = δ (t t∗) .(B.93) −

Thesolutiontothisequationis

u (t)=(t t∗) S (t t∗) .(B.94) − − 359

The shaping filter produces the desired results and has shaping matrices given by

01 0 D = , P = .(B.95) 00 1 ∙ ¸ ∙ ¸ In some cases, one has no choice but to use a time varying shaping filter. For example, suppose one would like to generate the signal

u (t)=√t t S (t t∗) .(B.96) − ∗ −

The only possible time-invariant filter is nonlinear

m˙ 1 0 1 = + (B.97a) m˙ 2 0 δ (t t∗) ∙ ¸ ∙ ¸ ∙ − ¸ u = √m1 .(B.97b)

However, nonlinear shaping filters may not be desirable. The other choice is to use the time-varying filter

m˙ = δ (t t∗) (B.98a) − u = √t m .(B.98b) ·

Thesolutiontothisequationis

u = √tS (t t∗) .(B.99) −

As before, the solution is not the one desired, but it is the closest one that a linear shaping filter will permit without making the output u an explicit function of t∗.

Linear Time-Invariant Shaping Filters Many functions (step, ramp, sinusoid, etc.) can be represented by a continuous-time LTI system’s impulse response. Any continuous-time LTI system can be represented by a constant coefficient differential equation

c c 1 c 1 d u d − u du d − r dr c + a1 c 1 + + ac 1 + acu = b0r + b1 c 1 + + bc 1 + bcr . (B.100) dt dt − ··· − dt dt − ··· − dt 360

Assuming zero initial conditions, the Laplace transform of this equation is

c c 1 U (s) b0s + b1s − + + bc 1s + bc − G (s)= = c c 1 ··· . (B.101) R (s) s + a1s − + ac 1s + ac ··· − Conversion of Eq. B.101 back to differential equation form is obvious

dcu scU (s) . (B.102) → dtc General methods exist [?, Ch. 3] for converting the transfer function into a state space system that is equivalent to Eq. B.100. The result will be of the form

m˙ = Dm + Pr (t) (B.103a)

u = Vm . (B.103b)

Suppose one wanted to model a sine function of the form

u (t)=sin[ω (t t∗)] S (t t∗) . (B.104) − − The shaping filter can be found by taking the Laplace transform of sin (ωt)

∞ st G (s)= sin (ωt) e− dt 0 Z ω = . (B.105) s2 + ω2 Using Eq. B.102, the equivalent differential equation is given by

u¨ + ω2u = ωr (t) . (B.106)

Taking the Laplace transform with r (t)=δ (t t∗) gives − ω u (s)= e st∗ . (B.107) s2 + ω2 −

The inverse Laplace transform gives Eq. B.104, which is the desired result. The shaping matrices to implement Eq. B.106 are given by

10 0 1 D = , P = , V = . (B.108) 0 ω2 ω 0 ∙ − ¸ ∙ ¸ ∙ ¸ 361

As a final note, it typically possible to create a shaping filter for a signal that has a closed form Laplace transform. In such cases, one must simply convert the Laplace transform into an equivalent set of differential equations to obtain the shaping ma- trices.

B.5.2 Continuous-Time Shaping Filters for Stochastic Signals

Shaping filters can be used to shape white noise in much the same way they are used to shape a deterministic impulse. This subsection will show this for a scalar signal r (t), but a similar result applies for the more general vector case. Let h denote the impulse response of a linear time-invariant (LTI) shaping filter. The response of the system to the stochastic input r (t) is given by the convolution integral

u (t)= ∞ r (γ) h (t γ) dγ − Z−∞ = ∞ r (t γ) h (γ) dγ . (B.109) − Z−∞ The cross-correlation of r (t) and u (t) is given by

RRU (τ) E r (t + τ) u (t) , { } = E r (t + τ) ∞ r (t γ) h (γ) dγ − ½ Z−∞ ¾ = ∞ E r (t + τ) r (t γ) h (γ) dγ { − } Z−∞ ∞ = RRR (τ + γ) h (γ) dγ Z−∞ ∞ = RRR (τ γ) h ( γ) dγ − − Z−∞ = RRR (τ) h ( τ) . (B.110) ∗ − 362

The autocorrelation of u (t) is given by

RUU (τ) E u (t) u (t τ) , { − } = E ∞ r (t γ) h (γ) dγ u (t τ) − − ½∙Z−∞ ¸ ¾ = ∞ E r (t γ) u (t τ) h (γ) dγ { − − } Z−∞ ∞ = RRU (τ γ) h (γ) dγ − Z−∞ = RRU (τ) h (τ) . (B.111) ∗

Combining the previous results gives

RUU (τ)=RRU (τ) h (τ) ∗

= RRR (τ) h ( τ) h (τ) . (B.112) ∗ − ∗ Convolution in the time domain corresponds to multiplication in the frequency domain

PUU (Ω)=PRR (Ω) H∗ (Ω) H (Ω) ∗ ∗ 2 = PRR (Ω) H (Ω) . (B.113) | | This result is especially useful when r (t) is white noise

2 PUU (Ω)=A H (Ω) . (B.114) | |

Simple Correlation Low Pass Filter One of the most common systems found in sto- chastic signal analysis is the low-pass filter subject to a white noise input. A low-pass filter has a Laplace transform representation

1 H (s)= , (B.115) 1+τ cs where τ c is the time constant of the filter. The impulse response of the system is given by

1 t/τ c h (t)= e− S (t) . (B.116) τ c 363

The cross correlation between the input and output of the filter is found by evaluating

Eq. B.110 with RRR (τ)=Aδ (τ)

∞ RRU (τ)= RRR (τ γ) h ( γ) dγ − − Z−∞ A ∞ = δ (τ γ) eγ/τc S ( γ) dγ τ c − − Z−∞ A = eτ/τc S ( τ) . (B.117) τ c − Note that the cross correlation is zero for all τ>0. This is because the input signal is white. The output autocorrelation is found by evaluating Eq. B.111

∞ RUU (τ)= RRU (τ γ) h (γ) dγ − Z−∞ A ∞ (τ γ)/τ c γ/τc = 2 e − S (γ τ) e− S (γ) dγ τ c − Z−∞ A ∞ (τ γ)/τ c γ/τc = e − e− S (γ τ) dγ . (B.118) τ 2 − c Z0 Assume that τ<0

A ∞ (τ γ)/τ c γ/τc RUU (τ)= e − e− S (γ τ) dγ τ 2 − c Z0 A ∞ = e(τ γ)/τ c e γ/τc dγ τ 2 − − c Z0 A ∞ = eτ/τc e 2γ/τc dγ τ 2 − c Z0 A τ/τc 2γ/τc ∞ = e e− 0 −2τ c A £ ¤ = eτ/τc . (B.119) 2τ c

Similarly, if τ>0 then R (τ)= A e τ/τc . Therefore, the output autocorrelation UU 2τ c − is given by

A τ /τ c RUU (τ)= e−| | . (B.120) 2τ c The signal’s autocorrelation exponentially decays with time. Since the input signal has a zero mean, the output signal will also be zero mean. The variance of the output 364 signal is given by evaluating the autocorrelation function at τ =0

2 σU = RUU (0) A = . (B.121) 2τ c

The value of A can be selected to produce a signal with the desired variance σ2

2 A =2τ cσ . (B.122)

This results in the autocorrelation function

2 τ /τ c RUU (τ)=σ e−| | . (B.123)

For more information on noise modeling, see [?].

Transient Response In the design of shaping filters, one must keep in mind that a transient will exist at the start of a simulation. For example, suppose one wishes to create a signal with an autocorrelation given by Eq. B.123. This autocorrelation wasderivedbyusingawhitenoiseinputsignal. Ifthewhitenoiseiszeropriorto t =0,thenthePSDisgivenby

RRR (t2,t1)=Aδ (t2 t1) S (t1) . (B.124) −

The impulse response of the system is given by

1 t/τ c h (t)= e− S (t) . (B.125) τ c 365

The cross-correlation of r (t) and u (t) is given by

RRU (t + τ,t)=E r (t + τ) u (t) { } = E r (t + τ) ∞ r (t γ) h (γ) dγ − ½ Z−∞ ¾ = ∞ E r (t + τ) r (t γ) h (γ) dγ { − } Z−∞ ∞ = RRR (t + τ,t γ) h (γ) dγ − Z−∞ 1 ∞ γ/τc = RRR (t + τ,t γ) e− dγ τ − c Z0 1 ∞ γ/τc = Aδ (τ + γ) S (t γ) e− dγ τ c 0 − Z t A γ/τc = δ (τ + γ) e− dγ τ c 0 A Z = eτ/τc S (t) S ( τ) s (t + τ) . (B.126) τ c − 366

The output autocorrelation is given by

RUU (t, t τ)=E u (t) u (t τ) − { − } = E ∞ r (t γ) h (γ) dγu (t τ) − − ½Z−∞ ¾ = ∞ E r (t γ) u (t τ) h (γ) dγ { − − } Z−∞ ∞ = RRU ((t τ)+(τ γ) ,t τ) h (γ) dγ − − − Z−∞ 1 ∞ (τ γ)/τ c = A e − S (t τ) S (γ τ) S (t γ) h (γ) dγ τ c − − − Z−∞t 1 (τ γ)/τ c = A e − h (γ) dγS (t τ) τ c τ − Z t 1 (τ γ)/τ c 1 γ/τc = A e − e− S (γ) dγS (t τ) τ c τ τ c − Z t A τ/τc 2γ/τc = e e− dγS (t τ) τ 2 − c Zmin(0,τ) t 1 τ/τc τ c 2γ/τc = 2 Ae S (t τ) e− τ c − − 2 min(0,τ) h i A τ/τc 2min(0,τ)/τ c 2t/τ c = e S (t τ) e− e− 2τ c − −

A τ /τ c ¡ A (τ 2t)/τ c ¢ = e−| | S (t τ) e − S (t τ) . (B.127) 2τ c − − 2τ c − The variable t is arbitrary and can be replaced with t + τ

A τ /τ c A (τ+2t)/τ c RUU (t + τ,t)= e−| | S (t) e− S (t) . (B.128) 2τ c − 2τ c The second term in the autocorrelation function is a transient that goes to zero as t gets large. Taking the limit as t , the transient autocorrelation goes to zero and →∞ the steady-state autocorrelation function is equal to the autocorrelation given by Eq. B.120.

B.5.3 Discrete-Time Shaping Filters for Stochastic Signals

A linear shift-invariant (LSI) system will produce a Markov sequence. The output PSD will be non-uniform due to the correlation that results from the LSI system. 367

It is possible to design the LSI such that a desired spectrum or autocorrelation is achieved. Consider the first-order Markov sequence

X (k)=αX (k 1) + βV (k 1) − − = α [αx (k 2) + βv (k 2)] + βV (k 1) − − − = α2X (k 2) + β [αV (k 2) + V (k 1)] − k − − k n 1 = α X (0) + β α − V (k n) , (B.129) n=1 − X where V (k) satisfies

E V (k) V (k + m) = γ (k) δ (m) . (B.130) { }

At index k + m

k+m k+m r 1 X (k + m)=α X (0) + β α − V (k + m r) . (B.131) r=1 − X The autocorrelation of X is

RXX [k, m]=E X (k) X (k + m) { } k k+m 2k+m 2 2 n 1 r 1 = α E X (0) + β α − α − E V (k + m r) V (k (B.132)n) . n=1 r=1 { − − } © ª X X The second summation is only nonzero when k + m r = k n which is true when − − r = m + n

k k+m 2k+m 2 2 n 1 r 1 RXX [k, m]=α E X (0) + β α − α − E V (k + m r) V (k n) n=1 r=1 { − − } © ª Xk X 2k+m 2 2 n 1 m+n 1 = α E X (0) + β α − α − γ (k n) n=1 − © ª X k = α2k+mE X2 (0) + β2αm α2nγ (k n 1) . (B.133) n=0 − − © ª X 368

If γ (k)=γ a constant,

k 2k+m 2 2 m 2n RXX [k, m]=α E X (0) + β α α γ (k n 1) n=0 − − © ª Xk = α2k+mE X2 (0) + β2αmγ α2n n=0 X © ª 1 α2(k+1) = α2k+mE X2 (0) + β2αmγ − . (B.134) 1 α2 − © ª Since α<1,forlargek, αk 0 → 2(k+1) 2k+m 2 2 m 1 α RXX [k, m]=α E X (0) + β α γ − 1 α2 − β2γ © ª αm . (B.135) ' 1 α2 − That is, for large k and a constant γ, the autocorrelation is stationary

2 β γ m RXX [m] α . (B.136) ' 1 α2 − Just as in the continuous-time example given in Section B.5.2, the end result is a signal with an exponentially decreasing autocorrelation.

B.6 Coordinate Frames

Parts of this dissertation make use of specific coordinate frames. It is necessary to relate vectors in one coordinate frame with vectors in another coordinate frame. What follows is a very brief introduction to the subject which only provides material that is essential to this dissertation. For further information, the reader can consult reference [57].

B.6.1 Direction Cosine Matrix (DCM)

Consider the case of two coordinate systems C and C0 with axes denoted by x, y, z and x0,y0,z0, respectively [34, pp. 57-61]. Figure B.2 shows the relationship between 369

z

γ

r k x' r i ' β r y j r i α

x

Figure B.2. Direction angles.

the x0 axis of C0 and the x, y, z axes of coordinate system C. The components of the unit vector (a vector of unity magnitude) i0 are the projections of the vector onto the axes of coordinate system C

i0 =(cosα)i +(cosβ)j +(cosγ)k . (B.137)

Define lp0q = lqp0 tobethecosineoftheanglebetweentheaxisp0 and axis q.Ex- tending Eq. B.137 to each of the axes of the C0 coordinate system

i0 = lx0xi + lx0yj + lx0zk (B.138a) j0 = ly0xi + ly0yj + ly0zk (B.138b) k0 = lz0zi + lz0yj + lz0zk , (B.138c) or in matrix notation i0 i j0 = R j , (B.139a) ⎡ ⎤ ⎡ ⎤ k0 k ⎣ ⎦ ⎣ ⎦ 370 where

lx0x lx0y lx0z R = ly x ly y ly z . (B.140) ⎡ 0 0 0 ⎤ lz0x lz0y lz0z The inverse transformation is obtained⎣ by ⎦

i i0 1 j = R− j0 . (B.141) ⎡ ⎤ ⎡ ⎤ k k0 ⎣ ⎦ ⎣ ⎦ This equation can be simplified by noting that

RRT = I

T 1 R = R− , (B.142) resulting in i i0 T j = R j0 . (B.143) ⎡ ⎤ ⎡ ⎤ k k0 ⎣ ⎦ ⎣ ⎦ B.6.2 Euler Angles

Euler angles are used to uniquely specify the angular orientation of an object with three unconstrained coordinates. The fact that an object only has three degrees of freedom in its angular orientation and the Euler angles method only requires three coordinates (other methods, such as the quaternion require four constrained coordi- nates) makes this description attractive. Furthermore, it is very easy to construct a DCM from a set of Euler angles. The Euler angles are a set of three angles that are used in a body-fixed rotation sequence. There are a total of twelve distinct Euler angle descriptions in a right- hand coordinate system. A distinct description consists of specifying the sequence andaxesaboutwhichtherotationsoccur. Forexample,thexzx sequence denotes a rotation about the object’s x-axis followed by a rotation about the object’s z-axis followed by another rotation about the object’s x-axis. An example of a complete Euler angle description is given in Section B.6.2. 371

Yaw Pitch Roll DCM This development of the yaw-pitch-roll DCM is intended to give the reader an appreciation of the simplicity of the Euler description and show its limitations. This DCM is a body-fixed rotation sequence, that is (perhaps) best visualized with its application to an aircraft. Let ABC (aircraft body coordinates) be a body fixed frame with x-forward, y-starboard (to the right), and z-down in the aircraft. The following rotations are made [88, pp. 36-37]

Algorithm B.1. (1) Rotate about the z-axis, nose right (positive “yaw ψ) cos ψ sin ψ 0 Rz (ψ) , sin ψ cos ψ 0 ; (B.144) ⎡ − 001⎤ ⎣ ⎦ (2) Rotate about the new y-axis, nose up (positive “pitch” θ) cos θ 0 sin θ − Ry (θ) , 01 0 ; (B.145) ⎡ sin θ 0cosθ ⎤ ⎣ ⎦ (3) Rotate about the new x-axis, right wing down (positive “roll” φ) 10 0 Rx (φ) , 0cosφ sin φ . (B.146) ⎡ 0 sin φ cos φ ⎤ − ⎣ ⎦ Such a sequence takes a vector in the NED (North-East-Down) frame and trans- forms it into the ABC frame. Alternately, this can be viewed as aligning the NED frame with the ABC frame. With either interpretation, the result is

(ABC) (NED) r = Rx (φ) Ry (θ) Rz (ψ) r 10 0 cos θ 0 sin θ cos ψ sin ψ 0 = 0cosφ sin φ 01− 0 sin ψ cos ψ 0 r(NED) ⎡ 0 sin φ cos φ ⎤ ⎡ sin θ 0cosθ ⎤ ⎡ − 001⎤ − ⎣ cos θ cos ψ⎦ ⎣ cos⎦ ⎣θ sin ψ ⎦sin θ = sin φ sin θ cos ψ cos φ sin ψ sin φ sin θ sin ψ +cosφ cos ψ sin−φ cos θ r(NED) ⎡ cos φ sin θ cos ψ −+sinφ sin ψ cos φ sin θ sin ψ sin φ cos ψ cos φ cos θ ⎤ − ⎣ ABC (NED) ⎦ = RNEDr . (B.147)

ThisDCMismoregeneralthanthecontextinwhichitwasderivedabove. Letthere be defined a right-handed coordinate system A. Another right-handed coordinate 372

system B is formed by performing a positive yaw ψ about Az, a positive pitch θ about (A) Ay0 ,apositiverollφ about Ax00 sequence. A vector r in the A coordinate system can be expressed in the B coordinate system by the rotation cos θ cos ψ cos θ sin ψ sin θ B − RA = sin φ sin θ cos ψ cos φ sin ψ sin φ sin θ sin ψ +cosφ cos ψ sin φ cos θ ⎡ cos φ sin θ cos ψ−+sinφ sin ψ cos φ sin θ sin ψ sin φ cos ψ cos φ cos θ ⎤ − ⎣ (B.148a)⎦ r(B)=RB r(A) . (B.148b) A · B The transformation from A to B will be either written as RA or TBA (this form is more convenient for code). Alternately, if the vector r is expressed in the B coordinate system, use

(A) B T (B) r = RA r

A (B) = R¡ Br¢ , (B.149a)

A to rotate to the A coordinate system. Note that the matrix RB consists of the opposite sequence of elementary transformations; first roll, then pitch, and finally yaw. Given a DCM, one can compute a set of Euler angles using the following formulas:

1 1 ψ =atan2R2 [2, 1] , R2 [1, 1] (B.150a)

1 1 θ = sin−¡ R [3, 1] ¢ (B.150b) − 2 1 1 φ =atan2R¡ 2 [3, 2] , R¢ 2 [3, 3] , (B.150c) where atan2 (y,x) is the four-quadrant¡ inverse tangent function.¢ The range of the yaw ψ and roll φ angles is [0, 2π] and the range of the pitch θ angle is π , π .If − 2 2 θ = π , then the yaw ψ and roll φ are not unique. Euler Angles a bad£ choice¤ for ± 2 non static angular relations that cannot preclude this configuration.

B.7 Dynamics (Kinematics)

This 373

B.7.1 Time-Derivative of a Constant Magnitude Vector

Consider a constant magnitude vector r =(rx,ry,rz) in coordinate system C.Assume that the vector undergoes (active point of view) a sequence of three infinitesimal rotations of about the axes of C.

10 0 10 dθY 1 dθZ 0 − R1 = 01dθX , R2 = 01 0 , R3 = dθZ 10 ⎡ ⎤ ⎡ ⎤ ⎡ − ⎤ 0 dθX 1 dθY 01 001 − ⎣ ⎦ ⎣ ⎦ ⎣ (B.151)⎦ Keeping only first order terms and noting that the sequence of the infinitesimal rota- tionsdoesnotmatterwehave

r0 = R1R2R3r 0 dθZ dθY − = r + dθZ 0 dθX r ⎡ − ⎤ dθY dθX 0 − ⎣ dθX ⎦ = r + dθY r . (B.152a) ⎡ ⎤ × dθZ ⎣ ⎦ Thus we have the infinitesimal change in r

dθX dr = r0 r = dθY r . (B.153) − ⎡ ⎤ × dθZ ⎣ ⎦ Dividing each side of Eq. B.153 by dt gives

dθX dt dθY (i) r˙ = dt r , ω r , (B.154) ⎡ dθZ ⎤ × × dt ⎣ ⎦ where dθX dt dθY ω , dt . (B.155) ⎡ dθZ ⎤ dt Thus, a constant magnitude vector r with⎣ angular⎦ velocity ω has a time derivative given by r˙ = ω r. × 374

B.7.2 Time-Derivative of an Arbitrary Vector

Let the orthogonal unit vectors (ˆxA, yˆA, zˆA) define a coordinate system CA and the orthogonal unit vectors (ˆxB, yˆB, zˆB) define a coordinate system CB.Coordinate system CA is fixed in space and does not rotate. The origin of Coordinate system

CB translates relative to Coordinate system CA with a translational velocity r˙A,B and translational acceleration ¨rA,B.CoordinatesystemCB also rotates with angular velocity ωB/A and angular acceleration ω˙ B/A relative to coordinate system CA.

Position ApointP has CB coordinates denoted by

rB,P = xB,P xˆB + yB,P yˆB + zB,P zˆB , (B.156)

and CA coordinates denoted by

rA,P = xA,P xˆA + yA,P yˆA + zA,P zˆA

= rA,B + rB,P . (B.157a)

Velocity The time derivative of rB,P relative to CB is given by

r˙B,P x˙ B,P xˆB + y˙B,P yˆB + z˙B,P zˆB . (B.158) |B ,

Since the unit vectors defining CB are rotating relative to CA,thesamevectorwill appear to have a different time derivative relative to CA

dxˆB dyˆB dzˆB r˙B,P x˙ B,P xˆB + y˙B,P yˆB + z˙B,P zˆB + xB,P + yB,P + zB,P . (B.159) |A , dt dt dt

The unit vectors defining CB are constant magnitude and have a time derivative given by Eq. B.154

r˙B,P = x˙ B,P xˆB + y˙B,P yˆB + z˙B,P zˆB + xB,P ωB/A xˆB + yB,P ωB/A yˆB + zB,P ωB/A zˆB |A × × ×

= r˙B,P + ωB/A rB,P . (B.160a) |B × 375

Using these results, time derivative of rA,P is given by

r˙A,P = r˙A,B + r˙B,P |A |A |A

= r˙A,B + r˙B,P + ωB/A rB,P . (B.161a) |A |B ×

Acceleration The angular acceleration of point P relative to CB is given by

¨rB,P x¨B,P xˆB +¨yB,P yˆB +¨zB,P zˆB . (B.162) |B ,

The angular acceleration of point P relative to CA is given by d ¨rA,P = r˙A,B + r˙B,P + ωB/A rB,P |A dt |A |B × ¡ drB,P ¢ = ¨rA,B + ¨rB,P + ωB/A + ω˙ B/A rB,P + ωB/A r˙B,P |A |B × dt × × |A ¯B ¯ = ¨rA,B A + ¨rB,P B +2ωB/A r˙B,P ¯B + ω˙ B/A rB,P + ωB/A ωB/A rB,P . | | × |¯ × × (B.163a)× ¡ ¢ Time Derivatives Using DCMs Equivalent results can be obtained with the use di- rection matrices. This approach is advantageous when the time derivatives of the

DCM are known analytic functions. The point P has CA coordinates denoted by

(A) (A) A (B) rA,P = rA,B + RBrB,P . (B.164)

The time-derivative of this expression is given by

(A) (A) A (B) ˙ A (B) r˙A,P = r˙A,B + RBr˙B,P + RBrB,P . (B.165)

The second time-derivative is given by

(A) (A) A (B) ˙ A (B) ¨ A (B) ¨rA,P = ¨rA,B + RB¨rB,P +2RBr˙B,P + RBrB,P . (B.166)

Evaluating Eq. B.161 in coordinate system CA,gives

(A) (A) r˙A,P = r˙A,B + r˙B,P + ωB/A rB,P |A |A |B × (A) A (B) (A) A (B) ¡ ¢ = ¡r˙A,B + R r˙B,P + ¢ω R r . (B.167a) |A B |B B/A × B B,P ¡ ¢ ¡ ¢ ³ ´ 376

Comparing this result with Eq. B.165 gives

(A) (A) r˙ r˙A,B (B.168a) A,B , |A (B) (B) r˙ ¡r˙B,P ¢ (B.168b) B,P , |B R˙ A ω¡ (A) ¢RA . (B.168c) B , B/A × B

Similarly, evaluating Eq. B.163 in coordinate system CA,gives

(A) (A) ¨rA,P = ¨rA,B + ¨rB,P +2ωB/A r˙B,P |A |A |B × |B (A) ¡ ¢ £ +ω˙ B/A rB,P + ωB/A ωB/A ¤ rB,P × × × (A) A (B) (A) A (B) = ¨r£A,B + R ¨rB,P ¡ +2ω ¢¤R r˙B,P |A B |B B/A × B |B ¡ + ω˙ (A¢) RAr(B¡) + ω(A¢) ω(A) RAr(¡B) ¢ B/A × B B,P B/A × B/A × B B,P (A) A (B) ³ (A) A ´ (B) = ¨rA,B + R ¨rB,P +2ω R r˙B,P |A B |B B/A × B |B ¡ + ω˙ (¢A) RA +¡ ω(A) ¢ ω(A) RA r(¡B) .¢ (B.169) B/A × B B/A × B/A × B B,P h ³ ´i Comparing this result with Eq. B.167 gives

(A) (A) ¨r ¨rA,B (B.170a) A,B , |A (B) (B) ¨r ¡¨rB,P ¢ (B.170b) B,P , |B R¨ A ω¡˙ (A) ¢RA + ω(A) ω(A) RA . (B.170c) B , B/A × B B/A × B/A × B ³ ´ A Note on Vector Derivatives and Notation Even if coordinate systems CA and CB have a common origin,

(A) A (B) r˙A,P = R r˙A,P . (B.171) |A 6 B |B ¡ ¢ ¡ ¢ This is why we distinguish between r˙A,P and r˙A,P .Thatis,r˙A,P may be somewhat |A ambiguous because it has no coordinate system specified. When one sees a vector derivative in this form, it should be assumed that it is being evaluated relative to the coordinate system that is denoted by the first letter of the subscript, as in Eqs. B.165 and B.167. 377

i Time-Derivatives of the DCM Let the matrix Rj be a direction cosine matrix. Ac- i cording to the DCM interpretation, the matrix Rj can be thought of as a matrix of column vectors in the coordinate system Ci, each with angular velocity ωj/i.Ac- i cording to Eq. B.154, the time derivative of the matrix Rj is given by

R˙ i = ω(i) Ri . (B.172) j j/i× j

ThesecondtimederivativeoftheDCMisgivenby

d R¨ i = ω(i) Ri j dt j/i× j i = ω˙ (i³) Ri +ω(i´) R˙ j/i× j j/i× j = ω˙ (i) Ri +ω(i) ω(i) Ri . (B.173a) j/i× j j/i× j/i× j ³ ´ An alternate proof of these relationships is given by Eqs. B.168 and B.170. 378

References

[1] American Heritage College Dictionary,3rd Ed., Houghton Mifflin Co., New York, 1993, ISBN 0-395-66917-0.

[2] Anderson, E. W., "Navigational Principles as Applied to Animals", The Duke of Edinburgh Lecture, Royal Institute of Navigation, Journal of Navigation, Vol. 35, No. 1, 1982, pp. 1-27.

[3] Anderson, Brian, and John Moore, Optimal Filtering, Dover Publications, New York, 2005, ISBN 0-486-43938-0.

[4] Aggarwal, Romesh, and Daniel Boudreau, "An Implementable High-Order Guidance Law for Homing Missiles," IEEE Conference on decision and Control, New Orleans, LA, December 11-13, 1985.

[5] Anderson, G. M., "Comparison of Optimal Control and Differential Game In- tercept Missile Guidance Laws," Journal of Guidance and Control,Vol.4,No. 2, 1981, pp. 109-115.

[6] Arulampalam, Sanjeev M., et al., "A tutorial on Particle Filters for Online Nonlinear/Non-Gaussian Bayesian Tracking," IEEE Trans. on Signal Process- ing,Vol.50,No.2,Feb.2002,pp.174-188.

[7] Bar-Shalom, Yaakov, and Thomas E. Fortmann, Tracking and Data Associa- tion, Mathematics in Science and Engineering Vol. 179, Academic Press, 1988, ISBN 0-12-079760-7.

[8] Bellman, Richard, Dynamic Programming, Princeton University Press, Prince- ton, N. J., 1957.

[9] Ben-Asher, Joseph Z., and Isaac Yaesh, "Optimal guidance with reduced sen- sitivity to time-to-go estimation errors," Journal of Guidance, Control, and Dynamics,Vol.20,No.1,1997,pp.158-163.

[10] Ben-Asher, Joseph Z., and Isaac Yaesh, Advances in Missile Guidance Theory, Progress in Astronautics and Aeronautics, Vol. 180, 1998.

[11] Bezick, Scott, et al., "Guidance of a Homing Missile via Nonlinear Geometric Control Methods," AIAA J. Guidance, Control and Dynamics, Vol. 18, No. 3, June 1995, pp. 441-448.

[12] Blackman, Samuel, and Robert Popoli, Design and Analysis of Modern Tracking Systems, Artech House, Boston, 1999, ISBN 1-58053-006-0. 379

[13] Blaquiere, A., F. Gerard, and G. Leitmann, Quantitative and Qualitative Games, Academic Press, New York, 1969.

[14] Blake, Andrew, and Michael Isard, Active Contours, Springer, 1998, ISBN 3- 540-76217-5.

[15]Blom,H.A.P.,"Anefficient Filter for abruptly Changing systems," Proc. 23rd IEEE Conf. on Decision and Control, Las Vegas, NV, Dec. 1984, pp. 656-658.

[16] Blom, H. A. P., and Y. Bar-Shalom, "The Interacting Multiple Model Algorithm for systems with Markovian Switching Coefficients," IEEE Trans. on Automatic Control,Vol.33,No.8,Aug.1988,pp.780-783.

[17] Britting, Kennith R., Inertial Navigation Systems Analysis,JohnWileyand Sons, 1972.

[18] Brogan, William L., Modern Control Theory,3rd Ed., Prentice Hall, 1991, ISBN 0-13-589763-7.

[19] Brown, Robert, and Patrick Hwang, Introduction to Random Signals and Ap- plied Kalman Filtering,3rd Ed., John Wiley & Sons, 1997, ISBN 0-471-12839-2.

[20] Bryson, Arthur E. Jr., and Yu-Chi Ho, Applied Optimal Control,Ginnand Company, 1969.

[21] Casella, George, and Roger L. Berger, Statistical Inference,2nd Ed., Duxbury, 2002, ISBN 0-534-24312-6.

[22] Chiou, Ying-Chwan, and Chen-Yuan Kuo, "Geometric Approach to Three Di- mensional Missile Guidance Problem," AIAA J. of Guidance, Control, and Dy- namics,Vol.21,No.2,April1998,pp.335-341.

[23] Cottrell, R. G., "Optimal Intercept Guidance for Short-Range Tactical Mis- siles," AIAA Journal, Vol. 9, No. 7, 1971, pp. 1414-1415.

[24] Curry, Renwick E., "A Separation Theorem for Nonlinear Measurements," IEEE Trans. on Automatic Control,Vol.14,No.5,Oct.1969,pp.561-564.

[25] Duflos, E., et al., "General 3D Guidance Law Modeling," IEEE International Conference on Systems, Man and Cybernetics, Vol. 3, Oct. 1995, pp. 2013-2018.

[26] Daum, Frederick, E., and Robert J. Fitzgerald, "Decoupled Kalman Filters for Phased Array Radar Tracking," IEEE Transactions on Automatic Control,Vol. 28, No. 3, March 1983, pp. 269-283. 380

[27] Djuric, Petar M., et. al, "Particle filtering," IEEE Signal Processing Magazine, September 2003.

[28] Fitts, J., Video Correlation Tracker, U.S. Pat. 4,133,004, 1979.

[29] Fitts, J., "Correlation Tracking via Optimal Weighting functions," Hughes Air- craft Co., el Segundo, CA, Report Number P73-240, April 1973.

[30] Fossier, M. W., "The Development of Radar Homing Missiles," Journal of Guid- ance, Vol. 7, No. 6, 1984, pp. 641-651.

[31] Garnell, P., Guided Weapon Control Systems,2nd Ed., Pargamon Press Inc., New York, 1980.

[32] Gelb, Arthur, Applied Optimal Estimation, M.I.T. Press, 2001, ISBN 0-262- 57048-3.

[33] Geng, Z. Jason, and Claire L. McCullough, "Missile Control Using Fuzzy Cere- bellar Model Arithmetic Computer Neural Networks," AIAA J. of Guidance, Control and Dynamics, June 1997, pp. 557-565.

[34] Gerry H. Ginsberg, Advanced Engineering Dynamics, Cambridge University Press, 1998.

[35] Gonzalez, J., "New Methods in terminal guidance and control of tactical mis- siles," Guidance and Control for Tactical Guided Weapons with Emphasis on Simulation & Testing, Technical Editing and Reproduction Ltd Harford House, 709 Charlotte St, London, W1P1HD, 1979.

[36] Greenwood, Donald T., Principles of Dynamics, Prentice Hall, 1965, ISBN 64- 19700.

[37] Guelman, M. and J. Shinar, "Optimal Guidance Law in the Plane," AIAA Journal of Guidance, Vol. 7, No. 4, July-August 1984, pp. 471-476.

[38] Guelman, M. and O. M. Golan, "Three-Dimensional Minimum Energy Guid- ance," IEEE Transactions on Automatic Control, Vol. 31, No. 2, April 1995, pp. 835-841.

[39] Gunckel H., II, T. L., and G. F. Franklin, "A General Solution for Linear, Sampled-Data Control," ASME Journal of Basic Engineering,Vol.85D,June 1963, pp. 197-203.

[40] Gurstelle, William, The Art of the Catapult, Chicago Review Press, Chicago, Illinois, 2004, ISBN-13: 978-1-55652-526-1. 381

[41] Gutman, S., and G. Leitmann, "Stabilizing feedback control for dynamical sys- tems with bounded uncertainty," Proc. IEEE Conference on Decision and Con- trol, 1976, pp. 94-99. [42] Gurvine, R. Jeff, and Edwin G. Stauss, Fundamentals of Tactical Missiles, Raytheon Missile Systems, 1999. [43] Ho, Y. C., Bryson, A. E., and S. Baron, "Differential-Games and Optimal Pursuit Evasion Strategies," IEEE Transactions Auto. Control, Vol. C-10, No. 4, Oct. 1965, pp. 385-389. [44] Isaacs, R., "Differential Games I, II, III, IV," RAND, Corporation Research Memorandum RM-1391, 1399, 1411, 1468, 1954-1956. [45] Isaacs, R., Differential Games,Wiley,NewYork,1965. [46] Kalman, R. E., "A New Approach to Linear Filtering and Prediction Problems," J. Basic Engineering., Trans. ASME, Series D, Vol. 82, No. 1, March 1960, pp. 35-45. [47] Kirk, Donald C., Optimal Control Theory, An Introduction, Dover Publications, 2004. [48] Lee, G. K. F., "Estimation of the time-to-go parameter for air-to-air missiles," Journal of Guidance, Control, and Dynamics, Vol. 8, No. 2, 1985, pp. 262-266. [49] Lee, R. G., et al., Guided Weapons,3rd Ed., Brassey’s, Washington, 1997. [50] Lechevin, N., and C. A. Rabbath, "Lyapunov-Based Nonlinear Missile Guid- ance," AIAA J. Guidance, Control, and Dynamics, Vol. 27, No. 6, 2004, pp. 1096-1101. [51] Lin, Chih-Min, and Chun-Fei Hsu, "Guidance Law Design by Adaptive Fuzzy Sliding-Mode Control," AIAA J. of Guidance, Control, and Dynamics, Vol. 25, No. 2, April 2002, pp. 248-254. [52] Lin, Chun-Liang, and Huai-Wen Su, "Adaptive Fuzzy Gain Scheduling in Guid- ance System Design," J. of Guidance, Control and Dynamics, Vol. 24, no. 4, August 2001, pp. 683-692. [53] Mahapatra, P. R. and U. S. Shukla, "The advantages of velocity vector ref- erencing in proportional navigation," IEEE Position Location and Navigation Symposium 1990, IEEE PLANS, 1990 pp. 102-109. [54] Mendel, Jerry M., Lessons in Estimation Theory for Signal Processing, Com- munications, and Control, Prentice Hall, NJ, 1995, ISBN: 0-13-120981-7. 382

[55] Montera, Dennis A., et al., "Object Tracking through Adaptive Correlation," Optical Engineering,Vol.33,No.1,pp.294-302,1994. [56] Montgomery, Douglas, Design and Analysis of Experiments,5th Ed., John Wiley & Sons, 2001, ISBN: 0-471-31649-0. [57] Morgan, R. Wes, Coordinate Frames, Dynamics and Geodesy for Simulation of Aerospace Systems, unpublished work, 2006, 166 pages. [58] Niall Winters, Jose Gaspar, Gerard Lacey, Jose Santos-Victor, "Omni- Directional Vision for Robot Navigation," IEEE Workshop on Omnidirectional Vision (OMNIVIS’00), 2000. [59] Nesline, F. William, and Paul Zarchan, "A New Look at Classical vs Modern Homing Missile Guidance," AIAA J. Guidance and Control, Vol. 4, No. 1, Feb. 1981, pp. 78-85. [60] Nesline, F. William Jr., and Paul Zarchan, "Missile Guidance Design Tradeoffs for High-Altitude Air Defense," AIAA J. Guidance and Control,Vol.6,No.3, May-June 1983, pp. 207-212. [61] Nesline, F. William Jr., and Paul Zarchan, "Digital Homing Guidance—Stability vs Performance Tradeoffs," AIAA J. Guidance and Control, Vol. 8, No. 2, March-April 1985, pp. 255-261. [62] Neumann, John von, and Oskar Morgenstern, Theory of Games and Economic Behavior, Princeton University Press, Princeton, N. J., 1944. [63] Papoulis, Athanasios, Probability, Random Variables and Stochastic Processes, 4th Ed., McGraw-Hill, 2002, ISBN: 0-07-112256-7. [64] Pastrick, H. J., et al., "Guidance Laws for Short Range Tactical Missiles," AIAA J. of Guidance and Control, Vol. 4, No. 2, 1981, pp. 98-108. [65] Pearson III, John Bartling, "Basic Studies in Airborne Radar Tracking Sys- tems," Ph.D. Dissertation, University of California at Los Angeles, September 1970. [66] Pearson III, John Bartling, "Kalman Filter Applications in Airborne Radar Tracking Systems," IEEE Transactions on Aerospace and Electronic Systems, Vol. 10, No. 3, May 1974, pp. 319-329. [67] Pontryagin, L. S., et al., The Mathematical Theory of optimal processes,John Wile,NewYork,1962 [68] Raytheon Missile Systems, www.raytheon.com, 2007 383

[69] Riggs, T. L. Jr., "Linear optimal guidance for short range air-to-air missiles," Proceedings of NAECON, Vol. 2, Oakland, MI, 1979, 757-764.

[70] Rusnak, I., and L. Meir, "Optimal Guidance for high order and acceleration constrained missile," Journal of Guidance, Control and Dynamics, Vol. 14, No. 3, May-June, 1991, pp. 589-596.

[71] Rusnak, I. and L. Meir, "Optimal Guidance for acceleration constrained mis- sile and maneuvering target," IEEE Transactions on Aerospace and Electronic Systems,Vol.26,No.4,July1990,pp.618-624.

[72] Rusnak, I., "Advanced Guidance Laws for Acceleration-Constrained Missile, Randomly Maneuvering Target and Noisy Measurements," IEEE Transactions on Aerospace and Electronic Systems, Vol. 32, No. 1, January 1996, pp. 456-464.

[73] Rusnak, I., and A. Guez, "Optimal adaptive control of uncertain stochastic linear systems", IEEE Proc. of ACC, Vol. 4, June 1995, pp. 2520-2524.

[74] Sawasaki, Naoyuki, et al., "Design and Implementation of High-Speed Visual Tracking System for Real-Time Motion Analysis," IEEE Proc. of ICPR, 1996, pp. 478-483.

[75] Schmotzer, R., and G. Blankenship, "A simple proof of the separation theorem for linear stochastic systems with time delays," IEEE Transactions on Auto- matic Control,Vol.23,No.4,August1978,pp.734-735.

[76] Searle, S. R., Linear Models, John Wiley & Sons, 1971, ISBN: 0-471-18499-3.

[77] Searle, S. R., Matrix Algebra Useful for Statistics,JohnWiley&Sons,1982, ISBN: 0471866814.

[78] Shneydor, N. A., Missile Guidance and Pursuit, Coll House, Westergate Chich- ester, West Sussex, PO20 6QL England, 1998.

[79] Shukla, U. S. and P. R. Mahapatra, "The Proportional Navigation Dilemma, Pure or True?," IEEE Transactions on Aerospace and Electronic Systems,Vol. 26, No. 2, March 1990, pp. 382-392.

[80] Singer, Robert A., "Estimating Optimal Tracking filter Performance for manned Maneuvering Targets," IEEE Trans. on Aerospace and Electronic Systems,Vol AES-6, No. 4, July 1970, pp. 473-483.

[81] Slotine, J. E. and L. Weiping, Applied Nonlinear Control, Prentice Hall, 1991, ISBN 0-13-04-890-5. 384

[82] Song, S. and I. Ha, "A Lyapunov-Like Approach to Performance Analysis of 3- Dimensional Pure PNG Laws," IEEE Transactions on Aerospace and Electronic Systems, Vol. 30, No. 1, January 2004, pp. 238-248. [83] Sonka, Milan, et al., Image Processing, Analysis, and Machine Vision,2nd Ed., Brooks/Cole Publishing Co., 1999, ISBN 0-534-95393-X. [84] Stallard, David V., "Near-Optimal Stochastic Terminal Controllers," Ph.D. Thesis, MIT, 1971. [85] Stark, Henry, and John Woods, Probability and Random Processes with Appli- cations to Signal Processing,3rd Ed., Prentice Hall, 2002, ISBN 0-13-020071-9. [86] Stengle, Robert F., Optimal Control and Estimation, Dover Publications, New York, 1994, ISBN 0-486-68200-5. [87] Stephan, Larisa, et. al, "Fitts Correlation Tracker Fidelity in the Presence of Target Translation, Rotation, and Size Change," Proc. of SPIE, Vol. 4714, 2002, pp. 196-207. [88] Stevens, Brian L., and Frank L. Lewis, Aircraft Control and Simulation,John Wiley & Sons, New York, 1992. [89] Tao, Hai, "Object Tracking and Kalman Filtering," Department of Computer Engineering, University of California Santa Cruz, CMPE 264 course notes. [90] Thak, Min-Jea, et al., "Recursive Time-To-Go Estimation for Homing Guidance Missiles," IEEE Transactions on Aerospace and Electronic Systems,Vol.38,No. 1, January 2002, pp. 13-23. [91] Toomay, J. C., and Paul J. Hannen, Radar Principles for the Non-specialist, 3rd Ed., Scitech Publishing, 2004. [92] Vincent, Thomas L., and Walter J. Grantham, Nonlinear and Optimal Control Systems,Wiley,NewYork,1997. [93] Vincent, Thomas L., and Walter J. Grantham, Optimality in Parametric Sys- tems, Wiley, New York, 1981. [94] Vincent, Thomas L., and Walter J. Grantham, "Trajectory Following Methods in Control System Design," J. of Global Optimization, Vol. 23, Nos. 3-4, August, 2002, pp. 267-282. [95] T. L. Vincent and R. W. Morgan, "Guidance against maneuvering targets using Lyapunov optimizing feedback control," Proceedings of the American Control Conference, Anchorage, AK, May 8-10, 2002, 215-220. 385

[96] Wikipedia, The Free Encyclopedia, Wikipedia Foundation, Inc., http://en.wikipedia.org.

[97] Wiener, N., Extrapolation, Interpolation, and Smoothing of Stationary Time Series, Wiley, New York, 1949.

[98] Willems, G., "Optimal Controllers for Homing Missiles," Report No. RE-TR- 6815, U.S. Army Missile Command, Redstone Arsenal, Alabama, September 1968, Unclassified.

[99] Willems, G., "Optimal Controllers for Homing Missiles with Two Time Con- stants," Report No. RE-TR-69-20, U.S. Army Missile Command, Redstone Ar- senal, Alabama, October 1969.

[100] Yuan, Luke Chia-Liu, "Homing and Navigational Courses of Automatic Target Seeking Devices," RCA Laboratories, Princeton, N. J., Report No. PTR-12C, Dec. 1943, and Journal of Applied Physics, Vol. 19, 1948, pp. 1122-1128.

[101] Zarchan, Paul, Tactical and Strategic Missile Guidance,3rd Ed., Progress in Astronautics and Aeronautics, Vol. 176, Virginia, 1997.

[102] Zarchan, Paul, E. Greenberg, and J. Alpert, "Improving the High Altitude Performance of Tail-Controlled Endoatmospheric Missiles," AIAA Guidance, Navigation, and Control Conference and Exhibit, Monterey, California, August 5-8, 2002.