A New Paradigm in Optimal Missile Guidance
Item Type text; Electronic Dissertation
Authors Morgan, Robert W.
Publisher The University of Arizona.
Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Download date 24/09/2021 22:40:12
Link to Item http://hdl.handle.net/10150/194121 A New Paradigm in Optimal Missile Guidance
by Robert W. Morgan
A Dissertation Submitted to the Faculty of the Department of Electrical & Computer Engineering In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy In the Graduate College The University of Arizona
2007 2
The University of Arizona Graduate College
As members of the Dissertation Committee, we certify that we have read the dissertation prepared by Robert W. Morgan entitled A New Paradigm in Optimal Missile Guidance and recommend that it be accepted as fulfilling the dissertation requirement for the
Degree of Doctor of Philosophy
Date: 04/05/2007 Dr. Hal Tharp
Date: 04/05/2007 Dr. Jeffrey J. Rodriguez
Date: 04/05/2007 Dr. Jerzy W. Rozenblit
Date: 04/05/2007 Dr. Thomas L. Vincent
Final approval and acceptance of this dissertation is contingent upon the candidate’s submission of the final coppies of the dissertation to the Graduate College.
I hereby certify that I have read this dissertation prepared under my direction and recommend that it be accepted as fulfilling the dissertation requirement.
Date: 04/05/2007 Dissertation Director: Dr. Hal Tharp 3
Statement by Author
This dissertation has been submitted in partial fulfillment of requirements for an advanced degree at The University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.
Brief quotations from this dissertation are allowable without special permission, provided that accurate acknowledgment of source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.
Signed: Robert W. Morgan 4
Acknowledgments
The author wishes to express his appreciation to Professors J.J. Rodriguez, J.W. Rozenblit, H.S. Tharp, and T.L. Vincent for their service on his doctoral committee. Special acknowledgement is due the committee chairman, Professor H.S. Tharp, for his advice and efforts to review and scrutinize the content and quality of this dissertation. The author would also like to express his appreciation for the inspiration and encouragement given him by Professor T.L. Vincent throughout the author’s academic career.
This work was made possible by the generous fellowship awarded to the author by Raytheon Missile Systems. Several people at Raytheon have been instrumental to the author’s ability to participate in the program. Ron Reid, the director of the program, has been a source of encouragement from the very beginning. The author’s acceptance into and desire to participate in the program is, with great thanks, due to Martin Ulehla. Special acknowledgement is also due Chris Poage, who has been an exemplar professional and a friend to the author while participating in the program.
The author would also like to thank his wife, Angela, and his children for their support and encouragement these past four years. 5
Ta b l e o f Co n t e n t s
List of Figures ...... 11
List of Tables ...... 14
Abstract ...... 15
Chapter 1. Introduction ...... 16 1.1.BackgroundandScope...... 16 1.2.HistoryandStateoftheArt...... 18 1.3. Organization ...... 25 1.4.Notation...... 30
Chapter 2. Estimation ...... 32 2.1.GeneralConceptsinEstimation...... 32 2.1.1. Estimation Defined...... 32 2.1.2.MarkovModel...... 33 2.1.3.OptimalEstimationandRisk...... 34 2.2. Bayesian Estimation...... 38 2.3. Multiple Model ...... 40 2.3.1.InteractiveMultipleModel...... 43 2.4. Linear Estimation ...... 43 2.4.1.UnbiasedEstimation...... 44 2.4.2.BLUEEstimation...... 45 2.4.3.LeastSquareEstimation...... 49 2.4.4.MaximumLikelihoodEstimation...... 50 2.4.5.FullRankModels ...... 51 2.4.6. A Priori Estimates and Rank DeficientModels...... 52
Chapter 3. Estimation in Linear Sampled Data Systems ...... 55 3.1. Bayes’ Estimator ...... 57 3.1.1.FirstMeasurement...... 58 3.1.2. kth MeasurementandFinalResultbyInduction...... 64 3.2. Estimates and Confidence Regions (Error Ellipsoids) for the Bayes’ Estimator...... 67 3.3.TheWhiteNoiseAssumptionandBayesianEstimation...... 68 3.4. The Deterministic Input Assumption and Bayesian Estimation . . . . 70 3.5.BayesianEstimationBetweenMeasurements...... 71 3.6.NoAPrioriInformationandBayesianEstimation ...... 74 6
Table of Contents–Continued
3.7. The Kalman Filter ...... 77 3.7.1.CovarianceSimulations...... 79 3.7.2.ScalarSystemEstimationExample...... 79 3.8. Multiple Model ...... 83
Chapter 4. Stochastic Motion Models ...... 91 4.1. Markov Models ...... 91 4.1.1.PrincipleofInertia...... 92 4.2.ProcessNoiseModels...... 93 4.3.RandomWalk...... 95 4.3.1.ContinuousTimeRandomWalk...... 96 4.4. White Acceleration...... 98 4.4.1.DiscreteEquivalent...... 99 4.5.CorrelatedAcceleration...... 101 4.5.1.DiscreteEquivalent...... 101
Chapter 5. Optimization and Control Theory ...... 104 5.1.BasicControlTheoryConcepts...... 104 5.2.ParametricOptimization ...... 105 5.2.1. Constraints ...... 106 5.2.2. Necessary Conditions for a Local Minimum ...... 107 5.3.LyapunovControlTheory...... 111 5.3.1.QuickestDescentControl...... 113 5.3.2.QuickestDescentwithMinimumIncrementalCost...... 115 5.4.OptimalControlTheory...... 117 5.4.1.OptimalReturnFunction...... 118 5.4.2. The Augmented State Vector and the Augmented State Space 119 5.4.3.TemporalBoundaryConditions...... 121 5.4.4. The Optimal Control H Function...... 123 5.4.5.StateIndependentControlConstraints...... 124 5.4.6.TheOptimalControlMinimumPrinciple...... 126 5.4.7.LinearQuadraticRegulator(LQR)...... 127 5.4.8.LinearSystemswithProcessNoise...... 133 5.4.9. Linear Systems with Process Noise and Measurement Noise . . 134 5.5. DifferentialGameTheory...... 140 5.5.1. System Definition...... 140 5.5.2.ControlConstraints...... 141 5.5.3.TerminalSet...... 141 5.5.4. The Payoff ...... 141 5.5.5.GamesofKindandGamesofDegree...... 142 7
Table of Contents–Continued
5.5.6.Min-MaxPrinciple...... 143 5.5.7. Min-Max Necessary Conditions ...... 144
Chapter 6. Airframe and Autopilot Modeling ...... 145 6.1. Introduction ...... 145 6.2.AirframeModelingofTailControlledMissiles...... 151 6.2.1.AerodynamicForces...... 151 6.2.2.AirframeDynamics...... 156 6.2.3.AirframeTransferFunction...... 158 6.3.Three-LoopAutopilot...... 162 6.3.1.HighBandwidthActuator...... 165 6.3.2.3-LoopSummary...... 166 6.3.3.3-LoopParameters...... 166 6.3.4.3-LoopPerformance...... 167 6.4.FirstOrder(Pole)ApproximationofFlightControlSystem..... 168 6.5.Pole-ZeroApproximationofFlightControlSystem...... 170 6.6.BinomialApproximationofFlightControlSystem...... 173
Chapter 7. Optimal Guidance ...... 176 7.1.EngagementGeometryandDynamics ...... 176 7.2.Miss...... 178 7.2.1.AccelerationDynamics...... 179 7.3. Zero EffortMiss(ZEM)...... 181 7.3.1.AccelerationDynamics...... 182 7.4. Heading Error ...... 183 7.5.BasicModel ...... 185 7.5.1.AugmentedProportionalNavigationGuidance...... 188 7.5.2.ProportionalNavigationGuidance...... 188 7.6.AccelerationDynamics...... 189 7.6.1.DecoupledDynamics...... 192 7.7. Single Pole Flight Control System Model ...... 193 7.8. Optimal Evasion ...... 198 7.9.MagnitudeConstraints(Saturation) ...... 203 7.9.1.SinglePoleFlightControlSystem...... 205 7.10.DirectionalConstraints ...... 207 7.10.1. Simulation ...... 212 7.10.2. Engagement Configuration...... 215 7.10.3.TrajectoryShaping...... 216 7.10.4.NavigationRatio...... 219 7.11. Endgame Geometry and Final Time tf ...... 223 8
Table of Contents–Continued
7.11.1.ApproximateFinalTimeEstimate...... 224 7.11.2.UnconstrainedMissileAcceleration...... 225
Chapter 8. Estimating the Zero Effort Miss ...... 227 8.1. Target Modeling ...... 228 8.1.1.StepChangeinTargetAcceleration...... 228 8.1.2.UncorrelatedTargetAcceleration...... 229 8.1.3.CorrelatedTargetAcceleration...... 230 8.1.4.OptimalEvasion...... 232 8.1.5.Summary...... 232 8.2.FlightControlSystemModeling...... 233 8.2.1.SinglePoleFlightControlSystem...... 234 8.2.2.FastFlightControlSystem...... 235 8.2.3.Summary...... 236 8.3.Estimation ...... 236 8.3.1.KinematicEstimatorEquations...... 237 8.3.2.RangeandRange-RateFilter...... 242 8.3.3.AzimuthAngleFilter ...... 244 8.3.4.ElevationAngleFilter...... 245 8.3.5. Estimating the ZEM Using Decoupled Filter Estimates . . . . 245 8.4.EstimationwithoutRangeInformation...... 248 8.5.LackofAPrioriInformation ...... 253
Chapter 9. Guidance Strategies ...... 255 9.1.GuidanceStrategiesandInformationConstraints...... 255 9.1.1. Zero EffortMissGuidanceStrategy...... 256 9.1.2.ParallelNavigationStrategy...... 257 9.1.3.DirectPursuit...... 261 9.2.ControlStrategiesandControlConstraints...... 262 9.3.GuidanceStrategyParadigm...... 263
Chapter 10. Applications of the Strategy Paradigm and Conclu- sion ...... 266 10.1.LyapunovGuidance ...... 266 10.1.1.DirectPursuitGuidance...... 267 10.1.2. Zero-EffortMissGuidance...... 269 10.2.ExtendingPPNforManeuveringTargets ...... 273 10.2.1.PureAugmentedProportionalNavigation...... 275 10.2.2.NumericalResults...... 277 10.3.ConclusionsandRecommendationsforFutureResearch...... 284 9
Table of Contents–Continued
10.3.1.SummaryofDissertation...... 285 10.3.2. Contributions...... 286 10.3.3.FutureResearch...... 290
Appendix A. Probability and Stochastic Processes ...... 292 A.1.ConceptsinProbability...... 292 A.2. Random Variables...... 294 A.3.MomentGeneratingFunctions...... 295 A.3.1.ComputationofMoments...... 296 A.3.2.FunctionsofIndependentRandomVariables...... 297 A.3.3.MarginalDistributions...... 298 A.4.Normal(Gaussian)Distribution...... 299 A.4.1. ConfidenceIntervals...... 301 A.4.2. Sum and Difference of Independent Normally Distributed Ran- domVariables...... 302 A.4.3.RelationshiptoChi-SquareDistribution...... 303 A.5.GammaDistribution...... 304 A.5.1. Sum of Independently Distributed Gamma Random Variables 308 A.6.Chi-SquareDistribution...... 309 A.6.1. Sum of Independently Distributed Chi-Square Random Variables 312 A.7.MultivariateNormalDistribution...... 312 A.7.1. Confidence Regions for Multivariate Normal Random Variables 321 A.8. Random Processes and Random Sequences ...... 334 A.8.1. White Processes and Sequences ...... 335 A.8.2.MarkovRandomSequences...... 336 A.9.InnovationsSequence ...... 337
Appendix B. Concepts from Systems Theory ...... 340 B.1. Common Signals ...... 340 B.1.1.DeterministicSignals...... 340 B.1.2.StochasticSignals...... 342 B.2.ConvolutionandImpulseResponse...... 345 B.2.1.PrincipleofSuperposition...... 345 B.2.2.HomogeneousSystem...... 346 B.2.3.TheParticularSolution...... 348 B.2.4. Time-Invariant Systems ...... 350 B.3.LinearSystemswithStochasticInputs...... 351 B.4.CovarianceandStatePropagation...... 355 B.5. Shaping Filters ...... 357 B.5.1. Continuous-Time Shaping Filters for Deterministic Signals . . 357 10
Table of Contents–Continued
B.5.2. Continuous-Time Shaping Filters for Stochastic Signals . . . 361 B.5.3.Discrete-TimeShapingFiltersforStochasticSignals...... 366 B.6. Coordinate Frames...... 368 B.6.1.DirectionCosineMatrix(DCM) ...... 368 B.6.2.EulerAngles...... 370 B.7.Dynamics(Kinematics)...... 372 B.7.1.Time-DerivativeofaConstantMagnitudeVector...... 373 B.7.2.Time-DerivativeofanArbitraryVector...... 374
References ...... 378 11
List of Figures
Figure 1.1.Typicaltacticalmissiletrajectory[35]...... 17 Figure 1.2.Majormissilesubsystems...... 26 Figure 2.1.Blockdiagramofmultiple-modelsystem[7,p.127]...... 40 Figure 3.1. Kalman Filter Performance for a 1st Order Gauss-Markov Process. 83 st Figure 3.2. Estimation Error (blue) and √Pk (red) for 1 Order Gauss- Markov Process...... ± ...... 84 Figure 3.3. Kalman Gain for 1st OrderGauss-MarkovProcess...... 85 Figure 5.1. Hatched region indicating intersection of a spherical ball and the control constraint set...... 106 Figure 5.2. Borrowed with permission from [92, p. 127]. Cost and cost gradient geometry for minimizing G (u). (a) At a minimizing point u∗. (b) At a nonminimizing point u...... 108 Figure 5.3. Geometry for steepest descent control us (green) and quickest descent control uq (red)...... 112 Figure 5.4. Quickest descent trajectories (black) and controllable set bound- aryforZermelo’sproblem...... 116 Figure 5.5. A Σ surfaceinaugmentedstatespace...... 120 Figure 5.6.Plotofcontrolconstrantfunctionanditsderivative...... 125 Figure 5.7.Optimalcontrolgainsasafunctionoftime...... 131 Figure 5.8. Controllable set and LQR domain of attraction for system with bounded control...... 132 Figure 6.1. Aerodynamic force, which is resolved into lift and drag compo- nents, is generated by creating an angle of attack, α, with respect to the directio of airflowagainsttheairfoil...... 147 Figure 6.2. A standard measure of serodynamic stability is static margin, the distance between the center of gravity and the center of pressure. . . 148 Figure 6.3. Canard control requires actuators to be located near the nose of themissile,typicallyinthesameareaastheseeker...... 149 Figure 6.4. Movement of tail control surfaces will not disturb the airflow acrossthewingsormissilebody...... 150 Figure 6.5.Aqualitativecomparisonofcanardandtailcontrol...... 151 Figure 6.6.Forcesonatail-controlledmissile...... 153 Figure 6.7. Missile with normal force FN at CP and moment M about CG. 157 Figure 6.8.Blockdiagramoflinearizedairframe...... 158 Figure 6.9. Standard Three-Loop Autopilot Topology [101, p. 508]. . . . . 162 Figure 6.10. Bode plot (frequency response) of flightcontrolsystem...... 168 12
List of Figures–Continued
Figure 6.11. Flight control system step response for different flight control system time constants...... 169 Figure 6.12. Step response comparison for 1st order (pole) approximation of 3-loop autopilot...... 170 Figure 6.13. Frequency response comparison of 3-loop autopilot and pole model. 171 Figure 6.14. Step response comparison for pole-zero approximation of 3-loop autopilot...... 174 Figure 6.15. Frequency response comparison of 3-loop autopilot and pole-zero model...... 174 Figure 6.16. Binomial approximations of flight control system with TA =0.3 seconds...... 175 Figure 7.1.Typicalengagementgeometry...... 177 Figure 7.2. Typical engagement geometry for spiraling target (green), con- stant acceleration target (blue), and constant velocity target (red). . . . 214 Figure 7.3. Typical pursuer trajectories for ks =0.01 (red) and ks =0.05 (green)...... 218 Figure 7.4. Heading error as a function of time-to-go for various trajectory shapingparametervalues...... 218 Figure 7.5. Trajectories A (green) and B (red) generated from parameters listed in Table 7.2...... 221 Figure 7.6. Heading error curves A (green) and B (red) as a function of time-to-go...... 221 Figure 7.7.Navigationgainasafunctionofheadingerror...... 222 Figure 7.8. Navigation gain Vs. heading error for different values of φ.... 222 Figure 8.1.Engagementrelativetoseekerboresight...... 238 Figure 10.1. Comparison between three forms of proportional navigation. . . 274 Figure 10.2.Plane(2D)engagementwithacceleratingtarget...... 279 Figure 10.3.X-axisZEMtrajectories...... 280 Figure 10.4.Y-axisZEMtrajectories...... 281 Figure 10.5.Z-axisZEMtrajectories...... 282 Figure 10.6. Asymptotically decreasing ZEM trajectories and asymptotically increasingcosttrajectoriesforguidancelaws...... 283 Figure 10.7.Pursueraccelerationasafunctionoftime...... 284
Figure A.1. Relationship between the ellipses tranced out bo δx and δy...326 Figure A.2. Equiprobability ellipsoids for random variables X N (µX, VX), Y N (µ , V ) and Z N (0, I)...... ∼ 326 ∼ Y Y ∼ Figure A.3. Regions X formed by rotating region Y such that P (X X)= R R ∈R P (Y Y) for the orthonormal transformation Y = RX...... 328 ∈R 13
List of Figures–Continued
Figure A.4. Rectangular confidence region (M) enclosed by box (M).... 329 RX BX Figure B.1.Rectangularpulseswithunityarea...... 341 Figure B.2.Directionangles...... 369 14
List of Tables
Table 6.1. Nominal values for flight control system parameters [60], [61]. . . 167 Table 7.1.Simulationparameters...... 214 Table 7.2.Simulationparametersfornavigationgainanalysis...... 220 Table 8.1.TargetmodelassumptionsandimpactonZEMequation..... 233 Table 8.2. Flight control system model assumptions and impact on ZEM equation...... 236 Table 10.1.Simulationparameters...... 278 Table A.1. Probabilities (x100) for different dimensions and different regions. 333 Table A.2. Required region size normalized to number of standard deviations along principal axes...... 334 15
Abstract
This dissertation investigates advanced concepts in terminal missile guidance. The terminal phase of missile guidance usually lasts less than ten seconds and calls for very accurate maneuvering to ensure intercept. Technological advancements have produced increasingly sophisticated threats that greatly reduce the effectiveness of traditional approaches to missile guidance. Because of this, terminal missile guidance is, and will remain, an important and active area of research. The complexity of the problem and the desire for an optimal solution has resulted in researchers focusing on simplistic, usually linear, models. The fruit of these endeavors has resulted in some of the world’s most advanced weapons systems. Even so, the resulting guidance schemes cannot possibly counter the evolving threats that will push the system outside the linear envelope for which they were designed. The research done in this dissertation greatly extends previous research in the area of optimal missile guidance. Herein it is shown that optimal missile guidance is fundamentally a pairing of an optimal guidance strategy and an optimal control strategy. The optimal guidance strategy is determined from a missile’s information constraints, which are themselves largely determined from the missile’s sensors. The optimal control strategy is determined by the missile’s control constraints, and works to achieve a specified guidance strat- egy. This dichotomy of missile guidance is demonstrated by showing that missiles having different control constraints utilize the same guidance strategy so long as the information constraints are the same. This concept has hitherto been unrecognized because of the difficulty in developing an optimal control for the nonlinear set of equa- tions that result from control constraints. Having overcome this difficultybyindirect means, evidence of the guidance strategy paradigm emerged. The guidance strategy paradigm is used to develop two advanced guidance laws. The new guidance laws are compared qualitatively and quantitatively with existing guidance laws. 16
Chapter 1 Introduction
This introductory chapter of the dissertation is organized into four sections. The first section, titled "Background and Scope," provides an introduction to some of the issues in guided weapons and defines what this dissertation will focus on — terminal missile guidance. The second section of this chapter, titled "History and State of the Art," provides a brief history of weaponry with more detail being given to the topic of missile guidance and its historical development. The third section of this chapter, titled "Dissertation Organization," lists the chapters of this dissertation along with an explanation of their content and relation to other chapters in the dissertation. The final section of this chapter, titled "Notation," discusses some of the conventions that are used elsewhere in the dissertation, which may not be familiar to the reader.
1.1 Background and Scope
A missile is an object (weapon) that is fired, thrown, dropped, or otherwise projected at a target [1]. The three primary factors affecting a missile’s ability to successfully intercept a target are: (1) incorrect direction at takeoff (heading error), (2) environ- mental influences, and (3) unpredictable target maneuvers. One way of countering these factors is to guide the weapon to the target; another way is to use a (larger) warhead [49]. Guided weapons are generally classified as the command type or the homing type. In command systems, guidance commands are transmitted to the missile by a data link (e.g. radio, wire cables or fiber optics). Homing systems are characterized by the ability of the missile to detect, acquire and track a target. The essential difference between the two systems is the location of the target tracking device [31]. 17
Figure 1.1. Typical tactical missile trajectory [35].
Among guided weapons, further distinction needs to be made concerning the mis- sion objective. Will the engagement be surface-to-surface (SSGW), surface-to-air (SAGW), air-to-surface (ASGW), air-to-air (AAGW) or something else? What is the expected engagement range: short, medium, or long? Surely, the guidance scheme will be different, depending on mission objective. A typical missile trajectory is shown in Figure 1.1. There are essentially five phases to a typical missile engagement: launch, midcourse guidance, detection, acquisition and terminal guidance1. During the launch phase, the missile is launched from some platform. Depending on the length of the engagement, midcourse guidance is initiated directly after launch. A typical midcourse strategy might be to maintain launch heading and a constant altitude. "Detection is the process whereby the seeker senses a certain amount of energy (in some region of the spectrum) above that normally expected from background or internal seeker noise. Acquisition is the process whereby the seeker, after experiencing one or more incidents of detection, decides (according to some pre-established criteria or algorithm) that a valid target
1 Some missiles have a discrimination phase that occurs between acquisition and terminal guid- ance. 18 has been located. Tracking is the process whereby the seeker continually specifies the angular location of the target relative to some fixed coordinate system" [35, 1979, p. 3-3]. Once the missile is tracking the target, a terminal guidance algorithm is initiated. It is this final phase of the missile engagement (terminal guidance) that this dissertation addresses.
1.2 History and State of the Art
This section of the introduction will briefly discuss the history and state of the art in weapon systems. For the sake of brevity, the scope and thoroughness of the discussion must be sufficiently bounded. The discussion that follows will first give a very brief summary of the history of weapons. This will be followed by more specific information on guided weapons, and in particular the process of guidance itself as it is central to this dissertation. Clearly, the general concept of weapons dates to antiquity, when they were used by early humans as a method of obtaining food, as well as possibly settling disputes. First, there was likely the simple throwing of a stone, but doubtless it was contempo- rary with the club — this may be the early "sticks and stones" class of weapons. As time passed, spears were fashioned, maybe out of only wood at first but eventually tipped with a sharp piece of rock. The earliest account of these type of weapons are flint knives and flint (or flintstone) tipped spears which can be found on display in museums. Itisuncertainifthebowandarroworthesling(distinctfromslingshot) appeared first. The earliest textual account of the sling is found in the Bible in the Book of Judges, wherein "a shepherd David, unarmored and equipped only with a sling, defeats the warrior champion Goliath with a well aimed shot [96, encyclopedia entry "sling"]." However, archeological evidence indicates that the sling and arrow were used as early as 20,000 B.C.. The Bronze Age brought about bronze swords as early as 2000 B.C. and the Iron Age brought about iron swords as early as 800 B.C.. 19
As history progressed, weaponry technology advanced at a fairly slow rate until the Medieval period. During this period, castles became common which necessitated a new class of weapons to defeat them — siege weapons. There are likely many cat- egories and variations of siege weapons, but the main ones were the catapult, the trebuchet, and the ballista [40]. The Renaissance period (14th-16th century) marked, among other things, the beginning of the implementation of combustion based devices in warfare. The most long-lasting effect of this was the introduction of cannon and firearms to the battlefield,wheretheyarestillatthecoreofmodernweaponry[96, encyclopedia entry "weapon"]. Rapid shooting weapons, such as the gatling gun (circa 1860), began to appear in the 19th century. The early 20th century brought about the tank and the airplane, which itself was used by the Japanese as a guided weapon in the World War II kamikaze attacks. During World War II, the first nuclear weapons were developed in the United States (an international effortthatinvolved both European and American scientists) under the top secret Manhattan Project and eventually unleashed against the Japanese cities of Hiroshima and Nagasaki in August 1945. The 20th century also gave birth to and matured the guided weapon, which will be discussed more in the following paragraph. In [31, p. 1], Garnell defines a guided weapon as a "weapon system in which the warhead is delivered by an unmanned guided vehicle." This definition may be too narrow. The term "guide" means to lead or show the way and the term weapon is an item or object used to injure, kill, disarm or incapacitate an opponent [96]. Perhaps a better definition of a guided weapon is simply a weapon that makes course corrections subsequent to initiation based upon post initiation data. Note then that an arrow, which by design corrects for environmental disturbances, is not a guided weapon because disturbance rejection is not considered a course correction. Similarly, a ballistic (that which falls under the force of gravity) missile can be launched in a manner that will cause it to fall within close proximity to its intended target. However, the ballistic missile is not a guided weapon because no post launch data is 20 used to alter its course subsequent to launch. An example which does satisfy the author’s definition of a guided weapon, but fails Garnell’s definition is the Japanese kamikaze attacks. Clearly the kamikaze were guiding their planes in a manner so as to cause damage to their enemy. The fact that a person was guiding the plane as opposed to a computer, does not change the fact that the weapon (the plane) was guided to a target with an intent to cause damage. Having accepted a definition of a guided weapon as a weapon that uses a closed loop control to engage an enemy (target, evader, etc.), we proceed to look at some of the history of guided weaponry. In 1870, a man by the name of Werner Siemen submitted a proposal for "the destruction of enemy vessels by guided torpedoes" to the Prussian ministry of war. The following account is given in [78, p. 12].
It consisted of a torpedo mounted beneath a sailing boat, controlled by pneumatic pulses transmitted through rubber tubes. The commands were to be transmitted from a control post on land or on a marine vessel, the position of the guided boat being marked by a shielded lamp. By the time the system had finally been developed and deployed by the German navy (in 1916), the boats were propelled by advanced internal-combustion engines, could achieve speeds exceeding 30 knots (45 feet/sec), and were guided from airborne command posts via radio and 50km-long electrical cables. In October 1917, the first operational success was attained when a British ship was hit and sunk.
In Siemen’s original proposal, the torpedo was to be guided to the target by directing its course (via pneumatic pulses) such that the torpedo traversed the line- of-sight between the controlling station and the target. This is a very primitive guidance scheme which directs the torpedo to where the target is, as opposed to where it will be at the time of intercept. Nevertheless, it has been shown to be effective for speed disadvantaged targets. The German military also developed and 21 employed at least two radio-guided air-to-sea bombs during World War II, both of which were guided to the target in a similar manner to the previously described sea torpedo based on Siemen’s design [78, p. 13]. The next major advance of guided weapons came about with the development of the "proportional navigation" (PN) guidance law, which was first formulated in the United States in 1943. The PN guidance law, which was invented by C. L. Yuan at the RCA Laboratories in the USA in 1943, was declassified and published in 1948 in the Journal of Applied Physics [100]. Qualitatively speaking, the PN guidance law seeks to stabilize the angular motion of the line-of-sight (LOS) between the missile and the target. As long as the distance along the LOS, referred to as the range, is decreasing, intercept is assured if the rotation rate of the LOS is minimal. The "horribly effective kamikaze attacks" against US ships in World War II was sufficient motivation to promote rapid development of a PN guided missile. The first successful intercept made by a missile against a (pilotless) aircraft was in December 1950, by a Hughes-developed2 Lark missile [30]. Although the mathematical development of PN and it’s eventual weapon system implementation is relatively modern, it’s use by humans and animals has evidently occurred for eons. As evidence of this, consider a football player attempting a tackle or a lion chasing its prey. These will often run nearly parallel with their targets while slowly decreasing the range between them until intercept occurs; this is in contrast to, and superior to, the direct pursuit strategy where the pursuer turns all his efforts directly at the target. An interesting account of navigation principles appearing in nature can be found in [2]. Many extensions of and variations of PN developed in the decade or so after its original development. In fact, there is still active research in this area and this dissertation presents some entirely new and promising results that are related to the PN guidance scheme. The mathematical theory of optimal control began in the 1950’s with Bellman’s dynamic programming [8]. A short time later (circa 1960) a blind Russian math-
2 Hughes Aircraft Company was purchased by Raytheon in 1997. 22 ematician by the name of Pontryagin developed an alternate approach to optimal control theory in what has been termed the "Maximum Principle" [67]. An Amer- ican scientist by the name of Rufus Isaacs initiated the study of differential game theory3 in 1954 [45]. The work of Isaacs and others that soon followed led to the seminal work of Ho and Bryson [43], wherein they showed that the PN guidance law is optimal under reasonable conditions (which will be discussed in Chapter 7). The aforementioned reference assumed the missile’s flight control system could be well approximated by a unity gain. That is, the model used by Ho and Bryson presumed that the missile could instantaneously achieve a desired acceleration. In 1968, a few years after the work of Ho and Bryson, Willems developed a guidance law that was optimal when the missile flight control system could be adequately modeled by a single lag [98], [99]. In 1971, Ronald Cottrell, of Hughes Aircraft Company, in- dependently developed and published the guidance law for a single lag flight control system in collaboration with University of Arizona Professor Dr. Thomas Vincent, who is also an advisor for the author of this dissertation [23]. Although Willems or others may have arrived at the result prior to Cottrell, most references cite Cottrell’s work as the original source of this important guidance law — this is likely due to the accessibility, clarity and conciseness of the work. Various papers on missile guidance continued to be developed through the 1980’s; for a literature survey see Pastrick [64]. In 1980 Guelman and Shinar made significant progress in optimal control of aero- dynamic missiles that cannot effect axial acceleration [37], [38]. However, Guelman and Shinar’s work did not lead to a closed form feedback controller which limits its potential application to practical weapon systems — a short coming that is overcome in the work contained in this dissertation. In 1991, Rusnak and Meir considered the intercept problem for a general flight control system model with inequality constraints
3 Differential game theory is a branch of the more general field of game theory developed by Neumann in 1944 [62]. 23 on acceleration magnitude [70]. In their paper, Rusnak and Meir showed that the ef- fect of the magnitude constraint on control results in a saturating controller when the flight control system is minimum phase. In 1995, Aggarwall used the approach taken by Rusnak and Meir to develop an "implementable" guidance law that is applicable for a tail controlled missile exhibiting what is known as the "wrong way effect" [4]. The distinction of Aggarwall’s approach is that the resulting guidance law makes use of states that are directly measurable, thus making implementation feasible. In 2002, Zarchan and Alpert also developed an optimal controller for a tail controlled missile by numerically solving the Ricatti equation to obtain the controller gains as a func- tion of time-to-go [102]. These gains could theoretically be used in an engagement by simple table look-up. The most noteworthy of Zarchan and Alpert’s results is that the more sophisticated guidance law fails to show remarkable improvement over that developed by Cottrell [23] which models the flight control system as a single lag. In fact, the only case where a significant performance increase was observed seems to be at a set of flight control system parameters which are likely not ever to be observed in a physical missile (a zero at five radians per second and a pole at 20 radians per second). Other control techniques have been applied to the problem of missile guidance in the past few decades. What distinguishes these techniques from much of the pre- 1990’s literature is the focus on nonlinearities and other difficulties. This said, it is very difficult to make an assessment of the quality of the resulting guidance laws because an exhaustive study would necessitate a designed experiment [56] involving possibly hundreds of parameters in a high-fidelity, six-degree of freedom (6-DOF) simulation. As evidence of this, the reader is referred to Nesline and Zarchan’s classic paper [59] wherein PN is compared with Cottrell’s single pole model [23]. In this reference, Nesline and Zarchan show that there are real-world situations, such as large radome slope and poor time-to-go estimation, where PN is superior to the more advanced optimal control law — that is PN is more robust. In fact, the reason PN is 24 still used in modern missiles is because it has been shown to be an extremely robust guidance law. Despite the difficulty in determining the performance and robustness of complex guidance schemes, new guidance schemes continue to be developed, often accompanied by paltry numerical results that compare them (usually favorably) to PN or some variant. This point is made here because several new guidance schemes are presented in this dissertation. While numerical comparisons are necessarily made, they are not relied on to imply superiority of certain guidance schemes and should be viewed with the same skepticism as one would any other paper showing similar results. This is not because these situations were selectively picked, but rather because it is nearly impossible to exhaustively compare complex guidance schemes. Fortunately, the guidance laws developed in this dissertation are amenable to qualitative study and also to direct comparison with other well known guidance schemes. As the previous paragraph states, several other approaches have been used to develop complex guidance schemes that are applicable in challenging engagement situations. LyapunovmethodshavebeenappliedbyVincentandMorganin2002 [95] and by Lechevin and Rabbath in 2004 [50]. Geometric approaches to missile guidance have been examined by Duflos et al. in 1995 [25] and by Chiou and Kuo [22] in 1998. Feedback linearization was used by Bezick et al. in 1995 [11]. The universal approximation ability of fuzzy models and neural network models have also been used in addressing the missile guidance problem. For the problem of missile guidance, the author’s preference is fuzzy models simply because the resulting models are interpretable, where as the counterpart neural networks are "black box" systems4. Application of these "soft" techniques to missile guidance has produced some promising results and seems to be where the momentum is heading in the area of missile guidance. There are many new papers in this area, three of which are [33],
4 In a missile that costs upwards of a million dollars, risk is an important factor. Accordingly, an interpretable system, from which intuitive explanations can be drawn, will be preferred to a black box system, even if the black box system performs slightly better. 25
[51] and [52].
1.3 Organization
This section explains the organization and content of this dissertation. While the main focus of this dissertation is on missile guidance, the dissertation also considers closely related topics. To better explain this, a functional diagram of a missile’s major subsystems and their inter-relationships is shown in Figure 1.2. The major subsystems shown in Figure 1.2 are explained in the sentences that follow. The seeker is an instrument used to detect, acquire, and track a target by sensing some band(s) of the electromagnetic spectrum. The seeker’s measurements are then passed to an appropriate filter (or estimator) to obtain an approximation of pertinent engagement parameters needed for guidance. These parameters (possibly along with other sensor measurements) are used by the guidance law to determine the desired missile maneuver. The autopilot generates appropriate actuator commands which will most efficiently produce the desired missile maneuver, while maintaining missile stability. An actuator is used to alter the external geometry of the missile by means of findeflection, tail deflection, canard deflection, thrust control, or some combination of these commands. Modern missiles may also employ divert thrusters (e.g. Raytheon’s Exoatmospheric Kill Vehicle - EKV [68, EKV product sheet]) to achieve the desired missile maneuver. “The airframe serves two purposes. First, it is the container for all the other subsystems (including the payload). Secondly, by proper design and in partnership with the propulsion, it can be used to effectively produce the required lift and drag forces for accomplishing the mission objectives” [35, 1979, pp. 3-4]. The kinematic blocks represent the governing physics for the engagement. Having introduced and briefly explained a functional diagram of the missile, the reader may now be able to better appreciate the organization of this dissertation. 26
Figure 1.2. Major missile subsystems.
This dissertation contains ten chapters and two appendices. The two appendices, one on probability and stochastic processes and the other on certain topics in systems theory, provide information and derivations that are used elsewhere in the disserta- tion, but are of secondary importance. The first chapter is an introductory chapter that provides background information and defines the scope of the dissertation. It also contains a section on the dissertation’s organization. The second, third and fourth chapters of this dissertation deal with estimation theory and stochastic motion modeling. These topics are of direct importance to missile guidance because an estimate of pertinent engagement parameters is necessary to guide the missile to its target. Chapter 2 of the dissertation presents fundamental concepts in estimation theory and lays the groundwork for Chapter 3 on estimation in linear sampled data systems. Chapter 3 is included because the equations describing the engagement are fundamentally linear and the measurements come at discrete time intervals. In practice, there are often deviations from the linearity assumption, but these are often accounted for by modifying the basic equations developed for 27 linear systems — e.g. the Extended Kalman Filter (EKF) and many other variants. Chapter 4, which deals with stochastic motion modeling, is included because of the inherent uncertainty involved in tracking a target. Application of an appropriate stochastic motion model allows the missile tracking algorithm to place bounds on the uncertainty and often test out different motion model hypotheses (multiple model estimation) during the tracking process. Chapter 5 develops fundamental concepts in optimization and advanced control theory. Geometrical arguments are used to develop the theory of constrained para- metric optimization. Lyapunov control theory ([41], [94], and [92, ch. 5]) is developed and an example of a quickest decent Lyapunov controller is given. Optimal control theory is developed using geometrical arguments based on semipermeable surfaces in n-dimensional space. This technique, whichiscoveredinmoredetaininVincent’s book on optimal control [92], avoids the use of functionals common to many other variational approaches to optimal control theory (e.g. [20] and [47]). The linear quadratic regulator is developed and an example is given. A moderately rigorous derivation of the separation theorem for linear sampled data systems is then pro- vided. Chapter 5 concludes with a lengthy statement of the min-max principle for continuous-time systems — the results of which should be directly evident after the development of optimal control theory. Chapter 6 develops the equations describing the missile airframe and associated controller (autopilot), which together are referred to as the flight control system. An aerodynamic missile cannot instantaneously achieve a desired level of acceleration. A tail controlled missile is analyzed and the equations governing the commanded to achieved acceleration are developed. This configuration results in a non-minimum phase, third-order system. In order for a missile to accelerate in a desired direction, the tail fins must firstinclineintheoppositedirectionsoastopitchthebody(the main lifting surface) in the desired direction. The initial lift that is created from the tail fins causes the missile to accelerate in the direction that is opposite to the 28 desired direction, a phenomenon known as the "wrong-way" effect that is character- istic of non-minimum phase systems. The complexity of the non-minimum phase, third-order system motivates the guidance engineer to seek an appropriate modeling approximation. In this vein, a single pole and a pole-zero model are developed and compared to the full flight control system model. Chapter 7 develops the theory of optimal missile guidance. The equations gov- erning the miss are developed and then used to define the zero effort miss (ZEM), which is the miss that occurs when the pursuer exerts zero effort. Guidance laws are developed for a missile with a very generic flight control system, where the only requirement is that it (the flight control system) can be represented by a linear, time- invariant ordinary differential equation (or equivalently a transfer function). The resulting guidance law is related to the previously defined zero effort miss. An ex- plicit closed loop guidance law for the single pole flight control system model is given. Optimal evasion (differential game theory) is also considered and shown to result in a guidance law that is simply an amplified (increased gain) version of the guidance law for zero target acceleration. The nonlinear equations describing a missile without axial acceleration capabilities are considered and new results are presented. Chap- ter 7 concludes with a discussion of time-to-go estimation, wherein a new time-to-go algorithm is presented. Chapter 8 takes up the important topic of estimating the variables required by all previously developed guidance laws, more specifically with estimating the zero effort miss. Stochastic motion modeling, as discussed in Chapter 4, is used to specify the equations describing the motion of the target. The missile is modeled by the flight control system discussed in Chapter 6. The engagement equations are given in terms of variables that are relevant for a sensor mounted on a gimbaled platform, namely equations related the measured boresight angles and other sensor data. The resulting equations are shown to be amenable to estimation by the Kalman filter that was developed in Chapter 3. In Chapter 8, numerical estimation results are not 29 of importance and therefore not pursued, but the interested reader can find them elsewhere, for example in [84]. Chapter 9 of this dissertation ties in Chapters 7 and 8 by invoking the separation theorem. This results in the author’s observation that there are optimal guidance strategies, that are themselves dictated by the pursuer’s (missile’s) information con- straints (i.e. level of knowledge about the state of the system). Similarly, there are also optimal control laws to achieve a given guidance strategy. Hitherto, the guid- ance strategy and control strategy have been relatively inseparable and corporately referred to as guidance laws. The main exception to this is the work by Schney- dor [78], where it is evident from the organization of his book, that he too thought such a dichotomy is inherent to missile guidance. Until now any definitive sweeping statement would have been difficult to support, even if it were suspected — the main reason is that optimal guidance laws have only been devised for linear systems. Any nonlinearity makes solving the differential equations by analytical means impossible (at least using present day solution techniques that do not involve power series), and therefore limited the practical applicability of optimal control theory to nonlinear missile guidance5. However, the new results obtained in Chapter 7 solved the op- timal control problem for a nonlinear engagement model by indirect means. The solution indicates that the guidance strategies that are optimal for linear engagement problems are also optimal for nonlinear engagements. The nonlinearities only affect how the control law will achieve an optimal strategy. Chapter 10 demonstrates the usefulness of the strategy and control dichotomy paradigm. The paradigm is first used to show that Lyapunov control theory can be used to obtain a guidance law for a missile by choosing the Lyapunov function to be associated with the applicable guidance strategy. Next, the paradigm is used to separate existing guidance laws into strategies and control laws. Once this has been done, a guidance law can be extended to higher-level strategies. This concept is used
5 There are exceptions to this, for example see [95]. 30 to extend the well known pure proportional navigation guidance law to a higher-level strategy that should be much more effective against maneuvering targets. The last section of Chapter 10 provides a conclusion for the dissertation.
1.4 Notation
This dissertation covers several areas of engineering, each of which has a relatively standard notation. Whenever possible, the commonly accepted standards were ad- hered to. For example, vectors are usually depicted as bold faced, lower-case letters and matrices are usually depicted as bold faced, capital letters. Most of the other use of symbols and accents should be clear from the context in which it occurs. However, the author feels the need to briefly discuss the notation used for random variables. This dissertation makes frequent use of random variables. It is standard practice to distinguish the name of a random variable from a particular value the random variable takes. This is typically done by either using a bold face font or capitalization. This convention presents a difficulty in this dissertation because bold face is often reserved to distinguish vector quantities from scalar quantities and capitalization is often used to distinguish matrices from vectors. This conflictbecomesmostapparent when working with functions of random variables, where the previously mentioned notation is used to distinguish scalars, vectors and matrices. Without any other satisfactory solution to this dilemma the author chose to capitalize the name of a random variable only when it would have caused confusion not to do so. For example, the standard syntax for the expectation of a scalar random variable X is
∞ E X = xfX (x) dx . (1.1) { } Z−∞ However, this dissertation often uses the syntax
∞ E x = xfX (x) dx , (1.2) { } Z−∞ 31 becauseitisclearthatx in the expectation operator refers to the random variable itself rather than any value the random variable might take. A case when such an abuseinnotationwouldnothavebeenacceptableiswhenstatingtheprobabilitythat the random variable X takesonavaluelessthanx
x P (X x)=FX (x)= ξfX (ξ) dξ . (1.3) ≤ Z−∞ In this case, it is necessary to explicitly distinguish the name of the random variable from the argument x. 32
Chapter 2 Estimation
This chapter presents fundamental concepts in estimation theory and lays the ground- work for Chapter 3 on estimation in linear sampled data systems. The first section of this chapter provides a definition of estimation and a discussion of optimal estima- tion. This section is followed by a section that develops the fundamental equations for Bayesian estimation. The next section extends the Bayesian estimator to the case when there are multiple model descriptions of the system, with each model having aspecified probability of being the correct model. The final section of this chapter provides a discussion of linear estimation.
2.1 General Concepts in Estimation
Estimation has been the topic of many papers as well as textbooks. Perhaps one of the most popular books on the subject is [32], which is in its 16th printing since its original publication in 1974. The book has probably enjoyed so much success because of its practical approach to the subject and its many examples. Another excellent reference is [86], which develops the topic succinctly from a least-squares perspective. A moderately rigorous approach to the subject is given in [3]. Reference [19] is useful for those unfamiliar to both random processes and estimation; this book also provides Matlab exercises and solutions. Bryson and Ho’s celebrated optimal control book provides a nice introduction to the subject for those familiar with control theory [43].
2.1.1 Estimation Defined
The following definitions come from reference [1]. An estimation (noun) is an ap- proximate calculation (or evaluation/opinion/judgement) of the amount, magnitude, 33 extent, position or value of something. An estimator is an algorithm or tool used to compute an estimation. To estimate (verb) something means to generate an estima- tion. Alternately, the noun form of the word estimate has the same meaning as the noun form of the word estimation. For this reason, the terms estimate (noun) and estimation (noun) are used interchangeably. However, the term estimator should never be used interchangeably with the terms estimate or estimation. The remainder of this section develops basic concepts in estimation theory. The Markov model is defined and related to the problem of object tracking. The concepts of optimal estimation and risk are developed and related to maximum likelihood estimation and minimum mean square error (MMSE) estimation.
2.1.2 Markov Model
Consider the vector system [6]
xk = fk (xk 1, vk 1) , (2.1) − −
nx nv nx where xk is the system state vector; fk : is a possibly nonlinear < ×< → < function; vk is an i.i.d. (independently and identically distributed) process noise sequence; nx and nv are the dimensions of the state and noise vectors. The system produces a Markov sequence because vk is i.i.d. and the function fk only depends on xk 1 (and no prior values of x). A Markov sequence has properties, some of which are − derived in Section A.8.2, that make it amenable to state estimation. These properties will be exploited in the sections that follow. A measurement equation is defined by
zk = hk (xk, nk) , (2.2)
nx nn nz where zk is the measurement vector; hk : is a possibly nonlinear < ×< → < function; nk is an i.i.d. measurement noise sequence; nz and nn are the dimensions 34
of the measurement and measurement noise vectors. The notation z1:k will be used to denote the set of zi measurements with i [1,k]. ∈ The model given by Eq. 2.1 and Eq. 2.2 is a hidden Markov model (HMM) because the data available to the observer are not the evolution of the states but a second stochastic process that is a probabilistic function of the states [85, Sec. 9.5].
In the context of object tracking, the state xk usually consists of the position, velocity, acceleration, and other features of the object. The observation zk is usually a video frame at the current time instance [89], [83, p. 744].
2.1.3 Optimal Estimation and Risk
Reference [32] has the following to say about optimal estimation: “An optimal estima- tor is a computational algorithm that processes measurements to deduce a minimum error (in accordance with some stated criterion of optimality) estimate of the state of a system by utilizing: knowledge of system and measurement dynamics, assumed sta- tistics of system noises and measurement errors, and initial condition information.” Any time the term optimal is used, one should immediately look for the associated index or cost. That is, a quantity that is optimal with respect to one index is not necessarily optimal with respect to another index.
Suppose that k measurements are available and an initial estimate xˆ0 of the state x0 is available. Using this information, the goal is to form an optimal estimate xˆk of the state xk. Let a scalar function L (xˆk, xk) represent the loss that occurs as a result of choosing xˆk as the estimation when the true state is xk. The estimate can be a function of all measurements and the initial estimate
xˆk = g (z1:k, xˆ0) . (2.3)
To simplify notation, we note that the initial estimate xˆ0 can also be thought of as a measurement z0 and so the estimator takes the form
xˆk = g (z0:k) . (2.4) 35
The risk of using the function g (z0:k) to compute an estimate xˆk of the state xk R is defined to be the expected value of the loss conditioned on the measurement data z0:k
(g ( )) E L (xˆk, xk) z0:k R · , { | } ∞ = L (xˆk, xk) f (xk z0:k) dxk | Z−∞ ∞ = L (g (z0:k) , xk) f (xk z0:k) dxk . (2.5) | Z−∞ Minimum Mean Square Error Suppose that the loss is a quadratic function of the estimation error T L (xˆk, xk)=(xˆk xk) S (xˆk xk) . (2.6) − − Then, the risk is given by
(g ( )) = E L (xˆk, xk) z0:k R · { | } T = E (xˆk xk) S (xˆk xk) z0:k − − | Tn T oT T = xˆ Sˆxk xˆ SE xk z0:k E x z0:k Sˆxk + E x Sxk z0:k (2.7), k − k { | } − k | k | © ª © ª where xˆk can be removed from the expectation because it depends only on the con- ditioning variables z0:k. Further simplification of the risk gives
T T T T (g ( )) = xˆ Sˆxk xˆ SE xk z0:k E x z0:k Sˆxk + E x Sxk z0:k R · k − k { | } − k | k | T =(xˆk E xk z0:k ) S (xˆk E©xk z0:k ª) © ª − { | } − { | } T T E x z0:k SE xk z0:k + E x Sxk z0:k . (2.8) − k | { | } k | © ª © ª Only the first term depends on xˆk and it can be made zero if
xˆk = E xk z0:k . (2.9) { | }
Therefore, the estimate provided by Eq. 2.9 minimizes the risk of the quadratic loss function defined in Eq. 2.6. In this sense, the conditional expectation defined in Eq. 36
2.6 is optimal, and is often referred to as a minimum mean square error (MMSE). The conditional mean is an unbiased estimate, which can be shown by taking its expectation
∞ E xˆk = E xk z0:k f (z0:k) dz0:k { } { | } Z−∞ ∞ ∞ = xkf (xk z0:k) f (z0:k) dxkdz0:k | Z−∞ Z−∞ ∞ ∞ = xkf (xk, z0:k) dxkdz0:k Z−∞ Z−∞ ∞ = xkf (xk) dxk Z−∞ = E xk . (2.10) { }
It has now been established that the MMSE estimate is the conditional mean given in Eq. 2.9. However, Eq. 2.9 is an equation describing an estimate, it is not an estimator. Suppose one had knowledge of the conditional density function f (xk z0:k). | Then one could compute the MMSE estimation by taking the expectation of the conditional density function
xˆk = E xk z0:k { | } ∞ = xkf (xk z0:k) . (2.11) | Z−∞ Maximum Likelihood Estimation Suppose that the risk is defined to be 1 if the ab- solute value of the error is greater than ε and zero otherwise
1 xˆ x > ε L (xˆ , x )= if k k . (2.12) k k 0 | otherwise− | ½ The risk could be modified slightly, for example a quadratic risk could be used
1 (xˆ x )T S (xˆ x ) >ε L (xˆ , x )= if k k k k . (2.13) k k 0 − otherwise− ½ The main point is to choose a risk that places zero penalty on estimates that are very accurate and place a large, but equal penalty on all other estimates. The risk is 37 given by
(xˆk)= f (xk z0:k) dxk R | xˆ Zx >ε | k− k|
=1 f (xk z0:k) dxk . (2.14) − | xˆ Zx <ε | k− k| If ε is very small
(xˆk) 1 f (xˆk z0:k) . (2.15) R − ∝− |
Minimizing (xˆk) over all xˆk is equivalent to minimizing (xˆk) 1 over all xˆk.The R R − minimum value occurs when f (xˆk z0:k) is maximum. That is, the optimal estimate | corresponds to the value of x that maximizes the conditional density
xˆk =arg max f (xk z0:k) . (2.16) xk | ∙ ¸ A density function has a maximum at a point where its derivative is zero. Any point in a density function with a zero derivative is called a mode of the density function. The reason why modes are important is because they represent a locally most likely value
x+dx P (x xk x + dx z0:k)= f (xk z0:k) dxk = f (x z0:k) dx ≤ ≤ | | | Zx + higher order terms . (2.17)
For small dx, this probability is a maximum when the conditional density is a max- imum. The estimate xˆ which corresponds to the peak in the conditional density function f (x z0:k) is more likely than any other value of x. For this reason, the | estimate provided by Eq. 2.16 is called a maximum likelihood estimate (MLE).
General Loss Functions If the loss were not of the types already mentioned, one could still make use of the conditional density to compute the risk of choosing a particular 38
xˆk
(xˆk) E L (xˆk, xk) z0:k R , { | } ∞ = L (xˆk, xk) f (xk z0:k) dxk . (2.18) | Z−∞
Then, one could select the value of xˆk that produced the minimum risk.
It seems that knowledge of the conditional density f (xk z0:k) is the key to optimal | estimation, whether one is concerned with a MMSE estimate, a maximum likelihood estimate or some other criterion. Bayes’ formula provides a means by which the conditional density f (xk z0:k) may be obtained given an a priori density function | f (xk z1:k 1) and a new measurement zk. | −
2.2 Bayesian Estimation
This section develops the general equations for Bayesian estimation. Assume that an initial density function for the state vector is given as f (x0) and define
f (x0 z0) f (x0) . (2.19) | ,
The goal is to use f (xk 1 z0:k 1) to form a new estimate f (xk z0:k) when a new − | − | measurement zk becomes available. The a priori density function f (xk z0:k 1) can | − be found from integration of the density function f (xk, xk 1 z0:k 1) − | −
∞ f (xk z0:k 1)= f (xk, xk 1 z0:k 1) dxk 1 | − − | − − Z−∞ ∞ = f (xk xk 1, z0:k 1) f (xk 1 z0:k 1) dxk 1 | − − − | − − Z−∞ ∞ = f (xk xk 1) f (xk 1 z0:k 1) dxk 1 , (2.20) | − − | − − Z−∞ where the last equality follows because the state xk described by Eq. 2.1 is a Markov sequence. The probabilistic model of the state evolution f (xk xk 1) is known from | − Eq.2.1andtheknownstatisticsofvk 1. The density function f (xk 1 z0:k 1) is − − | − 39 known from a previous iteration. Another consequence that is readily recognizable from the Markov model given by Eq. 2.1 and measurement equation given by Eq. 2.2 is
f (zk xk, z0:k 1)=f (zk xk) . (2.21) | − |
The a posterior estimate f (xk z0:k) can be found using Bayes’ formula | f (z0:k, xk) f (xk z0:k)= | f (z0:k) f (zk z0:k 1, xk) f (z0:k 1, xk) = | − − f (zk z0:k 1) f (z0:k 1) | − − f (zk z0:k 1, xk) f (xk z0:k 1) f (z0:k 1) = | − | − − f (zk z0:k 1) f (z0:k 1) | − − f (zk z0:k 1, xk) f (xk z0:k 1) = | − | − f (zk z0:k 1) | − f (zk xk) f (xk z0:k 1) = | | − . (2.22) f (zk z0:k 1) | − The density f (zk xk) in Eq. 2.22 is known from Eq. 2.2 and the known statistics | of nk.Thetermf (xk z0:k 1) is known from Eq. 2.20. The term f (zk z0:k 1) is | − | − obtained rearranging Eq. 2.22 and integrating
∞ f (zk z0:k 1)= f (zk xk) f (xk z0:k 1) dxk . (2.23) | − | | − Z−∞ Eqs. 2.20-2.23 are the necessary equations for updating the conditional density func- tion f (xk 1 z0:k 1) f (xk z0:k). The algorithm would work as follows. Process the − | − → | measurement by evaluating the density f (zk xk) at the measurement value zk.Next | compute Eq. 2.20. Using 2.20 and f (zk xk) compute Eq. 2.23. Finally, compute | the a posteriori density function given by Eq. 2.22. With the a posteriori estimation known, one could compute an MMSE estimate (Eq. 2.11), an MLE estimate (Eq. 2.16) or any other minimum risk estimate. The following section extends the Bayesian equations to a multiple model context, in which the underlying process is described by a set of models, each with an associated probability of being the true process model. 40
Process Noise
1 1 System xˆ , P Dynamics Filter 1
xˆ, P Measurement Combine System Estimates
Measurement xˆ r , Pr Noise Filter r
Calculate Probabilities
Figure 2.1. Block diagram of multiple-model system [7, p. 127].
2.3 Multiple Model
This section extends the Bayesian estimator to the case when there are multiple model descriptions of the system, with each model having a specified probability of being the correct model. The basic idea behind the multiple model approach is shown in Figure 2.1. Each model can have a different state (process) model and/or a different measurement model. The model probabilities are computed according to Bayes’ rule, as discussed in Section 2.2 with all quantities conditioned on a particular model. Let
Mj be the event that model j is correct with prior probability
P (Mj)=µj (0) , (2.24) 41 and corresponding density
r
fM (m)= µj (0) δ (m j) . (2.25) j=1 − X
Assume that an initial density function for the state vector is given as f (x0 m) and | define
f (x0 z0,m) f (x0 m) . (2.26) | , |
Assume that k measurements have been made and are represented by z1:k.The density of the current state xk conditioned on the measurements z1:k 1 and the model − choice m can be computed using Eq. 2.20
∞ f (xk z1:k 1,m)= f (xk xk 1,m) f (xk 1 z1:k 1,m) dxk 1 , (2.27) | − | − − | − − Z−∞ where the density f (xk xk 1,m) is implicitly given from model m and f (xk 1 z1:k 1,m) | − − | − is known from a previous iteration. The density of a new measurement zk condi- tioned on the measurements z1:k 1 and the model choice m can be computed using − Eq. 2.23 ∞ f (zk z1:k 1,m)= f (zk xk,m) f (xk z1:k 1,m) dxk , (2.28) | − | | − Z−∞ where the density f (zk xk,m) is implicitly given from model m and f (xk z1:k 1,m) | | − is given by Eq. 2.27. The density of xk conditioned on the measurements z1:k and the model choice m can be computed using Eq. 2.22
f (zk xk,m) f (xk z1:k 1,m) f (xk z1:k,m)= | | − , (2.29) | f (zk z1:k 1,m) | − where the density f (zk xk,m) is implicitly given from model m; the density f (xk z1:k 1,m) | | − is given by Eq. 2.27; and f (zk z1:k 1,m) is given by Eq. 2.29. The density z1:k con- | − ditioned on the chosen model m is given by Bayes’ rule
f (z1:k m)=f (zk, z1:k 1 m) | − |
= f (zk z1:k 1,m) f (z1:k 1 m) , (2.30) | − − | 42
where f (zk z1:k 1,m) is given by Eq. 2.28 and f (z1:k 1 m) is known from a previous | − − | iteration. The a posteriori model density, which is the model density m conditioned on the measurements z1:k is given by Bayes’ rule
f (z1:k m) fM (m) f (m z1:k)= | | f (z1:k) f (z1:k m) fM (m) = | , (2.31) ∞ f (z1:k m) fM (m) dm −∞ | where f (z1:k m) is given by Eq. 2.30R and fM (m) is given by Eq. 2.25. The expec- | tation of the state conditioned on the measurements z1:k and the model choice m is given by ∞ E xk z1:k,m = xkf (xk z1:k,m) dxk , (2.32) { | } | Z−∞ where f (xk z1:k,m) is given by Eq. 2.29. The expectation of the state given the | measurements z1:k is given by
∞ ∞ E xk z1:k = xkf (xk,mz1:k) dxkdm { | } | Z−∞ Z−∞ ∞ ∞ = xkf (xk, mz1:k) f (m z1:k) dxkdm | | Z−∞ Z−∞ ∞ = E xk z1:k,m f (m z1:k) dm , (2.33) { | } | Z−∞ where E xk z1:k,m is given by Eq. 2.32 and f (m z1:k) is given by Eq. 2.31. Differ- { | } | ent estimates will be used depending on the loss function chosen (see Section 2.1.3). If the loss is the mean-square estimation error then the conditional expectation, given by Eq. 2.33, will be used. If the loss is such that a maximum-likelihood estimator results, then the value of x will be chosen such that Eq. 2.29 attains a maximum. The results in this section can be greatly simplified if the models are linear, and the process and measurement noise are linear [7, Ch. 4]. These are the same simplifica- tions that result in a Kalman filter being more than the best linear unbiased estimator (BLUE), but also the minimum mean square error (MMSE) estimator. An excellent derivation and example of the multiple model for Gaussian noise is given in [19, pp. 353-361]. 43
2.3.1 Interactive Multiple Model
The latest research in the area of multiple-models are interacting multiple-models (IIM). The basic idea is that each of the models are considered a state in a Markov chain. A model transition matrix is then constructed that contains the probabilities that the object will transition from one motion model to another at a given instant. The interactive multiple model was first proposed by Blom [12, p. 221], [15] and [16]. Sections 2.2 and 2.3 have developed the framework for Bayesian estimation. Prac- tical application of the equations developed therein is application specific. It is often the case that a sub-optimal solution is settled on, with the most common choices being particle filters or linear estimators. The theory of particle filtersisbeyond the scope of this dissertation, but the interested reader can consult the literature [6], [27]. The other common approach to estimation is the use of linear filters, which is the topic of Section 2.4.
2.4 Linear Estimation
An MMSE estimator provides the minimum mean square error estimate among all estimators that are a (possibly nonlinear) function of the measurements zk. Except for the special, but important, case when the system equation and measurement equation are linear and have Gaussian noise, it may be difficult to derive an MMSE estimate. For this reason, it is helpful to restrict the estimator to the class of functions that are linear. In this section, no restrictions are imposed on the state model (i.e. it may be nonlinear), but the measurement equation is assumed to have the form
zk = Hkxk + vk . (2.34)
For the moment, assume no a priori knowledge about xk is available, either in the form of an initial estimate xˆk/k 1 or in the form of measurements z0:k 1.Alinear − − 44 estimator has the form
xˆk = Lzk , (2.35) where L is an n m matrix. Let e denote the error in the estimate ×
ek= xˆk xk . (2.36) −
2.4.1 Unbiased Estimation
Recall from Eq. 2.10 that an MMSE estimate is unbiased. Requiring the same of the linear MMSE estimator requires the expected value of ek to be zero
E ek = E xˆk xk { } { − }
= E Lzk xk { − }
= E L (Hkxk + vk) xk { − }
=(LHk In) E xk . (2.37) − { }
Since the state xk is not (in general) zero mean, then the expected value of the e is zero if and only if
LHk= In . (2.38)
At this point, a very important implication of unbiased estimation is made clear.
Namely that the matrix Hk must have full column rank for unbiased estimation to be possible. If the matrix Hk is less than full column rank, then it is impossible for any matrix L to satisfy Eq. 2.38. Suppose that the loss function is the mean square error T L (xˆk, xk)=ek ek . (2.39) 45
The associated risk is the mean square error. The error covariance is defined by
T Pk = E ekek T = E ©(xˆk ªxk)(xˆk xk) − − n oT = E (Lzk xk)(Lzk xk) − − n o T = E (L (Hkxk + vk) xk)(L (Hkxk + vk) xk) − − n T o = E (Lvk)(Lvk)
n T T o = LE vkvk L
T = LRk©L .ª (2.40)
The trace of the error covariance is equal to the risk of the loss function
T Tr(Pk)=E (xˆk xk) (xˆk xk) − − n o = E L (xˆk, xk) { }
= (xˆk) . (2.41) R
If one can find a matrix L that minimizes the trace of the error covariance Pk, then the mean square error will also be minimized. The estimator that satisfies this criteria is often referred to as the best linear unbiased estimator (BLUE).
2.4.2 BLUE Estimation
The best linear unbiased estimator (BLUE) provides an estimate xˆk that is both unbiased and has the minimum variance among all linear estimators. The BLUE can be obtained by minimizing the trace of the covariance Pk, given by Eq. 2.40, subject to the constraint imposed on all unbiased estimators, given by Eq. 2.38
T min Tr LRkL . (2.42) L:LHk=In Theorem 2.1. The BLUE estimator satis¡fies the equation¢
T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk (2.43) 46
Proof. Determining the matrix L that satisfies Eq. 2.42 is a constrained optimization problem and can be solved using Lagrange multipliers. A slight modification is necessary because the constraint is a matrix constraint. To accomplish this, we will first find the BLUE of a linear combination of the elements of the state vector
T y = γ xk , (2.44) where γ is an n 1 vector. A linear unbiased estimator of y,denotedbyyˆ,willbe × of the form T yˆ = θ zk , (2.45) where θ is an m 1 vector. If the estimator is to be unbiased we must have γ = HT θ, × k as can be shown by
E y yˆ = E y E yˆ { − } { } − { } T T = γ E xk θ E zk { } − { } T T = γ E xk θ HkE xk { } − { } T T = γ θ Hk E xk − { } =0¡ ,¢ (2.46) or T γ = Hk θ . (2.47) 47
The corresponding covariance is given by
2 Pyˆ = E (ˆy y) − T T 2 = E © θ zk γª xk − n T o T 2 = E ¡θ (Hkxk+v¢k) γ xk − n T T T 2o = E ¡θ Hkxk γ xk+θ vk¢ − n T T 2 o T 2 = E ¡θ Hkxk γ xk + E¢ θ vk − nT T 2 oT n o = θ H¡ kxk γ xk +¢θ Rkθ ¡ ¢ − T T 2 T = ¡θ Hkxk θ Hk¢xk + θ Rkθ − T = ¡θ Rkθ .¢ (2.48)
T T Thus, we seek to minimize θ Rkθ subject to γ = Hk θ
T T T L = θ Rkθ + λ γ H θ , (2.49) − k ¡ ¢ where λ is a n 1 vector of Lagrange multipliers. Taking the derivative with respect × to θ and setting the result equal to zero gives
T T T 2θ Rk λ H = 0 , (2.50) − k or 1 θ = R 1H λ . (2.51) 2 k− k Substituting this result into the constraint equation gives
1 γ = HT R 1H λ . (2.52) 2 k k− k
Before proceeding further, assume that the above process optimization is performed
th th for γ = γi, a unit vector with the i element equal to one. Then, yi is the i element of the state vector xk and the equations of importance are (from the proceeding 48 analysis) given below.
T yi = γi xk (2.53a)
T yˆi = θi zk (2.53b) 1 θ = R 1H λ (2.53c) i 2 k− k i 1 γ = HT R 1H λ (2.53d) i 2 k k− k i Define the following matrices.
Λ = λ1 λ2 λn (2.54a) ···
In = £ γ γ γ ¤ (2.54b) 1 2 ··· n
Θ = £ θ1 θ2 θn ¤ (2.54c) ··· Then, in matrix form, the collected£ equations are given¤ below.
y = xk (2.55a)
T yˆ = Θ zk (2.55b) 1 Θ = R 1H Λ (2.55c) 2 k− k 1 I = HT R 1H Λ (2.55d) n 2 k k− k From these, our estimate of x becomes
T xˆk= yˆ = Θ zk . (2.56)
T 1 Pre-multiplying by Hk Rk− Hk gives
T 1 T 1 T Hk Rk− Hkxˆ = Hk Rk− HkΘ zk . (2.57)
Substituting Eq. 2.55c gives
T 1 T 1 T Hk Rk− Hkxˆ = Hk Rk− HkΘ zk 1 T = HT R 1H R 1H Λ z k k− k 2 k− k k µ ¶ 1 T 1 T T 1 = H R− H Λ H R− z . (2.58) 2 k k k k k k 49
To simplify this result, we must show that the matrix Λ is symmetric. To do this,
1 T 1 we make use of Eq. 2.55d and note that S = 2 Hk Rk− Hk is symmetric
SΛ = In . (2.59)
As has already been stated, the matrix Hk must be full rank for unbiased estimation to be possible. This result and the previous equation indicate that Λ is the inverse of S.SinceS is symmetric, Λ must also be symmetric
T T In =(SΛ) = Λ S . (2.60)
That is, both Λ and ΛT are inverses of the matrix S and therefore
Λ = ΛT . (2.61)
Substituting this result into Eq. 2.58 becomes 1 1 HT R 1H xˆ = HT R 1H ΛT HT R 1z = HT R 1H ΛHT R 1z = HT R 1z . k k− k k 2 k k− k k k− k 2 k k− k k k− k k k− k (2.62) Thus, the theorem is proved
T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.63)
2.4.3 Least Square Estimation
A least square estimator is one that minimizes
T 1 L =(zk Hkxˆk) R− (zk Hkxˆk) . (2.64) − k − As will now be shown, the least square estimator is also the BLUE estimator. This is an unconstrained optimization problem. Taking the partial derivative with respect to xˆk and setting the result equal to zero gives
T 1 2(zk Hkxˆk) R− Hk= 0 , (2.65) − k 50 or T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.66)
2.4.4 Maximum Likelihood Estimation
Another way of arriving at the BLUE estimator is to assume that the noise is nor- mally distributed vk N (0, Rk) and use a technique known as maximum likelihood ∼ estimation. When the noise Gaussian distributed, the measurement zk is distributed as N (Hkxk, Rk) with the corresponding joint PDF
1 1 T 1 fZ (zk H , x , R )= exp (zk Hkxk) R− (zk Hkxk) . k | k k k m/2 1/2 −2 − k − (2π) Rk − ½ ¾ | | (2.67) The idea behind maximum likelihood estimation is to choose the parameter values, in this case xk, such that the observed values of zk are the most likely ones to have occurred. Since the value of xk is to be selected so as to minimize the likelihood function, we replace it with the estimate xˆk to distinguish it from the true value of the state xk
1 1 T 1 fZ (zk H , xˆ , R )= exp (zk Hkxˆk) R− (zk Hkxˆk) . k | k k k m/2 1/2 −2 − k − (2π) Rk − ½ ¾ | | (2.68) Now, the likelihood function, which is always non-negative, is maximized when its logarithm is maximized, which is more convenient
1 1 T 1 log (L)=log m/2 1/2 (zk Hkxˆk) RK− (zk Hkxˆk) . (2.69) Ã(2π) Rk − ! − 2 − − | | Taking the partial derivative with respect to xk gives
∂ (log (L)) ∂ 1 T 1 = (zk Hkxˆk) R− (zk Hkxˆk) ∂xˆ ∂xˆ −2 − k − k k ∙ ¸ T 1 =(zk Hkxˆk) R− Hk . (2.70) − k Setting this result equal to zero gives
T T 1 T 1 xˆk Hk Rk− Hk= zk Rk− Hk , (2.71) 51 or T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.72)
2.4.5 Full Rank Models
For the purposes of this discussion, full rank models are those models for which the m n dimensional matrix Hk has full rank; i.e. its rank is n.Allthreemethods × of estimation (least squares, maximum likelihood and BLUE) result in an estimation equation given by T 1 T 1 Hk Rk− Hkxˆk= Hk Rk− zk . (2.73)
T 1 If the matrix Hk is full rank, the inverse of Hk Rk− Hk exists and xˆ canbesolvedfor
1 T 1 − T 1 xˆ = Hk Rk− Hk Hk Rk− zk . (2.74) ¡ ¢ That is, the estimate is of the form given below.
xˆ = Lz (2.75a)
1 T 1 − T 1 L = Hk Rk− Hk Hk Rk− (2.75b) ¡ ¢ Since this is a linear unbiased estimator, the covariance of the estimate is given by Eq. 2.40
T Pk = LRkL 1 1 T T 1 − T 1 T 1 − T 1 = Hk Rk− Hk Hk Rk− Rk Hk Rk− Hk Hk Rk−
h 1 i h 1 T i ¡ T 1 −¢ T 1 ¡T 1 −¢ = Hk Rk− Hk Hk Rk− Hk Hk Rk− Hk
1 T h i ¡ T 1 ¢ − ¡ ¢ = Hk Rk− Hk 1 h T 1 − i = H¡ k Rk− Hk ¢ . (2.76) ¡ ¢ The matrix L can be expressed using the updated covariance matrix
T 1 L = PkHk Rk− . (2.77) 52
2.4.6 A Priori Estimates and Rank Deficient Models
It has been found that unbiased estimation is only possible if the matrix Hk is of full column rank. It turns out that this condition is no longer necessary if an a priori estimate is available.
Theorem 2.2. Suppose that an a priori estimate xˆk/k 1 is available with a covariance − Pk/k 1. The BLUE (or least square or maximum likelihood) estimator is given by − xˆk/k = xˆk/k 1 + Kk z Hkxˆk/k 1 and the associated error covariance is given by − − 1 − 1 T 1 − T 1 Pk/k = Pk/k− 1 + Hk¡Rk− Hk ,where¢ K = Pk/k 1Hk Rk− − − h i Proof. Combining the a priori information with the measurement equation results in the system of equations given below.
xˆk/k 1 In − = xk + m (2.78a) z Hk ∙ ¸ ∙ ¸ T Pk/k 1 0 E mm = − (2.78b) 0Rk ∙ ¸ © ª Note that the combined system is in the form of a new measurement equation. To obtain an a posteriori unbiased estimate of xk,werequire I rank n = n , (2.79) Hk µ∙ ¸¶ which is insured because of the In present in the matrix. The BLUE estimator of this system is
xˆk/k 1 xˆ = L − k/k z ∙ ¸T 1 In Pk/k 1 0 − xˆk/k 1 = Pk/k − − Hk 0Rk z ∙ ¸ ∙ ¸ ∙ ¸ T 1 In Pk/k− 1 0 xˆk/k 1 = Pk/k − 1 − Hk 0R− z ∙ ¸ ∙ k ¸ ∙ ¸ T 1 In Pk/k− 1xˆk/k 1 0 = Pk/k − − 1 Hk 0R− z ∙ ¸ ∙ k ¸ 1 T 1 = Pk/k Pk/k− 1xˆk/k 1 + Hk Rk− z , (2.80) − − h i 53
where the a posteriori covariance matrix Pk/k is given by
1 T 1 − In Pk/k 1 0 − In Pk/k = − . (2.81) Hk 0Rk Hk Ã∙ ¸ ∙ ¸ ∙ ¸! Taking the inverse of the a posteriori covariance gives
T 1 1 In Pk/k 1 0 − In Pk/k− = − Hk 0Rk Hk ∙ ¸ ∙ ¸ ∙ ¸ T 1 In Pk/k− 1 0 = − 1 Hk 0R− H ∙ ¸ ∙ k k ¸ 1 T 1 = Pk/k− 1 + Hk Rk− Hk . (2.82) −
1 Solving this equation for Pk/k− 1 and substituting into Eq. 2.80 gives −
1 T 1 xˆk/k = Pk/k Pk/k− 1xˆk/k 1 + Hk Rk− z − − h 1 T 1 i T 1 = Pk/k Pk/k− Hk Rk− Hk xˆk/k 1 + Hk Rk− z − − h³ T 1 ´ i = xˆk/k 1 + Pk/kHk Rk− z Hkxˆk/k 1 − − −
= xˆk/k 1 + Kk z Hkxˆ¡k/k 1 ,¢ (2.83) − − − ¡ ¢ where 1 Kk = Pk/kHkRk− . (2.84)
Therefore, the a posteriori covariance and state estimate are given below.
1 1 T 1 − Pk/k = Pk/k− 1 + Hk Rk− Hk (2.85) − ³ T 1 ´ Kk = Pk/kHk Rk− (2.86)
xˆk/k = xˆk/k 1 + Kk z Hkxˆk/k 1 (2.87) − − − ¡ ¢ The use of this estimator only requires that the rows of Hk are linearly independent — a very reasonable assumption.
This chapter has introduced the subject of estimation. Optimal estimation was defined in the context of risk. Bayesian estimation was discussed and fundamental 54 equations were derived. Linear estimation was discussed in detail, as a possibly sub-optimal alternative to Bayesian estimation. Chapter 3 will develop optimal estimators for linear systems, with the final result being that linear estimators are optimal for linear systems with linear measurement equations. 55
Chapter 3 Estimation in Linear Sampled Data Systems
In the present day when computers are inexpensive and operate at speeds measured in gigahertz, linear sampled data systems are extremely common, and in fact are quickly replacing older (pure) analog systems. This is certainly true of missiles, where traditional analog control systems are now being replaced by more advanced digital control systems. This chapter is concerned with developing estimators for a linear sampled data system. The linear system has the general form discussed in Appendix B and repeated here
x˙ = F (t) x (t)+BS (t) uS (t)+BD (t) uD (t) , (3.1) where uS is a white stochastic input with autocorrelation given by
T E uS (t) uS (t + τ) = Aδ (τ) , (3.2) © ª and uD is a deterministic input. The system response may be expressed in terms of the state transition matrix Φ, as given by Eq. B.45 and repeated here
t x (t)=Φx (t, t0) x (t0)+ Φx (t, τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ . (3.3) Zt0
At every T seconds a linear measurement, zk, of the system state, x (tk),ismade
zk = Hkxk + vk , (3.4)
where vk is a white random sequence. Letting t0 = tk and t = tk+1 in Eq. 3.3 gives
tk+1 x (tk+1)=Φx (tk+1,tk) x (tk)+ Φx (tk+1,τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ Ztk = Φx (tk+1,tk) x (tk)+w (tk)+d (tk) ,(3.5) 56 where tk+1 w (tk)= Φx (tk+1,τ) BS (τ) uS (τ) dτ , (3.6) Ztk and tk+1 d (tk)= Φx (tk+1,τ) BD (τ) uD (τ) dτ . (3.7) Ztk As discussed in Appendix B, the random variable w (tk) is a Gaussian white random sequence. For notational convenience, Eq. 3.5 can be written as
xk+1 = Φkxk + wk + dk , (3.8) where
xk =(n 1) process state vector at time tk × Φk =(n n) matrix relating xk to xk+1 in the absence of a forcing function × wk =(n 1) a white sequence with known covariance structure × dk =(n 1) deterministic input × zk =(m 1) vector measurement at time tk × Hk =(m n) matrix giving the ideal (noiseless) connection between × the measurement and the state vector at time tk vk =(m 1) measurement error - a white sequence with known covariance × structure and having zero cross correlation with the wk sequence (3.9) The autocorrelation of the process noise and measurement noise are denoted by
T E vkvk+m = Rkδ (m) (3.10a)
T E ©wkwk+mª = Qkδ (m) ,(3.10b) © ª where Qk canbecomputedasshowninAppendixB
tk+1 T T Qk = Φx (tk+1,τ) BS (τ) ABS (τ) Φx (tk+1,τ) dτ . (3.11) Ztk The requirement that the process noise and measurement noise be uncorrelated is not necessary for developing an optimal estimator of the system. Rather this assumption is merely used to simplify the ensuing derivation of the optimal estimator. Appro- priate extensions are available in the literature should the reader have need to make use of them [3], [54]. 57
3.1 Bayes’ Estimator
The Bayes’ approach to optimal estimation comes from the equations developed in
Section 2.2. Estimation begins with an initial estimate xˆ0, which is related to the true state x0 by
x0 N xˆ0/0, P0/0 . (3.12) ∼ That is, ¡ ¢ 1 T 1 (x0 xˆ0/0) P− (x0 xˆ0/0) e− 2 − 0/0 − f (x )=f (x z )= , (3.13) 0 0 0 n/2 1/2 | (2π) P0/0 where the conditioning variable z0 represents the initial¯ estimate¯ xˆ0/0 and covariance ¯ ¯ P0/0. The random variable xk xk 1 will always be Gaussian because the process | − noise is Gaussian. Since the random variable xk xk 1 is Gaussian, it is completely | − characterized by its mean and covariance, which are readily computed using Eq. 3.8. The conditional mean is given by
E xk xk 1 = Φk 1xk 1 + dk 1 . (3.14) { | − } − − −
The conditional covariance
T cov xk xk 1 = E (xk E xk xk 1 )(xk E xk xk 1 ) xk 1 { | − } − { | − } − { | − } | − n T o = E wk 1wk 1 xk 1 − − | −
= Qk© 1 .ª (3.15) −
Since wk 1 is Gaussian, the conditioned random variable is also Gaussian −
xk xk 1 N (Φk 1xk 1 + dk 1, Qk 1) . (3.16) | − ∼ − − − −
Therefore
1 T 1 2 (xk Φk 1xk 1 dk 1) Qk− 1(xk Φk 1xk 1 dk 1) e− − − − − − − − − − − − f (xk xk 1)= . (3.17) − n/2 1/2 | (2π) Qk 1 | − | 58
The random variable zk xk will always be Gaussian because the measurement noise | vk is Gaussian. In the same manner as which f (xk xk 1) was determined, it is easy | − to show that the conditional measurement noise density is given by
1 T 1 (zk Hkxk) R− (zk Hkxk) e− 2 − k − f (zk xk)= m/2 1/2 . (3.18) | (2π) Rk | |
3.1.1 First Measurement
After T seconds a measurement z1 is processed. The density f (x0 z0) can be updated | using Eq. 2.20 ∞ f (x1 z0)= f (x1 x0) f (x0 z0) dx0 , (3.19) | | | Z−∞ where f (x1 x0) is given by Eq. 3.17 and f (x0 z0) is given by Eq. 3.13. However, it | | is not necessary to directly evaluate the integral in Eq. 3.19. The random variable x1 is a linear combination of the Gaussian distributed random variables x0 and w0 and is therefore Gaussian distributed as well. Furthermore, the conditioning random variable z0 is also Gaussian and so must be x1 z0. Since all Gaussian random variables | are uniquely characterized by their mean and variance, the distribution is most easily obtained by finding the mean and variance of the random variable x1 z0.Themean | of x1 z0 is obtained from Eq. 3.8 |
E x1 z0 = E Φ0x0 + w0 + d0 z0 { | } { | }
= Φ0E x0 z0 + d0 { | }
= Φ0xˆ0/0 + d0 . (3.20) 59
Thecovarianceofx1 z0 is given by |
T cov x1 z0 = E (x1 E x1 z0 )(x1 E x1 z0 ) z0 { | } − { | } − { | } | n o T = E (Φ0x0 + w0 + d0 E x1 z0 )(Φ0x0 + w0 + d0 E x1 z0 ) z0 − { | } − { | } | n T o = E Φ0 x0 xˆ0/0 + w0 Φ0 x0 xˆ0/0 + w0 z0 − − | n T oT = E ¡Φ0 ¡x0 xˆ0/0¢ Φ0 ¢¡x0 ¡xˆ0/0 z0¢ + E¢ w0w .(3.21) − − | 0 n¡ ¡ ¢¢ ¡ ¡ ¢¢ o © ª The last result follows because w0 is independent of z0, x0,andxˆ0/0.Continuing with the computation gives
T T cov x1 z0 = E Φ0 x0 xˆ0/0 Φ0 x0 xˆ0/0 z0 + E w0w { | } − − | 0 n T T o = Φ0E¡ x¡0 xˆ0/0 ¢¢x0 ¡ xˆ¡0/0 z0 ¢¢Φ + Q0 © ª − − | 0 n T o = Φ0P0/0¡Φ0 + Q0 ¢¡.¢ (3.22)
Thus, the random variable x1 is distributed as
x1 z0 N xˆ1/0, P1/0 , (3.23) | ∼ ¡ ¢ where
xˆ1/0 = Φ0xˆ0/0 + d0 (3.24)
T P1/0 = Φ0P0/0Φ0 + Q0 . (3.25)
That is, the density of x1 is given by
1 T 1 (x1 xˆ1/0) P− (x1 xˆ1/0) e− 2 − 1/0 − f (x1 z0)= . (3.26) n/2 1/2 | (2π) P1/0
Now, we must evaluate f (z1 z0). One option is¯ to use¯ Eq. 2.23 | ¯ ¯ ∞ f (z1 z0)= f (z1 x1) f (x1 z0) dx1 . (3.27) | | | Z−∞ Alternately,wecanusethefactthatthevariablesz1 and z0 are Gaussian. Therefore, the conditional density f (z1 z0) will also be Gaussian. Since a Gaussian RV is | 60 completely determined by its mean and covariance, we proceed by finding the mean and covariance of Eq. 3.4
E z1 z0 = E H1x1 + v1 z0 { | } { | }
= H1E x1 z0 { | }
= H1xˆ1/0 , (3.28) where Eq. 3.23 has been used. The covariance is given by
T cov z1 z0 = E (z1 E z1 z0 )(z1 E z1 z0 ) z0 { | } − { | } − { | } | n³ ´o T = E H1x1 + v1 H1xˆ1/0 H1x1 + v1 H1xˆ1/0 z0 − − | n³ T ´o = E ¡H1 x1 xˆ1/0 + v1 ¢¡H1 x1 xˆ1/0 + v1 ¢ z0 − − | n³ T T T ´o = E ¡H1 x¡ 1 xˆ1/0 ¢ x1 ¢¡xˆ1/0 ¡H z0 ¢+ E ¢v1v − − 1 | 1 n³ T ´oT = H1E ¡x1 xˆ1/0 ¢¡x1 xˆ1/0 ¢ z0 H + R1©,ª (3.29) − − | 1 n³¡ ¢¡ ¢ ´o where we have used the fact that v1 is independent of z0, x1,andxˆ1/0 and
T R1 = E v1v1 . (3.30) © ª From Eq. 3.26 we know that
E x1 z0 = xˆ1/0 . (3.31) { | } Substituting this result into Eq. 3.29 gives
T T cov z1 z0 = H1E x1 xˆ1/0 x1 xˆ1/0 z0 H + R1 { | } − − | 1 n³ ´o T T = H1E (¡x1 E x¢¡1 z0 )(x1 ¢E x1 z0 ) z0 H + R1 − { | } − { | } | 1 n³ T ´o = H1cov (x1 z0) H + R1 . (3.32) | 1 Substituting Eq. 3.22 and Eq. 3.25 into the previous result gives
T cov z1 z0 = H1cov (x1 z0) H + R1 { | } | 1 T = H1P1/0H1 + R1 . (3.33) 61
The conditional density of z1 given z0 can be found using Eq. 3.28 and Eq. 3.33
1 T T 1 (z1 H1xˆ1/0) (H1P1/0H +R1)− (z1 H1xˆ1/0) e− 2 − 1 − fz1 z0 (z1, z0)= . (3.34) | m/2 T 1/2 (2π) H1P1/0H1 + R1
We now have enough information to compute¯ f (x1 z0:1).¯ The random variable ¯ | ¯ x1 z0 is Gaussian (see Eq. 3.26) and the random variable z1 is also Gaussian because | it is a linear combination of two independent Gaussian random variables x1 and v1. Therefore, the random variable x1 z0 conditioned on z1, which is written as | x1 z0z1 = x1 z0:1, must also be Gaussian. Because the random variable x1 z0:1 is | | | Gaussian, it will have the form
1 T 1 (x1 E x1 z0:1 ) [cov(x1 z0:1)]− (x1 E x1 z0:1 ) e− 2 − { | } | − { | } f (x1 z0:1)= n/2 1/2 | (2π) cov (x1 z0:1) | | | 1 Q e− 2 1/1 = n/2 1/2 , (3.35) (2π) cov (x1 z0:1) | | | where
T 1 Q1/1 =(x1 E x1 z0:1 ) [cov (x1 z0:1)]− (x1 E x1 z0:1 ) − { | } | − { | } 1 T 1 = x1 [cov (x1 z0:1)]− x1 2x [cov (x1 z0:1)]− E x1 z0:1 | − 1 | { | } T 1 +E x1 z0:1 [cov (x1 z0:1)]− E x1 z0:1 . (3.36) { | } | { | }
By itself, Eqs. 3.35-3.36 are not of much use because we don’t know the mean and covariance of x1 z0:1. However, the density of x1 conditioned on z0:1 can be found | using Eq. 2.22 f (z1 x1) f (x1 z0) f (x1 z0:1)= | | . (3.37) | f (z1 z0) | 62
Substituting Eq. 3.18 (with k =1), Eq. 3.26, and Eq. 3.34 into Eq. 3.37 gives
f (z1 x1) f (x1 z0) f (x1 z0:1)= | | | f (z1 z0) | T 1 T 1 1 1 (z H x ) R (z H x ) (x1 xˆ1/0) P− (x1 xˆ1/0) e 2 1 1 1 −1 1 1 1 e− 2 − 1/0 − = − − − m/2 1/2 n/2 1/2 (2π) R1 (2π) P | | 1/0 1 1 T T 1 − (z1 H1xˆ1/0) (H1P1/0H +R1)− (z1 H1xˆ1/0) e− 2 − ¯ 1 ¯ − ¯ ¯ . (3.38) × ⎛ m/2 T 1/2 ⎞ (2π) H1P1/0H1 + R1 ⎝ ⎠ As it stands, the density f (x1 z0:1) appears¯ very complicated.¯ It has already been | ¯ ¯ reasoned that the random variable x1 z0:1 must be Gaussian. Further evidence of | this is that the density f (x1 z0:1) is an exponential quadratic in x1 | T 1/2 H1P1/0H + R1 1 1 Q1/1 f (x1 z0:1)= e− 2 , (3.39) | n/2 1/2 1/2 (2¯π) P1/0 R¯1 ¯ | ¯ | where ¯ ¯ ¯ ¯ T 1 T 1 Q1/1 =(z1 H1x1) R− (z1 H1x1)+ x1 xˆ1/0 P− x1 xˆ1/0 − 1 − − 1/0 − T T 1 z1 H1xˆ1/0 H1P1/0H ¡+ R1 − ¢z1 H1¡xˆ1/0 .¢ (3.40) − − 1 − The quadratic can be¡ expanded as¢ follows¡ ¢ ¡ ¢
T 1 T 1 Q1/1 =(z1 H1x1) R− (z1 H1x1)+ x1 xˆ1/0 P− x1 xˆ1/0 − 1 − − 1/0 − T T 1 z1 H1xˆ1/0 H1P1/¡0H + R1¢ − z1¡ H1xˆ1/0 ¢ − − 1 − T 1 T 1 T 1 T T 1 T 1 = x P− ¡+ H R− H¢1 ¡x1 + z R− z1 2¢x H¡ R− z1 +¢xˆ P− xˆ1/0 1 1/0 1 1 1 1 − 1 1 1 1/0 1/0 ³ T 1 ´ T T 1 2x P− xˆ1/0 z1 H1xˆ1/0 H1P1/0H + R1 − z1 H1xˆ1/0 − 1 1/0 − − 1 − T 1 T 1 T T 1 1 T 1 = x P− + H R− H1 x¡ 1 2x H¢ R¡ − z1 + P− xˆ1/0 ¢+ z¡ R− z1 ¢ 1 1/0 1 1 − 1 1 1 1/0 1 1 ³ T 1 ´ ³ T T ´ 1 + xˆ P− xˆ1/0 z1 H1xˆ1/0 H1P1/0H + R1 − z1 H1xˆ1/0 . 1/0 1/0 − − 1 − (3.41) ¡ ¢ ¡ ¢ ¡ ¢ Comparing the quadratic term in Eq. 3.41 with Eq. 3.36, one can conclude that the covariance is given by
1 1 T 1 − cov (x1 z0:1)= P− + H R− H1 . (3.42) | 1/0 1 1 ³ ´ 63
This inverse can be expressed using the easily checked formula
1 1 T 1 − P− + H R− H1 =(I K1H1) P1/0 , (3.43) 1/0 1 1 − ³ ´ where 1 T T − K1 = P1/0H1 H1P1/0H1 + R1 . (3.44)
Using this definition, the covariance may¡ be expressed as¢
P1/1 = cov (x1 z0:1)=(I K1H1) P1/0 . (3.45) | − Similarly, comparing the bilinear term in Eq. 3.41 with Eq. 3.36, one can conclude thatthemeanisgivenby
T 1 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 + P− xˆ1/0 . (3.46) { | } | 1 1 1/0 ³ ´ Substituting Eq. 3.45 into Eq. 3.47 gives
T 1 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 + P− xˆ1/0 { | } | 1 1 1/0 1 1 T ³ 1 − T 1 ´ 1 = P1−/0 + H1 R1− H1 H1 R1− z1 + P1−/0xˆ1/0 ³ T´ 1³ ´ =(I K1H1) P1/0H R− z1 +(I K1H1) xˆ1/0 − 1 1 − T 1 T 1 = P1/0H R− K1H1P1/0H R− z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = ¡P1/0H R− K1H1P1/0H R− ¢ K1 z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − − T 1 T 1 = ¡P1/0H R− K1 I + H1P1/0H R− ¢ z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = £P1/0H R− K1 ¡R1+H1P1/0H R¢¤− z1 + K1z1 +(I K1H1) xˆ1/0 1 1 − 1 1 − T 1 T 1 = £P1/0H R− P1/¡0H R− z1 + K¢1z1 +(¤ I K1H1) xˆ1/0 1 1 − 1 1 −
= K£ 1z1 +(I K1H1) xˆ1/0 ¤ −
= xˆ1/0 + K1 z1 H1xˆ1/0 .(3.47) − ¡ ¢ We then define xˆ1/1 as
xˆ1/1 = E x1 z0:1 = xˆ1/0 + K1 z1 H1xˆ1/0 . (3.48) { | } − ¡ ¢ 64
The conditional density can then be written as
1 T 1 (x1 xˆ1/1) P− (x1 xˆ1/1) e− 2 − 1/1 − f (x1 z0:1)= . (3.49) n/2 1/2 | (2π) P1/1 ¯ ¯ 3.1.2 kth Measurement and Final Result¯ by Induction¯
A general proof by induction involves showing a result holds initially and then showing it also holds after k +1steps by assuming it held after k steps. For the problem at hand, the result is a recursive formula for the conditional density of a random variable x. The random variable x has an initial distribution given by Eq. 3.12
x0 N xˆ0/0, P0/0 , (3.50) ∼ ¡ ¢ where xˆ0/0 denotes the conditional mean of x0 after processing 0 measurements. More generally, a subscript i/j indicates that the subscripted quantity is valid at time step i and is conditioned on j measurements collectively denoted as z1:j. A recursive formula for updating the conditional density of x is now stated. With the state modelgivenbyEq.3.8,theaprioriconditionaldensityofxk is given by
f (xk z0:k 1)=N xˆk/k 1, Pk/k 1 , (3.51) | − − − where ¡ ¢
xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.52a) − − − − − T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 .(3.52b) − − − − − −
Given a new measurement zk, the a priori density is updated according to
f (xk z0:k)=N xˆk/k, Pk/k , (3.53) | where ¡ ¢
xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 (3.54a) − − −
Pk/k =(I KkHk) P¡ k/k 1 ¢ (3.54b) − − 1 T T − Kk = Pk/k 1Hk HkPk/k 1Hk + Rk .(3.54c) − − ¡ ¢ 65
The detailed derivation contained in Section 3.1.1 proved the validity of this result for the initial measurement z1 and the initial density f (x0) given by Eq. 3.50. Proceed- ing with a proof by induction, we assume that this result holds for k 1 measurements − and demonstrate that the same results holds for k measurements. By assumption, the a posteriori density of xk 1 is −
f (xk 1 z0:k 1)=N xˆk 1/k 1, Pk 1/k 1 . (3.55) − | − − − − − ¡ ¢ The random variable xk z0:k 1 is Gaussian because xk and z0:k 1 are both Gaussian. | − − The mean and covariance of xk z0:k 1 are readily obtained in the same manner as was | − done for x1 z0 and are given by |
E xk z0:k 1 = Φk 1xˆk 1/k 1 + dk 1 = xˆk/k 1 (3.56a) { | − } − − − − − T cov (xk z0:k 1)=Φk 1Pk 1/k 1Φk 1 + Qk 1 = Pk/k 1 .(3.56b) | − − − − − − −
Thus, the density of xk z0:k 1 is consistent with the result we are trying to prove and is | − given by Eq. 3.51. The next step in the estimation process is to obtain the conditional density zk z0:k 1. Since the random variables zk are Gaussian, the conditional density | − f (zk z0:k 1) will also be Gaussian, and therefore completely determined by its mean | − and covariance. We proceed by finding the mean and covariance of Eq. 3.4
E zk z0:k 1 = E Hkxk + vk z0:k 1 { | − } { | − }
= HkE xk z0:k 1 { | − }
= Hkxˆk/k 1 , (3.57) − where Eq. 3.56a has been used. The covariance is given by
T cov zk z0:k 1 = E (zk E zk z0:k 1 )(zk E zk z0:k 1 ) z0:k 1 { | − } − { | − } − { | − } | − n o T = E Hkxk + vk Hkxˆk/k 1 Hkxk + vk Hkxˆk/k 1 z0:k 1 − − − − | − n T o = E ¡Hk xk xˆk/k 1 + vk ¢¡Hk xk xˆk/k 1 + vk ¢ z0:k 1 − − − − | − n T T T o = HkE¡ x¡k xˆk/k 1 ¢xk ¢¡xˆk/k 1¡ z0:k 1 H¢k + E¢ vkvk − − − − | − n T o = HkPk/k¡ 1Hk + Rk ¢¡,¢ © (3.58)ª − 66
where Eq. 3.56b has been used. The fact that the random variable vk is indepen- dent white noise and therefore independent of xk, xˆk/k 1 and z0:k 1 was also used in − − obtaining Eq. 3.58. Using Eq. 3.57 and Eq. 3.58, the density of zk z0:k 1 is given by | − T 1 1 T − 2 (zk Hkxˆk/k 1) (HkPk/k 1Hk +Rk) (zk Hkxˆk/k 1) e− − − − − − f (zk z0:k 1)= . (3.59) | − m/2 T 1/2 (2π) HkPk/k 1Hk + Rk −
We now have enough information to compute¯ f (xk z0:k).¯ The random variables xk ¯ | ¯ and z0:k are all Gaussian. Therefore, the conditional random variable xk z0:k is also | Gaussian. The density of xk z0:k can be found using Eq. 2.22 |
f (zk xk) f (xk z0:k 1) f (xk z0:k)= | | − . (3.60) | f (zk z0:k 1) | − Substituting Eq. 3.18, Eq. 3.51, and Eq. 3.59 into Eq. 3.60 gives
f (zk xk) f (xk z0:k 1) f (xk z0:k)= | | − | f (zk z0:k 1) | − T 1 T 1 1 1 (zk Hkxk) R− (zk Hkxk) 2 (xk xˆk/k 1) Pk/k− 1(xk xˆk/k 1) e 2 k e− − − − − = − − − − m/2 1/2 n/2 1/2 (2π) Rk (2π) Pk/k 1 | | − T 1 1 1 T − − 2 (zk Hkxˆk/k 1) (HkPk/k¯ 1Hk +Rk¯) (zk Hkxˆk/k 1) e− − − − − − ¯ ¯ . × ⎛ m/2 T 1/2 ⎞ (2π) HkPk/k 1Hk + Rk − ⎝ ¯ ¯ ⎠(3.61) ¯ ¯ The density given by Eq. 3.61 is directly comparable to the density given by Eq. 3.38. Therefore, one can proceed to the final result by appropriate substitutions into Eq. 3.49. Doing so results in the following PDF
1 T 1 (xk xˆk/k) P− (xk xˆk/k) e− 2 − k/k − f (xk z k)= , (3.62) 0: n/2 1/2 | (2π) Pk/k ¯ ¯ where xˆk/k and Pk/k are given by Eq. 3.54a and¯ 3.54b,¯ respectively. Since Eq. 3.62 is identical to Eq. 3.53 the proof by induction is complete. 67
3.2 Estimates and Confidence Regions (Error Ellipsoids) for the Bayes’ Estimator
A central aspect of the Bayesian estimator given by Eqs. 3.51-3.54 is that the condi- tional densities are Gaussian. This is true for both the a priori conditional density (Eq. 3.51)
f (xk z0:k 1)=N xˆk/k 1, Pk/k 1 , (3.63) | − − − and the a posteriori density (Eq. 3.53) ¡ ¢
f (xk z0:k)=N xˆk/k, Pk/k . (3.64) | ¡ ¢ This is convenient since one may wish to form an estimate of the state (a priori or a posteriori) and an associated confidence region for that estimate. The MMSE estimate is the conditional mean. The conditional a priori mean estimate of xk and the associated covariance are computed by Eq. 3.52a. Similarly, the conditional a posteriori mean estimate of xk and the associated covariance are computed by Eq. 3.54. As discussed in Appendix ??, the eigenvalues and eigenvectors of the covariance matrix are used to specify an ellipsoidal confidence region about the estimate xˆk. The exponential in the Gaussian PDF is a quadratic function defining an ellipsoid S in n dimensional space
(α) T 1 = xk :(xk xˆk) P− (xk xˆk)=α . (3.65) SXk − k − n o The ellipsoid defined by (α) has principal axes aligned with the eigenvectors of the SXk covariance matrix Pk. The eigenvalues λi of the covariance matrix Pk are related to 2 the variance σi along each of the principal axes of the ellipsoid
2 λi = σi . (3.66)
(α) The ellipsoid defined by intersects its principal axes at a distance of √α√λi = SXk √ασi from the center of the ellipsoid. The probability that the true state xk belongs 68 to the region (α) enclosed by the ellipsoid (α) can be found by Eq. A.228 RXk SXk
2 (r ) 2 P Xk = FR2 r n , (3.67) ∈ RXk | µ ¶ 2 ¡ ¢ where the cumulative distribution F 2 (r n) refers to the chi-square distribution with R | n degrees of freedom. More generally, we would like to determine the value of α = r2 required for a given level of probability. Table A.2 lists the required value of r = √α for various values of n at probability levels of 0.5, 0.90, and 0.95. For example, with n =6, the 95% confidence (0.95 probability) ellipsoid is centered on the estimate xˆk and intersects its principal axes (which are the eigenvectors of Pk)atadistanceof
3.55σi from the ellipsoid center.
3.3 The White Noise Assumption and Bayesian Estimation
The derivation of the Bayesian estimator made explicit use of the independent nature of the process and measurement noise. The Bayesian estimator given by Eqs. 3.51- 3.54 is only valid if the noise sources are white. Unfortunately, many noise sources are far from white and exhibit some type of correlation. If the noise autocorrelation can be adequately described by a decaying exponential, which is very common, then a shaping filter can be used to obtain a Bayesian estimator. The noise sources of the Bayes estimator must be white and therefore have an autocorrelation given by Eq. 3.10a. The following continuous-time shaping filter was considered in Appendix B
x˙ = ax + bu . (3.68) − The input u is white noise described by the autocorrelation
RU (τ)=E u (t) u (t + τ) = Aδ (τ) , (3.69) { } where δ (τ) is a dirac delta function. The output autocorrelation is given by
Ab2 R (τ)= e a τ . (3.70) X 2a − | | 69
The shaping filter described by Eq. 3.68 can be augmented to the continuous-time state equations used to form the discrete-time sampled data state equations given by Eq. 3.8. The input of the shaping filter is white, as required for the Bayesian estimator being discussed, and the output is correlated according to Eq. 3.70. The output of the shaping filter could then be added to the state equations, the measure- ment equation, or both. This is a very effective way to model colored noise. The same idea can be used directly with the discrete-time state equations. The following discrete-time shaping filter was considered in Appendix B.5.3
xk = αxk 1 + βwk 1 . (3.71) − −
The input wk is white noise described by the autocorrelation
RW (m)=E wkwk+m = Aδ (m) , (3.72) { } where δ (m) is the unit impulse function (with a value of one when m =0and zero otherwise). The output autocorrelation is given by
Aβ2 R (m)= αm . (3.73) X 1 α2 − The shaping filter described by Eq. 3.71 can be directly augmented to the discrete- time sampled data state equations given by Eq. 3.8. The augmented state represent- ing the output of the shaping filter could then be added to the other state equations, the measurement equation, or both. It is only a matter of preference whether to use a discrete-time shaping filter directly, or begin with a continuous-time shaping filter and discretize it along the other state equations when forming the discrete-time sampled data equations. The use of shaping filters is, of course not restricted to scalar inputs, but easily extends to vector inputs. In the vector input case, a system of linear, time invariant differential (or difference) equations can be used to create a noise signal that has an exponentially decreasing autocorrelation and if so desired a non-trivial cross-correlation. 70
3.4 The Deterministic Input Assumption and Bayesian Esti- mation
The linear state equation is given by Eq. 3.8 and repeated here
xk+1 = Φkxk + wk + dk . (3.74)
In the derivation of the Bayes’ estimator, it was assumed that the function dk is deterministic. In many situations, the function dk is a function of the estimate xˆk/k
dk = g xˆk/k . (3.75) ¡ ¢ Is the Bayes’ estimator still valid when dk is not deterministic, but rather given by Eq. 3.75? The answer to this question is yes, as will soon become clear. In the case when dk is deterministic, the estimate xˆk/k,satisfies the recursive formula given by Eq. 3.54a and repeated here
xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 . (3.76) − − − ¡ ¢ Because of this, xˆk/k only depends on the initial estimate xˆ0/0 and the measurement sequence z1:k, which together are represented by the sequence z0:k.Theonlytimedk is used in the derivation of the Bayes’ estimator is in the computation of the a priori density (see Eq. 3.51) f (xk+1 z0:k),whichisgivenby |
f (xk+1 z0:k)=f (Φkxk + wk + dk z0:k)=f Φkxk + wk + g xˆk/k z0:k . (3.77) | | | ¡ ¡ ¢ ¢ However, this density is conditioned on z0:k — the very information used to construct xˆk/k. Therefore, the resulting Bayes’ estimator will be the same if the input dk is deterministic, or a function of xˆk/k. A very interesting observation is now made regarding the fact that g ( ) can be a nonlinear function. The development of the · Bayes’ estimator for dk beingdeterministicmademuchuseofthefactthatthevari- ables xk and zk were Gaussian. Clearly, if g ( ) is a nonlinear function then neither · 71
xk nor zk will (in general) be Gaussian. However, the conditional densities xk z0:k 1 | − and zk z0:k 1 are Gaussian because conditioning on z0:k 1 effectively also conditions | − − on d0:k 1 since the inputs d0:k 1 are functions of the conditioning variables z0:k 1. − − −
3.5 Bayesian Estimation Between Measurements
As developed so far, the Bayesian estimator produces the conditional density at mea- surement instances separated by intervals of time T . Itisalsopossibletogenerate the conditional density at instants of time that occur in between measurements. An example of how this is accomplished is the a priori prediction, which provides the conditional density of xk given the measurements z0:k 1.Recallthatxk is simply − a convenient notation for the random variable x (tk). In the same manner as the a priori density is generated, it is also possible to provide a conditional density of x (tk + ∆T) given the measurements z0:k. Although not necessary, assume that ∆ is any number such that 0 ∆ 1 . (3.78) ≤ ≤
For notational convenience, let xk+∆ denote the random variable x (tk + ∆T).The state x (tk + ∆T) can be expressed in terms of the state x (tk) by using Eq. 3.5
xk+∆ = x (tk+∆) tk+∆ = Φx (tk+∆,tk) x (tk)+ Φx (tk+∆,τ)[BS (τ) uS (τ)+BD (τ) uD (τ)] dτ Ztk = Φk,∆xk + wk,∆ + dk,∆ . (3.79) where for notational convenience, the following substitutions have been made
Φk,∆ = Φx (tk+∆,tk) (3.80a) tk+∆ wk,∆ = Φx (tk+∆,τ) BS (τ) uS (τ) dτ (3.80b) Ztk tk+∆ dk,∆ = Φx (tk+∆,τ) BD (τ) uD (τ) dτ .(3.80c) Ztk 72
Because the process noise uS is white, the random variable wk,∆ is a white Gaussian random variable with density
f (wk,∆)=N (0, Qk,∆) , (3.81) where Qk,∆ canbecomputedasshowninAppendixB
tk+∆ T T T Qk = E wkwk = Φx (tk+∆,τ) BS (τ) ABS (τ) Φx (tk+∆,τ) dτ . (3.82) Ztk © ª The conditional density of xk given z0:k is given by Eq. 3.53
f (xk z0:k)=N xˆk/k, Pk/k . (3.83) | ¡ ¢ Because the random variables xk and wk,∆ are Gaussian, the random variable xk+∆, which is a linear combination of xk and wk,∆, will also be Gaussian. Furthermore, because xk+∆ and zj are both Gaussian, the conditional density of xk+∆ given z0:k is also Gaussian and therefore completely characterized by the conditional mean and conditional covariance
f (xk+∆ z0:k)=N xˆk+∆/k, Pk+∆/k . (3.84) | The conditional mean is found by taking the¡ expectation of¢ Eq. 3.79
xˆk+∆/k = E xk+∆ z0:k { | }
= E Φk,∆xk + wk,∆ + dk,∆ z0:k { | }
= Φk,∆E xk z0:k + dk,∆ { | }
= Φk,∆xˆk/k + dk,∆ . (3.85)
Similarly, the conditional covariance is given by
T Pk+∆/k = E xk+∆ xˆk+∆/k xk+∆ xˆk+∆/k z0:k − − | n o T = E (¡Φk,∆ (xk xˆk)+¢¡wk,∆)(Φk,∆ (xk¢ xˆk)+wk,∆) z0:k − − | n T To = E (Φk,∆ (xk xˆk)) (Φk,∆ (xk xˆk)) z0:k + E wk,∆w z0:k − − | k,∆| n T T o T = Φk,∆E (xk xˆk)(xk xˆk) z0:k Φ + E wk,©∆w ª − − | k,∆ k,∆ n T o = Φk,∆Pk/kΦk,∆ + Qk,∆ .© ª (3.86) 73
It is clear that between measurements, the state and covariance satisfy the differential equations developed in Appendix B, specifically Eq. B.83 and Eq. B.87. When a measurement zj is processed, the state estimate and covariance are instantaneously adjusted according to Eq. 3.54. These results show that the estimate xˆ (t) satisfies the differential equation dxˆ = F (t) xˆ + B u + δ (t t ) K (z H xˆ (t)) . (3.87) dt D D j j j j j − − X Thecovariancesatisfies the differential equation
˙ T T P (t)=F (t) P (t)+P (t) F (t)+BS (t) ABS (t) δ (t tj) KjHjP (t) . (3.88) − j − X The sum of all knowledge available at time t, referred to as the information state, is denoted by D (t).Attimet0 the information state consists of the a priori density = of x (t0) given by Eq. 3.12
D (0) = xˆ0/0, P0/0 . (3.89) = © ª The distribution of the state at time t =0can be expressed as
f (x (0) D (0)) N xˆ0/0, P0/0 . (3.90) |= ∼ ¡ ¢ At time instances denoted by tk, measurements zk become available and are added to the information state
xˆ0/0, P0/0 t0 t f (xk D (t)) = N (xˆ (t) , P (t)) , (3.92) |= where xˆ (t) and P (t) are obtained from Eq. 3.87 and Eq. 3.88, respectively. 74 3.6 No A Priori Information and Bayesian Estimation The case of no a priori information can be adequately handled by using the following conditional distribution for x1 [3, p. 99] f (x1 z0) lim N (0,ρIn) . (3.93) ρ | ∼ →∞ That is, xˆ1/0 = 0 (3.94a) P1/0 = lim ρIn .(3.94b) ρ →∞ This situation could, for example, occur when a filter is just initialized and absolutely nothing is known about the state x1, and yet is now expected to process a measure- ment z1. Note that the lack of a priori information could have been assigned to the state x0. However, if such were done, it would do no good to use the state equations to update the density and covariance (Eq. 3.51), since the covariance itself is infinite. The a posteriori density is given by Eq. 3.53 f (x1 z0:1)=N xˆ1/1, P1/1 . (3.95) | ¡ ¢ where the a posteriori estimate xˆ1/1 and a posteriori covariance P1/1 are to be de- termined from Eqs. 3.54. The Kalman gain given by Eq. 3.54c can be written as 1 T T − K1 = P1/0H1 H1P1/0H1 + R1 1 T T − = P1/0H1 ¡H1P1/0H1 + R1¢ 1 T T − = lim ρH1 ¡ ρH1H1 + R1 ¢ ρ 0 → 1 T ¡ T 1 ¢ − =limH1 H1H1 + R1 . (3.96) ρ ρ →∞ µ ¶ 75 The a posteriori estimate xˆ1/1 can be determined from Eq. 3.54a xˆ1/1 = xˆ1/0 + K1 z1 H1xˆ1/0 − = K1x1 ¡ ¢ 1 T T 1 − =limH1 H1H1 + R1 x1 ρ ρ →∞ µ 1 ¶ T T − = H1 H1H1 x1 . (3.97) ¡ ¢ It is not possible to determine the updated covariance by use of Eq. 3.54b. In fact, unless H1 has full column rank (not likely), it is tedious (but possible) to obtain a closed form expression for the a posteriori covariance matrix. To understand why this is so, one must reexamine Eq. 3.46 with xˆ1/0 = 0 T 1 E x1 z0:1 =[cov (x1 z0:1)] H R− z1 { | } | 1 1 1 1 T 1 − T 1 = P1−/0 + H1 R¡1− H1 H¢1 R1− z1 ³ ´ 1 1 T 1 − T 1 = lim I + H1 R1− H1 H1 R1− z1 ρ ρ →∞ µ ¶1 T 1 − T 1 = lim I + ρH1 R1− H1 ρH1 R1− z1 . (3.98) ρ →∞ ¡ ¢ Before proceeding, we note the following matrix inversion rule [77, p. 151] 1 1 (I + AB)− A = A (I + BA)− . (3.99) T 1 With A = H1 R1− and B =ρH1 1 T 1 − T 1 E x1 z0:1 = lim I + H1 R1− ρH1 H1 R1− z1ρ ρ { | } →∞ 1 ¡ T 1 ¢ T 1 − = lim H1 R1− I + ρH1H1 R1− z1ρ ρ →∞ 1 T 1 ¡ 1 T ¢1 − = lim H1 R1− I + H1H1 R1− z1 ρ ρ →∞ µ 1 ¶ T 1 T 1 − = H1 R1− H1H1 R1− z1 1 T T − = H1 H1H¡ 1 z1 .¢ (3.100) ¡ ¢ 76 NotethatthisisthesameresultgivenbyEq.3.97. ComparingthisresultwithEq. 3.46 gives the following relationship T 1 T T 1 [cov (x1 z0:1)] H R− z1 = H H1H − z1 , (3.101) | 1 1 1 1 ¡ ¢ or T 1 T T 1 [cov (x1 z0:1)] H R− H H1H − z1 = 0 . (3.102) | 1 1 − 1 1 h ¡ ¢ i Since z1 is arbitrary, the previous equation requires T 1 T T 1 [cov (x1 z0:1)] H R− = H H1H − , (3.103) | 1 1 1 1 ¡ ¢ or T T T 1 [cov (x1 z0:1)] H = H H1H − R1 . (3.104) | 1 1 1 ¡ ¢ When the matrix H1 hasfullcolumnrank,thecovarianceisgivenby T T 1 T 1 [cov (x1 z0:1)] = H H1H − R1H1 H H1 − . (3.105) | 1 1 1 ¡ ¢ ¡ ¢ The matrix H1 is not, in general, of full rank. However, it is possible to use a generalized inverse. For example, suppose that H1 is such that the measurement is only influenced by a subset of the states H1 = M0 , (3.106) £ ¤ where for sake of simplicity in the illustration we assume that M is invertible. Then, T T M [cov (x1 z0:1)] H =[cov (x1 z0:1)] | 1 | 0 ∙1 ¸ T T − = H1 H1H1 R1 T M 1 = ¡ MM¢ T − R 0 1 ∙ ¸ 1 MT MM¡ T −¢ R = 1 0 ∙ ¡ ¢ ¸ M 1R = − 1 , (3.107) 0 ∙ ¸ 77 or 1 1 T I M− R1 (M− ) [cov (x1 z0:1)] = . (3.108) | 0 0 ∙ ¸ ∙ ¸ The measurement z1 provided information about the subset of the states that were measured. The resulting covariance calculation was only valid for those portion of the states that influenced the measurement. Since no information is provided on the other set of states (those that did not influence the measurement), the uncertainty in the estimate of these states should not be affected, and we can therefore write 1 1 T M− R1 (M− ) 0 [cov (x1 z0:1)] = lim . (3.109) | ρ 0 ρI →∞ ∙ ¸ Certainly other special cases exist where an explicit expression of the a posteriori covariance has a reasonable analytical expression. The purpose here, is not to discuss such special cases, but rather to point out that the covariance does exist (in the limit) and the estimate provided by Eq. 3.97 is indeed a MMSE estimate. 3.7 The Kalman Filter Early approaches to MMSE estimation were developed by Wiener in the 1940s [97]. However, the Wiener solution does not lend well to the more complicated time- variable, multiple-input, multiple output (MIMO) systems — precisely the type de- scribed by Eqs. 3.4-3.8 [19, Chs. 4-5]. In 1960, R. E. Kalman provided an alternative way of formulating the MMSE estimation problem using state-space methods [46]. The resulting estimator is usually referred to as the Kalman filter, with other varia- tions of the name reflecting slight modifications to the original estimator proposed by Kalman. The Kalman filter is a MMSE estimator for the system described by Eqs. 3.4-3.8. As a result of the Markov form of the model and measurement equations, the Kalman filter is a recursive filter, making it a very attractive approach to real time estimation problems. There are many approaches to developing the infamous Kalman filter equations. Some are designed to arrive at the result in the quickest fashion and often lack the 78 insight of a more detailed derivation. The simplified approach usually begins by showing the Kalman filter to be the best linear unbiased estimator that recursively processes measurements. This is precisely the approach taken in Section 2.4. How- ever, contrary to this discussion, the results of Section 2.4 are not limited to linear stateequationsofthetypegivenbyEq.3.8. Thetwomostcommonapproachesto rigorously deriving the Kalman filter as an MMSE estimator are: (1) the innovations approach [3, Ch. 5], and (2) the Bayesian approach. While both approaches are in- sightful and arrive at the same result, the Bayesian approach is used here because the Bayes’ estimator has already been discussed in Section 3.1. In fact, having developed theBayes’estimator,theKalmanfilter equations immediately follow. The Kalman filter process is initialized with an initial estimate xˆ0/0 of the true state x0 andaninitialcovarianceP0/0. It is assumed that measurements zk are available at each index k. From the derivation of the Bayes’ estimator, it is known that the MMSE estimates are updated recursively. Assuming that the Kalman filter has been in operation through measurement k 1. Then, the a posteriori estimate − of xk 1 is denoted by xˆk 1/k 1 and the associated a posteriori covariance is denoted − − − by Pk 1/k 1. The a posteriori state estimate and covariance for step k 1 are then − − − projected ahead to step k using Eq. 3.52 xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.110a) − − − − − T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 . (3.110b) − − − − − − The estimate xˆk/k 1 and covariance Pk/k 1 are referred to as the a priori estimate and − − a priori covariance because the measurement at step k has not been used to obtain them. That is xˆk/k 1 is the MMSE of xk conditioned on xˆk 1/k 1. The MMSE − − − estimate of xk and the associated covariance conditioned on xˆk 1/k 1 and zk,which − − are referred to as the a posteriori estimate and a posteriori covariance for step k,are 79 obtained using Eq. 3.54 xˆk/k = xˆk/k 1 + Kk zk Hkxˆk/k 1 (3.111a) − − − Pk/k =(I KkHk) P¡ k/k 1 ¢ (3.111b) − − 1 T T − Kk = Pk/k 1Hk HkPk/k 1Hk + Rk . (3.111c) − − ¡ ¢ 3.7.1 Covariance Simulations One can see from the Kalman filter equations that it is not necessary to update the state at all. In application, the state estimate is probably more useful than the covariance. However, in analysis, the covariance is often the only quantity of interest. While the state is just a sample from the random process, the covariance represents the statistical properties of all such samples. As such, the covariance is much more representative of the system behavior. A covariance simulation only requires one to compute the a priori covariance given by Eq. 3.110b T Pk/k 1 = Φk 1Pk 1/k 1Φk 1 + Qk 1 (3.112a) − − − − − − 1 1 − T 1 − Pk = Pk− + Hk Rk− Hk . (3.112b) h i The a posteriori covariance can be computed¡ ¢ using Eqs. 3.111b, which also requires computing the Kalman gain given by Eq. 3.110b. It is possible to compute the a posteriori covariance without computing the Kalman gain using the following equation [19, p. 247] 1 1 T 1 − Pk/k = Pk/k− 1 + Hk Rk− Hk . (3.113) − The covariance can be used to determine³ if the quality of´ the state estimate is sufficient for a particular application. 3.7.2 Scalar System Estimation Example Consider the system described by the first order differential equation x˙ = βx + 2βσu(t) , (3.114) − p 80 where u (t) is a white noise process RU (τ)=δ (τ) (3.115a) PU (Ω)=1. (3.115b) This system was analyzed in Appendix B where the autocorrelation was found to be 2 β τ RX (τ)=σ e− | | . (3.116) Using the results of Appendix B, the sampled system and measurement equation are given by xk+1 = φkxk + wk (3.117a) zk = Hkxk + vk . (3.117b) The state transition matrix (a scalar in this example) for the differential equation is given by β∆t φk = e− . (3.118) Thevarianceofwk is given by 2 Qk = E wk ∆t ∆t βξ βη = E £ ¤ 2βσe− u (ξ) dξ 2βσe− u (η) dη ∙Z0 Z0 ¸ p∆t ∆t p 2 βξ βη =2βσ E e− u (ξ) dξ e− u (η) dη 0 0 ∙∆Zt ∆t Z ¸ 2 βξ βη =2βσ e− e− E [u (ξ) u (η)] dξdη 0 0 Z ∆t Z ∆t 2 βξ βη =2βσ e− e− E [δ (ξ) δ (η)] dξdη 0 0 Z ∆t Z 2 2βξ =2βσ e− dξ 0 2 Z 2β∆t = σ 1 e− − = Q .¡ ¢ (3.119) 81 This results in the autocorrelation function RW [k]=δ (k) Q 2 2β∆t = σ 1 e− δ (k) . (3.120) − ¡ ¢ The same result could have been obtained using (4.45). The mean value of x (t) is zero [19, pp. 83-84] lim RX (τ)=0 E [X (t)] = 0 . (3.121) τ →∞ ⇒ The mean square value for the process (also the variance, since the process has a zero mean) at t =0is found by 2 E X (t) = RX (0) £ ¤ = σ2 . (3.122) If a Kalman filter is to be used, the initial estimate of x (t) is given by xˆ0/0 = E [X (0)] = 0 , (3.123) and the error covariance is given by 2 P0/0 = RX (0) = σ . (3.124) Numerical Results [19, pp. 223-225] If the process noise wk is drawn from a Gaussian density function with variance σ2, then the sequence x is said to be a first-order Gauss-Markov process. It is first-order because the equation is a first-order difference equation; it is Gauss because the density of the process noise wk is Gaussian; it is Markov because (due to the state transition matrix) the state update depends only on the previous state. Let the autocorrelation parameters be β =1 (3.125a) σ =1. (3.125b) 82 Then, the autocorrelation is given by τ RX (τ)=e−| | . (3.126) Suppose we have a sequence of noisy measurements of this process taken 0.02 seconds apart beginning at t =0with autocorrelation given by 2 Rk = E vk =1. (3.127) © ª The measurement relationship to x is Hk =1. (3.128) AKalmanfilter is to be used to obtain an optimal estimate of x (t).Thestate transition matrix is given by β∆t 0.02 φ = e− = e− 0.9802 . (3.129) k ≈ The discrete-time process noise has a variance given by 2 2β∆t 2(0.02) Q = σ 1 e− =1 e− 0.03921 . (3.130) − − ≈ ¡ ¢ The process has a zero mean and unity variance. In summary, the process parameters are given below. φk =0.98020 Qk =0.03921 Rk =1.00000 (3.131) P0/0 =1.00000 xˆ0/0 =0 Figure’s 3.1-3.3 show the Kalman filter’s performance for this example. The estima- tion error is shown in Figure 3.2. Also shown is the square root of the covariance Pk. The estimation error is Gaussian and therefore, it should lie within (plus or minus) one standard deviation about 68% of the time. Of the twenty-four samples shown, eight fall outside of the 1-σ error bound. Therefore, the experimental results show that the state estimation is within the 1-σ boundary 67% of the time — this 83 1.5 True process Filter estimates 1 Measurements 0.5 0 −0.5 −1 Process Value −1.5 −2 −2.5 −3 0 5 10 15 20 25 Time (seconds) Figure 3.1. Kalman Filter Performance for a 1st Order Gauss-Markov Process. is in excellent agreement with the theoretical value of 68%. The Kalman gain for the problem quickly reaches a steady-state value. For this reason, it is more likely that one would simply implement the steady-state gains rather than the time varying gains. The full utility of the Kalman filter is typically only realized when the process or measurement noise matrices are time-varying (i.e. when Rk or Qk are functions of k). The use of steady-state gains is discussed more in [7, pp. 89-100]. 3.8 Multiple Model The equations governing optimal estimation in a general multiple model context have been developed in Section 2.1. These general multiple model equations will now be applied to the linear sampled data system. As in the general multiple model context, 84 0.8 0.6 0.4 0.2 0 Estimation Error −0.2 −0.4 −0.6 −0.8 0 5 10 15 20 25 Filter Cycle Number (k) st Figure 3.2. Estimation Error (blue) and √Pk (red) for 1 Order Gauss-Markov Process. ± 85 0.5 0.45 0.4 0.35 Filter Gain 0.3 0.25 0.2 0.15 0 5 10 15 20 25 Filter Cycle Number (k) Figure 3.3. Kalman Gain for 1st Order Gauss-Markov Process. 86 the model probabilities are given by (j) P (Mj)=µ0 , (3.132) with corresponding density r (j) fM (m)= µ0 δ (m j) . (3.133) j=1 − X Each model m has a linear measurement equation of the form given by Eq. 3.4 (m) (m) (m) (m) zk = Hk xk + vk , (3.134) where the superscript (m) indicates that the measurement equation is valid for model m. The measurement noise is white with covariance (m) (m) (m) E vk vk = Rk . (3.135) n o Similarlytheprocessdynamicsareofthetypespecified by Eq. 3.8 (m) (m) (m) (m) (m) xk+1 = Φk xk + wk + dk (3.136) Theprocessnoiseiswhitewithcovariance (m) (m) (m) E wk wk = Qk . (3.137) n o For each model m, an initial PDF is given f (x0 m) and as usual we include z0 as a | place holder for measurements f (x0 z0,m) f (x0 m) . (3.138) | , | For the linear sampled data system, the density is given by (m) (m) f (x0 z0,m)=N xˆ , P , (3.139) | 0/0 0/0 ³ ´ where (m) xˆ = E x0 M = m (3.140a) 0/0 { | } (m) P = cov (x0 M = m) . (3.140b) 0/0 | 87 A Bayes’ estimator for model m is given by Eqs. 3.51-3.54c with all estimates in- cluding the superscript (m). With the state model given by Eq. 3.136, the a priori conditional density of xk is given by (m) (m) f (xk z0:k 1,m)=N xˆk/k 1, Pk/k 1 , (3.141) | − − − ³ ´ where (m) (m) (m) (m) xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.142) − − − − − T (m) (m) (m) (m) (m) Pk/k 1 = Φk 1Pk 1/k 1 Φk 1 + Qk 1 . (3.143) − − − − − − ³ ´ Given a new measurement zk, the a priori density is updated according to (m) (m) f (xk z0:k,m)=N xˆ , P , (3.144) | k/k k/k ³ ´ where (m) (m) (m) (m) (m) (m) xˆk/k = xˆk/k 1 + Kk zk Hk xˆk/k 1 (3.145) − − − (m) (m) (m)³ (m) ´ Pk/k = I Kk Hk Pk/k 1 (3.146) − − ³ T´ T 1 (m) (m) (m) (m) (m) (m) − Kk = Pk/k 1 Hk Hk Pk/k 1 Hk + Rk . (3.147) − − ³ ´ µ ³ ´ ¶ The a posteriori model density, which is the model density m conditioned on the measurements z0:k is given by Bayes’ rule f (m, z0:k) f (m z0:k)= | f (z0:k) f (m, zk z0:k 1) f (z0:k 1) = | − − f (zk z0:k 1) f (z0:k 1) | − − f (m, zk z0:k 1) = | − f (zk z0:k 1) | − f (zk z0:k 1,m) f (m z0:k 1) = | − | − . (3.148) f (zk z0:k 1) | − 88 That is, the density f (m z0:k) can be recursively computed. The value of f (zk z0:k 1,m) | | − can be obtained using Eq. 3.59 T T 1 1 (m) (m) (m) (m) (m) (m) − (m) (m) 2 zk Hk xˆk/k 1 Hk Pk/k 1 Hk +Rk zk Hk xˆk/k 1 − − − − − − e f (zk z0:k 1,m)= . − T 1/2 | m/2 (m) (m) (m) (m) (2π) Hk Pk/k 1 Hk + Rk ¯ − ¯ ¯ ³ ´ ¯ (3.149) ¯ ¯ ¯ ¯ The conditional density f (m z0:k 1) is known from a previous iteration and is initial- | − ized by f (m z0:0)=fM (m) . (3.150) | For the first measurement z1, Eq. 3.148 is computed by f (z1 z0,m) f (m z0) f (m z0:1)= | | | f (z1 z0) | f (z1 z0,m) fM (m) = | f (z1 z0) | r f (z1 z0,m) = | µ(j)δ (m j) . (3.151) f (z z ) 0 1 0 j=1 − | X Alternately, using probability mass functions for M,wehave f (z1 z0,m) (m) f (z1 z0,m) P (m z0:1)= | µ0 = | P (m z0) . (3.152) | f (z1 z0) f (z1 z0) | | | Using the PMF representation, Eq. 3.148 gives the following recursion formula f (zk z0:k 1,m) P (m z0:k)= | − P (m z0:k 1) . (3.153) | f (zk z0:k 1) | − | − The value of f (zk z0:k 1) is computed by | − f (zk z0:k 1)= f (zk,mz0:k 1) dm | − | − Z = f (zk z0:k 1,m) f (m z0:k 1) dm | − | − Zr = f (zk z0:k 1,m) P (m z0:k 1) . (3.154) − − m=1 | | X 89 Substituting this result into Eq. 3.153 gives f (zk z0:k 1,m) P (m z0:k)= | − P (m z0:k 1) | f (zk z0:k 1) | − | − f (zk z0:k 1,m) = r | − P (m z0:k 1) . (3.155) m=1 f (zk z0:k 1,m) P (m z0:k 1) | − | − | − The expectation of the stateP conditioned on the measurements z0:k and the model choice m is given by Eq. 3.145 (m) E xk z0:k,m = xˆ . (3.156) { | } k/k The MMSE is the conditional mean and is given by xˆk/k = E xk z0:k { | } = xkf (xk z0:k) dxk | Z = xkf (xk,mz0:k) dxkdm | ZZ = xkf (xk m, z0:k) f (m z0:k) dxkdm | | ZZr = xkf (xk m, z0:k) dxkP (m z0:k) | | m=1 Z Xr (m) = xˆk/k P (m z0:k) . (3.157) m=1 | X The density f (xk z0:k) canbecomputedasfollows | f (xk z0:k)= f (xk,mz0:k) dm | | Z = f (xk m, z0:k)(m z0:k) dm | | Zr = f (xk m, z0:k) P (m z0:k) . (3.158) m=1 | | X Since each of the f (xk m, z0:k) are Gaussian, the conditional density f (xk z0:k) is a | | weighted sum of Gaussian PDFs. This of course means that the conditional density f (xk z0:k) is not Gaussian, making statistical inference about the estimation error | difficult (i.e. confidence regions). 90 Kalman Filter The multiple model Kalman Filter equations follow directly from pre- vious results. Each of the m Kalman filters are initialized with an initial estimate (m) (m) (m) xˆ0/0 ,aninitialcovarianceP0/0 , and an initial model probability µ0 .Ateachstep k, the standard Kalman filter equations are computed for each model m (m) (m) (m) (m) xˆk/k 1 = Φk 1xˆk 1/k 1 + dk 1 (3.159) − − − − − T (m) (m) (m) (m) (m) Pk/k 1 = Φk 1Pk 1/k 1 Φk 1 + Qk 1 (3.160) − − − − − − ³T ´ T 1 (m) (m) (m) (m) (m) (m) − Kk = Pk/k 1 Hk Hk Pk/k 1 Hk + Rk (3.161) − µ − ¶ (m) (m) ³ (m)´ (m) (m) (m³) ´ xˆk/k = xˆk/k 1 + Kk zk Hk xˆk/k 1 (3.162) − − − (m) (m) (m)³ (m) ´ Pk/k = I Kk Hk Pk/k 1 . (3.163) − − ³ ´ The conditional density f (zk z0:k 1,m) is evaluated at the current measurement zk | − T T 1 1 (m) (m) (m) (m) (m) (m) − (m) (m) 2 zk Hk xˆk/k 1 Hk Pk/k 1 Hk +Rk zk Hk xˆk/k 1 − − − − − − e f (zk z0:k 1,m)= . − T 1/2 | m/2 (m) (m) (m) (m) (2π) Hk Pk/k 1 Hk + Rk ¯ − ¯ ¯ ³ ´ ¯ (3.164) ¯ ¯ ¯ ¯ The conditional model probability P (m z0:k) is updated | f (zk z0:k 1,m) P (m z0:k)= r | − P (m z0:k 1) . (3.165) | m=1 f (zk z0:k 1,m) P (m z0:k 1) | − | − | − TheMMSEestimateisformedP r (m) xˆk/k = xˆk/k P (m z0:k) . (3.166) m=1 | X 91 Chapter 4 Stochastic Motion Models As with any other chapter of this dissertation, an entire book could be written on the subject of stochastic motion modeling. An excellent introduction to the subject is given in [12, Ch. 4], under the title of “Modeling and Tracking Dynamic Targets”. Without question, domain knowledge can greatly assist in motion modeling. For example, one would use a different motion model for tracking a boost-stage missile [12, p. 242], than one would for tracking people (deformable objects) in a cluttered environment [14, p. 267], [83, p. 744]. It may be that one would like to use a tracker to guide a missile to a target. If such is the case, then the motion model will likely consist of the states that are required by the guidance law. That is, the system model will be influenced by how the tracker fits into a larger design, whether it be a missile guidance system or an automated visual-to-audio sign language interpretation system. This chapter will introduce some of the more common stochastic motion models. The section will begin with a discussion of Markov models and their relation to motion modeling. Next, process noise modeling will be discussed. This will be followed by motion models of increasing complexity. 4.1 Markov Models Regardless of the motion model, there is a distinguishing feature which makes the problem manageable — motion, by its very nature, is Markov (see Section A.8.2). Thus, we can expect models to be of the form discrete: xk = fk (xk 1, uk 1, vk 1) − − − , (4.1) continuous: x˙ = f (x, u, v) 92 where x is the system state vector and typically consists of some type of position and velocity information; u is a known control or disturbance input; v is an unknown white forcing function usually referred to as process noise. The fact that v is white is not limiting, as a correlated process can be formed from filtering a white process and incorporating the filtering sates into the process model [32, pp. 78-84], [19, pp. 226-228] and [3, Ch. 11]. It can also be argued that the Markov property is a direct result of state space modeling. By definition, the state of a system is a set of quantities which allow one to uniquely determine the status of the system for all future times if all inputs are specified [18, p. 76]. That is, only the current state, not past values of it, influence the future state of the system. The generality of Markov processes is briefly discussed in [20, p. 317]. 4.1.1 Principle of Inertia Most of the objects one can conceive of tracking possess inertia. That is they are resistant to changes in their current state of motion. The time derivative of position is velocity and the time derivative of velocity is acceleration 1.Letr represent a vector locating an object of mass M. Newton’s law states that a rigid object of mass M will undergo an acceleration ¨r ifacteduponbyaforceF according to the relationship 1 ¨r = F . (4.2) M Let x be a state vector with the first three states position and the next three states velocity (additional states would be needed if one were interested in the rotational motion of an object) d r 0 I 1 0 x˙ = = 3x3 3x3 x+ 3x3 F . (4.3) dt r˙ 03x3 03x3 M I3x3 ∙ ¸ ∙ ¸ ∙ ¸ 1 The time derivative of acceleration is jerk. The author hesitates to say how a jerk changes with time! 93 Physically realizable forces are finite in magnitude. For such systems, the process model is given by Eq. 4.3. The force F can consist of a known, deterministic component and an unknown stochastic component. If the stochastic component has a non-uniform PSD, then additional states must be added to shape a white process to the desired form. Process noise models are discussed in the next section. The principle of inertia, which represents domain knowledge for the tracking problem, gives much of the structure to stochastic motion models. While the model given by Eq. 4.3 is only applicable to Cartesian coordinate systems, the principle of inertia is more general. In other, more complex coordinate systems (e.g. spherical) with interacting states, the principle of inertia will still apply, but the resulting model will often be nonlinear which will likely complicate state estimation. Generalized coordinate systems and the principle of inertia are discussed in [36, Ch. 6]. Since the position states in a Cartesian system are decoupled, the state model need only be developed for one spatial direction. The remaining two spatial directions have state models that are equivalent to the one developed. 4.2 Process Noise Models The most common state estimation schemes are only feasible for processes that have white noise inputs. White noise is discussed in Section A.8.1. Even if the process noise is not white, a white noise model is appropriate when the noise is nearly uniform over frequencies that define the system (model) bandwidth. However, it should be no surprise that white noise does not reflect reality in many situations. More often the noise is colored, which means that the power spectral density (PSD) of the noise is not uniform across all frequencies, as it is in the case of a white noise process. Fortunately, many colored noise processes can be adequately modeled as the output of a linear time-invariant (LTI) filter that is driven by a white noise process. Similarly, a colored noise sequence can usually be adequately modeled as the output of a discrete-time 94 linear shift-invariant filter that is driven by a white noise sequence. The following scalar system is to be used as a shaping filter x˙ = ax + bu , (4.4) − where a>0 and u is a white noise process with autocorrelation and PSD (Section A.8.1) RU (t)=Aδ (t) (4.5a) PU (Ω)=A .(4.5b) The output autocorrelation and PSD are given by the following equations. Ab2 R (τ)= e a τ X 2a − | | Ab2 P (Ω)= (4.6) X a2 + Ω2 Thus, the simple scalar system given by Eq. 4.4 has been used to create a process with an exponential autocorrelation function. The parameter α will determine the correlation time of the process. Needless to say, a white process is not possible to synthesize. However, a white sequence is. Consider the system that results from sampling the continuous-time system xk+1 = φkxk + wk . (4.7) The state transition function φk and autocorrelation RW are given by the following equations. aT φk = e− (4.8a) RW = δ (n k) Q (4.8b) 2 − b 2aT Q = 1 e− (4.8c) 2a − ¡ ¢ The sampled system can be simulated by applying a white noise sequence, with variance Q to the discrete equivalent system. Although filtering a white sequence 95 with a shaping filter produces the desired autocorrelation, it does have a short coming. The short coming is that the density of the output signal xk can not be specified. This is explained in more detail in Appendix B.5.3. 4.3 Random Walk One can hardly discuss motion models without mention of the infamous random walk. Thetermrandomwalkderivesitsnamefromtheexampleofamanwhotakesfixed length steps in arbitrary directions [32, p. 79]. It is most common to see a random walk model in discrete form x (k +1)=x (k)+w (k) . (4.9) After N steps we have x (N)=x (N 1) + w (N 1) − − = x (N 2) + w (N 2) + w (N 1) − N 1 − − − = x (0) + w (n) . (4.10) n=0 X Taking the expectation with x (0) = 0 gives the mean N 1 − E x (N) = E x (0) + w (n) { } ( n=0 ) XN 1 − = E x (0) + E w (n) { } n=0 { } = E x (0) X { } =0. (4.11) 96 The autocorrelation is RX (m, n)=E x (m) x (n) { }m n = E x (0) + w (n) x (0) + w (n) (Ã r=0 !Ã k=0 !) X m X = E x2 (0) + E x (0) w (r) ( r=0 ) © ª n Xm n +E x (0) w (k) + E w (r) w (n) (4.12) { } ( k=0 ) r=0 k=0 m n X X X = E w (r) w (n) { } r=0 k=0 X X =min(m, n) = mS (n m)+nS (m n) . (4.13) − − Thus, the process is non-stationary. 4.3.1 Continuous Time Random Walk The continuous-time version of random walk is [19, pp. 100-102] x˙ = u (t) , (4.14) where u (t) is white noise E u (t + τ) u (t) = Aδ (τ) . (4.15) { } Integrating with x (t =0)=0gives t x (t)= u (γ) dγ . (4.16) Z0 Taking the expectation gives t E x (t) = E u (γ) dγ =0. (4.17) { } { } Z0 97 The autocorrelation is t1 t2 E x (t1) x (t2) = E u (γ) dγ u (β) dβ { } ½Z0 Z0 ¾ t2 t1 = E u (γ) u (β) dγdβ 0 0 { } Z t2 Z t1 = δ (γ β) dγdβ − Z0 Z0 =min(t1,t2) = t1S (t2 t1)+t2S (t1 t2) . (4.18) − − Sampled Continuous Time Random Walk The continuous-time system state transi- tion matrix is equal to one (this can be seen by setting u (t)=0and solving the homogeneous equation x˙ =0) φ (t)=1, (4.19) and the equivalent noise is given by 2 Q = E wk tk+1 = A © ª dη Ztk = AT . (4.20) In summary, the sampled continuous-time random walk is given by x (k +1) = x (k)+w (k) (4.21a) RWW [k]=Qδ (k) .(4.21b) Therefore, the discrete equivalent is given by Eq. 4.9 with E wm+kwm = RWW [k]= { } Qδ (k)=AT δ (k). Relation to Inertia The concept of a random walk is something that one can easily visualize. One can picture a person taking random steps that are uncorrelated. It seems that the definition itself is sufficient to justify the credibility of a random walk. 98 However, one would still like a justification for the continuous-time random walk model and its relationship to Newton’s Law of motion. The system model given by Eq. 4.3 can be generalized such that the force F has components that are due to a restoring force, viscous friction and noise. The restoring force is essentially a spring, it acts in the opposite direction as r.Viscous friction is a dissipative force that can be modeled as a constant times the velocity of the object M¨r = F = u kr cr˙ . (4.22) − − Rearranging gives M¨r+kr+cr˙ = u . (4.23) This model has a wide range of application in engineering. It can be used to represent a mass-spring-damper system or an inductor-capacitor-resistor system. It can also be used to describe the movement of a particle in a liquid, subjected to collisions and other forces — resulting in a motion termed Brownian motion [63, pp. 447-449]. If the restoring constant f and the mass M are small in comparison to the viscous damping, then an approximate model is given by 1 r˙ = u . (4.24) c This random process is a random walk and is often called the Wiener process. 4.4 White Acceleration Consider one direction in a Cartesian coordinate system. This may represent the row or column coordinate on an FPA. Alternately, it may represent one linear direction in three dimensional space. The second time-derivative of this coordinate is r¨ = u , (4.25) 99 where u is a white noise process RU (τ)=Aδ (τ) (4.26) This model corresponds to a random walk velocity d r¨ = r˙ = u . (4.27) dt Let the state be composed of the coordinate and its first time-derivative x = r r˙ T . (4.28) £ ¤ The state model is given by x˙ = Fx + Gu , (4.29) where 01 F = (4.30a) 00 ∙ ¸ 0 G = .(4.30b) 1 ∙ ¸ 4.4.1 Discrete Equivalent The homogeneous system has u = 0 x˙ = Fx . (4.31) Taking the Laplace transform gives sx (s) x0= Fx (s) . (4.32) − Solving for x gives 1 x (s)=(sI F)− x0 − = Φ (s) x0 . (4.33) 100 where Φ (s) is the Laplace transform of the state transition matrix 1 1 s 1 − 1 s 1 Φ (s)=(sI F)− = − = . (4.34) − 0 s s2 0 s ∙ ¸ ∙ ¸ TakingtheinverseLaplacetransformsgives 1 t Φ (t)= S (t) . (4.35) 01 ∙ ¸ where S (t) is the unit step function. The complete solution is given by x (tk+1)=Φ (tk+1 tk) x (tk)+wk − 1 T = x (t )+w , (4.36) 01 k k ∙ ¸ 1 where the T − isthesamplerateandwk is white noise with an autocorrelation that can be found by Eq. B.70 RW [k, n]=δ (n k) Q , (4.37) − where T T T Q = Φ (ν) GAnxnG Φ (ν) dν Z0 T 1 v 0 10 = A 01 dv 01 1 v 1 Z0 ∙ ¸ ∙ ¸ ∙ ¸ T v £ ¤ = A v 1 dv 1 Z0 ∙ ¸ T v2 £v ¤ = A dv v 1 Z0 ∙ ¸ T 3 T 2 3 2 = A 2 . (4.38) T T ∙ 2 ¸ A stochastic motion model for a white noise acceleration process is completely defined by Eq. 4.36-4.38. 101 4.5 Correlated Acceleration A correlated acceleration model can be obtained by appending process noise model given by Eq. 4.4 to Newton’s Equations given by Eq. 4.3. The equations are uncoupled so only one dimension needs to be analyzed. The state is the position, velocity and acceleration x = r r˙ r¨ . (4.39) The process model is given by £ ¤ x˙ = Fx + Gu , (4.40) where 01 0 F = 00 1 (4.41a) ⎡ 00 α ⎤ − G = ⎣ 001 .(4.41b)⎦ £ ¤ 4.5.1 Discrete Equivalent The Laplace transform of the state transition matrix is 1 Φ (s)=(sI F)− − 1 s 10 − = 0 −s 1 ⎡ 00s−+ α ⎤ ⎣ s (s ⎦+ α)0 0T 1 = (s + α) s (s + α)0 s2 (s + α) ⎡ 1 ss2 ⎤ ⎣ s (s + α)(s + α)1⎦ 1 = 0 s (s + α) s s2 (s + α) ⎡ 00s2 ⎤ 1 1 ⎣ 1 ⎦ s s2 s2(s+α) 1 1 = 0 s s(s+α) . (4.42) ⎡ 1 ⎤ 00 s+α ⎣ ⎦ 102 Taking the inverse Laplace transform gives 1 αt 1 t α2 ( 1+αt + e− ) −1 αt Φ (t)= 01 α (1 e− ) . (4.43) ⎡ −αt ⎤ 00 e− The discrete-time process noise⎣ has an autocorrelation given⎦ by RU [k, n]=δ (n k) Q , (4.44) − where Q can be determined from Eq. B.70 T T T Q = Φ (t) GAnxnG Φ (t) dt 0 Z T T 0 0 = A Φ (t) 0 Φ (t) 0 dt ⎛ ⎡ ⎤⎞ ⎛ ⎡ ⎤⎞ Z0 1 1 ⎝ 1 ⎣ ⎦⎠ ⎝ αt ⎣ ⎦1 ⎠ αt T T α2 ( 1+αt + e− ) α2 ( 1+αt + e− ) −1 αt −1 αt = A α (1 e− ) α (1 e− ) dt . (4.45a) 0 ⎡ −αt ⎤ ⎡ −αt ⎤ Z e− e− ⎣ ⎦ ⎣ ⎦ This equation would be tedious to evaluate, but certainly possible (the result is given αt in [80]). Expand e− in a Taylor series about αT =0 αt 2 3 e− =1 αt +(αt) + O (αt) . (4.46) − ¡ ¢ It is often the case that the correlation time α and sample time T are small enough so that αt 1 2 e− 1 αt + (αt) . (4.47) ' − 2 Using this approximation, the state transition matrix is approximated by 1 αt 1 t α2 ( 1+αt + e− ) −1 αt Φ (t)= 01 α (1 e− ) ⎡ −αt ⎤ 00 e− ⎣ 1 2 ⎦ 1 t 2 t 1 2 01 t 2 αt − 2 ' ⎡ 001 αt + 1 (αt) ⎤ − 2 ⎣ 1 2 ⎦ 1 t 2 t 01 t . (4.48) ' ⎡ 001 αt ⎤ − ⎣ ⎦ 103 Similarly, the matrix Q can be approximated by T T T Q = Φ (t) GAnxnG Φ (t) dt Z0 1 2 1 2 T T 2 t 2 t A t t dt ' ⎡ ⎤ ⎡ ⎤ Z0 1 αt 1 αt − − ⎣ 1 4 1 3⎦ ⎣ 1 2 1⎦ 3 T 4 t 2 t 2 t 2 αt 1 3 2 − 2 = A 2 t t t αt dt ⎡ 1 − ⎤ Z0 t2 t 1 2αt + α2t2 2 − ⎣ 1 4 1 3 1 2 ⎦ T 4 t 2 t 2 t 1 3 2 A 2 t t t dt ' ⎡ 1 2 ⎤ Z0 t t 1 2αt 2 − 1 ⎣5 1 4 1 3 ⎦ 20 T 8 T 6 T 1 4 1 3 1 2 = 8 T 3 T 2 T . (4.49) ⎡ 1 T 3 1 T 2 T (1 αT ) ⎤ 6 2 − ⎣ ⎦ In summary, the discrete equivalent system is given by x (tk+1)=Φ (tk+1 tk) x (tk)+wk − 1 2 1 T 2 T = 01 T x (tk)+wk , (4.50) ⎡ 001 αT ⎤ − ⎣ ⎦ where the process noise has the autocorrelation function RU [k, n]=δ (n k) Q (4.51) 1 − 5 1 4 1 3 20 T 8 T 6 T 1 4 1 3 1 2 Q = 8 T 3 T 2 T . (4.52) ⎡ 1 T 3 1 T 2 T (1 αT) ⎤ 6 2 − ⎣ ⎦ This model can be further simplified by letting α =0. The resulting motion model has a random walk acceleration. 104 Chapter 5 Optimization and Control Theory This chapter reviews important material related to optimal control theory. Section 5.1 introduces terms and concepts from control theory, such as the controllable set. Section 5.2 addresses the topic of constrained optimization and the use of Lagrange multipliers. Section 5.3 shows how parametric optimization can be used to obtain suboptimal controllers for nonlinear systems. Section 5.4 provides a derivation of the minimum principle using a geometrical approach, as opposed to the (perhaps more common) calculus of variations approach. Section 5.5, the last section of this chapter, contains a statement of the Min-Max Principle for continuous-time systems. 5.1 Basic Control Theory Concepts The discussion that follows is provided to introduce the reader to the notation that is used throughout this chapter, as well as to concepts that the reader may not be aware of, such as that of the controllable set [92]. It is not the purpose of this section to discuss the more common aspects of control theory, such as the definition of a state space, or a discussion specifically on linear systems. A rather general class of nonlinear control systems can be described by equations of the form x˙ = f (x, u) , (5.1) where x is an nx dimensional state vector, u is an nu dimensional control vector and f ( ) is assumed to be continuous and continuously differentiable in all of its · arguments. There are typical constraints imposed on the control vector u.These are generally in the form of nh inequalities u = u h (x, u) 0 . (5.2a) ∈ U { | ≥ } 105 A controller u that satisfies Eq. 5.2a and is piecewise differentiable is said to be an admissible control. Although not necessary for all problems, control design is much easier if the control constraints are state independent u = u h (u) 0 .(5.2b) ∈ U { | ≥ } The goal of control theory is to find an admissible controller that, among other things, transfers the system state to a target set that can be described by a set of equalities X = x g (x)=0 . (5.3) X { | } Given a control system (Eq. 5.1) with specified control constraints (Eq. 5.2a), an initial state is said to be controllable to the target (Eq. 5.3) if there exists an admis- sible control u (x) such that the solution to Eq. 5.1 transfers the initial state to the target in a finite time [92, p. 90]. The set of all controllable initial states is said to be the controllable set. For a given admissible control law u (x),thesetofinitial points that actually get transferred to the target in finite time is called the domain of effectiveness for u (x) and is a subset of the controllable set. 5.2 Parametric Optimization This section presents the basic elements of nonlinear parametric optimization. The basic topic of nonlinear optimization is covered in [92, Ch. 3] and in extensive detail in [93]. Suppose we wish to optimize a scalar valued function G of the following form minimize G (u) , (5.4) subject to a set of inequality constraints of the form u = u h (u) 0 . (5.5) ∈ U { | ≥ } The parameter vector u has dimension nu and the vector valued function h (u) has dimension nh. 106 U B u* Figure 5.1. Hatched region indicating intersection of a spherical ball and the control constraint set. Aballisaspecialtypeofopenneighborhood,showninFigure5.1,anddefined by = u∗ + ∆u ∆u <