Least Squares Methods for System Identification of Structured Models

Least Squares Methods for System Identification of Structured Models MIGUEL GALRINHO Licentiate Thesis Stockholm, Sweden 2016 KTH School of Electrical Engineering TRITA-EE 2016:115 Automatic Control Lab ISSN 1653-5146 SE-100 44 Stockholm ISBN 978-91-7729-066-7 SWEDEN Akademisk avhandling som med tillstånd av Kungliga Tekniska högskolan framlägges till offentlig granskning för avläggande av teknologie licenciatexamen i reglerteknik fredagen den 9 september 2016 klockan 10.15 i sal Q2 Kungliga Tekniska högskolan, Osquldas väg 10, Stockholm. © Miguel Galrinho, September 2016. All rights reserved. Tryck: Universitetsservice US AB Abstract The purpose of system identification is to build mathematical models for dynamical systems from experimental data. With the current increase in complexity of engineering systems, an important challenge is to develop accurate and computationally efficient algorithms. For estimation of parametric models, the prediction error method (PEM) is a benchmark in the field. When the noise is Gaussian and a quadratic cost function is used, PEM provides asymptotically efficient estimates if the model orders are correct. A disadvantage with PEM is that, in general, it requires minimizing a non-convex function. Alternative methods are then needed to provide initialization points for the optimization. Two important classes of such methods are subspace and instrumental variables. Other methods, such as Steiglitz-McBride, use iterative least squares to avoid the non-convexity of PEM. This thesis focuses on this class of methods, with the purpose of addressing common limitations in existing algorithms and suggesting more accurate and computationally efficient ones. In particular, the proposed methods first estimate a high order non-parametric model and then reduce this estimate to a model of lower order by iteratively applying least squares. Two methods are proposed. First, the weighted null-space fitting (WNSF) uses iterative weighted least squares to reduce the high order model to a parametric model of interest. Second, the model order reduction Steiglitz-McBride (MORSM) uses pre-filtering and Steiglitz-McBride to estimate a parametric model of the plant. The asymptotic properties of the methods are studied, which show that one iteration provides asymptotically efficient estimates. We also discuss two extensions for this type of methods: transient estimation and estimation of unstable systems. Simulation studies provide promising results regarding accuracy and convergence properties in comparison with PEM. Sammanfattning Syftet med systemidentifiering är att bygga matematiska modeller av dynamiska system från observerade data. Dagens alltmer komplexa tekniska system har gjort att behovet av att utveckla noggranna och beräkningseffektiva algoritmer ökat. För skattning av parametriska modeller är prediktionsfelsmetoden (PEM) en standardmetod inom området. När störningen är Gaussisk, och en kvadratisk kost- nadsfunktion används, är prediktionsfelsmetodens skattningar asymptotiskt effektiva om modellens ordningstal är korrekta. En nackdel med denna metod är att den ofta kräver att en icke-konvex funktion minimeras. I detta fall behövs alternativa metoder för att hitta initieringspunkter åt optimeringen. Två viktiga klasser av dessa är subspace-metoder och instrumentvariabelmetoder. Andra metoder, såsom Steiglitz-McBride, använder minstakvadratmetoden iter- ativt för att undvika prediktionsfelsmetodens icke-konvexitet. Denna avhandling fokuserar på denna klass av metoder, med syftet att gå igenom begränsningar med befintliga metoder samt att föreslå noggrannare och beräkningseffektivare algoritmer. De föreslagna metoderna skattar först en icke-parametrisk modell av högt ordningstal, för att sedan reducera denna skattning till en lägre ordningens modell genom iterativ användning av minstakvadratmetoden. Två metoder föreslås. Den första, viktad nollrumsanpassning (WNSF), an- vänder en viktad variant av minstakvadratmedoten för att reducera den icke- parametriska modellen till den parametriska modellen av intresse. Den andra, modellordningsreduktions-Steiglitz-McBride (MORSM), använder förfiltrering och Steiglitz-McBride för att skatta en parametrisk modell. Metodernas asymptotiska egenskaper studeras, vilket visar att en iteration ger asymptotiskt effektiva skattningar. Dessutom diskuteras två utvidgningar av denna klass av metoder: tran- sientskattning och skattning av instabila system. Simuleringar visar på lovande resultat med avseende på noggrannhet och kon- vergensegenskaper jämfört med PEM. Contents Acknowledgements ix Abbreviations x Notation xi 1 Introduction 1 1.1 Motivation . 3 1.2 Outline and Contributions . 5 2 Background 9 2.1 System Description . 9 2.2 Models . 11 2.3 Identification Methods . 14 3 Asymptotic Properties of the Least Squares Method 37 3.1 Definitions and Assumptions . 38 3.2 Convergence Results . 41 3.3 Variance Results . 42 4 Weighted Null-Space Fitting 43 4.1 Problem Statement . 43 4.2 Motivation . 44 4.3 The Algorithm . 48 4.4 Asymptotic Properties . 55 4.5 Simulation Examples . 56 4.6 Comparison with Explicit ML Optimization . 61 4.7 Conclusions . 64 4.A Consistency of Step 2 . 65 4.B Consistency of Step 3 . 70 4.C Asymptotic Distribution and Covariance of Step 3 . 75 5 Model Order Reduction Steiglitz-McBride 81 vi Contents vii 5.1 Problem Statement . 82 5.2 Motivation . 82 5.3 Open Loop . 86 5.4 Closed Loop . 90 5.5 Comparison to WNSF . 92 5.6 Simulation Examples . 94 5.7 Conclusions . 100 6 Transient Estimation with Unstructured Models 103 6.1 Problem Statement . 104 6.2 Estimating the Transient . 105 6.3 WNSF with Transient Estimation . 106 6.4 Generalizations . 109 6.5 Simulation Example . 113 6.6 Conclusions . 115 7 ARX Modeling of Unstable Box-Jenkins Systems 117 7.1 Problem Statement . 118 7.2 The ARX Minimizer of a Stable BJ Model . 119 7.3 The ARX Minimizer of an Unstable BJ Model . 121 7.4 Practical Aspects . 124 7.5 Examples . 125 7.6 PEM with Unstable Predictors . 128 7.7 WNSF with Unstable Predictors . 131 7.8 Conclusions . 138 8 Conclusions and Future Research Directions 141 Bibliography 145 Acknowledgements This thesis has been influenced by many people who should be acknowledged for their contributions. I would like to thank Håkan, my main supervisor, for giving me this opportunity and guiding me in accomplishing it. I would also like to thank Cristian, my co- supervisor, for having his door always open to discuss whatever questions arise; and Bo, for having provided me with some of his previous work, which has helped me much in understanding my own. I would like to thank everyone at the department of Automatic Control and the System Identification group for pleasurable company and interesting discussions. In particular, I would like to thank those who have reviewed parts of this thesis: Henrik (for being the official reviewer), Patricio (for his availability and eye for detail), Riccardo (for knowing the Chicago Manual of Style better than me), Niklas (for contributing directly by being a co-author in several papers), and Linnea (for the help in translating the abstract). Last but not least, I would like to thank my friends and family who have given me all their support during these challenging and fascinating years. Miguel Galrinho Stockholm, August 2016. ix Abbreviations ARMAX autoregressive moving average with exogeneous input ARX autoregressive with exogeneous input BJ Box-Jenkins BJSM Box-Jenkins Steiglitz-McBride FIR finite impulse response MIMO multi-input multi-output MISO multi-input single-output ML maximum likelihood MORSM model order reduction Steiglitz-McBride OE output-error PDF probability density function PEM prediction error method SISO single-input single-output WNSF weighted null-space fitting w.p.1 with probability one xi Notation n 2 x l2 norm of the vector x—that is, x k 1 xk , with x an n 1 vector ¼ = YAY induced Euclidean norm ofY theY ∶= matrix∑ S SA—that is, × A supx Ax x , x 0 Y Y Ex mathematical expectation of the random vector x ¯ Y Y ∶= ¯Y Y ~ Y Y 1 ≠N Ex defined by Ex lim t 1 Ext N N x f the function x tends to zero= at a rate not slower than f , as N N ∶N= →∞ ∑ N N , w.p.1 1 = O( ) 1 q backward shift operator, q xt xt 1 → ∞ A− transpose of A − = − cov⊺ x covariance matrix of x Ascov x asymptotic covariance matrix of x ( ) As a, R asymptotic normal distribution with mean a and covariance R ( ) C any bounded constant, which need not be the same in different N ( ) expressions det A determinant of A R set of real numbers Rn set of n-dimensional column vectors with real entries definition n,m X q Toeplitz matrix of size n m (m n) with zeros above the ∶= main diagonal and whose first column is x0 xn 1 , where T ( ( )) X q x q k × ≤ ⊺ k 0 k − ∞ − [ ⋯ ] ( ) = ∑ = xiii Chapter 1 Introduction Creating models to describe physical phenomena is the core of science. Models are representations of the object of study. As such, they are not true descriptions of the real world, but abstractions that attempt to explain how some object behaves. Models can provide insight and understanding about the object being studied and make predictions about its behavior. Also, their accuracy can be tested through their ability to make predictions, but their usefulness depends as well on the simplicity and propensity to give insight and understanding. It is thus natural that the same object may have different models, none being true or false, but each having different points of interest and ranges of applicability. A classical example is Newton’s universal law of gravitation and Einstein’s general theory of relativity, which are both models for the force of gravity. However, while Newton’s theory is sufficient to send a spaceship to Mars, general relativity would be unnecessarily complex for that purpose; on the other hand, it is required to explain certain anomalies in the orbit of Mercury, due to the strong influence of the Sun’s gravitational field. System identification is about using experimental data to build mathematical models for dynamical systems. A system is a process where different variables interact, producing observable signals, which are called outputs.

Load more