Lesson 4 – Instrumental Variables

Total Page:16

File Type:pdf, Size:1020Kb

Lesson 4 – Instrumental Variables Ph.D. IN ENGINEERING AND Advanced methods for system APPLIED SCIENCES identification Ph.D. course Lesson 4: Instrumental variables TEACHER Mirko Mazzoleni PLACE University of Bergamo Outline 1. Introduction to error-in-variables problems 2. Least Squares revised 3. The instrumental variable method 4. Estimate an ARX model from ARMAX generated data 5. Application study: the VRFT approach for direct data driven control 6. Conclusions 2 /44 Outline 1. Introduction to error-in-variables problems 2. Least Squares revised 3. The instrumental variable method 4. Estimate an ARX model from ARMAX generated data 5. Application study: the VRFT approach for direct data driven control 6. Conclusions 3 /44 Introduction Many different solutions have been presented for system identification of linear dynamic systems from noise-corrupted output measurements On the other hand, estimation of the parameters for linear dynamic systems when also the input is affected by noise is recognized as a more difficult problem Representations where errors or measurement noises are present on both inputs and outputs are usually called Errors-in-Variables (EIV) models In case of static systems, errors-in-variables representations are closely related to other well-known topics such as latent variables models and factor models 4 /44 Introduction Errors-in-variables models can be motivated in several situations: • modeling of the dynamics between the noise-free input and noise-free output. The reason can be to have a better understanding of the underlying relations (e.g. in econometrics), rather than to make a good prediction from noisy data • when the user lacks enough information to classify the available signals into inputs and outputs and prefer to use a “symmetric” system model. This is closely connected to the behavioral approach to modeling • In some settings, especially in non technical areas such as nature, biology, economics, the identification experiment may not be under full “control” of the modeler, so we have to work with observational data and treat confounding variables accordingly 5 /44 Example: data-driven control design Consider the Virtual Reference Feedback Tuning (VRFT) algorithm: • Unknown SISO linear system 퐺 푧 • Family of linear parametric 1-DOF controllers 휷 푧 = 훽 푧 훽 푧 … 훽 푧 푇 푦 푡 = 퐺 푧 푢 푡 + 푦෤ 푡 푅 푧; 휽 = 휷푇 푧 휽 1 2 푛 1 × 푑 푑 × 1 푇 휽 = 휃1 휃2 … 휃푛 • Model reference specification 푦෤ 푡 푀 푧 : Reference model + + Design 푅 푧; 휽 such that 푀 푧 푟ҧ 푡 푢 푡 푦0 푡 + 푦 푡 푅 푧; 휽 퐺 푧 − 퐺 푧 푅 푧; 휽 푀 푧 ≈ 1 + 퐺 푧 푅 푧; 휽 Campi, M. C., Lecchini, A., & Savaresi, S. M. (2002) 6 /44 Example: data-driven control design 푁 1 2 What we can minimize : 퐽푁 휽 = ෍ 푢 푡 − 푅 푧; 휽 푒 푡 푉푅 푁 퐿 퐿 푡=1 푀−1 푧 Given a set of I/O measures 푢 푡 , 푦 푡 푡=1,…,푁 Noisy «input» Compute: + 푟푣ҧ 푡 푒푣 푡 푢 푡 푦 푡 −1 • Virtual noisy reference 푟푣ҧ 푡 = 푀 푧 푦 푡 ? 퐺 푧 − • Filtered noisy virtual error 푒퐿 푡 = 퐿 푧 푟푣ҧ 푡 − 푦 푡 • Filtered input 푢퐿 푡 = 퐿 푧 푢 푡 If 푅 푧; 휽 is linear in the parameters, we can estiamate 휽 via Least Squares. However, in this case the noise is on the regressor signal 푒퐿 푡 and not on the output 푢 푡 7 /44 Basic setup We now describe the basic setup for an EIV problem: 푦෤ 푡 • 푢 푡 푦 푡 푢 푡 + 0 and 0 denote the undisturbed 0 푦0 푡 푦 푡 푆 input and output, respectively, such that + −1 푛푎 • 퐴 푧 = 1 − 푎1푧 − ⋯ − 푧 + 퐴 푧 푦0 푡 = 퐵 푧 푢0 푡 푢෤ 푡 푢 푡 −1 푛푏 • 퐵 푧 = 푏1푧 + ⋯ + 푧 + So that: 푦0 푡 = 푎1푦0 푡 − 1 + ⋯ + 푎푛푎푦0 푡 − 푛푎 + 푏1푢0 푡 − 1 + ⋯ + 푏푛푏 푢0 푡 − 푛푏 8 /44 Basic setup • 푢෤ 푡 and 푦෤ 푡 are zero-mean disturbances on input and output. The available signals are 푢 푡 = 푢0 푡 + 푢෤ 푡 푦 푡 = 푦0 푡 + 푦෤ 푡 Then: 푦0 푡 = 푎1푦0 푡 − 1 + ⋯ + 푎푛푎푦0 푡 − 푛푎 + 푏1푢0 푡 − 1 + ⋯ + 푏푛푏 푢0 푡 − 푛푏 푦 푡 − 푦෤ 푡 = 푎1 푦 푡 − 1 − 푦෤ 푡 − 1 + ⋯ + 푎푛푎 푦 푡 − 푛푎 − 푦෤ 푡 − 푛푎 +푏1 푢 푡 − 1 − 푢෤ 푡 − 1 + ⋯ + 푏푛푏 푢 푡 − 푛푏 − 푢෤ 푡 − 푛푏 퐴 푧 푦 푡 = 퐵 푧 푢 푡 + 퐴 푧 푦෤ 푡 − 퐵 푧 푢෤ 푡 퐴 푧 푦 푡 = 퐵 푧 푢 푡 + 휂 푡 9 /44 ⊤ • 흋 푡 = 푦 푡 − 1 … 푦 푡 − 푛푎 푢 푡 − 1 … 푢 푡 − 푛푏 Basic setup ⊤ • 휽 = 푎1 … 푎푛푎 푏1 … 푏푛푏 We can then write the system as: 푦 푡 = 흋 푡 ⊤휽 + 휂 푡 Estimation problem 푁 Given 푁 observations 푢(푡 , 푦 푡 푡=1, estimate the system parameters ⊤ 휽 = 푎1 … 푎푛 푏1 … 푏푛 푑 × 1 푎 푏 We denote with 휽0, 퐴0 푧 , 퐵0 푧 the true values of the parameters and polynomials Consistency. An estimated parameter vector 휽෡푁 based on 푁 data points is said to be consistent if it converges to its “true” value 휽0 as the number of measured data grows without any limit 휽෡푁 → 휽0, 푁 → +∞ 10 /44 Outline 1. Introduction to error-in-variables problems 2. Least Squares revised 3. The instrumental variable method 4. Estimate an ARX model from ARMAX generated data 5. Application study: the VRFT approach for direct data driven control 6. Conclusions 11 /44 Least squares revised Consider a linear model having a the following structure: 푦 푡 = 푎1푦 푡 − 1 + ⋯ + 푎푛푎푦 푡 − 푛푎 + 푏1푢 푡 − 1 + ⋯ + 푏푛푏푢 푡 − 푛푏 + 휂 푡 퐴 푧 푦 푡 = 퐵 푧 푢 푡 + 휂 푡 where 휂 푡 denotes an equation error, which can describe disturbances as well as unmodeled dynamics (휂 푡 does not need to be white noise) 1 The Least Squares estimate, found by minimizing σ1 휂 푡 2, reads as: 푁 푡=1 −1 푁 푁 1 1 휽෡ = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푦 푡 퐿푆 푁 푁 푑 × 1 푡=1 푑 × 1 1 × 푑 푡=1 푑 × 1 1 × 1 ⋯ ⋮ ⋮ ⋮ 12 /44 Least squares revised Consider now a linear system and write it as ⊤ 퐴0 푧 푦 푡 = 퐵0 푧 푢 푡 + 푣 푡 푦 푡 = 흋 푡 휽0 + 푣 푡 The 푣 푡 term can be evaluated as 푣 푡 = 퐴0 푧 푦 푡 − 퐵0 푧 푢 푡 = 퐴0 푧 푦0 푡 + 푦෤ 푡 − 퐵0 푧 푢0 푡 + 푢෤ 푡 = 퐴0 푧 푦෤ 푡 − 퐵0 푧 푢෤ 푡 Focus now on the estimation error: 휽෡퐿푆 − 휽0 13 /44 Least squares revised −1 푁 푁 1 1 휽෡ − 휽 = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푦 푡 − 휽 퐿푆 0 푁 푁 0 푡=1 푡=1 −1 푁 푁 푁 1 1 = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푦 푡 − ෍ 흋 푡 흋 푡 ⊤ 휽 푁 푁 0 푡=1 푡=1 푡=1 −1 푁 푁 1 1 = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푦 푡 − 흋 푡 ⊤휽 푁 푁 0 푡=1 푡=1 −1 푁 푁 1 1 = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푣 푡 푁 푁 푡=1 푡=1 14 /44 Least squares revised −1 푁 푁 1 1 휽෡ − 휽 = ෍ 흋 푡 흋 푡 ⊤ ⋅ ෍ 흋 푡 푣 푡 퐿푆 0 푁 푁 푡=1 푡=1 The summations in the equation converge to their expected values, under mild conditions. Thus, the estimate 휽෡퐿푆 is consistent if: • 피 흋 푡 흋 푡 ⊤ is invertible • 피 흋 푡 푣 푡 = ퟎ 15 /44 Least squares revised The second condition is the most resctrictive: 피 흋 푡 푣 푡 = ퟎ 푑 × 1 푑 × 1 If 푣 푡 is a white noise, it will be uncorrelated with all values in 흋 푡 , and so the estimate is consistent ⊤ 흋 푡 = 푦 푡 − 1 … 푦 푡 − 푛푎 푢 푡 − 1 … 푢 푡 − 푛푏 푦 푡 − 1 = 푓 푦 푡 − 2 , … , 푢 푡 − 2 , … , 푣 푡 − 1 , … ⊥ 푣 푡 In the EIV case, since 푣 푡 = 퐴0 푧 푦෤ 푡 − 퐵0 푧 푢෤ 푡 , it will be correlated with the variables in 흋 푡 , and so the estimate is NOT consistent 16 /44 Example (least squares with noise on regressor) Suppose that we want to estimate a static linear model, using regression and least squares 2 푦 푖 = 휃푥0 푖 + 푒푦 푖 휃 = 3, 푒푦 ∼ 풩 0, 휎푦 so that 푥0 푖 is noiseless. Suppose now that we measure and employ 2 푥 푖 = 푥0 푖 + 푒푥 푖 푒푥 ∼ 풩 0, 휎푥 instead of 푥0 푖 , i.e. we have a noisy regressor 푥 푖 The LS estimates are biased 2 proportionally to 휎푥 See static_system_regressor_noise.m MatLab file 17 /44 Example (least squares with noise on regressor) Let’s investigate the problem. 2 The relation assumed by LS is: 푦 푖 = 휃푥0 푖 + 푒푦 푖 휃 = 3, 푒푦 ∼ 풩 0, 휎푦 but we employ 푥 푖 = 푥0 푖 + 푒푥 푖 in place of 푥 푖 푦 푖 = 휃 푥 푖 − 푒푥 푖 + 푒푦 푖 = 휃푥 푖 − 휃푒푥 푖 + 푒푦 푖 = 휃푥 푖 + 휂 푖 So that employed regressor 푥 is correlated with the output noise 휂: 피 푥 푖 ⋅ 휂 푖 = 피 푥0 푖 + 푒푥 푖 ⋅ −휃푒푥 푖 + 푒푦 푖 ≠ 0 and a bias is present in the estimates See static_system_regressor_noise.m MatLab file 18 /44 Outline 1. Introduction to error-in-variables problems 2. Least Squares revised 3. The instrumental variable method 4. Estimate an ARX model from ARMAX generated data 5. Application study: the VRFT approach for direct data driven control 6.
Recommended publications
  • Least-Squares Fitting of Circles and Ellipses∗
    Least-Squares Fitting of Circles and Ellipses∗ Walter Gander Gene H. Golub Rolf Strebel Dedicated to Ake˚ Bj¨orck on the occasion of his 60th birthday. Abstract Fitting circles and ellipses to given points in the plane is a problem that arises in many application areas, e.g. computer graphics [1], coordinate metrol- ogy [2], petroleum engineering [11], statistics [7]. In the past, algorithms have been given which fit circles and ellipses in some least squares sense without minimizing the geometric distance to the given points [1], [6]. In this paper we present several algorithms which compute the ellipse for which the sum of the squares of the distances to the given points is minimal. These algorithms are compared with classical simple and iterative methods. Circles and ellipses may be represented algebraically i.e. by an equation of the form F(x) = 0. If a point is on the curve then its coordinates x are a zero of the function F. Alternatively, curves may be represented in parametric form, which is well suited for minimizing the sum of the squares of the distances. 1 Preliminaries and Introduction Ellipses, for which the sum of the squares of the distances to the given points is minimal will be referred to as \best fit” or \geometric fit", and the algorithms will be called \geometric". Determining the parameters of the algebraic equation F (x)=0intheleast squares sense will be denoted by \algebraic fit" and the algorithms will be called \algebraic". We will use the well known Gauss-Newton method to solve the nonlinear least T squares problem (cf.
    [Show full text]
  • Overview of Total Least Squares Methods
    Overview of total least squares methods ,1 2 Ivan Markovsky∗ and Sabine Van Huffel 1 — School of Electronics and Computer Science, University of Southampton, SO17 1BJ, UK, [email protected] 2 — Katholieke Universiteit Leuven, ESAT-SCD/SISTA, Kasteelpark Arenberg 10 B–3001 Leuven, Belgium, [email protected] Abstract We review the development and extensions of the classical total least squares method and describe algorithms for its generalization to weighted and structured approximation problems. In the generic case, the classical total least squares problem has a unique solution, which is given in analytic form in terms of the singular value de- composition of the data matrix. The weighted and structured total least squares problems have no such analytic solution and are currently solved numerically by local optimization methods. We explain how special structure of the weight matrix and the data matrix can be exploited for efficient cost function and first derivative computation. This allows to obtain computationally efficient solution methods. The total least squares family of methods has a wide range of applications in system theory, signal processing, and computer algebra. We describe the applications for deconvolution, linear prediction, and errors-in-variables system identification. Keywords: Total least squares; Orthogonal regression; Errors-in-variables model; Deconvolution; Linear pre- diction; System identification. 1 Introduction The total least squares method was introduced by Golub and Van Loan [25, 27] as a solution technique for an overde- termined system of equations AX B, where A Rm n and B Rm d are the given data and X Rn d is unknown.
    [Show full text]
  • Uncorrected Proof
    Annals of Operations Research 124, 69–79, 2003 2003 Kluwer Academic Publishers. Manufactured in The Netherlands. 1 1 2 2 3 3 4 Multiple Neutral Data Fitting 4 5 5 6 6 CHRIS TOFALLIS [email protected] 7 7 Deptament of Stats, Econ, Accounting and Management Systems, Business School, 8 University of Hertfordshire, UK 8 9 9 10 10 11 Abstract. A method is proposed for estimating the relationship between a number of variables; this differs 11 12 from regression where the emphasis is on predicting one of the variables. Regression assumes that only 12 13 one of the variables has error or natural variability, whereas our technique does not make this assumption; 13 instead, it treats all variables in the same way and produces models which are units invariant – this is 14 important for ensuring physically meaningful relationships. It is thus superior to orthogonal regression in 14 15 that it does not suffer from being scale-dependent. We show that the solution to the estimation problem is 15 16 a unique and global optimum. For two variables the method has appeared under different names in various 16 17 disciplines, with two Nobel laureates having published work on it. 17 18 18 Keywords: functional relation, data fitting, regression, model fitting 19 19 20 20 21 Introduction 21 22 22 23 23 24 The method of regression is undoubtedly one of the most widely used quantitative meth- 24 25 ods in both the natural and social sciences. For most users of the technique the term is 25 26 taken to refer to the fitting of a function to data by means of the least squares criterion.
    [Show full text]
  • ROBUST SOLUTIONS to LEAST-SQUARES PROBLEMS with UNCERTAIN DATA ∗ ´ LAURENT EL GHAOUI† and HERVE LEBRET† Abstract
    SIAM J. MATRIX ANAL. APPL. c 1997 Society for Industrial and Applied Mathematics Vol. 18, No. 4, pp. 1035–1064, October 1997 015 ROBUST SOLUTIONS TO LEAST-SQUARES PROBLEMS WITH UNCERTAIN DATA ∗ ´ LAURENT EL GHAOUI† AND HERVE LEBRET† Abstract. We consider least-squares problems where the coefficient matrices A, b are unknown but bounded. We minimize the worst-case residual error using (convex) second-order cone program- ming, yielding an algorithm with complexity similar to one singular value decomposition of A. The method can be interpreted as a Tikhonov regularization procedure, with the advantage that it pro- vides an exact bound on the robustness of solution and a rigorous way to compute the regularization parameter. When the perturbation has a known (e.g., Toeplitz) structure, the same problem can be solved in polynomial-time using semidefinite programming (SDP). We also consider the case when A, b are rational functions of an unknown-but-bounded perturbation vector. We show how to mini- mize (via SDP) upper bounds on the optimal worst-case residual. We provide numerical examples, including one from robust identification and one from robust interpolation. Key words. least-squares problems, uncertainty, robustness, second-order cone programming, semidefinite programming, ill-conditioned problem, regularization, robust identification, robust in- terpolation AMS subject classifications. 15A06, 65F10, 65F35, 65K10, 65Y20 PII. S0895479896298130 Notation. For a matrix X, X denotes the largest singular value and X F k k k k the Frobenius norm. If x is a vector, maxi xi is denoted by x . For a matrix A, | | k k∞ A† denotes the Moore–Penrose pseudoinverse of A.
    [Show full text]
  • Thesis Would Not Be Possible
    SYMMETRIC AND TRIMMED SOLUTIONS OF SIMPLE LINEAR REGRESSION by Chao Cheng A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree MASTER OF SCIENCE (STATISTICS) December 2006 Copyright 2006 Chao Cheng Dedication to my grandmother to my parents ii Acknowledgments First and foremost, I would like to thank my advisor, Dr. Lei Li, for his patience and guidance during my graduate study. Without his constant support and encouragement, the completion of the thesis would not be possible. I feel privileged to have had the opportunity to work closely and learn from him. I am very grateful to Dr. Larry Goldstein and Dr. Jianfeng Zhang, in the Depart- ment of Mathematics, for their most valuable help in writing this thesis. Most of my knowledge about linear regression is learned from Dr. Larry Goldstein’s classes. I would also like to thank Dr. Xiaotu Ma, Dr. Rui Jiang, Dr. Ming Li, Huanying Ge, and all the past and current members in our research group for their suggestions and recommendations. I greatly benefit from the discussion and collaboration with them. Finally, I would like to thank my little sister and my parents for their love, support and encouragement. I would also like to thank my grandmother, who passed away in 2002, soon after I began my study in USC. I feel sorry that I did not accompany her in her last minute. This thesis is dedicated to her. iii Table of Contents Dedication ii Acknowledgments iii List of Tables vi List of Figures vii Abstract 1 Chapter 1: INTRODUCTION 2 1.1 Simple linear regression model .
    [Show full text]
  • Total Least Squares Regression in Input Sparsity Time
    Total Least Squares Regression in Input Sparsity Time Huaian Diao Zhao Song Northeast Normal University & KLAS of MOE University of Washington [email protected] [email protected] David P. Woodruff Xin Yang Carnegie Mellon University University of Washington [email protected] [email protected] Abstract In the total least squares problem, one is given an m n matrix A, and an m d × × matrix B, and one seeks to “correct” both A and B, obtaining matrices A and B, so that there exists an X satisfying the equation AX = B. Typically the problem is overconstrained, meaning that m max(n, d). The cost of the solutionb A,bB 2 2 b b is given by A A F + B B F . We give an algorithm for finding a solution k − k k − k 2 b2 b X to the linear system AX = B for which the cost A A F + B B F is at most a multiplicativeb (1 + )bfactor times the optimalk − cost,k up tok an− additivek error η that may be an arbitrarilyb b small function of n. Importantly,b our runningb time is O(nnz(A) + nnz(B)) + poly(n/) d, where for a matrix C, nnz(C) denotes its number of non-zero entries. Importantly,· our running time does not directlye depend on the large parameter m. As total least squares regression is known to be solvable via low rank approximation, a natural approach is to invoke fast algorithms for approximate low rank approximation, obtaining matrices A and B from this low rank approximation, and then solving for X so that AX = B.
    [Show full text]
  • A Bayesian Approach to Nonlinear Parameter Identification for Rigid
    A Bayesian Approach to Nonlinear Parameter Identification for Rigid Body Dynamics Jo-Anne Ting∗, Michael Mistry∗, Jan Peters∗, Stefan Schaal∗† and Jun Nakanishi†‡ ∗Computer Science, University of Southern California, Los Angeles, CA, 90089-2520 †ATR Computational Neuroscience Labs, Kyoto 619-0288, Japan ‡ICORP, Japan Science and Technology Agency Kyoto 619-0288, Japan Email: {joanneti, mmistry, jrpeters, sschaal}@usc.edu, [email protected] Abstract— For robots of increasing complexity such as hu- robots such as humanoid robots have significant additional manoid robots, conventional identification of rigid body dynamics nonlinear dynamics beyond the rigid body dynamics model, models based on CAD data and actuator models becomes due to actuator dynamics, routing of cables, use of protective difficult and inaccurate due to the large number of additional nonlinear effects in these systems, e.g., stemming from stiff shells and other sources. In such cases, instead of trying to wires, hydraulic hoses, protective shells, skin, etc. Data driven explicitly model all possible nonlinear effects in the robot, parameter estimation offers an alternative model identification empirical system identification methods appear to be more method, but it is often burdened by various other problems, useful. Under the assumption that a rigid body dynamics such as significant noise in all measured or inferred variables (RBD) model is sufficient to capture the entire robot dynamics, of the robot. The danger of physically inconsistent results also exists due to unmodeled nonlinearities or insufficiently rich data. this problem is theoretically straightforward as all unknown In this paper, we address all these problems by developing a parameters of the robot such as mass, center of mass and Bayesian parameter identification method that can automatically inertial parameters appear linearly in the rigid body dynamics detect noise in both input and output data for the regression equations [3].
    [Show full text]
  • Orthogonal Decomposition of Left Ventricular Remodeling in Myocardial Infarction
    Touro Scholar Office of the esidentPr Publications and Research Office of the esidentPr 2017 Orthogonal Decomposition of Left Ventricular Remodeling in Myocardial Infarction Xingyu Zhang Pau Medrano-Gracia Bharath Ambale-Venkatesh David A. Bluemke Brett R. Cowan See next page for additional authors Follow this and additional works at: https://touroscholar.touro.edu/president_pubs Part of the Cardiovascular Diseases Commons, and the Disease Modeling Commons Recommended Citation Zhang, X., Medrano-Gracia, P., Ambale-Venkatesh, B., Bluemke, D. A., Cowan, B. R., Finn, J. P., . Kadish, A. H. (2017). Orthogonal decomposition of left ventricular remodeling in myocardial infarction. GigaScience, 6(3), 1-15. This Article is brought to you for free and open access by the Office of the President at Touro Scholar. It has been accepted for inclusion in Office of the President Publications and Research by an authorized administrator of Touro Scholar. For more information, please contact [email protected]. Authors Xingyu Zhang, Pau Medrano-Gracia, Bharath Ambale-Venkatesh, David A. Bluemke, Brett R. Cowan, J. Paul Finn, and Alan H. Kadish This article is available at Touro Scholar: https://touroscholar.touro.edu/president_pubs/162 GigaScience, 6, 2017, 1–15 doi: 10.1093/gigascience/gix005 Advance Access Publication Date: 6 February 2017 Research RESEARCH Orthogonal decomposition of left ventricular remodeling in myocardial infarction Xingyu Zhang1, Pau Medrano-Gracia1, Bharath Ambale-Venkatesh2, David A. Bluemke3,BrettRCowan1,J.PaulFinn4, Alan
    [Show full text]
  • Tikhonov Regularization for Weighted Total Least Squares Problems✩
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Applied Mathematics Letters 20 (2007) 82–87 www.elsevier.com/locate/aml Tikhonov regularization for weighted total least squares problems✩ Yimin Wei a,b,∗, Naimin Zhangc, Michael K. Ngd, Wei Xue a School of Mathematical Sciences, Fudan University, Shanghai 200433, PR China b Key Laboratory of Mathematics for Nonlinear Sciences (Fudan University), Ministry of Education, PR China c Mathematics and Information Science College, Wenzhou University, Wenzhou 325035, PR China d Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong e Department of Computing and Software, McMaster University, Hamilton, Ont., Canada L8S 4L7 Received 15 August 2003; received in revised form 15 January 2006; accepted 1 March 2006 Abstract In this work, we study and analyze the regularized weighted total least squares (RWTLS) formulation. Our regularization of the weighted total least squares problem is based on the Tikhonov regularization. Numerical examples are presented to demonstrate the effectiveness of the RWTLS method. c 2006 Elsevier Ltd. All rights reserved. Keywords: Tikhonov regularization; Total least squares; Weighted regularized total least squares; Matrix differentiation; Lagrange multiplier 1. Introduction In this work, we study the regularized weighted total least squares (RWTLS) formulation. Our regularization of the weighted total least squares problem is based on the Tikhonov regularization [1]. For the total least squares (TLS) problem [2], the truncation approach has already been studied by Fierro et al. [3]. In [4], Golub et al. has considered the Tikhonov regularization approach for TLS problems. They derived a new regularization method in which stabilization enters the formulation in a natural way, and that is able to produce regularized solutions with superior properties for certain problems in which the perturbations are large.
    [Show full text]
  • A Gauss--Newton Iteration for Total Least Squares Problems
    A GAUSS–NEWTON ITERATION FOR TOTAL LEAST SQUARES PROBLEMS ∗ DARIO FASINO† AND ANTONIO FAZZI‡ Abstract. The Total Least Squares solution of an overdetermined, approximate linear equation Ax ≈ b minimizes a nonlinear function which characterizes the backward error. We show that a globally convergent variant of the Gauss–Newton iteration can be tailored to compute that solution. At each iteration, the proposed method requires the solution of an ordinary least squares problem where the matrix A is perturbed by a rank-one term. Key words. Total Least Squares, Gauss–Newton method AMS subject classifications. 65F20 1. Introduction. The Total Least Squares (TLS) problem is a well known tech- nique for solving overdetermined linear systems of equations Ax b, A Rm×n, b Rm (m>n), ≈ ∈ ∈ in which both the matrix A and the right hand side b are affected by errors. We consider the following classical definition of TLS problem, see e.g., [4, 13]. Definition 1.1 (TLS problem). The Total Least Squares problem with data A Rm×n and b Rm, with m n, is ∈ ∈ ≥ min (E f) F, subject to b + f Im(A + E), (1.1) E,f k | k ∈ where E Rm×n and f Rm. Given a matrix (E¯ f¯) that attains the minimum in (1.1), any∈x Rn such that∈ | ∈ (A + E¯)x = b + f¯ is called a solution of the Total Least Squares problem (1.1). Here and in what follows, F denotes the Frobenius matrix norm, while (E f) denotes the m (n +1) matrixk·k whose first n columns are the ones of E, and the last| column is the vector× f.
    [Show full text]
  • Appendix D: Nonlinear Least Squares for Data Fitting
    Appendix D Nonlinear Least Squares Data Fitting D.1 Introduction A nonlinear least squares problem is an unconstrained minimization problem of the form m 2 minimize f(x)= fi(x) , x i=1 where the objective function is defined in terms of auxiliary functions { fi }.It is called “least squares” because we are minimizing the sum of squares of these functions. Looked at in this way, it is just another example of unconstrained min- imization, leading one to ask why it should be studied as a separate topic. There are several reasons. In the context of data fitting, the auxiliary functions { fi } are not arbitrary nonlinear functions. They correspond to the residuals in a data fitting problem (see m Chapter 1). For example, suppose that we had collected data { (ti,yi) }i=1 consist- ing of the size of a population of antelope at various times. Here ti corresponds to the time at which the population yi was counted. Suppose we had the data ti :12458 yi :3461120 where the times are measured in years and the populations are measured in hun- dreds. It is common to model populations using exponential models, and so we might hope that x2ti yi ≈ x1e for appropriate choices of the parameters x1 and x2. A model of this type is illus- trated in Figure D.1. If least squares were used to select the parameters (see Section 1.5) then we would solve 5 x2ti 2 minimize f(x1,x2)= (x1e − yi) x1,x2 i=1 743 744 Appendix D. Nonlinear Least Squares Data Fitting 20 y 18 16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 7 8 9 10 t Figure D.1.
    [Show full text]
  • A Fast Algorithm for Solving Regularized Total Least Squares Problems∗
    Electronic Transactions on Numerical Analysis. ETNA Volume 31, pp. 12-24, 2008. Kent State University Copyright 2008, Kent State University. [email protected] ISSN 1068-9613. A FAST ALGORITHM FOR SOLVING REGULARIZED TOTAL LEAST SQUARES PROBLEMS∗ JORG¨ LAMPE† AND HEINRICH VOSS† Abstract. The total least squares (TLS) method is a successful approach for linear problems if both the system matrix and the right hand side are contaminated by some noise. For ill-posed TLS problems Renaut and Guo [SIAM J. Matrix Anal. Appl., 26 (2005), pp. 457–476] suggested an iterative method based on a sequence of linear eigenvalue problems. Here we analyze this method carefully, and we accelerate it substantially by solving the linear eigenproblems by the Nonlinear Arnoldi method (which reuses information from the previous iteration step considerably) and by a modified root finding method based on rational interpolation. Key words. Total least squares, regularization, ill-posedness, Nonlinear Arnoldi method. AMS subject classifications. 15A18, 65F15, 65F20, 65F22. 1. Introduction. Many problems in data estimation are governed by overdetermined linear systems Ax b, A Rm×n, b Rm,m n, (1.1) ≈ ∈ ∈ ≥ where both the matrix A and the right hand side b are contaminated by some noise. An appropriate approach to this problem is the total least squares (TLS) method which determines perturbations ∆A Rm×n to the coefficient matrix and ∆b Rm to the vector b such that ∈ ∈ [∆A, ∆b] 2 = min! subject to (A + ∆A)x = b + ∆b, (1.2) k kF where F denotes the Frobenius norm of a matrix; see, e.g., [7, 18].
    [Show full text]