Approximate Solution of an Overdetermined System of Equations
Total Page:16
File Type:pdf, Size:1020Kb
Appendix A Approximate Solution of an Overdetermined System of Equations Approximate solution of an overdetermined system of linear equations AX ≈ B is one of the main topics in (numerical) linear algebra and is covered in any linear alge- bra textbook, see, e.g., Strang (1976, Sect. 3.3), Meyer (2000, Sects. 4.6 and 5.14), and Trefethen and Bau (1997, Lecture 11). The classical approach is approximate solution in the least squares sense: − = minimize over B and X B B F subject to AX B, (LS) where the matrix B is modified as little as possible in the sense of minimizing the correction size B − BF, so that the modified system of equations AX = B is com- patible. The classical least squares problem has an analytic solution: assuming that the matrix A is full column rank, the unique least squares approximate solution is −1 −1 Xls = A A A B and Bls = A A A A B. In the case when A is rank deficient, the solution is either nonunique or does not exist. Such least squares problems are solved numerically by regularization tech- niques, see, e.g., Björck (1996, Sect. 2.7). There are many variations and generalizations of the least squares method for solving approximately an overdetermined system of equations. Well known ones are methods for recursive least squares approximation (Kailath et al. 2000, Sect. 2.6), regularized least squares (Hansen 1997), linear and quadratically constrained least squares problems (Golub and Van Loan 1996, Sect. 12.1). Next, we list generalizations related to the class of the total least squares methods because of their close connection to corresponding low rank approximation prob- lems. Total least squares methods are low rank approximation methods using an input/output representation of the rank constraint. In all these problems the basic idea is to modify the given data as little as possible, so that the modified data define a consistent system of equations. In the different methods, however, the correction is done and its size is measured in different ways. This results in different properties of the methods in a stochastic estimation setting and motivates the use of the methods in different practical setups. I. Markovsky, Low Rank Approximation, Communications and Control Engineering, 227 DOI 10.1007/978-1-4471-2227-2, © Springer-Verlag London Limited 2012 228 A Approximate Solution of an Overdetermined System of Equations • The data least squares (Degroat and Dowling 1991) method is the “reverse” of the least squares method in the sense that the matrix A is modified and the matrix B is not: − = minimize over A and X A A F subject to AX B. (DLS) As in the least squares problem, the solution of the data least squares is com- putable in closed form. • The classical total least squares (Golub and Reinsch 1970; Golub 1973; Golub and Van Loan 1980) method modifies symmetrically the matrices A and B: − minimize over A, B, and X AB A B F (TLS) subject to AX = B. Conditions for existence and uniqueness of a total least squares approximate solu- tion are given in terms of the singular value decomposition of the augmented data matrix [AB]. In the generic case when a unique solution exists, that solution is given in terms of the right singular vectors of [AB] corresponding to the smallest singular values. In this case, the optimal total least squares approximation [A B] of the data matrix [AB] coincides with the Frobenius norm optimal low rank approximation of [AB], i.e., in the generic case, the model obtained by the total least squares method coincides with the model obtained by the unstructured low rank approximation in the Frobenius norm. Theorem A.1 Let D∗ be a solution to the low rank approximation problem − ≤ m minimize over D D D F subject to rank(D) and let B∗ = image(D∗) be the corresponding optimal linear static model. The ∗ ∗ ∗ parameter X of an input/output representation B = Bi/o(X ) of the optimal model is a solution to the total least squares problem (TLS) with data matrices m A q×N = D ∈ R . q − m B A total least squares solution X∗ exists if and only if an input/output representa- ∗ ∗ ∗ tion Bi/o(X ) of B exists and is unique if and only if an optimal model B is unique. In the case of existence and uniqueness of a total least squares solution ∗ ∗ ∗ ∗ D = A∗ B∗ , where A X = B . The theorem makes explicit the link between low rank approximation and total least squares. From a data modeling point of view, total least squares is low rank approximation of the data matrix D = [AB], followed by input/output representation of the optimal model. A Approximate Solution of an Overdetermined System of Equations 229 229a Total least squares 229a ≡ function [x, ah, bh] = tls(a, b) n = size(a, 2); [r, p, dh] = lra([a b]’, n); low rank approximation → total least squares solution 229b Defines: tls, never used. Uses lra 64. 229b low rank approximation → total least squares solution 229b ≡ (229a 230) x = p2x(p); ah = dh(1:n, :); bh = dh((n + 1):end, :); Lack of solution of the total least squares problem (TLS)—a case called non- generic total least squares problem—is caused by lack of existence of an in- put/output representation of the model. Nongeneric total least squares problems are considered in Van Huffel and Vandewalle (1988, 1991) and Paige and Strakos (2005). • The generalized total least squares (Van Huffel and Vandewalle 1989) method measures the size of the data correction matrix ΔA ΔB := AB − AB after row and column weighting: − minimize over A, B, and X Wl AB A B Wr F (GTLS) subject to AX = B. Here the matrices are Wl and Wr are positive semidefinite weight matrices—Wl corresponds to weighting of the rows and Wr to weighting of the columns of the correction [ΔA ΔB]. Similarly to the classical total least squares problem, the existence and uniqueness of a generalized total least squares approximate solution is determined from the singular value decomposition. The data least squares and total least squares problems are special cases of the generalized total least squares problem. • The restricted total least squares (Van Huffel and Zha 1991) method constrains thecorrectiontobeintheform ΔA ΔB = PeELe, for some E, i.e., the row and column span of the correction matrix are constrained to be within the given subspaces image(Pe) and image(Le ), respectively. The restricted total least squares problem is: minimize over A, B, E, and X EF (RTLS) subject to AB − A B = PeELe and AX = B. The generalized total least squares problem is a special case of (RTLS). 230 A Approximate Solution of an Overdetermined System of Equations • The Procrustes problem:givenm × n real matrices A and B, minimize over X B − AXF subject to X X = In is a least squares problem with a constraint that the unknown X is an orthogonal matrix. The solution is given by X = UV, where UΣV is the singular value decomposition of AB, see Golub and Van Loan (1996, p. 601). • The weighted total least squares (De Moor 1993, Sect. 4.3) method generalizes the classical total least squares problem by measuring the correction size by a weighted matrix norm ·W − minimize over A, B, and X AB A B W (WTLS) subject to AX = B. Special weighted total least squares problems correspond to weight matrices W with special structure, e.g., diagonal W corresponds to element-wise weighted to- tal least squares (Markovsky et al. 2005). In general, the weighted total least squares problem has no analytic solution in terms of the singular value de- composition, so that contrary to the above listed generalizations, weighted to- tal least squares problems, in general, cannot be solved globally and efficiently. Weighted low rank approximation problems, corresponding to the weighted to- tal least squares problem are considered in Wentzell et al. (1997), Manton et al. (2003) and Markovsky and Van Huffel (2007a). 230 Weighted total least squares 230 ≡ function [x, ah, bh, info] = wtls(a, b, s, opt) n = size(a, 2); [p, l, info] = wlra([a b]’, n, s, opt); dh = p * l; low rank approximation → total least squares solution 229b Defines: wtls, never used. Uses wlra 139. • The regularized total least squares (Fierro et al. 1997; Golub et al. 1999; Sima et al. 2004; Beck and Ben-Tal 2006; Sima 2006) methods is defined as − + minimize over A, B, and X AB A B F γ DX F (RegTLS) subject to AX = B. Global and efficient solution methods for solving regularized total least squares problems are derived in Beck and Ben-Tal (2006). • The structured total least squares method (De Moor 1993; Abatzoglou et al. 1991) method is a total least squares method with the additional constraint that the correction should have certain specified structure − minimize over A, B, and X AB A B F (STLS) subject to AX = B and AB has a specified structure. References 231 Hankel and Toeplitz structured total least squares problems are the most often studied ones due to the their application in signal processing and system theory. • The structured total least norm method (Rosen et al. 1996) is the same as the structured total least squares method with a general matrix norm in the approxi- mation criterion instead of the Frobenius norm. For generalizations and applications of the total least squares problem in the periods 1990–1996, 1996–2001, and 2001–2006, see, respectively, the edited books (Van Huffel 1997; Van Huffel and Lemmerling 2002), and the special issues (Van Huffel et al.