Block Matrices with L-Block-Banded Inverse

Total Page:16

File Type:pdf, Size:1020Kb

Block Matrices with L-Block-Banded Inverse 630 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 53, NO. 2, FEBRUARY 2005 Block Matrices With v-Block-banded Inverse: Inversion Algorithms Amir Asif, Senior Member, IEEE, and José M. F. Moura, Fellow, IEEE Abstract—Block-banded matrices generalize banded matrices. scalars. In contrast with banded matrices, there are much fewer We study the properties of positive definite full matrices whose results published for block-banded matrices. Any block-banded inverses are -block-banded. We show that, for such matrices, matrix is also banded; therefore, the methods in [3]–[10] could the blocks in the -block band of completely determine ; namely, all blocks of outside its -block band are computed in principle be applicable to block-banded matrices. Exten- from the blocks in the -block band of . We derive fast inver- sion of inversion algorithms designed for banded matrices to sion algorithms for and its inverse that, when compared block-banded matrices generally fails to exploit the sparsity to direct inversion, are faster by two orders of magnitude of patterns within each block and between blocks; therefore, the linear dimension of the constituent blocks. We apply these they are not optimal [11]. Further, algorithms that use the inversion algorithms to successfully develop fast approximations to Kalman–Bucy filters in applications with high dimensional block-banded structure of the matrix to be inverted are compu- states where the direct inversion of the covariance matrix is tationally more efficient than those that just manipulate scalars computationally unfeasible. because with block-banded matrices more computations are Index Terms—Block-banded matrix, Cholesky decomposition, associated with a given data movement than with scalar banded covariance matrix, Gauss–Markov random process, Kalman–Bucy matrices. An example of an inversion algorithm that uses the filter, matrix inversion, sparse matrix. inherent structure of the constituent blocks in a block-banded matrix to its advantage is in [3]; this reference proposes a fast I. INTRODUCTION algorithm for inverting block Toeplitz matrices with Toeplitz blocks. References [12]–[17] describe alternative algorithms LOCK-banded matrices and their inverses arise fre- for matrices with a similar Toeplitz structure. B quently in signal processing applications, including This paper develops results for positive definite and sym- autoregressive or moving average image modeling, covariances metric -block-banded matrices and their inverses . of Gauss–Markov random processes (GMrp) [1], [2], or with An example of is the covariance matrix of a Gauss–Markov finite difference numerical approximations to partial differen- random field (GMrp); see [2] with its inverse referred to as the tial equations. For example, the point spread function (psf) in information matrix. Unlike existing approaches, no additional image restoration problems has a block-banded structure [3]. structure or constraint is assumed on ; in particular, the algo- Block-banded matrices are also used to model the correlation rithms in this paper do not require to be Toeplitz. Our inver- of cyclostationary processes in periodic time series [4]. We sion algorithms generalize our earlier work presented in [18] for are motivated by signal processing problems with large di- tridiagonal block matrices to block matrices with an arbitrary mensional states where a straightforward implementation of a bandwidth . We show that the matrix , whose inverse is recursive estimation algorithm, such as the Kalman–Bucy filter an -block-banded matrix, is completely defined by the blocks (KBf), is prohibitively expensive. In particular, the inversion within its -block band. In other words, when the block matrix of the error covariance matrices in the KBf is computationally has an -block-banded inverse, is highly structured. Any intensive precluding the direct implementation of the KBf to block entry outside the -block diagonals of can be obtained such problems. from the block entries within the -block diagonals of . The Several approaches1 [3]–[10] have been proposed in the paper proves this fact, which is at first sight surprising, and de- literature for the inversion of (scalar, not block) banded ma- rives the following algorithms for block matrices whose in- trices. In banded matrices, the entries in the band diagonals are verses are -block-banded: 1) Inversion of : An inversion algorithm for that uses Manuscript received January 7, 2003; revised February 24, 2004. This work only the block entries in the -block band of . This is a was supported in part by the Natural Science and Engineering Research Council (NSERC) of Canada under Grants 229862-01 and 228415-03 and by the United very efficient inversion algorithm for such ; it is faster States Office of Naval Research under Grant N00014-97-1-0800. The associate than direct inversion by two orders of magnitude of the editor coordinating the review of this paper and approving it for publication was linear dimension of the blocks used. Prof Trac D. Tran. A. Asif is with the Department of Computer Science and Engineering, York 2) Inversion of : A fast inversion algorithm for the University, Toronto, ON, M3J 1P3 Canada (Email: [email protected]) -block-banded matrix that is faster than its direct J. M. F. Moura is with the Department of Electrical and Computer Engi- inversion by up to one order of magnitude of the linear neering, Carnegie Mellon University, Pittsburgh, PA, 15213 USA (e-mail: [email protected]). dimension of its constituent blocks. Digital Object Identifier 10.1109/TSP.2004.840709 Compared with the scalar banded representations, the block- 1This review is intended only to provide context to the work and should not banded implementations of Algorithms 1 and 2 provide com- be treated as complete or exhaustive. putational savings of up to three orders of magnitude of the 1053-587X/$20.00 © 2005 IEEE ASIF AND MOURA: BLOCK MATRICES WITH -BLOCK-BANDED INVERSE: INVERSION ALGORITHMS 631 dimension of the constituent blocks used to represent and spanning block rows and columns through its inverse . The inversion algorithms are then used to de- is given by velop alternative, computationally efficient approximations of the KBf. These near-optimal implementations are obtained by imposing an -block-banded structure on the inverse of the error . (2) covariance matrix (information matrix) and correspond to mod- .. eling the error field as a reduced-order Gauss–Markov random process (GMrp). Controlled simulations show that our KBf im- plementations lead to results that are virtually indistinguishable The Cholesky factorization of results in the from the results for the conventional KBf. Cholesky factor that is an upper triangular matrix. To indi- The paper is organized as follows. In Section II, we define cate that the matrix is the upper triangular Cholesky factor of the notation used to represent block-banded matrices and derive , we work often with the notation chol . three important properties for -block-banded matrices. These Lemma 1.1 shows that the Cholesky factor of the -block- properties express the block entries of an -block-banded ma- banded matrix has exactly nonzero block diagonals above trix in terms of the block entries of its inverse, and vice versa. the main diagonal. Section III applies the results derived in Section II to derive in- Lemma 1.1: A positive definite and symmetric matrix version algorithms for an -block-banded matrix and its in- is -block-banded if and only if (iff) the constituent blocks verse . We also consider special block-banded matrices that in its upper triangular Cholesky factor are have additional zero block diagonals within the first -block di- agonals. To illustrate the application of the inversion algorithms, for (3) we apply them in Section IV to inverting large covariance ma- trices in the context of an approximation algorithm to a large The proof of Lemma 1.1 is included in the Appendix. state space KBf problem. These simulations show an almost per- The inverse of the Cholesky factor is an upper trian- fect agreement between the approximate filter and the exact KBf gular matrix estimate. Finally, Section V summarizes the paper. II. BANDED MATRIX RESULTS A. Notation .. .. (4) Consider a positive-definite symmetric matrix represented . by its constituent blocks , , . The matrix is assumed to have an -block-banded inverse , , , with the following structure: where the lower diagonal entries in are zero blocks . More importantly, the main diagonal entries in are block inverses of the corresponding blocks in . These features are used next to derive three important theorems for -block-banded matrices where we show how to obtain the following: (1) a) block entry of the Cholesky factor from a selected number of blocks of without inverting the full matrix ; b) block entries of recursively from the blocks where the square blocks and the zero square blocks are of without inverting the complete Cholesky of order . A diagonal block matrix is a 0-block-banded matrix factor ; and . A tridiagonal block matrix has exactly one nonzero c) block entries , outside the first diago- block diagonal above and below the main diagonal and is there- nals of from the blocks within the first -diagonals of fore a 1-block-banded matrix and similarly for higher . values of . Unless otherwise specified, we use calligraphic Since we operate at a block level, the three theorems offer con- fonts to denote matrices (e.g., or ) with dimensions siderable computational savings over direct computations of . Their constituent blocks are denoted by capital letters (e.g., from and vice versa. The proofs are included in the Appendix. or ) with the subscript representing their location inside the full matrix in terms of the number of block rows B. Theorems and block columns . The blocks (or ) are of dimen- sions , implying that there are block rows and block Theorem 1: The Cholesky blocks ’s3 on columns in matrix (or ). block row of the Cholesky factor of an -block-banded ma- To be concise, we borrow the MATLAB2 notation to refer to an ordered combination of blocks .
Recommended publications
  • On Multivariate Interpolation
    On Multivariate Interpolation Peter J. Olver† School of Mathematics University of Minnesota Minneapolis, MN 55455 U.S.A. [email protected] http://www.math.umn.edu/∼olver Abstract. A new approach to interpolation theory for functions of several variables is proposed. We develop a multivariate divided difference calculus based on the theory of non-commutative quasi-determinants. In addition, intriguing explicit formulae that connect the classical finite difference interpolation coefficients for univariate curves with multivariate interpolation coefficients for higher dimensional submanifolds are established. † Supported in part by NSF Grant DMS 11–08894. April 6, 2016 1 1. Introduction. Interpolation theory for functions of a single variable has a long and distinguished his- tory, dating back to Newton’s fundamental interpolation formula and the classical calculus of finite differences, [7, 47, 58, 64]. Standard numerical approximations to derivatives and many numerical integration methods for differential equations are based on the finite dif- ference calculus. However, historically, no comparable calculus was developed for functions of more than one variable. If one looks up multivariate interpolation in the classical books, one is essentially restricted to rectangular, or, slightly more generally, separable grids, over which the formulae are a simple adaptation of the univariate divided difference calculus. See [19] for historical details. Starting with G. Birkhoff, [2] (who was, coincidentally, my thesis advisor), recent years have seen a renewed level of interest in multivariate interpolation among both pure and applied researchers; see [18] for a fairly recent survey containing an extensive bibli- ography. De Boor and Ron, [8, 12, 13], and Sauer and Xu, [61, 10, 65], have systemati- cally studied the polynomial case.
    [Show full text]
  • Determinants of Commuting-Block Matrices by Istvan Kovacs, Daniel S
    Determinants of Commuting-Block Matrices by Istvan Kovacs, Daniel S. Silver*, and Susan G. Williams* Let R beacommutative ring, and Matn(R) the ring of n × n matrices over R.We (i,j) can regard a k × k matrix M =(A ) over Matn(R)asablock matrix,amatrix that has been partitioned into k2 submatrices (blocks)overR, each of size n × n. When M is regarded in this way, we denote its determinant by |M|.Wewill use the symbol D(M) for the determinant of M viewed as a k × k matrix over Matn(R). It is important to realize that D(M)isann × n matrix. Theorem 1. Let R be acommutative ring. Assume that M is a k × k block matrix of (i,j) blocks A ∈ Matn(R) that commute pairwise. Then | | | | (1,π(1)) (2,π(2)) ··· (k,π(k)) (1) M = D(M) = (sgn π)A A A . π∈Sk Here Sk is the symmetric group on k symbols; the summation is the usual one that appears in the definition of determinant. Theorem 1 is well known in the case k =2;the proof is often left as an exercise in linear algebra texts (see [4, page 164], for example). The general result is implicit in [3], but it is not widely known. We present a short, elementary proof using mathematical induction on k.Wesketch a second proof when the ring R has no zero divisors, a proof that is based on [3] and avoids induction by using the fact that commuting matrices over an algebraically closed field can be simultaneously triangularized.
    [Show full text]
  • QUADRATIC FORMS and DEFINITE MATRICES 1.1. Definition of A
    QUADRATIC FORMS AND DEFINITE MATRICES 1. DEFINITION AND CLASSIFICATION OF QUADRATIC FORMS 1.1. Definition of a quadratic form. Let A denote an n x n symmetric matrix with real entries and let x denote an n x 1 column vector. Then Q = x’Ax is said to be a quadratic form. Note that a11 ··· a1n . x1 Q = x´Ax =(x1...xn) . xn an1 ··· ann P a1ixi . =(x1,x2, ··· ,xn) . P anixi 2 (1) = a11x1 + a12x1x2 + ... + a1nx1xn 2 + a21x2x1 + a22x2 + ... + a2nx2xn + ... + ... + ... 2 + an1xnx1 + an2xnx2 + ... + annxn = Pi ≤ j aij xi xj For example, consider the matrix 12 A = 21 and the vector x. Q is given by 0 12x1 Q = x Ax =[x1 x2] 21 x2 x1 =[x1 +2x2 2 x1 + x2 ] x2 2 2 = x1 +2x1 x2 +2x1 x2 + x2 2 2 = x1 +4x1 x2 + x2 1.2. Classification of the quadratic form Q = x0Ax: A quadratic form is said to be: a: negative definite: Q<0 when x =06 b: negative semidefinite: Q ≤ 0 for all x and Q =0for some x =06 c: positive definite: Q>0 when x =06 d: positive semidefinite: Q ≥ 0 for all x and Q = 0 for some x =06 e: indefinite: Q>0 for some x and Q<0 for some other x Date: September 14, 2004. 1 2 QUADRATIC FORMS AND DEFINITE MATRICES Consider as an example the 3x3 diagonal matrix D below and a general 3 element vector x. 100 D = 020 004 The general quadratic form is given by 100 x1 0 Q = x Ax =[x1 x2 x3] 020 x2 004 x3 x1 =[x 2 x 4 x ] x2 1 2 3 x3 2 2 2 = x1 +2x2 +4x3 Note that for any real vector x =06 , that Q will be positive, because the square of any number is positive, the coefficients of the squared terms are positive and the sum of positive numbers is always positive.
    [Show full text]
  • Irreducibility in Algebraic Groups and Regular Unipotent Elements
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 141, Number 1, January 2013, Pages 13–28 S 0002-9939(2012)11898-2 Article electronically published on August 16, 2012 IRREDUCIBILITY IN ALGEBRAIC GROUPS AND REGULAR UNIPOTENT ELEMENTS DONNA TESTERMAN AND ALEXANDRE ZALESSKI (Communicated by Pham Huu Tiep) Abstract. We study (connected) reductive subgroups G of a reductive alge- braic group H,whereG contains a regular unipotent element of H.Themain result states that G cannot lie in a proper parabolic subgroup of H. This result is new even in the classical case H =SL(n, F ), the special linear group over an algebraically closed field, where a regular unipotent element is one whose Jor- dan normal form consists of a single block. In previous work, Saxl and Seitz (1997) determined the maximal closed positive-dimensional (not necessarily connected) subgroups of simple algebraic groups containing regular unipotent elements. Combining their work with our main result, we classify all reductive subgroups of a simple algebraic group H which contain a regular unipotent element. 1. Introduction Let H be a reductive linear algebraic group defined over an algebraically closed field F . Throughout this text ‘reductive’ will mean ‘connected reductive’. A unipo- tent element u ∈ H is said to be regular if the dimension of its centralizer CH (u) coincides with the rank of H (or, equivalently, u is contained in a unique Borel subgroup of H). Regular unipotent elements of a reductive algebraic group exist in all characteristics (see [22]) and form a single conjugacy class. These play an important role in the general theory of algebraic groups.
    [Show full text]
  • 8.3 Positive Definite Matrices
    8.3. Positive Definite Matrices 433 Exercise 8.2.25 Show that every 2 2 orthog- [Hint: If a2 + b2 = 1, then a = cos θ and b = sinθ for × cos θ sinθ some angle θ.] onal matrix has the form − or sinθ cosθ cos θ sin θ Exercise 8.2.26 Use Theorem 8.2.5 to show that every for some angle θ. sinθ cosθ symmetric matrix is orthogonally diagonalizable. − 8.3 Positive Definite Matrices All the eigenvalues of any symmetric matrix are real; this section is about the case in which the eigenvalues are positive. These matrices, which arise whenever optimization (maximum and minimum) problems are encountered, have countless applications throughout science and engineering. They also arise in statistics (for example, in factor analysis used in the social sciences) and in geometry (see Section 8.9). We will encounter them again in Chapter 10 when describing all inner products in Rn. Definition 8.5 Positive Definite Matrices A square matrix is called positive definite if it is symmetric and all its eigenvalues λ are positive, that is λ > 0. Because these matrices are symmetric, the principal axes theorem plays a central role in the theory. Theorem 8.3.1 If A is positive definite, then it is invertible and det A > 0. Proof. If A is n n and the eigenvalues are λ1, λ2, ..., λn, then det A = λ1λ2 λn > 0 by the principal axes theorem (or× the corollary to Theorem 8.2.5). ··· If x is a column in Rn and A is any real n n matrix, we view the 1 1 matrix xT Ax as a real number.
    [Show full text]
  • Inner Products and Norms (Part III)
    Inner Products and Norms (part III) Prof. Dan A. Simovici UMB 1 / 74 Outline 1 Approximating Subspaces 2 Gram Matrices 3 The Gram-Schmidt Orthogonalization Algorithm 4 QR Decomposition of Matrices 5 Gram-Schmidt Algorithm in R 2 / 74 Approximating Subspaces Definition A subspace T of a inner product linear space is an approximating subspace if for every x 2 L there is a unique element in T that is closest to x. Theorem Let T be a subspace in the inner product space L. If x 2 L and t 2 T , then x − t 2 T ? if and only if t is the unique element of T closest to x. 3 / 74 Approximating Subspaces Proof Suppose that x − t 2 T ?. Then, for any u 2 T we have k x − u k2=k (x − t) + (t − u) k2=k x − t k2 + k t − u k2; by observing that x − t 2 T ? and t − u 2 T and applying Pythagora's 2 2 Theorem to x − t and t − u. Therefore, we have k x − u k >k x − t k , so t is the unique element of T closest to x. 4 / 74 Approximating Subspaces Proof (cont'd) Conversely, suppose that t is the unique element of T closest to x and x − t 62 T ?, that is, there exists u 2 T such that (x − t; u) 6= 0. This implies, of course, that u 6= 0L. We have k x − (t + au) k2=k x − t − au k2=k x − t k2 −2(x − t; au) + jaj2 k u k2 : 2 2 Since k x − (t + au) k >k x − t k (by the definition of t), we have 2 2 1 −2(x − t; au) + jaj k u k > 0 for every a 2 F.
    [Show full text]
  • 9. Properties of Matrices Block Matrices
    9. Properties of Matrices Block Matrices It is often convenient to partition a matrix M into smaller matrices called blocks, like so: 01 2 3 11 ! B C B4 5 6 0C A B M = B C = @7 8 9 1A C D 0 1 2 0 01 2 31 011 B C B C Here A = @4 5 6A, B = @0A, C = 0 1 2 , D = (0). 7 8 9 1 • The blocks of a block matrix must fit together to form a rectangle. So ! ! B A C B makes sense, but does not. D C D A • There are many ways to cut up an n × n matrix into blocks. Often context or the entries of the matrix will suggest a useful way to divide the matrix into blocks. For example, if there are large blocks of zeros in a matrix, or blocks that look like an identity matrix, it can be useful to partition the matrix accordingly. • Matrix operations on block matrices can be carried out by treating the blocks as matrix entries. In the example above, ! ! A B A B M 2 = C D C D ! A2 + BC AB + BD = CA + DC CB + D2 1 Computing the individual blocks, we get: 0 30 37 44 1 2 B C A + BC = @ 66 81 96 A 102 127 152 0 4 1 B C AB + BD = @10A 16 0181 B C CA + DC = @21A 24 CB + D2 = (2) Assembling these pieces into a block matrix gives: 0 30 37 44 4 1 B C B 66 81 96 10C B C @102 127 152 16A 4 10 16 2 This is exactly M 2.
    [Show full text]
  • Block Matrices in Linear Algebra
    PRIMUS Problems, Resources, and Issues in Mathematics Undergraduate Studies ISSN: 1051-1970 (Print) 1935-4053 (Online) Journal homepage: https://www.tandfonline.com/loi/upri20 Block Matrices in Linear Algebra Stephan Ramon Garcia & Roger A. Horn To cite this article: Stephan Ramon Garcia & Roger A. Horn (2020) Block Matrices in Linear Algebra, PRIMUS, 30:3, 285-306, DOI: 10.1080/10511970.2019.1567214 To link to this article: https://doi.org/10.1080/10511970.2019.1567214 Accepted author version posted online: 05 Feb 2019. Published online: 13 May 2019. Submit your article to this journal Article views: 86 View related articles View Crossmark data Full Terms & Conditions of access and use can be found at https://www.tandfonline.com/action/journalInformation?journalCode=upri20 PRIMUS, 30(3): 285–306, 2020 Copyright # Taylor & Francis Group, LLC ISSN: 1051-1970 print / 1935-4053 online DOI: 10.1080/10511970.2019.1567214 Block Matrices in Linear Algebra Stephan Ramon Garcia and Roger A. Horn Abstract: Linear algebra is best done with block matrices. As evidence in sup- port of this thesis, we present numerous examples suitable for classroom presentation. Keywords: Matrix, matrix multiplication, block matrix, Kronecker product, rank, eigenvalues 1. INTRODUCTION This paper is addressed to instructors of a first course in linear algebra, who need not be specialists in the field. We aim to convince the reader that linear algebra is best done with block matrices. In particular, flexible thinking about the process of matrix multiplication can reveal concise proofs of important theorems and expose new results. Viewing linear algebra from a block-matrix perspective gives an instructor access to use- ful techniques, exercises, and examples.
    [Show full text]
  • Amath/Math 516 First Homework Set Solutions
    AMATH/MATH 516 FIRST HOMEWORK SET SOLUTIONS The purpose of this problem set is to have you brush up and further develop your multi-variable calculus and linear algebra skills. The problem set will be very difficult for some and straightforward for others. If you are having any difficulty, please feel free to discuss the problems with me at any time. Don't delay in starting work on these problems! 1. Let Q be an n × n symmetric positive definite matrix. The following fact for symmetric matrices can be used to answer the questions in this problem. Fact: If M is a real symmetric n×n matrix, then there is a real orthogonal n×n matrix U (U T U = I) T and a real diagonal matrix Λ = diag(λ1; λ2; : : : ; λn) such that M = UΛU . (a) Show that the eigenvalues of Q2 are the square of the eigenvalues of Q. Note that λ is an eigenvalue of Q if and only if there is some vector v such that Qv = λv. Then Q2v = Qλv = λ2v, so λ2 is an eigenvalue of Q2. (b) If λ1 ≥ λ2 ≥ · · · ≥ λn are the eigen values of Q, show that 2 T 2 n λnkuk2 ≤ u Qu ≤ λ1kuk2 8 u 2 IR : T T Pn 2 We have u Qu = u UΛUu = i=1 λikuk2. The result follows immediately from bounds on the λi. (c) If 0 < λ < λ¯ are such that 2 T ¯ 2 n λkuk2 ≤ u Qu ≤ λkuk2 8 u 2 IR ; then all of the eigenvalues of Q must lie in the interval [λ; λ¯].
    [Show full text]
  • Linear Algebra Review
    Linear Algebra Review Kaiyu Zheng October 2017 Linear algebra is fundamental for many areas in computer science. This document aims at providing a reference (mostly for myself) when I need to remember some concepts or examples. Instead of a collection of facts as the Matrix Cookbook, this document is more gentle like a tutorial. Most of the content come from my notes while taking the undergraduate linear algebra course (Math 308) at the University of Washington. Contents on more advanced topics are collected from reading different sources on the Internet. Contents 3.8 Exponential and 7 Special Matrices 19 Logarithm...... 11 7.1 Block Matrix.... 19 1 Linear System of Equa- 3.9 Conversion Be- 7.2 Orthogonal..... 20 tions2 tween Matrix Nota- 7.3 Diagonal....... 20 tion and Summation 12 7.4 Diagonalizable... 20 2 Vectors3 7.5 Symmetric...... 21 2.1 Linear independence5 4 Vector Spaces 13 7.6 Positive-Definite.. 21 2.2 Linear dependence.5 4.1 Determinant..... 13 7.7 Singular Value De- 2.3 Linear transforma- 4.2 Kernel........ 15 composition..... 22 tion.........5 4.3 Basis......... 15 7.8 Similar........ 22 7.9 Jordan Normal Form 23 4.4 Change of Basis... 16 3 Matrix Algebra6 7.10 Hermitian...... 23 4.5 Dimension, Row & 7.11 Discrete Fourier 3.1 Addition.......6 Column Space, and Transform...... 24 3.2 Scalar Multiplication6 Rank......... 17 3.3 Matrix Multiplication6 8 Matrix Calculus 24 3.4 Transpose......8 5 Eigen 17 8.1 Differentiation... 24 3.4.1 Conjugate 5.1 Multiplicity of 8.2 Jacobian......
    [Show full text]
  • Facts from Linear Algebra
    Appendix A Facts from Linear Algebra Abstract We introduce the notation of vector and matrices (cf. Section A.1), and recall the solvability of linear systems (cf. Section A.2). Section A.3 introduces the spectrum σ(A), matrix polynomials P (A) and their spectra, the spectral radius ρ(A), and its properties. Block structures are introduced in Section A.4. Subjects of Section A.5 are orthogonal and orthonormal vectors, orthogonalisation, the QR method, and orthogonal projections. Section A.6 is devoted to the Schur normal form (§A.6.1) and the Jordan normal form (§A.6.2). Diagonalisability is discussed in §A.6.3. Finally, in §A.6.4, the singular value decomposition is explained. A.1 Notation for Vectors and Matrices We recall that the field K denotes either R or C. Given a finite index set I, the linear I space of all vectors x =(xi)i∈I with xi ∈ K is denoted by K . The corresponding square matrices form the space KI×I . KI×J with another index set J describes rectangular matrices mapping KJ into KI . The linear subspace of a vector space V spanned by the vectors {xα ∈V : α ∈ I} is denoted and defined by α α span{x : α ∈ I} := aαx : aα ∈ K . α∈I I×I T Let A =(aαβ)α,β∈I ∈ K . Then A =(aβα)α,β∈I denotes the transposed H matrix, while A =(aβα)α,β∈I is the adjoint (or Hermitian transposed) matrix. T H Note that A = A holds if K = R . Since (x1,x2,...) indicates a row vector, T (x1,x2,...) is used for a column vector.
    [Show full text]
  • Similar Matrices and Jordan Form
    Similar matrices and Jordan form We’ve nearly covered the entire heart of linear algebra – once we’ve finished singular value decompositions we’ll have seen all the most central topics. AT A is positive definite A matrix is positive definite if xT Ax > 0 for all x 6= 0. This is a very important class of matrices; positive definite matrices appear in the form of AT A when computing least squares solutions. In many situations, a rectangular matrix is multiplied by its transpose to get a square matrix. Given a symmetric positive definite matrix A, is its inverse also symmet­ ric and positive definite? Yes, because if the (positive) eigenvalues of A are −1 l1, l2, · · · ld then the eigenvalues 1/l1, 1/l2, · · · 1/ld of A are also positive. If A and B are positive definite, is A + B positive definite? We don’t know much about the eigenvalues of A + B, but we can use the property xT Ax > 0 and xT Bx > 0 to show that xT(A + B)x > 0 for x 6= 0 and so A + B is also positive definite. Now suppose A is a rectangular (m by n) matrix. A is almost certainly not symmetric, but AT A is square and symmetric. Is AT A positive definite? We’d rather not try to find the eigenvalues or the pivots of this matrix, so we ask when xT AT Ax is positive. Simplifying xT AT Ax is just a matter of moving parentheses: xT (AT A)x = (Ax)T (Ax) = jAxj2 ≥ 0.
    [Show full text]