Block Matrix Formulas

Total Page:16

File Type:pdf, Size:1020Kb

Block Matrix Formulas BlockMatrixFormulas.tex July 7, 2015 Block Matrix Formulas John A. Gubner Department of Electrical and Computer Engineering University of Wisconsin–Madison Abstract We derive a number of formulas for block matrices, including the block matrix inverse formulas, determinant formulas, psuedoinverse formulas, etc. If you find this writeup useful, or if you find typos or mistakes, please let me know at [email protected] 1. Preliminary Observations Given a block matrix AB F := ; CD if D is invertible, then the Schur complement of D is S := A − BD−1C: It is easy to check that I BD−1 S 0 I 0 F = : (1) 0 I 0 D D−1CI Then detF = detSdetD, and we see that given D is invertible, F is invertible if and only if S is invertible. When F is invertible, taking the inverse of (1) yields I 0S−1 0 I −BD−1 F−1 = −D−1CI 0 D−1 0 I " # S−1 −S−1BD−1 = : (2) − D−1CS−1 D−1CS−1BD−1 + D−1 Instead of assuming D is invertible, suppose A is invertible. The Schur comple- ment of A is Q := D −CA−1B; 1 BlockMatrixFormulas.tex July 7, 2015 and I 0 A 0 IA−1B F = : (3) CA−1 I 0 Q 0 I Then detF = detAdetQ, and we see that given A is invertible, F is invertible if and only if Q is invertible. If F is invertible, I −A−1BA−1 0 I 0 F−1 = 0 I 0 Q−1 −CA−1 I " # A−1 + A−1BQ−1CA−1 −A−1BQ−1 = : (4) − Q−1CA−1 Q−1 2. Results Using (1)–(4), there are several identities that can easily be found. 2.1. Determinant Formulas First, comparing (1) and (3) yields AB det = det(A)det(D −CA−1B) = det(A − BD−1C)det(D): (5) CD In particular, taking A = In and D = Ik yields det(Ik −CB) = det(In − BC): (6) 2.2. Matrix Inversion Formulas Next, comparing the upper-left blocks of (2) and (4), we see that [A − BD−1C]−1 = A−1 + A−1B[D −CA−1B]−1CA−1; (7) which is known as the Sherman–Morrison–Woodbury formula or sometimes just the Woodbury formula. The remaining corresponding blocks are also equal. For example, combining the top row of (2) with the bottom row of (4), we obtain " # AB −1 [A − BD−1C]−1 −[A − BD−1C]−1BD−1 = : (8) CD − [D −CA−1B]−1CA−1 [D −CA−1B]−1 2 BlockMatrixFormulas.tex July 7, 2015 2.3. Psuedoinverse Formulas A nice application of (8) is to the pseudoinverse of a 2 × 2 block matrix. Recall that if M is full rank, then its pseudoinverse is M† = (M∗M)−1M∗, where M∗ is the complex conjugate transpose of M. Consider the case h i G∗ M = GH and M∗ = : H∗ Then G∗GG∗H M∗M = ; H∗GH∗H and by (8), " # (G∗P?G)−1 −(G∗P?G)−1G∗H(H∗H)−1 (M∗M)−1 = H H ; (9) ∗ ? −1 ∗ ∗ −1 ∗ ? −1 − (H PG H) H G(G G) (H PG H) ? ∗ −1 ∗ ? ∗ −1 ∗ where PH := I − H(H H) H and PG := I − G(G G) G . It now easily follows that " # " # (G∗P?G)−1G∗P? (P?G)† (M∗M)−1M∗ = H H = H : (10) ∗ ? −1 ∗ ? ? † (H PG H) H PG (PG H) 2.4. Positive Semidefinite Matrices A matrix U is positive semidefinite if U∗ = U and x∗Ux ≥ 0 for all vectors x. In this case, we use the notation U 0. If U and V are Hermitian, we write U V if U −V is positive semidefinite. For real matrices, the condition U = U∗ is equivalent to U = U T, where U T denotes the transpose of U. Suppose that F is real with FT = F. Then A = AT, D = DT, and C = BT. Now rewrite (1) as A − BD−1BT 0 I −BD−1 AB I 0 = : 0 D 0 I BT D −D−1BT I Suppose the center matrix on the right is known to be positive semidefinite. Then since the matrices on either side are transposes of each other, it is easy to see that the matrix on the left-hand side is also positive semidefinite. It then follows that A − BD−1BT is positive semidefinite; in symbols, A BD−1BT. 3 BlockMatrixFormulas.tex July 7, 2015 2.4.1. Upper-Left Block Submatrices Given a 2 × 2 block matrix U, we denote its upper-left block by fUgULB. For the ∗ matrix F defined earlier, fFgULB = A. We show below that if F = F , then ∗ −1 ∗ −1 fU F UgULB fU gULBfFgULBfUgULB: More specifically, if PQ U := ; RS then ∗ −1 ∗ −1 fU F UgULB P A P: To derive the result, we first use (4) to compute the left-hand block column of F−1U. We get " # fA−1 + A−1BQ−1CA−1gP − A−1BQ−1R : − Q−1CA−1P + Q−1R It follows that " # fA−1 + A−1BQ−1CA−1gP − A−1BQ−1R fU∗F−1Ug = P∗ R∗ ULB − Q−1CA−1P + Q−1R = P∗A−1P + P∗A−1BQ−1CA−1P − P∗A−1BQ−1R − RQ−1CA−1P + R∗Q−1R = P∗A−1P + (B∗A−1P − R)∗Q−1(CA−1P − R): ∗ ∗ ∗ −1 ∗ −1 Since F = F , we have C = B , it follows that fU F UgULB P A P. Remarks. (i) The above display does not depend on Q or S. (ii) The only requirement on U is that its size be such that the required matrix multiplications are defined. The number of columns in Q and S must be the same, but this number is arbitrary, and could be zero, in which case U is a 2 × 1 block matrix. In particular, there is no requirement that U be a square matrix. References [1] W. W. Hager, “Updating the inverse of a matrix,” SIAM Review, vol. 31, no. 2, pp. 221–239, 1989. [2] Wikipedia, “Schur complement — Wikipedia, The Free Encyclopedia,” [Online]. Available: https://en.wikipedia.org/w/index.php?title=Schur_complement&oldid= 658902757, accessed July 6, 2015. [3] Wikipedia, “Woodbury matrix identity — Wikipedia, The Free Encyclopedia,” [Online]. Available: https://en.wikipedia.org/w/index.php?title=Woodbury_matrix_ identity&oldid=661757721, accessed July 6, 2015. 4.
Recommended publications
  • 21. Orthonormal Bases
    21. Orthonormal Bases The canonical/standard basis 011 001 001 B C B C B C B0C B1C B0C e1 = B.C ; e2 = B.C ; : : : ; en = B.C B.C B.C B.C @.A @.A @.A 0 0 1 has many useful properties. • Each of the standard basis vectors has unit length: q p T jjeijj = ei ei = ei ei = 1: • The standard basis vectors are orthogonal (in other words, at right angles or perpendicular). T ei ej = ei ej = 0 when i 6= j This is summarized by ( 1 i = j eT e = δ = ; i j ij 0 i 6= j where δij is the Kronecker delta. Notice that the Kronecker delta gives the entries of the identity matrix. Given column vectors v and w, we have seen that the dot product v w is the same as the matrix multiplication vT w. This is the inner product on n T R . We can also form the outer product vw , which gives a square matrix. 1 The outer product on the standard basis vectors is interesting. Set T Π1 = e1e1 011 B C B0C = B.C 1 0 ::: 0 B.C @.A 0 01 0 ::: 01 B C B0 0 ::: 0C = B. .C B. .C @. .A 0 0 ::: 0 . T Πn = enen 001 B C B0C = B.C 0 0 ::: 1 B.C @.A 1 00 0 ::: 01 B C B0 0 ::: 0C = B. .C B. .C @. .A 0 0 ::: 1 In short, Πi is the diagonal square matrix with a 1 in the ith diagonal position and zeros everywhere else.
    [Show full text]
  • Partitioned (Or Block) Matrices This Version: 29 Nov 2018
    Partitioned (or Block) Matrices This version: 29 Nov 2018 Intermediate Econometrics / Forecasting Class Notes Instructor: Anthony Tay It is frequently convenient to partition matrices into smaller sub-matrices. e.g. 2 3 2 1 3 2 3 2 1 3 4 1 1 0 7 4 1 1 0 7 A B (2×2) (2×3) 3 1 1 0 0 = 3 1 1 0 0 = C I 1 3 0 1 0 1 3 0 1 0 (3×2) (3×3) 2 0 0 0 1 2 0 0 0 1 The same matrix can be partitioned in several different ways. For instance, we can write the previous matrix as 2 3 2 1 3 2 3 2 1 3 4 1 1 0 7 4 1 1 0 7 a b0 (1×1) (1×4) 3 1 1 0 0 = 3 1 1 0 0 = c D 1 3 0 1 0 1 3 0 1 0 (4×1) (4×4) 2 0 0 0 1 2 0 0 0 1 One reason partitioning is useful is that we can do matrix addition and multiplication with blocks, as though the blocks are elements, as long as the blocks are conformable for the operations. For instance: A B D E A + D B + E (2×2) (2×3) (2×2) (2×3) (2×2) (2×3) + = C I C F 2C I + F (3×2) (3×3) (3×2) (3×3) (3×2) (3×3) A B d E Ad + BF AE + BG (2×2) (2×3) (2×1) (2×3) (2×1) (2×3) = C I F G Cd + F CE + G (3×2) (3×3) (3×1) (3×3) (3×1) (3×3) | {z } | {z } | {z } (5×5) (5×4) (5×4) 1 Intermediate Econometrics / Forecasting 2 Examples (1) Let 1 2 1 1 2 1 c 1 4 2 3 4 2 3 h i A = = = a a a and c = c 1 2 3 2 3 0 1 3 0 1 c 0 1 3 0 1 3 3 c1 h i then Ac = a1 a2 a3 c2 = c1a1 + c2a2 + c3a3 c3 The product Ac produces a linear combination of the columns of A.
    [Show full text]
  • On Multivariate Interpolation
    On Multivariate Interpolation Peter J. Olver† School of Mathematics University of Minnesota Minneapolis, MN 55455 U.S.A. [email protected] http://www.math.umn.edu/∼olver Abstract. A new approach to interpolation theory for functions of several variables is proposed. We develop a multivariate divided difference calculus based on the theory of non-commutative quasi-determinants. In addition, intriguing explicit formulae that connect the classical finite difference interpolation coefficients for univariate curves with multivariate interpolation coefficients for higher dimensional submanifolds are established. † Supported in part by NSF Grant DMS 11–08894. April 6, 2016 1 1. Introduction. Interpolation theory for functions of a single variable has a long and distinguished his- tory, dating back to Newton’s fundamental interpolation formula and the classical calculus of finite differences, [7, 47, 58, 64]. Standard numerical approximations to derivatives and many numerical integration methods for differential equations are based on the finite dif- ference calculus. However, historically, no comparable calculus was developed for functions of more than one variable. If one looks up multivariate interpolation in the classical books, one is essentially restricted to rectangular, or, slightly more generally, separable grids, over which the formulae are a simple adaptation of the univariate divided difference calculus. See [19] for historical details. Starting with G. Birkhoff, [2] (who was, coincidentally, my thesis advisor), recent years have seen a renewed level of interest in multivariate interpolation among both pure and applied researchers; see [18] for a fairly recent survey containing an extensive bibli- ography. De Boor and Ron, [8, 12, 13], and Sauer and Xu, [61, 10, 65], have systemati- cally studied the polynomial case.
    [Show full text]
  • Handout 9 More Matrix Properties; the Transpose
    Handout 9 More matrix properties; the transpose Square matrix properties These properties only apply to a square matrix, i.e. n £ n. ² The leading diagonal is the diagonal line consisting of the entries a11, a22, a33, . ann. ² A diagonal matrix has zeros everywhere except the leading diagonal. ² The identity matrix I has zeros o® the leading diagonal, and 1 for each entry on the diagonal. It is a special case of a diagonal matrix, and A I = I A = A for any n £ n matrix A. ² An upper triangular matrix has all its non-zero entries on or above the leading diagonal. ² A lower triangular matrix has all its non-zero entries on or below the leading diagonal. ² A symmetric matrix has the same entries below and above the diagonal: aij = aji for any values of i and j between 1 and n. ² An antisymmetric or skew-symmetric matrix has the opposite entries below and above the diagonal: aij = ¡aji for any values of i and j between 1 and n. This automatically means the digaonal entries must all be zero. Transpose To transpose a matrix, we reect it across the line given by the leading diagonal a11, a22 etc. In general the result is a di®erent shape to the original matrix: a11 a21 a11 a12 a13 > > A = A = 0 a12 a22 1 [A ]ij = A : µ a21 a22 a23 ¶ ji a13 a23 @ A > ² If A is m £ n then A is n £ m. > ² The transpose of a symmetric matrix is itself: A = A (recalling that only square matrices can be symmetric).
    [Show full text]
  • Determinants of Commuting-Block Matrices by Istvan Kovacs, Daniel S
    Determinants of Commuting-Block Matrices by Istvan Kovacs, Daniel S. Silver*, and Susan G. Williams* Let R beacommutative ring, and Matn(R) the ring of n × n matrices over R.We (i,j) can regard a k × k matrix M =(A ) over Matn(R)asablock matrix,amatrix that has been partitioned into k2 submatrices (blocks)overR, each of size n × n. When M is regarded in this way, we denote its determinant by |M|.Wewill use the symbol D(M) for the determinant of M viewed as a k × k matrix over Matn(R). It is important to realize that D(M)isann × n matrix. Theorem 1. Let R be acommutative ring. Assume that M is a k × k block matrix of (i,j) blocks A ∈ Matn(R) that commute pairwise. Then | | | | (1,π(1)) (2,π(2)) ··· (k,π(k)) (1) M = D(M) = (sgn π)A A A . π∈Sk Here Sk is the symmetric group on k symbols; the summation is the usual one that appears in the definition of determinant. Theorem 1 is well known in the case k =2;the proof is often left as an exercise in linear algebra texts (see [4, page 164], for example). The general result is implicit in [3], but it is not widely known. We present a short, elementary proof using mathematical induction on k.Wesketch a second proof when the ring R has no zero divisors, a proof that is based on [3] and avoids induction by using the fact that commuting matrices over an algebraically closed field can be simultaneously triangularized.
    [Show full text]
  • Week 8-9. Inner Product Spaces. (Revised Version) Section 3.1 Dot Product As an Inner Product
    Math 2051 W2008 Margo Kondratieva Week 8-9. Inner product spaces. (revised version) Section 3.1 Dot product as an inner product. Consider a linear (vector) space V . (Let us restrict ourselves to only real spaces that is we will not deal with complex numbers and vectors.) De¯nition 1. An inner product on V is a function which assigns a real number, denoted by < ~u;~v> to every pair of vectors ~u;~v 2 V such that (1) < ~u;~v>=< ~v; ~u> for all ~u;~v 2 V ; (2) < ~u + ~v; ~w>=< ~u;~w> + < ~v; ~w> for all ~u;~v; ~w 2 V ; (3) < k~u;~v>= k < ~u;~v> for any k 2 R and ~u;~v 2 V . (4) < ~v;~v>¸ 0 for all ~v 2 V , and < ~v;~v>= 0 only for ~v = ~0. De¯nition 2. Inner product space is a vector space equipped with an inner product. Pn It is straightforward to check that the dot product introduces by ~u ¢ ~v = j=1 ujvj is an inner product. You are advised to verify all the properties listed in the de¯nition, as an exercise. The dot product is also called Euclidian inner product. De¯nition 3. Euclidian vector space is Rn equipped with Euclidian inner product < ~u;~v>= ~u¢~v. De¯nition 4. A square matrix A is called positive de¯nite if ~vT A~v> 0 for any vector ~v 6= ~0. · ¸ 2 0 Problem 1. Show that is positive de¯nite. 0 3 Solution: Take ~v = (x; y)T . Then ~vT A~v = 2x2 + 3y2 > 0 for (x; y) 6= (0; 0).
    [Show full text]
  • Irreducibility in Algebraic Groups and Regular Unipotent Elements
    PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 141, Number 1, January 2013, Pages 13–28 S 0002-9939(2012)11898-2 Article electronically published on August 16, 2012 IRREDUCIBILITY IN ALGEBRAIC GROUPS AND REGULAR UNIPOTENT ELEMENTS DONNA TESTERMAN AND ALEXANDRE ZALESSKI (Communicated by Pham Huu Tiep) Abstract. We study (connected) reductive subgroups G of a reductive alge- braic group H,whereG contains a regular unipotent element of H.Themain result states that G cannot lie in a proper parabolic subgroup of H. This result is new even in the classical case H =SL(n, F ), the special linear group over an algebraically closed field, where a regular unipotent element is one whose Jor- dan normal form consists of a single block. In previous work, Saxl and Seitz (1997) determined the maximal closed positive-dimensional (not necessarily connected) subgroups of simple algebraic groups containing regular unipotent elements. Combining their work with our main result, we classify all reductive subgroups of a simple algebraic group H which contain a regular unipotent element. 1. Introduction Let H be a reductive linear algebraic group defined over an algebraically closed field F . Throughout this text ‘reductive’ will mean ‘connected reductive’. A unipo- tent element u ∈ H is said to be regular if the dimension of its centralizer CH (u) coincides with the rank of H (or, equivalently, u is contained in a unique Borel subgroup of H). Regular unipotent elements of a reductive algebraic group exist in all characteristics (see [22]) and form a single conjugacy class. These play an important role in the general theory of algebraic groups.
    [Show full text]
  • New Foundations for Geometric Algebra1
    Text published in the electronic journal Clifford Analysis, Clifford Algebras and their Applications vol. 2, No. 3 (2013) pp. 193-211 New foundations for geometric algebra1 Ramon González Calvet Institut Pere Calders, Campus Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Spain E-mail : [email protected] Abstract. New foundations for geometric algebra are proposed based upon the existing isomorphisms between geometric and matrix algebras. Each geometric algebra always has a faithful real matrix representation with a periodicity of 8. On the other hand, each matrix algebra is always embedded in a geometric algebra of a convenient dimension. The geometric product is also isomorphic to the matrix product, and many vector transformations such as rotations, axial symmetries and Lorentz transformations can be written in a form isomorphic to a similarity transformation of matrices. We collect the idea Dirac applied to develop the relativistic electron equation when he took a basis of matrices for the geometric algebra instead of a basis of geometric vectors. Of course, this way of understanding the geometric algebra requires new definitions: the geometric vector space is defined as the algebraic subspace that generates the rest of the matrix algebra by addition and multiplication; isometries are simply defined as the similarity transformations of matrices as shown above, and finally the norm of any element of the geometric algebra is defined as the nth root of the determinant of its representative matrix of order n. The main idea of this proposal is an arithmetic point of view consisting of reversing the roles of matrix and geometric algebras in the sense that geometric algebra is a way of accessing, working and understanding the most fundamental conception of matrix algebra as the algebra of transformations of multiple quantities.
    [Show full text]
  • Matrix Determinants
    MATRIX DETERMINANTS Summary Uses ................................................................................................................................................. 1 1‐ Reminder ‐ Definition and components of a matrix ................................................................ 1 2‐ The matrix determinant .......................................................................................................... 2 3‐ Calculation of the determinant for a matrix ................................................................. 2 4‐ Exercise .................................................................................................................................... 3 5‐ Definition of a minor ............................................................................................................... 3 6‐ Definition of a cofactor ............................................................................................................ 4 7‐ Cofactor expansion – a method to calculate the determinant ............................................... 4 8‐ Calculate the determinant for a matrix ........................................................................ 5 9‐ Alternative method to calculate determinants ....................................................................... 6 10‐ Exercise .................................................................................................................................... 7 11‐ Determinants of square matrices of dimensions 4x4 and greater ........................................
    [Show full text]
  • 6 Inner Product Spaces
    Lectures 16,17,18 6 Inner Product Spaces 6.1 Basic Definition Parallelogram law, the ability to measure angle between two vectors and in particular, the concept of perpendicularity make the euclidean space quite a special type of a vector space. Essentially all these are consequences of the dot product. Thus, it makes sense to look for operations which share the basic properties of the dot product. In this section we shall briefly discuss this. Definition 6.1 Let V be a vector space. By an inner product on V we mean a binary operation, which associates a scalar say u, v for each pair of vectors u, v) in V, satisfying h i the following properties for all u, v, w in V and α, β any scalar. (Let “ ” denote the complex − conjugate of a complex number.) (1) u, v = v, u (Hermitian property or conjugate symmetry); h i h i (2) u, αv + βw = α u, v + β u, w (sesquilinearity); h i h i h i (3) v, v > 0 if v =0 (positivity). h i 6 A vector space with an inner product is called an inner product space. Remark 6.1 (i) Observe that we have not mentioned whether V is a real vector space or a complex vector space. The above definition includes both the cases. The only difference is that if K = R then the conjugation is just the identity. Thus for real vector spaces, (1) will becomes ‘symmetric property’ since for a real number c, we have c¯ = c. (ii) Combining (1) and (2) we obtain (2’) αu + βv, w =α ¯ u, w + β¯ v, w .
    [Show full text]
  • 9. Properties of Matrices Block Matrices
    9. Properties of Matrices Block Matrices It is often convenient to partition a matrix M into smaller matrices called blocks, like so: 01 2 3 11 ! B C B4 5 6 0C A B M = B C = @7 8 9 1A C D 0 1 2 0 01 2 31 011 B C B C Here A = @4 5 6A, B = @0A, C = 0 1 2 , D = (0). 7 8 9 1 • The blocks of a block matrix must fit together to form a rectangle. So ! ! B A C B makes sense, but does not. D C D A • There are many ways to cut up an n × n matrix into blocks. Often context or the entries of the matrix will suggest a useful way to divide the matrix into blocks. For example, if there are large blocks of zeros in a matrix, or blocks that look like an identity matrix, it can be useful to partition the matrix accordingly. • Matrix operations on block matrices can be carried out by treating the blocks as matrix entries. In the example above, ! ! A B A B M 2 = C D C D ! A2 + BC AB + BD = CA + DC CB + D2 1 Computing the individual blocks, we get: 0 30 37 44 1 2 B C A + BC = @ 66 81 96 A 102 127 152 0 4 1 B C AB + BD = @10A 16 0181 B C CA + DC = @21A 24 CB + D2 = (2) Assembling these pieces into a block matrix gives: 0 30 37 44 4 1 B C B 66 81 96 10C B C @102 127 152 16A 4 10 16 2 This is exactly M 2.
    [Show full text]
  • Notes on Change of Bases Northwestern University, Summer 2014
    Notes on Change of Bases Northwestern University, Summer 2014 Let V be a finite-dimensional vector space over a field F, and let T be a linear operator on V . Given a basis (v1; : : : ; vn) of V , we've seen how we can define a matrix which encodes all the information about T as follows. For each i, we can write T vi = a1iv1 + ··· + anivn 2 for a unique choice of scalars a1i; : : : ; ani 2 F. In total, we then have n scalars aij which we put into an n × n matrix called the matrix of T relative to (v1; : : : ; vn): 0 1 a11 ··· a1n B . .. C M(T )v := @ . A 2 Mn;n(F): an1 ··· ann In the notation M(T )v, the v showing up in the subscript emphasizes that we're taking this matrix relative to the specific bases consisting of v's. Given any vector u 2 V , we can also write u = b1v1 + ··· + bnvn for a unique choice of scalars b1; : : : ; bn 2 F, and we define the coordinate vector of u relative to (v1; : : : ; vn) as 0 1 b1 B . C n M(u)v := @ . A 2 F : bn In particular, the columns of M(T )v are the coordinates vectors of the T vi. Then the point of the matrix M(T )v is that the coordinate vector of T u is given by M(T u)v = M(T )vM(u)v; so that from the matrix of T and the coordinate vectors of elements of V , we can in fact reconstruct T itself.
    [Show full text]