An Extension of the Sherman-Morrison-Woodbury Formula ∗
Total Page:16
File Type:pdf, Size:1020Kb
中国科技论文在线 http://www.paper.edu.cn An extension of the Sherman-Morrison-Woodbury formula ∗ Yan Zi-zong † Abstract This paper is focused on the applications of Schur complements to matrix identities and presents an extension of the Sherman-Morrison-Woodbury for- mula, which includes in a lot of matrix identities, such as Hua’s identity and its extensions. Keywords: Sherman-Morrison-Woodbury formula, Hua’s identity, Schur complement AMS subject classifications. 15A45, 15A48, 15A24 1 Introduction The well-known matrix identity (A + UV ∗)−1 = A−1 − A−1U(I + V ∗A−1U)−1V ∗A−1, (1) ia called to be the Sherman-Morrison-Woodbury matrix formula, which is usually attributed to Sherman-Morrison [8] and Woodbury [11] independently. In (1), A ∈ Ck×k, U, V ∈ Ck×p, I is an identity matrix and both the matrices A and I +V ∗A−1U are nonsingular. In mathematics (specifically linear algebra), this matrix identity says that the inverse of a lower rank correction of some matrix can be computed by doing a lower rank correction to the inverse of the original matrix. Alternative names for this formula are the matrix inversion lemma, Sherman-Morrison formula when U and V are vectors or just Woodbury formula. There are numerous applications of the Sherman-Morrison-Woodbury formula in various fields [4, 10]. For example, this formula is useful in certain numerical computations where A−1 has already been computed and it is desired to compute (A + UV )−1. With the inverse of A available, it is only necessary to find the inverse of I + VA−1U in order to obtain the result using the right-hand side of the identity. ∗Supported by the National Natural Science Foundation of China (70771080). †Department of Information and Mathematics, Yangtze University, Jingzhou, Hubei, China([email protected]). 1 中国科技论文在线 http://www.paper.edu.cn Since the inverse of I+VA−1U is easily computed, this is more efficient than inverting A + UCV directly. The Sherman-Morrison-Woodbury formula (1) implies that (I − A∗A)−1 = I + A∗(I − AA∗)−1A. (2) By the use of the formula (2), Loo-Keng Hua [5] proposed the elegant matrix identity I − B∗B = (I − B∗A)(I − A∗A)−1(I − A∗B) ∗ − (3) −(A − B)(I − AA ) 1(A − B). A short proof of the formula (3) can be found in [14]. Meanwhile, Zhang [13, 14] also presented a nice generalization of Hua’s matrix identity (5) as follows AA∗ + BB∗ = (B + AX)(I + X∗X)−1(B + AX)∗ +(A − BX∗)(I + XX∗)−1(A − BX∗)∗. (4) Recently, Yan [12] presented another extension of (3) as follows AA∗ − BB∗ = (A − BX∗)(I − X∗X)−1(A − BX∗)∗ −(B − AX)(I − XX∗)−1(B − AX)∗. (5) Our purpose in this paper is to present an extension of the Sherman-Morrison- Woodbury formula and a lot of useful matrix identities including in Hua’s identity, both using Schur complements, and we do this on Section 3 and 4, after presenting necessary background theory in Section 2. 2 Background Let M be an n × n invertible matrix partitioned as M M M = 11 12 , (6) M21 M22 in which M11 is a square k × k block with 1 ≤ k < n. Letting −1 M22.1 = M22 − M21M11 M12 denote the Schur complement of M11 in M, the Banachiewicz identity in [2] is −1 −1 −1 −1 −1 −1 −1 M11 + M11 M12M22.1M21M11 −M11 M12M22.1 M = −1 −1 −1 , (7) −M22.1M21M11 M22.1 which can be derived from the following so-called Aitken block-diagonalization for- mula −1 I 0 M11 M12 I −M11 M12 M11 0 −1 = . (8) −M21M11 I M21 M22 0 I 0 M22.1 2 中国科技论文在线 http://www.paper.edu.cn The formula (8) apparently first established explicitly by Aitken [1] and first pub- lished in 1939. When M22 is an identity matrix, the Sherman-Morrison-Woodbury formula (1) is a special and important case of (9) the following Duncan identity −1 −1 −1 −1 −1 −1 (M11 − M21M22 M12) = M11 + M11 M12M22.1M21M11 , (9) established by Duncan identity (1942) in [3]. It follows at once from the Ba- nachiewicz identity (7). Both Duncan identity (9) and the Sherman-Morrison-Woodbury formula (1) are essentially equivalent. In fact, we can acquired the Duncan identity (9) if we replace ∗ −1 −1 −1 ∗ −1 U and V by M22 UM22 and M22 V M22 in (1), respectively. These well-known matrix identities can be found in, for example, [3, 6, 9, 13]. The following lemma is interesting, which can be found in [13, 14]. Here we still present a complete proof. Lemma 2.1. Let M be a partitioned matrix defined as (6) and L 0 R R L = 11 ,R = 11 12 , L21 L22 0 R22 with the same blocks as M, and R(·) denote the column space. Suppose that the blocks L11 and R11 are invertible. If R(M12) ⊂ R(M11), (10) then (LM)22.1 = L22M22.1, (11) (AR)22.1 = M22.1R22, (12) (LMR)22.1 = L22M22.1R22. (13) In particular, if L22 = R22 = I, then (LMR)22.1 = M22.1. (14) Proof. On the assumption of L22 = R22 = I, it is obvious for that (14) is valid if (13) is true. We only need prove the result (11). Firstly we assume that M11 is invertible. Since L M L M LM = 11 11 11 12 , L21M11 + L22M21 L21M12 + L22M22 then −1 (LM)22.1 = L21M12 + L22M22 − (L21M11 + L22M21)(L11M11) L11M12 −1 = L21M12 + L22M22 − (L21M11 + L22M21)M11 M12 −1 = L22M22 − L22M21M11 M12 = L22M22.1. If M11 is singular, the condition (10) implies that the Schur complement M22.1 of M11 in A is unique, (see [14]), and R(L11M12) ⊂ R(L11M11), which shows that the Schur complement (LM)22.1 of L11M11 in LM is unique. So (11) is still valid. 3 中国科技论文在线 http://www.paper.edu.cn 3 Main results Now, the main result of this paper is the statement as follows. Theorem 3.1. Let N be an n × n matrix with the same blocks of M in (6). If the blocks M11,N11 and M11N11 + M12N21 are invertible, then M21N12 + M22N22 −1 = (M21N11 + M22N21)(M11N11 + M12N21) (M11N12 + M12N22) (15) −1 −1 −1 +M22.1(I + N21N11 M11 M12) N22.1. Proof: Letting M M N N P = 11 12 11 12 , M21 M22 N21 N22 −1 I 0 I −N11 N12 Q = −1 P , −M21M11 I 0 I then M M N 0 M N + M N M N Q = 11 12 11 = 11 11 12 21 12 22.1 , 0 M22.1 N21 N22.1 M22.1N21 M22.1N22.1 and −1 P22.1 = M21N12 + M22N22 − (M21N11 + M22N21)(M11N11 + M12N21) (M11N12 + M12N22), −1 Q22.1 = M22.1N22.1 − M22.1N21(M11N11 + M12N21) M12N22.1. On the other hand, Sherman − Morrison − W oodbury formula (1) implies −1 (M11N11 + M12N21) −1 −1 −1 −1 −1 −1 −1 −1 −1 = N11 M11 + N11 M11 M12(I + N21N11 M11 M12) N21N11 M11 . −1 −1 Let E = N21N11 M11 M12. By the use of the basic relation E(I + E)−1E − E = (I + E)−1 − I, we have −1 −1 Q22.1 = M22.1N22.1 − M22.1N21N11 M11 M12N22.1 −1 −1 −1 −1 −1 −1 −1 +M22.1N21N11 M11 M12(I + N21N11 M11 M12) N21N11 M11 M12N22.1 −1 −1 −1 = M22.1(I + N21N11 M11 M12) N22.1 From the lemma 2.1, P22.1 = Q22.1 implies the desired result. The matrix identity (15) and the Sherman−Morrison−W oodbury formula (1) are essentially equivalent. The above proof shows that the latter implies the former. Conversely, if we choose AU I 0 P = ∗ 0 I V I in the matrix identity (15), we can acquire the Sherman − Morrison − W oodbury formula (1). 4 中国科技论文在线 http://www.paper.edu.cn 4 Applications In what follows we show that many existing identities are in fact consequences of Theorem 3.1 by making special choices of different matrices P . In general, we always choose P such that P22.1 is a Hermitian matrix. The first choice of P is Y ∗ X∗ YB∗ P = ∗ , BA XA to give rise to the following matrix identity AA∗ + BB∗ = (BY + AX)(Y ∗Y + X∗X)−1(BY + AX)∗ (16) +(A∗ − XY −1B∗)(I + X(Y ∗Y )−1X∗)−1(A∗ − XY −1B∗)∗. A special case of (16) when Y is an identity matrix is the identity (4). The second choice of P is Y ∗ X∗ Y −B∗ P = ∗ , BA −XA to give rise to the following matrix identitiy AA∗ − BB∗ ∗ − ∗ ∗ − ∗ − ∗ − ∗ ∗ = (A − XY 1B )(I − X(Y Y ) 1X ) 1(A − XY 1B ) (17) ∗ ∗ − ∗ −(BY − AX)(Y Y − X X) 1(BY − AX) . A special case of (17) when A is equal to B is (I − XY −1)A∗(I − X(Y ∗Y )−1X∗)−1A(I − XY −1)∗ (18) = A(Y − X)(Y ∗Y − X∗X)−1(Y − X)∗A∗. When Y is an identity matrix, we acquire the identity (5) and (I − X)A∗(I − XX∗)−1A(I − X)∗ (19) = A(I − X)(I − X∗X)−1(I − X)∗A∗ from (17) and (18), respectively. Furthermore, we can yield the Hua’s identity (3) from the identity (5). The third choice of P is Y ∗ X∗ YA P = ∗ ∗ , B A XB to give rise to the matrix identities B∗A + A∗B = (B∗Y + A∗X)(Y ∗Y + X∗X)−1(Y ∗A + X∗B) (20) +(B − XY −1A)(I + X(Y ∗Y )−1X∗)−1(A∗ − B∗(Y ∗)−1X∗) 5 中国科技论文在线 http://www.paper.edu.cn and B∗A + A∗B = (B∗ + A∗X)(I + X∗X)−1(A + X∗B) (21) +(B − XA)(I + XX∗)−1(A∗ − B∗X∗).