中国科技论文在线 http://www.paper.edu.cn

An extension of the Sherman-Morrison-Woodbury formula ∗

Yan Zi-zong †

Abstract This paper is focused on the applications of Schur complements to identities and presents an extension of the Sherman-Morrison-Woodbury for- mula, which includes in a lot of matrix identities, such as Hua’s identity and its extensions. Keywords: Sherman-Morrison-Woodbury formula, Hua’s identity, Schur complement

AMS subject classifications. 15A45, 15A48, 15A24

1 Introduction

The well-known matrix identity

(A + UV ∗)−1 = A−1 − A−1U(I + V ∗A−1U)−1V ∗A−1, (1)

ia called to be the Sherman-Morrison-Woodbury matrix formula, which is usually attributed to Sherman-Morrison [8] and Woodbury [11] independently. In (1), A ∈ Ck×k, U, V ∈ Ck×p, I is an identity matrix and both the matrices A and I +V ∗A−1U are nonsingular. In mathematics (specifically ), this matrix identity says that the inverse of a lower correction of some matrix can be computed by doing a lower rank correction to the inverse of the original matrix. Alternative names for this formula are the matrix inversion lemma, Sherman-Morrison formula when U and V are vectors or just Woodbury formula. There are numerous applications of the Sherman-Morrison-Woodbury formula in various fields [4, 10]. For example, this formula is useful in certain numerical computations where A−1 has already been computed and it is desired to compute (A + UV )−1. With the inverse of A available, it is only necessary to find the inverse of I + VA−1U in order to obtain the result using the right-hand side of the identity.

∗Supported by the National Natural Science Foundation of China (70771080). †Department of Information and Mathematics, Yangtze University, Jingzhou, Hubei, China([email protected]).

1 中国科技论文在线 http://www.paper.edu.cn

Since the inverse of I+VA−1U is easily computed, this is more efficient than inverting A + UCV directly. The Sherman-Morrison-Woodbury formula (1) implies that

(I − A∗A)−1 = I + A∗(I − AA∗)−1A. (2)

By the use of the formula (2), Loo-Keng Hua [5] proposed the elegant matrix identity

I − B∗B = (I − B∗A)(I − A∗A)−1(I − A∗B) ∗ − (3) −(A − B)(I − AA ) 1(A − B).

A short proof of the formula (3) can be found in [14]. Meanwhile, Zhang [13, 14] also presented a nice generalization of Hua’s matrix identity (5) as follows

AA∗ + BB∗ = (B + AX)(I + X∗X)−1(B + AX)∗ +(A − BX∗)(I + XX∗)−1(A − BX∗)∗. (4)

Recently, Yan [12] presented another extension of (3) as follows

AA∗ − BB∗ = (A − BX∗)(I − X∗X)−1(A − BX∗)∗ −(B − AX)(I − XX∗)−1(B − AX)∗. (5)

Our purpose in this paper is to present an extension of the Sherman-Morrison- Woodbury formula and a lot of useful matrix identities including in Hua’s identity, both using Schur complements, and we do this on Section 3 and 4, after presenting necessary background theory in Section 2.

2 Background

Let M be an n × n partitioned as

M M M = 11 12 , (6) M21 M22

in which M11 is a square k × k block with 1 ≤ k < n. Letting

−1 M22.1 = M22 − M21M11 M12

denote the Schur complement of M11 in M, the Banachiewicz identity in [2] is

−1 −1 −1 −1 −1 −1 −1 M11 + M11 M12M22.1M21M11 −M11 M12M22.1 M = −1 −1 −1 , (7)  −M22.1M21M11 M22.1  which can be derived from the following so-called Aitken block-diagonalization for- mula

−1 I 0 M11 M12 I −M11 M12 M11 0 −1 = . (8) −M21M11 I M21 M22 0 I   0 M22.1

2 中国科技论文在线 http://www.paper.edu.cn

The formula (8) apparently first established explicitly by Aitken [1] and first pub- lished in 1939. When M22 is an identity matrix, the Sherman-Morrison-Woodbury formula (1) is a special and important case of (9) the following Duncan identity −1 −1 −1 −1 −1 −1 (M11 − M21M22 M12) = M11 + M11 M12M22.1M21M11 , (9) established by Duncan identity (1942) in [3]. It follows at once from the Ba- nachiewicz identity (7). Both Duncan identity (9) and the Sherman-Morrison-Woodbury formula (1) are essentially equivalent. In fact, we can acquired the Duncan identity (9) if we replace ∗ −1 −1 −1 ∗ −1 U and V by M22 UM22 and M22 V M22 in (1), respectively. These well-known matrix identities can be found in, for example, [3, 6, 9, 13]. The following lemma is interesting, which can be found in [13, 14]. Here we still present a complete proof. Lemma 2.1. Let M be a partitioned matrix defined as (6) and L 0 R R L = 11 ,R = 11 12 , L21 L22  0 R22 with the same blocks as M, and R(·) denote the column space. Suppose that the blocks L11 and R11 are invertible. If

R(M12) ⊂ R(M11), (10) then

(LM)22.1 = L22M22.1, (11)

(AR)22.1 = M22.1R22, (12)

(LMR)22.1 = L22M22.1R22. (13)

In particular, if L22 = R22 = I, then

(LMR)22.1 = M22.1. (14)

Proof. On the assumption of L22 = R22 = I, it is obvious for that (14) is valid if (13) is true. We only need prove the result (11). Firstly we assume that M11 is invertible. Since L M L M LM = 11 11 11 12 , L21M11 + L22M21 L21M12 + L22M22 then −1 (LM)22.1 = L21M12 + L22M22 − (L21M11 + L22M21)(L11M11) L11M12 −1 = L21M12 + L22M22 − (L21M11 + L22M21)M11 M12 −1 = L22M22 − L22M21M11 M12 = L22M22.1.

If M11 is singular, the condition (10) implies that the Schur complement M22.1 of M11 in A is unique, (see [14]), and R(L11M12) ⊂ R(L11M11), which shows that the Schur complement (LM)22.1 of L11M11 in LM is unique. So (11) is still valid. 

3 中国科技论文在线 http://www.paper.edu.cn 3 Main results

Now, the main result of this paper is the statement as follows. Theorem 3.1. Let N be an n × n matrix with the same blocks of M in (6). If the blocks M11,N11 and M11N11 + M12N21 are invertible, then

M21N12 + M22N22 −1 = (M21N11 + M22N21)(M11N11 + M12N21) (M11N12 + M12N22) (15) −1 −1 −1 +M22.1(I + N21N11 M11 M12) N22.1. Proof: Letting M M N N P = 11 12 11 12 , M21 M22 N21 N22 −1 I 0 I −N11 N12 Q = −1 P , −M21M11 I 0 I  then M M N 0 M N + M N M N Q = 11 12 11 = 11 11 12 21 12 22.1 ,  0 M22.1 N21 N22.1  M22.1N21 M22.1N22.1 and −1 P22.1 = M21N12 + M22N22 − (M21N11 + M22N21)(M11N11 + M12N21) (M11N12 + M12N22), −1 Q22.1 = M22.1N22.1 − M22.1N21(M11N11 + M12N21) M12N22.1. On the other hand, Sherman − Morrison − W oodbury formula (1) implies

−1 (M11N11 + M12N21) −1 −1 −1 −1 −1 −1 −1 −1 −1 = N11 M11 + N11 M11 M12(I + N21N11 M11 M12) N21N11 M11 . −1 −1 Let E = N21N11 M11 M12. By the use of the basic relation E(I + E)−1E − E = (I + E)−1 − I, we have

−1 −1 Q22.1 = M22.1N22.1 − M22.1N21N11 M11 M12N22.1 −1 −1 −1 −1 −1 −1 −1 +M22.1N21N11 M11 M12(I + N21N11 M11 M12) N21N11 M11 M12N22.1 −1 −1 −1 = M22.1(I + N21N11 M11 M12) N22.1

From the lemma 2.1, P22.1 = Q22.1 implies the desired result.  The matrix identity (15) and the Sherman−Morrison−W oodbury formula (1) are essentially equivalent. The above proof shows that the latter implies the former. Conversely, if we choose AU I 0 P = ∗ 0 I  V I in the matrix identity (15), we can acquire the Sherman − Morrison − W oodbury formula (1).

4 中国科技论文在线 http://www.paper.edu.cn 4 Applications

In what follows we show that many existing identities are in fact consequences of Theorem 3.1 by making special choices of different matrices P . In general, we always choose P such that P22.1 is a Hermitian matrix. The first choice of P is Y ∗ X∗ YB∗ P = ∗ ,  BA  XA  to give rise to the following matrix identity

AA∗ + BB∗ = (BY + AX)(Y ∗Y + X∗X)−1(BY + AX)∗ (16) +(A∗ − XY −1B∗)(I + X(Y ∗Y )−1X∗)−1(A∗ − XY −1B∗)∗.

A special case of (16) when Y is an identity matrix is the identity (4). The second choice of P is Y ∗ X∗ Y −B∗ P = ∗ ,  BA  −XA  to give rise to the following matrix identitiy

AA∗ − BB∗ ∗ − ∗ ∗ − ∗ − ∗ − ∗ ∗ = (A − XY 1B )(I − X(Y Y ) 1X ) 1(A − XY 1B ) (17) ∗ ∗ − ∗ −(BY − AX)(Y Y − X X) 1(BY − AX) . A special case of (17) when A is equal to B is

(I − XY −1)A∗(I − X(Y ∗Y )−1X∗)−1A(I − XY −1)∗ (18) = A(Y − X)(Y ∗Y − X∗X)−1(Y − X)∗A∗. When Y is an identity matrix, we acquire the identity (5) and

(I − X)A∗(I − XX∗)−1A(I − X)∗ (19) = A(I − X)(I − X∗X)−1(I − X)∗A∗ from (17) and (18), respectively. Furthermore, we can yield the Hua’s identity (3) from the identity (5). The third choice of P is Y ∗ X∗ YA P = ∗ ∗ , B A  XB to give rise to the matrix identities

B∗A + A∗B = (B∗Y + A∗X)(Y ∗Y + X∗X)−1(Y ∗A + X∗B) (20) +(B − XY −1A)(I + X(Y ∗Y )−1X∗)−1(A∗ − B∗(Y ∗)−1X∗)

5 中国科技论文在线 http://www.paper.edu.cn

and

B∗A + A∗B = (B∗ + A∗X)(I + X∗X)−1(A + X∗B) (21) +(B − XA)(I + XX∗)−1(A∗ − B∗X∗).

The fourth choice of P is Y ∗ X∗ Y −A P = ∗ ∗ , B A  −XB 

to give rise to the matrix identities

A∗B − B∗A ∗ ∗ ∗ ∗ − ∗ ∗ = (B Y − A X)(Y Y − X X) 1(X B − Y A) (22) +(B − XY −1A)(I − X(Y ∗Y )−1X∗)−1(A∗ − B∗(Y ∗)−1X∗)

and

A∗B − B∗A = (B∗ − A∗X)(I − X∗X)−1(X∗B − A) (23) +(B − XA)(I − XX∗)−1(A∗ − B∗X∗).

Of course, we might acquire a generalized matrix identities without Hermitian constraints in the formula (15). For instance, we can choose

Y ∗ X∗ B∗ X P = ∗  AB  A Y 

such that

AX + BY = (AB∗ + BA∗)(Y ∗B∗ + X∗A∗)−1(Y ∗X + X∗Y ) (24) +(Y − A∗(B∗)−1X)(I + A∗(Y ∗B∗)−1X∗)−1(B − A(Y ∗)−1X∗). References [1] Aitken, A.C. and Matrices. University Mathematical Tests, Oliver & Boyd, Edinburgh, 1939. (2nd-9th editors, 1942-1956; 9th edition. reset & reprinted, 1967. Reprint edition:Greenwood Press, Westport, Connecticut, 1983.)

[2] Banachiewicz, T. Zur Berechnung der Determination, Wie auch der Inversen, und zer darauf basierten Aufl¨osung der Systeme linearer Gleichungen. Acta Astronomica, S´erie C, 3, 41-67, 1937.

[3] Duncan, W.J. Some devices for the solution of large sets of simultaneous linear equations. (With an Appendix on the reciprocation of partitioned matrices.) The London, Edinburgh, and Dubin Philosophical Magaine and Journal of Science, Seventh Series, 35(1944), 660-670.

[4] Hager, W. W. Updating the inverse of a matrix. SIAM Review, 31(1989),221-239.

6 中国科技论文在线 http://www.paper.edu.cn

[5] Hua, L.K. Additive theory of prime numbers (Translated by N.B. Ng) in Translatinos of Math. Monographs, 13, Amer, Math. Soc. Providence, RI, 1965.

[6] Ouellette, D.V. Schur complements and statistocs. Linear Algebra Appl. 36(1981), 187-295.

[7] Schur, I. U¨ber Potenzreihen, die im Innern des Einheitskreises Beschr¨ankt sind [I]. Journal f¨ur die reine und angewandte Mathematik, 147, 205-232, 1917.

[8] Sherman, J. and Morrison,W. J. Adjustment of an inverse matrix correponding to changes in the elements of a given column or a given row of the original matrix. The Annals of Mathematical Statistics, 21(1950), 134-127.

[9] Styan, G.P.H. Schur Complement and linear statistical model, in: Pukkila, S. Puntanen (Eds.), Proceedings of the First International Tampere Seminar on Linear Statistical Models and their Applications, University of Tampere, Finland, 1985,pp.37-75.

[10] Vemuri, B.C. and S.H.Lai, A fast solution to the surfuce reconstrucyion problem. In SPIE conference on Mathematical Imaging: Geometric Methods in Computer Vision, 27-37, San Diego, CA, July, 1993, SPIE.

[11] Woodbury, M. A. Inverting modified matrices. Memorandum Report 42, Statistical Research Group, Institute for Advanced Study, Princeton, New Jersey, 6oo., June 14, 1950.

[12] Yan, Z. Schur complements and inequalities, Journal of Mathematics inequality, Vol. 3, No. 2(2009),161-167.

[13] Zhang, F. Schur complements and its applications, Springer, New York, 2005.

[14] Zhang, F. Hua’s matrix equality and schur complements, international journal of Information and Systems Sciences,Vol.1, No. 1(2008), 124-135.

7