How and Why to Estimate Condition

HowResearch and Why Matters to Estimate ConditionFebruary Numbers 25, for 2009 Matrix Functions

NickNick Higham Higham DirectorSchool of of Research Mathematics TheSchool University of Mathematics of Manchester http://www.maths.manchester.ac.uk/~higham @nhigham, nickhigham.wordpress.com

Joint work with Sam Relton and Lijing Lin

Householder Symposium XIX Spa, Belgium, June 8-13, 2014 1 / 6 The Big Picture

(2) f Lf Lf

Cmpw cond(f ) cond(L ) cond(f ) f

cond[2](f )

Nick Higham Matrix Function Condition Numbers 2 / 29 Examples:

2 2 2 (A + E) − A = AE + EA + E ⇒ Lx2 (A, E) = AE + EA.

(A + E)−1 − A−1 = −A−1EA−1 + O(E 2)

−1 −1 ⇒ Lx−1 (A, E) = −A EA .

0 Scalar case: Lf (x, e) = f ( x)e.

Fréchet Derivative

Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n n×n n×n n×n A linear mapping Lf : C → C s.t. for all E ∈ C

f (A + E) − f (A) − Lf (A, E) = o(kEk).

Nick Higham Matrix Function Condition Numbers 3 / 29 0 Scalar case: Lf (x, e) = f ( x)e.

Fréchet Derivative

Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n n×n n×n n×n A linear mapping Lf : C → C s.t. for all E ∈ C

f (A + E) − f (A) − Lf (A, E) = o(kEk).

Examples:

2 2 2 (A + E) − A = AE + EA + E ⇒ Lx2 (A, E) = AE + EA.

(A + E)−1 − A−1 = −A−1EA−1 + O(E 2)

−1 −1 ⇒ Lx−1 (A, E) = −A EA .

Nick Higham Matrix Function Condition Numbers 3 / 29 Fréchet Derivative

Fréchet derivative of f : Cn×n → Cn×n at X ∈ Cn×n n×n n×n n×n A linear mapping Lf : C → C s.t. for all E ∈ C

f (A + E) − f (A) − Lf (A, E) = o(kEk).

Examples:

2 2 2 (A + E) − A = AE + EA + E ⇒ Lx2 (A, E) = AE + EA.

(A + E)−1 − A−1 = −A−1EA−1 + O(E 2)

−1 −1 ⇒ Lx−1 (A, E) = −A EA .

0 Scalar case: Lf (x, e) = f ( x)e.

Nick Higham Matrix Function Condition Numbers 3 / 29 Applications of Fréchet Derivative

First and second Fréchet derivative in Halley’s method on Banach space (2003).

Matrix geometric mean computation (2012).

Model reduction (2012).

Computation of correlated choice probabilities (2013).

Registration of MRI images (2013).

Markov models applied to cancer data (2014).

Computing linearized backward errors for matrix functions (2014).

Nick Higham Matrix Function Condition Numbers 4 / 29 Computing Lf : Methods for Speciﬁc f

exponential Kenney & Laub (1998), Al-Mohy & H (2009). logarithm Al-Mohy, H & Relton (2013). fractional power H & Lin (2013).

Latter three methods obtained by differentiating alg for f .

Nick Higham Matrix Function Condition Numbers 5 / 29 Computing Lf : General Methods

Theorem (Mathias, 1996) If f is 2n − 1 times ctsly diffble,

AE f (A) L (A, E) f = f . 0 A 0 f (A)

Complex step method . Assume f : Rn×n → Rn×n and A, E ∈ Rn×n. Then (H & Al-Mohy, 2010):

f (A + i hE) L (A, E) ≈ Im . f h

Nick Higham Matrix Function Condition Numbers 6 / 29 Componentwise Sensitivity (1)

T Let E = ei ej . Then

( + T ) − ( ) f A ei ej f A T lim = Lf (A, ei ej ). →0

T (Lf (A, ei ej ))rs measures the sensitivity of f (A)rs to perturbations in aij .

Nick Higham Matrix Function Condition Numbers 7 / 29 In progress: sensitivity of y 0 = A(t)y (nuclear physics).

Componentwise Sensitivity (2)

Since Lf is a linear operator,

vec(Lf (A, E)) = Kf (A)vec(E)

n2×n2 where Kf (A) ∈ C is the Kronecker form of Lf (A).

a11 a21 a31 ... ann f11 f21

f31 Kf (A) ≡ . T . Lf (A, ei ej ) fnn

Nick Higham Matrix Function Condition Numbers 8 / 29 Componentwise Sensitivity (2)

Since Lf is a linear operator,

vec(Lf (A, E)) = Kf (A)vec(E)

n2×n2 where Kf (A) ∈ C is the Kronecker form of Lf (A).

a11 a21 a31 ... ann f11 f21

f31 Kf (A) ≡ . T . Lf (A, ei ej ) fnn

In progress: sensitivity of y 0 = A(t)y (nuclear physics).

Nick Higham Matrix Function Condition Numbers 8 / 29 0 Scalar case: condabs(f , x) = |f (x)|.

Absolute Condition Number

kf (A + E) − f (A)k condabs(f , A) = lim sup . →0 kEk≤

kLf (A, E)k kLf (A)k := max . E6=0 kEk

Lemma

condabs(f , A) = kLf (A)k.

Nick Higham Matrix Function Condition Numbers 9 / 29 Absolute Condition Number

kf (A + E) − f (A)k condabs(f , A) = lim sup . →0 kEk≤

kLf (A, E)k kLf (A)k := max . E6=0 kEk

Lemma

condabs(f , A) = kLf (A)k.

0 Scalar case: condabs(f , x) = |f (x)|.

Nick Higham Matrix Function Condition Numbers 9 / 29 Relative Condition Number

kf (A + E) − f (A)k condrel(f , A) := lim sup →0 kEk≤kAk kf (A)k kAk = cond (f , A) . abs kf (A)k

Nick Higham Matrix Function Condition Numbers 10 / 29 The Setup

3 Assume we have O(n ) algs for f (A) and Lf (A, E).

4 Kf (A) has n entries.

5 Computing Kf (A) requires O(n ) ﬂops.

Wish to estimate norms and condition numbers in O(n3) ﬂops.

In practice we only ever work with n × n matrices.

Nick Higham Matrix Function Condition Numbers 11 / 29 Condition Estimation (1)

Can show kL (A)k f 1 ≤ kK (A)k ≤ nkL (A)k . n f 1 f 1 So problem reduces to matrix norm estimation.

Use the block 1-norm estimator of H & Tisseur (2000). ∗ For kBk1 it needs Bx and B y for several x and y. Here,

Kf (A)x = vec(Lf (A, X)), vec(X) = x, ∗ ? Kf (A) y = vec(Lf (A, Y )), vec(Y ) = y.

Nick Higham Matrix Function Condition Numbers 12 / 29 Condition Estimation (2)

Theorem (H & Lin, 2013) Let f ∈ C2n−1 and ef (z) := f (z). If ef (A)∗ = ef (A∗) for all A ∈ Cn×n then ( )∗ = ( ( , ∗)∗) ( ) = Kf A x vec Lef A X , where vec X x.

For most functions of interest, ef = f , and we can deduce that ∗ ∗ ∗ Kf (A) x = vec(Lf (A, X ) ).

Nick Higham Matrix Function Condition Numbers 13 / 29 kth Fréchet derivative deﬁned by

(k−1) (k−1) Lf (A+Ek , E1,..., Ek−1) − Lf (A, E1,..., Ek−1) (k) − Lf (A, E1,..., Ek ) = o(kEk k).

Higher Derivatives

Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A, E). (2) Second Fréchet derivative Lf (A, E1, E2) is unique multilinear function of E1, E2 satisfying

(2) Lf (A+E2, E1) − Lf (A, E1) − Lf (A, E1, E2) = o(kE2k).

Nick Higham Matrix Function Condition Numbers 14 / 29 Higher Derivatives

Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A, E). (2) Second Fréchet derivative Lf (A, E1, E2) is unique multilinear function of E1, E2 satisfying

(2) Lf (A+E2, E1) − Lf (A, E1) − Lf (A, E1, E2) = o(kE2k).

kth Fréchet derivative deﬁned by

(k−1) (k−1) Lf (A+Ek , E1,..., Ek−1) − Lf (A, E1,..., Ek−1) (k) − Lf (A, E1,..., Ek ) = o(kEk k).

Nick Higham Matrix Function Condition Numbers 14 / 29 Existing Literature

Large literature on Fréchet derivatives in Banach space. Need specialized results for matrix functions.

Nick Higham Matrix Function Condition Numbers 15 / 29 Existence & Continuity of Fréchet Derivatives

D = open subset of C. Cn×n(D, p) = matrices with spectrum in D and largest Jordan block of size p.

Theorem (H & Relton, 2013)

k If f ∈ C2 p−1(D) and A ∈ Cn×n(D, p) then the kth Fréchet (k) (k) derivative Lf (A) exists and Lf (A, E1,..., Ek ) is continuous n×n in A and E1,..., Ek ∈ C .

Proof uses Gâteaux derivative. k = 1: Mathias (1996). Open problem: necessary conditions.

Nick Higham Matrix Function Condition Numbers 16 / 29 Properties

Assume from now on conditions of theorem satisﬁed. Then the Ei are interchangeable:

(2) (2) Lf (A, E1, E2) = Lf (A, E2, E1). Indeed

(k) ∂ Lf (A, E1,..., Ek ) = f (A + s1E1 + ··· + sk Ek ). ∂s1 ··· ∂sk s=0

Nick Higham Matrix Function Condition Numbers 17 / 29 (2) How to Compute Lf h i AE1 f (A) Lf (A,E) X1 = 0 A . Know f (X1) = 0 f (A) . Let   AE1 E2 0 0 1  0 A 0 E2  X2 = I2 ⊗ X1 + ⊗ E2 =   . 0 0  0 0 AE1  | {z } 0 0 0 A “nivellateur” Then

 (2)  f (A) Lf (A, E1) Lf (A, E2) Lf (A, E1, E2)  0 f (A) 0 Lf (A, E2)  f (X2) =   .  0 0 f (A) Lf (A, E1)  0 0 0 f (A)

Nick Higham Matrix Function Condition Numbers 18 / 29 (k) How to Compute Lf

2i n×2i n Deﬁne Xi ∈ C by 0 1 X = I ⊗ X + ⊗ I i−1 ⊗ E , X = A. i 2 i−1 0 0 2 i 0

Theorem (H & Relton, 2013) (k) The (1, n) block of f (Xk ) is Lf (A, E1,..., Ek ).

Nick Higham Matrix Function Condition Numbers 19 / 29 (2) f Lf Lf

Cmpw cond(f ) cond(L ) cond(f ) f

cond[2](f )

Nick Higham Matrix Function Condition Numbers 20 / 29 Condition Number of the Fréchet Derivative

kLf (A + ∆A, E + ∆E) − Lf (A, E)k condrel(Lf , A, E) = lim sup →0 k∆Ak≤kAk kLf (A, E)k k∆Ek≤kEk

Theorem (H & Relton, 2013)

max(condabs(f , A), sM)r ≤ condrel(Lf , A, E)

≤ (condabs(f , A) + sM)r,

kAk kEk (2) s = , r = ,M = max kLf (A, E, ∆A)k. kEk kLf (A, E)k k∆Ak=1

Hence we can compute condrel(Lf , A, E) to within a factor 2.

Nick Higham Matrix Function Condition Numbers 21 / 29 3 Get O(n ) estimate of condrel(Lf , A, E) within factor ≈ 6n.

Estimating condrel(Lf , A, E)

(2) n4×n2 Need second Kronecker form satisfying Kf (A) ∈ C

(2) T (2) vec(Lf (A, E1, E2)) = (vec(E1) ⊗ In2 )Kf (A)vec(E2).

T (2) Can show M is within factor n of k(vec(E) ⊗ In2 )Kf (A)k1. Theorem (H & Relton 2013) For most functions of interest,

∗ h T (2) i ∗ T (2) ∗ (vec(E) ⊗ In2 )Kf (A) = (vec(E ) ⊗ In2 )Kf (A ).

Nick Higham Matrix Function Condition Numbers 22 / 29 Estimating condrel(Lf , A, E)

(2) n4×n2 Need second Kronecker form satisfying Kf (A) ∈ C

(2) T (2) vec(Lf (A, E1, E2)) = (vec(E1) ⊗ In2 )Kf (A)vec(E2).

T (2) Can show M is within factor n of k(vec(E) ⊗ In2 )Kf (A)k1. Theorem (H & Relton 2013) For most functions of interest,

∗ h T (2) i ∗ T (2) ∗ (vec(E) ⊗ In2 )Kf (A) = (vec(E ) ⊗ In2 )Kf (A ).

3 Get O(n ) estimate of condrel(Lf , A, E) within factor ≈ 6n.

Nick Higham Matrix Function Condition Numbers 22 / 29 Level-2 Condition Number

“Condition number of the condition number”. Demmel (1987) showed that for matrix inversion (and D. J. Higham, 1995), the eigenproblem, polynomial zero-ﬁnding, pole assignment in linear control problems (relative) level-1 and level-2 cond no’s are equivalent. Cheung & Cucker (2005) show same holds when “condition number = 1 / distance to nearest ill-posed problem”.

Nick Higham Matrix Function Condition Numbers 23 / 29 Level-2 Condition Number

[2] |condabs(f , A + Z ) − condabs(f , A)| condabs(f , A) = lim sup . →0 kZk≤

Theorem (H & Relton, 2013) In the Frobenius norm,

[2] (2) condabs(f , A) ≤ kKf (A)k2.

Nick Higham Matrix Function Condition Numbers 24 / 29 Theorem Let A ∈ Cn×n be normal. Then in the 2-norm

[2] 1 ≤ condrel(exp, A) ≤ 2condrel(exp, A) + 1.

Level-2 Condition Number: Exponential

Theorem Let A ∈ Cn×n be normal. Then in the 2-norm

[2] condabs(exp, A) = condabs(exp, A).

Nick Higham Matrix Function Condition Numbers 25 / 29 Level-2 Condition Number: Exponential

Theorem Let A ∈ Cn×n be normal. Then in the 2-norm

[2] condabs(exp, A) = condabs(exp, A).

Theorem Let A ∈ Cn×n be normal. Then in the 2-norm

[2] 1 ≤ condrel(exp, A) ≤ 2condrel(exp, A) + 1.

Nick Higham Matrix Function Condition Numbers 25 / 29 Level-2 Condition Number: Hermitian A

Theorem (H & Relton, 2013)

Let A ∈ Cn×n be Hermitian and let f : R → R have a strictly monotonic derivative. Then in the Frobenius norm,

0 condabs(f , A) = max |f (λi )|. i Moreover, if the maximum is attained uniquely for i = k then

[2] 00 condabs(f , A) ≥ |f (λk )|.

Applies to log, square root, inverse. For inverse the inequality is an equality.

Nick Higham Matrix Function Condition Numbers 26 / 29 Logarithm Experiment

10 10 cond (log,A) abs

8 lvl2_bnd 10 cond (log,A)2 abs

6 10

4 10

2 10

0 10

−2 10 0 5 10 15 20 25 30 35 40 45 50

Nick Higham Matrix Function Condition Numbers 27 / 29 Summary

f L (2) f Lf O(n3) O(n3) O(n3)

cond(f ) Cmpw (2) Kf O(n3) cond(f ) O(n7)

[2] cond (f ) cond(Lf ) Kf O(n3) u.b. O(n3) O(n5)

Nick Higham Matrix Function Condition Numbers 28 / 29 Open Questions

Necessary conditions for existence of Lf ?

Quality of upper bound for cond[2](f )?

Relation between level-1 and level-2 condition numbers?

How can we exploit the symmetry of (k) Lf (E1, E2,..., Ek ) in the Ei ?

Nick Higham Matrix Function Condition Numbers 29 / 29 ReferencesI

S. D. Ahipasaoglu, X. Li, and K. Natarajan. A convex optimization approach for computing correlated choice probabilities with many alternatives. Preprint 4034, Optimization Online, 2013. 26 pp. A. H. Al-Mohy and N. J. Higham. Computing the Fréchet derivative of the matrix exponential, with an application to condition number estimation. SIAM J. Matrix Anal. Appl., 30(4):1639–1657, 2009.

Nick Higham Matrix Function Condition Numbers 1 / 11 ReferencesII

A. H. Al-Mohy, N. J. Higham, and S. D. Relton. Computing the Fréchet derivative of the matrix logarithm and estimating the condition number. SIAM J. Sci. Comput., 35(4):C394–C410, 2013. S. Amat, S. Busquier, and J. M. Gutiérrez. Geometric constructions of iterative functions to solve nonlinear equations. J. Comput. Appl. Math., 157(1):197–205, 2003. J. Ashburner and G. R. Ridgway. Symmetric diffeomorphic modelling of longitudinal structural MRI. Frontiers in Neuroscience, 6, 2013. Article 197.

Nick Higham Matrix Function Condition Numbers 2 / 11 References III

D. Cheung and F. Cucker. A note on level-2 condition numbers. J. Complexity, 21:314–319, 2005. E. Deadman and N. J. Higham. Testing matrix function algorithms using identities. MIMS EPrint 2014.13, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, Mar. 2014. 15 pp. J. W. Demmel. On condition numbers and the distance to the nearest ill-posed problem. Numer. Math., 51:251–289, 1987.

Nick Higham Matrix Function Condition Numbers 3 / 11 ReferencesIV

B. García-Mora, C. Santamaría, G. Rubio, and J. L. Pontones. Computing survival functions of the sum of two independent Markov processes. An application to bladder carcinoma treatment. Int. Journal of Computer Mathematics, 91(2):209–220, 2014. D. J. Higham. Condition numbers and their condition numbers. Linear Algebra Appl., 214:193–213, 1995.

Nick Higham Matrix Function Condition Numbers 4 / 11 ReferencesV

N. J. Higham. Functions of Matrices: Theory and Computation. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2008. ISBN 978-0-898716-46-7. xx+425 pp. N. J. Higham and A. H. Al-Mohy. Computing matrix functions. Acta Numerica, 19:159–208, 2010. N. J. Higham and L. Lin. An improved Schur–Padé algorithm for fractional powers of a matrix and their Fréchet derivatives. SIAM J. Matrix Anal. Appl., 34(3):1341–1360, 2013.

Nick Higham Matrix Function Condition Numbers 5 / 11 ReferencesVI

N. J. Higham and S. D. Relton. Estimating the condition number of the Fréchet derivative of a matrix function. MIMS EPrint 2013.84, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, Dec. 2013. 18 pp.

Nick Higham Matrix Function Condition Numbers 6 / 11 References VII

N. J. Higham and S. D. Relton. Higher order Fréchet derivatives of matrix functions and the level-2 condition number. MIMS EPrint 2013.58, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, Nov. 2013. 19 pp. Revised April 2014. To appear in SIAM J. Matrix Anal. Appl. B. Jeuris, R. Vandebril, and B. Vandereycken. A survey and comparison of contemporary algorithms for computing the matrix geometric mean. Electron. Trans. Numer. Anal., 39:379–402, 2012.

Nick Higham Matrix Function Condition Numbers 7 / 11 References VIII

C. S. Kenney and A. J. Laub. A Schur–Fréchet algorithm for computing the logarithm and exponential of a matrix. SIAM J. Matrix Anal. Appl., 19(3):640–663, 1998. R. Mathias. A chain rule for matrix functions and applications. SIAM J. Matrix Anal. Appl., 17(3):610–620, 1996.

Nick Higham Matrix Function Condition Numbers 8 / 11 ReferencesIX

D. Petersson. A Nonlinear Optimization Approach to H2-Optimal Modeling and Control. PhD thesis, Department of Electrical Engineering, Linköping University, Sweden, SE-581 83 Linköping, Sweden, 2013. Dissertation No. 1528. D. Petersson and J. Löfberg. Model reduction using a frequency-limited H2-cost. Technical report, Department of Electrical Engineering, Linköpings Universitet, SE-581 83, Linköping, Sweden, Mar. 2012. 13 pp.

Nick Higham Matrix Function Condition Numbers 9 / 11 Extra slides Level-2 Condition Number: Matrix Inverse

Theorem

For nonsingular A ∈ Cn×n,

[2] −1 −1 3/2 condabs(x , A) = 2condabs(x , A) .

Latter is what is expected from the scalar case: f (x) = x −1 ⇒ |f 00| = |2(f 0)3/2|. Have experimental evidence that matrix case can be similar to scalar case.

Nick Higham Matrix Function Condition Numbers 11 / 11