How and Why to Estimate Condition Numbers for Matrix Functions

HowResearch and Why Matters to Estimate ConditionFebruary Numbers 25, for 2009 Matrix Functions NickNick Higham Higham DirectorSchool of of Research Mathematics TheSchool University of Mathematics of Manchester http://www.maths.manchester.ac.uk/~higham @nhigham, nickhigham.wordpress.com Joint work with Sam Relton. Computational Linear Algebra and Optimization for the Digital Economy 1 / 6 Edinburgh, October 2013 Photocopying errors! Sources of Error f : Cn×n ! Cn×n. Want to compute f (A) but given A + ∆A not A. ∆A may come from Rounding errors in storing A. Measurement errors. Errors from an earlier computation. Nick Higham Matrix Function Condition Numbers 2 / 33 Sources of Error f : Cn×n ! Cn×n. Want to compute f (A) but given A + ∆A not A. ∆A may come from Rounding errors in storing A. Measurement errors. Errors from an earlier computation. Photocopying errors! Nick Higham Matrix Function Condition Numbers 2 / 33 Photocopier http://en.wikipedia.org/wiki/Photocopier “Most current photocopiers use a technology called xerography, a dry process that uses electrostatic charges on a light sensitive photoreceptor to first attract and then transfer toner particles (a powder) onto paper in the form of an image” “There is an increasing trend for new photocopiers to adopt digital technology, thus replacing the older analog technology. With digital copying, the copier effectively consists of an integrated scanner and laser printer.” Nick Higham Matrix Function Condition Numbers 3 / 33 Xerox WorkCentre 7535, 7556 August 2013: German computer scientist David Kriesel discovered the machines often change “6” to “8”. Jbig2 compression algorithm implicated. Nick Higham Matrix Function Condition Numbers 4 / 33 Maxim Every code for solving a numerical problem should return an error estimate or bound with the computed result. Nick Higham Matrix Function Condition Numbers 5 / 33 Condition number of the condition number: assuming xf 0(x) > 0 and f (x) > 0, 0 00 0 [2] xc (x) f (x) f (x) c (x) := = 1 + x − : c(x) f 0(x) f (x) Conditioning of f (x) Scalar function f 2 C2, and y = f (x), y + ∆y = f (x + ∆x). Then ∆y xf 0(x) ∆x = + O∆x 2: y f (x) x The relative condition number 0 xf (x) c(x) = : f (x) Nick Higham Matrix Function Condition Numbers 6 / 33 Conditioning of f (x) Scalar function f 2 C2, and y = f (x), y + ∆y = f (x + ∆x). Then ∆y xf 0(x) ∆x = + O∆x 2: y f (x) x The relative condition number 0 xf (x) c(x) = : f (x) Condition number of the condition number: assuming xf 0(x) > 0 and f (x) > 0, 0 00 0 [2] xc (x) f (x) f (x) c (x) := = 1 + x − : c(x) f 0(x) f (x) Nick Higham Matrix Function Condition Numbers 6 / 33 Examples: 2 2 2 (X + E) − X = XE + EX + E ) Lx2 (X; E) = XE + EX: (X + E)−1 − X −1 = −X −1EX −1 + O(E 2) −1 −1 ) Lx−1 (X; E) = −X EX : 0 Scalar case: Lf (x; e) = f ( x)e. Fréchet Derivative Fréchet derivative of f : Cn×n ! Cn×n at X 2 Cn×n n×n n×n n×n A linear mapping Lf : C ! C s.t. for all E 2 C f (X + E) − f (X) − Lf (X; E) = o(kEk): Nick Higham Matrix Function Condition Numbers 7 / 33 0 Scalar case: Lf (x; e) = f ( x)e. Fréchet Derivative Fréchet derivative of f : Cn×n ! Cn×n at X 2 Cn×n n×n n×n n×n A linear mapping Lf : C ! C s.t. for all E 2 C f (X + E) − f (X) − Lf (X; E) = o(kEk): Examples: 2 2 2 (X + E) − X = XE + EX + E ) Lx2 (X; E) = XE + EX: (X + E)−1 − X −1 = −X −1EX −1 + O(E 2) −1 −1 ) Lx−1 (X; E) = −X EX : Nick Higham Matrix Function Condition Numbers 7 / 33 Fréchet Derivative Fréchet derivative of f : Cn×n ! Cn×n at X 2 Cn×n n×n n×n n×n A linear mapping Lf : C ! C s.t. for all E 2 C f (X + E) − f (X) − Lf (X; E) = o(kEk): Examples: 2 2 2 (X + E) − X = XE + EX + E ) Lx2 (X; E) = XE + EX: (X + E)−1 − X −1 = −X −1EX −1 + O(E 2) −1 −1 ) Lx−1 (X; E) = −X EX : 0 Scalar case: Lf (x; e) = f ( x)e. Nick Higham Matrix Function Condition Numbers 7 / 33 Gâteaux Derivative Gâteaux derivative of matrix function f at A in direction E: f (A + E) − f (A) Gf (A; E) = lim : !0 Weaker notion of differentiability. Nick Higham Matrix Function Condition Numbers 8 / 33 Applications of Fréchet Derivative Computation of correlated choice probabilities (2013), registration of MRI images (2013), Markov models applied to cancer data (2013), matrix geometric mean computation (2012), model reduction (2012), first and second Fréchet derivative in Halley’s method on Banach space (2003). Nick Higham Matrix Function Condition Numbers 9 / 33 Componentwise Sensitivity (1) T Let E = ei ej . Then ( + T ) − ( ) f X ei ej f X T lim = Lf (X; ei ej ): !0 T (Lf (X; ei ej ))rs measures the sensitivity of f (X)rs to perturbations in aij . Nick Higham Matrix Function Condition Numbers 10 / 33 Componentwise Sensitivity (2) Since Lf is a linear operator, vec(Lf (A; E)) = Kf (A)vec(E) n2×n2 where Kf (A) 2 C is the Kronecker form of the Fréchet derivative. a11 a21 a31 ::: ann f11 f21 f31 Kf (A) ≡ . T . Lf (A; ei ej ) fnn Nick Higham Matrix Function Condition Numbers 11 / 33 0 Scalar case: condabs(f ; x) = jf (x)j. Condition Number kf (A + E) − f (A)k condabs(f ; A) = lim sup : !0 kEk≤ kLf (A; E)k kLf (A)k := max : E6=0 kEk Lemma condabs(f ; A) = kLf (A)k: Nick Higham Matrix Function Condition Numbers 12 / 33 Condition Number kf (A + E) − f (A)k condabs(f ; A) = lim sup : !0 kEk≤ kLf (A; E)k kLf (A)k := max : E6=0 kEk Lemma condabs(f ; A) = kLf (A)k: 0 Scalar case: condabs(f ; x) = jf (x)j. Nick Higham Matrix Function Condition Numbers 12 / 33 Relative Condition Number kf (A + E) − f (A)k condrel(f ; A) := lim sup !0 kEk≤kAk kf (A)k kAk = cond (f ; A) : abs kf (A)k Nick Higham Matrix Function Condition Numbers 13 / 33 Computing Lf : Methods for Specific f exponential Kenney & Laub (1998), Al-Mohy & H (2009) logarithm Al-Mohy, H & Relton (2013) fractional power H & Lin (2013) Latter three methods obtained by differentiating alg for f . Nick Higham Matrix Function Condition Numbers 14 / 33 Computing Lf : Via 2n × 2n Matrix Theorem (Mathias, 1996) If f is 2n − 1 times ctsly diffble, AE f (A) L (A; E) f = f : 0 A 0 f (A) Note that Lf (A; αE) = αLf (A; E), but α may effect alg used for the evaluation. Nick Higham Matrix Function Condition Numbers 15 / 33 Computing Lf : Complex Step Assume that f : Rn×n ! Rn×n and A; E 2 Rn×n. Then f (A + i hE) − f (A) − i hLf (A; E) = o(h): Thus (Al-Mohy & H, 2010) f (A) ≈ Re f (A + i hE); f (A + i hE) L (A; E) ≈ Im : f h Errors O(h2). h not restricted by fl pt arith considerations. Can take h = 10−100. f alg must not employ complex arith. Nick Higham Matrix Function Condition Numbers 16 / 33 Condition Estimation (1) Since Lf is a linear operator, vec(Lf (A; E)) = Kf (A)vec(E) n2×n2 where Kf (A) 2 C is the Kronecker form of the Fréchet derivative. Can show kLf (A)kF = kKf (A)k2; kL (A)k f 1 ≤ kK (A)k ≤ nkL (A)k : n f 1 f 1 So problem reduces to matrix norm estimation. Nick Higham Matrix Function Condition Numbers 17 / 33 Condition Estimation (2) Use the block 1-norm estimator of H & Tisseur (2000). ∗ For kBk1 it needs Bx and B y for several x and y. Let vec(X) = x. Then Kf (A)x = vec(Lf (A; X)). Theorem (H & Lin, 2013) Let f 2 C2n−1 and and ef (z) := f (z). If ef (A)∗ = ef (A∗) for all A 2 Cn×n then ( )∗ = ( ( ; ∗)∗) ( ) = Kf A x vec Lef A X , where vec X x. ef = f for most functions of interest. ef = f ) ef (A∗)∗ = ef (A∗). Nick Higham Matrix Function Condition Numbers 18 / 33 Software for 1-Norm Estimator MATLAB: normest1. NAG library: F04YD, F04ZD, Mark 24. Python: Nick Higham Matrix Function Condition Numbers 19 / 33 Python: SciPy 0.13.0 linalg/_expm_frechet.py linalg/_expm_multiply.py linalg/_sqrtm.py (blocked) Nick Higham Matrix Function Condition Numbers 20 / 33 kth Fréchet derivative defined by (k−1) (k−1) Lf (A+Ek ; E1;:::; Ek−1) − Lf (A; E1;:::; Ek−1) (k) − Lf (A; E1;:::; Ek ) = o(kEk k): Higher Derivatives Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A; E). (2) Second Fréchet derivative Lf (A; E1; E2) is unique mutlilinear function of E1; E2 satisfying (2) Lf (A+E2; E1) − Lf (A; E1) − Lf (A; E1; E2) = o(kE2k): Nick Higham Matrix Function Condition Numbers 21 / 33 Higher Derivatives Needed for level-2 condition number, understanding accuracy of algorithms for computing Lf (A; E). (2) Second Fréchet derivative Lf (A; E1; E2) is unique mutlilinear function of E1; E2 satisfying (2) Lf (A+E2; E1) − Lf (A; E1) − Lf (A; E1; E2) = o(kE2k): kth Fréchet derivative defined by (k−1) (k−1) Lf (A+Ek ; E1;:::; Ek−1) − Lf (A; E1;:::; Ek−1) (k) − Lf (A; E1;:::; Ek ) = o(kEk k): Nick Higham Matrix Function Condition Numbers 21 / 33 Existing Literature Large literature on Fréchet derivatives in Banach space.

How and Why to Estimate Condition Numbers for Matrix Functions

Structured Eigenvalue Condition Numbers and Linearizations for Matrix Polynomials

Condition Number Bounds for Problems with Integer Coefficients*

The Eigenvalue Problem: Perturbation Theory

Inverse Littlewood-Offord Theorems and the Condition Number Of

Computing the Jordan Structure of an Eigenvalue∗

High Probability Analysis of the Condition Number of Sparse Polynomial Systems

Condition Number and Matrices

1 on the Condition of Numerical Problems and the Numbers That

Chapter 9 PRECONDITIONING

CPSC 303: WHAT the CONDITION NUMBER DOES and DOES NOT TELL US Contents 1. Motivating Examples from Interpolation 2 1.1. an N

Structured Eigenvalue Condition Numbers

7.4 Matrix Norms and Condition Numbers This Section Considers the Accuracy of Computed Solutions to Linear Systems