115 Nonlinear Eigenvalue Problems

115.1 Basic Properties ...... 115-2 115.2 Analytic functions ...... 115-3 115.3 Variational Characterization of Eigenvalues ...... 115-7 115.4 General Rayleigh Functionals ...... 115-9 115.5 Methods for dense eigenvalue problems ...... 115-10 115.6 Iterative projection methods...... 115-13 115.7 Methods using invariant pairs ...... 115-17 Heinrich Voss 115.8 The infinite Arnoldi method ...... 115-20 Hamburg University of Technology References...... 115-22

This chapter considers the nonlinear eigenvalue problem to find a parameter λ such that the linear system T (λ)x = 0 (115.1) has a nontrivial solution x, where T (·): D → Cn×n is a family of matrices depending on a complex parameter λ ∈ D. It generalizes the linear eigenvalue problem Ax = λx, A ∈ Cn×n, where T (λ) = λI − A, and the generalized linear eigenvalue problem where T (λ) = λB − A, A, B ∈ Cn×n. Nonlinear eigenvalue problems T (λ)x = 0 arise in a variety of applications in science and engineering, such as the dynamic analysis of structures, vibrations of fluid–solid structures, the electronic behavior of quantum dots, and delay eigenvalue problems, to name just a few. Due to its wide range of applications, the quadratic eigenvalue problem T (λ)x = λ2Mx + λCx + Kx = 0 is of particular interest, but also polynomial, rational and more general eigenvalue problems appear. A standard approach for investigating or numerically solving polynomial eigenvalue problems is linearization where the original problem is transformed into a generalized linear eigenvalue problem with the same spectrum. Details on linearization and structure preservation are discussed in Chapter 102, Matrix Polynomials. This chapter is concerned with the general nonlinear eigenvalue problem which in general can not be linearized. Unlike for linear and polynomial eigenvalue problems there may exist infinitely many eigenvalues. In practice, however, one is usually interested only in a few eigenvalues close to a target value or a line in the complex plane. If T is linear then T (λ) = T (0)+λT 0(0) has the form of a generalized eigenvalue problem, and in the general case linerization gives the approximation T (λ) = T (0) + λT 0(0) + O(λ2), which is again a generalized linear eigenvalue problem. Hence, it is not surprising, that the (elementwise) derivative T 0(λ) of T (λ) plays an important role in the analysis of nonlinear eigenvalue problems. We tacitly assume in the whole chapter that whenever a derivative T 0(λˆ) appears, T is analytic in a neighborhood of λˆ or in the real case T : D → Rn×n, D ⊂ R that T is differentiable in a neighborhood of λˆ. k · k always denotes the Euclidean and spectral , respectively, and we use the notation [x; y] := [xT , yT ]T for column vectors. 115-1 115-2 Handbook of Linear Algebra

115.1 Basic Properties

This section presents basic properties of the nonlinear eigenvalue problem (115.1)

Definitions: As for a linear eigenvalue problem, λˆ ∈ D is called an eigenvalue of T (·) if T (λˆ)x = 0 has a nontrivial solution xˆ 6= 0. Then xˆ is called a corresponding eigenvector or right eigenvector, and (λ,ˆ xˆ) is called eigenpair of T (·). Any nontrivial solution yˆ 6= 0 of the adjoint equation T (λˆ)∗y = 0 is called left eigenvector of T (·) and the vector-scalar-vector triplet (yˆ, λ,ˆ xˆ) is called eigentriplet of T (·). The eigenvalue problem (115.1) is regular if det T (λ) 6≡ 0, and otherwise it is called singular. The spectrum σ(T (·)) of T (·) is the set of all eigenvalues of T (·). ` ˆ d An eigenvalue λ of T (·) has algebraic multiplicity k if ` det(T (λ)) = 0 for ` = 0, . . . , k− dλ λ=λˆ dk 1 and k det(T (λ)) 6= 0. dλ λ=λˆ An eigenvalue λˆ is simple if its algebraic multiplicity is one. The geometric multiplicity of an eigenvalue λˆ is the dimension of the ker(T (λˆ)) of T (λˆ). An eigenvalue λˆ is called semi-simple if its algebraic and geometric mutiplicity coincide. n×n T T (·): J → R is real symmetric if T (λ) = T (λ) for every λ ∈ J ⊂ R. n×n T T (·): D → C is complex symmetric if T (λ) = T (λ) for every λ ∈ D. n×n ∗ T (·): D → C is Hermitian if D is symmetric with respect to the real line and T (λ) = T (λ¯) for every λ ∈ D.

Facts:

1. For A ∈ Cn×n and T (λ) = λI − A, the terms eigenvalue, (left and right) eigen- vector, eigenpair, eigentriplet, spectrum, algebraic and geometric multiplicity and semi-simple have their standard meaning. 2. For linear eigenvalue problems, – eigenvectors corresponding to distinct eigenvalues are linearly independent, which is not the case for nonlinear eigenvalue problems (cf. Example 1). – left and right eigenvectors corresponding to distinct eigenvalues are orthogonal, which does not hold for nonlinear eigenproblems (cf. Example 2). – the algebraic multiplicities of eigenvalues sum up to the dimension of the prob- lem, whereas for nonlinear problems there may exist an infinite number of eigen- values (cf. Example 2) and an eigenvalue may have any algebraic multiplicity (cf. Example 3).

3. [Sch08] If λˆ is an algebraically simple eigenvalue of T (·), then λˆ is geometrically simple. 4. [Neu85, Sch08] Let (yˆ, λ,ˆ xˆ) be an eigentriplet of T (·). Then λˆ is algebraically simple if and only if λˆ is geometrically simple and yˆ∗T 0(λˆ)xˆ 6= 0. 5. [Sch08] Let D ⊂ C and E ⊂ Cd be open sets. Let T : D ×E → Cn×n be continuously differentiable, and let λˆ be a simple eigenvalue of T (·, 0) and xˆ and yˆ right and left eigenvectors with unit norm. Then the first order perturbation expansion at λˆ reads as follows: d ˆ 1 X ∗ ∂T ˆ λ(ε) − λ = εjyˆ (λ, 0)xˆ + o(kεk). ∗ 0 ˆ ∂ε yˆ T (λ, 0)xˆ j=1 j Nonlinear Eigenvalue Problem 115-3

The normwise for λˆ is given by v u d 2 |λ(ε) − λˆ| 1 uX ∂T κ(λˆ) = lim sup = t yˆ∗ (λ,ˆ 0)xˆ . kεk ∗ 0 ˆ ∂ε kεk→0 |yˆ T (λ, 0)xˆ| j=1 j

6. [Sch08] Let (yˆ, λ,ˆ xˆ) be an eigentriplet of T (·) with simple eigenvalue λˆ. Then for sufficiently small |λˆ − λ|

1 xˆyˆ∗ T (λ)−1 = + O(1). λ − λˆ yˆ∗T 0(λˆ)xˆ

7. [Neu85] Let λˆ be a simple eigenvalue of T (·), and let xˆ be a right eigenvector normal- ized such that e∗xˆ = 1 for some vector e. Then the matrix B := T (λˆ) + T 0(λˆ)xeˆ ∗ is nonsingular. 8. If T (·) is real symmetric and λ is a real eigenvalue, then left and right eigenvectors corresponding to λ coincide. 9. If T (·) is complex symmetric and x is a right eigenvector, then x¯ is a left eigenvector corresponding to the same eigenvalue. 10. If T (·) is Hermitian, then eigenvalues are real (and left and right eigenvectors corre- sponding to λ coincide) or they come in pairs, i.e. if (y, λ, x) is an eigentriplet of T (·), then this is also true for (x, λ,¯ y).

Examples:

1. For the quadratic eigenvalue problem T (λ)x = 0 with

 0 1  7 −5 1 0 T (λ) := + λ + λ2 (115.2) −2 3 10 −8 0 1 the distinct eigenvalues λ = 1 and λ = 2 share the eigenvector [1; 2]. " 2 # eiλ 1 √ 2. Let T (λ)x := x = 0. Then T (λ)x = 0 has a countable set of eigenvalues 2kπ, 1 1 ˆ k ∈ N ∪ {0}. λ = 0 is an algebraically double and geometrically simple eigenvalue√ with ∗ 0 ˆ left and right eigenvectors xˆ = yˆ = [1; −1], and yˆ T (0)xˆ = 0. Every λk = 2kπ, k 6= 0 is algebraically√ and geometrically simple with the same eigenvectors xˆ, yˆ as before, and ∗ 0 ˆ yˆ T (λk)xˆ = 2 2kπi 6= 0. k 3. T (λ) = (λ ), k ∈ N has the eigenvalue λˆ = 0 with algebraic multiplicity k.

115.2 Analytic matrix functions

In this section we consider the eigenvalue problem (115.1) where T (·): D → Cn×n is a regular matrix function which is analytic in a neighborhood of an eigenvalue λˆ.

Definitions:

A sequence of vectors x0, x1,..., xr−1 is called a Jordan chain (of length r) corresponding to λˆ if x0 6= 0 and ` k X 1 d T (λˆ) x = 0 for ` = 0, . . . , r − 1. k `−k k! dλ ˆ k=0 λ=λ 115-4 Handbook of Linear Algebra

x0 is an eigenvector and x1,..., xr−1 are generalized eigenvectors. Let x0 be an eigenvector corresponding to an eigenvalue λˆ. The maximal length of a Jordan chain that starts with x0 is called the multiplicity of x0. An eigenvalue λˆ is is said to be normal if it is a discrete point in σ(T (·))) and the multiplicity of each corresponding eigenvector is finite. n An analytic function x : D → C is called root function of T (·) at λˆ ∈ D if T (λˆ)x(λˆ) = 0 and x(λˆ) 6= 0. The multiplicity of λˆ as a zero of T (λ)x(λ) is called the multiplicity of x(·). The rank of an eigenvector x0 is the maximum of the multiplicities of all root functions x(·) such that x(λˆ) = x0. A root function x(·) is called a maximal root function if the multiplicity of x(·) is equal to the rank of x0 := x(λˆ). (1) ˆ (1) P∞ (1) ˆ j Let x0 ∈ ker T (λ) be an eigenvector with maximal rank and let x (λ) = j=0 xj (λ − λ) (1) ˆ (1) (k) be a maximal root function such that x (λ) = x0 . Suppose that the root functions x (λ) = P∞ (k) ˆ j (i) j=0 xj (λ − λ) , k = 1, . . . , i − 1 are already constructed, and let x0 be an eigenvector with (1) (i−1) maximal rank in some direct complement to the linear span of the vectors x0 ,..., x0 in ˆ (i) P∞ (i) ˆ j (i) ˆ (i) ker T (λ). Let x (λ) = j=0 xj (λ − λ) be a maximal root function such that x (λ) = x0 . Then the ordered set

x(1),..., x(1) , x(2),..., x(2) ,..., x(k),..., x(k) , 0 r1−1 0 r2−1 0 rk−1 ˆ (j) where k = dim ker T (λ) and rj = rank x0 is called canonical set of Jordan chains, and the ordered set x(1)(λ),..., x(k)(λ) is called canonical system of root functions. n×α Let X ∈ C contain in its columns the vectors of a canonical set of Jordan chains and let ˆ J = diag(J1,...,Jk), where Jj is a Jordan block of size rj × rj corresponding to λ. Then the pair (X,J) is called a Jordan pair. Let x(1)(λ),..., x(k)(λ) be a canonical system of root functions at λˆ, and let x(k+1),..., x(n) ∈ n (1) (k) (k+1) (n) n (1) C such that x (λˆ),..., x (λˆ), x ,..., x is a basis of C . Then the system x (λ), ..., x(k)(λ), x(k+1),..., x(n) is called an extended canonical system of root functions. To (k+1) (n) n the constant functions x ,..., x ∈ C (which are not root functions in the strict sense of the definition) is assigned the multiplicity 0. Let λˆ be an eigenvalue of T (·), and let Φ(·) be an analytic matrix function such that its columns form an extended canonical system of root functions of T (·) at λˆ. Then (cf. [GKS93]) in a neigh- borhood of λˆ, L(λ)Φ(λ) = P (λ)D(λ), (115.3) where D(λ) is a diagonal matrix with diagonal entries (λ − λˆ)κ1 ,..., (λ − λˆ)κn and P (·) is a matrix function analytic at λˆ such that det P (λˆ) 6= 0. Furthermore, the exponents κ1, . . . , κn are the multiplicities of the columns of Φ(·), also called partial multiplicities of T (·) at λˆ. (115.3) is the local Smith form of T (·) in a neighborhood of λˆ. n×p p×p A pair of matrices (Y,Z) ∈ C × C is a regular pair if for some integer ` ≥ 1  Y   YZ  rank   = p.  .   .  YZ`−1

The number p is called the order of the regular pair (Y,Z).

Facts: The following facts for which no specific reference is given can be found in [GLR82, GR81]. Nonlinear Eigenvalue Problem 115-5

1. In contrast to linear eigenvalue problems the vectors in a Jordan chain need not be linearly independent. Even the zero vector is admissible as a . 2. Let x(·) be a root function at λˆ, and let x(j) denote the jth derivative of x. Then the (j) ˆ ˆ vectors xj := x (λ), j = 0, . . . , r − 1 form a Jordan chain at λ, where r denotes the multiplicity of x(·). 3. The multiplicity of a root function at λˆ (and hence the rank of an eigenvector) is at most the algebraic multiplicity of λˆ. 4. The numbers r1 ≥ · · · ≥ rk in a Jordan pair are uniquely determined. ˆ 5. The number α := r1 + ··· + rk is the algebraic multiplicity of the eigenvalue λ. n ˆ 6. [GKS93] Let y1,..., y` : D → C be a set of root functions at λ with multiplicities ˆ ˆ ˆ s1 ≥ · · · ≥ s` such that y1(λ),..., y`(λ) ∈ ker T (λ) are linearly independent. If the ˆ root functions x1,..., xk define a canonical set of Jordan chains of T (·) at λ with multiplicities r1 ≥ · · · ≥ rk, then k ≥ ` and ri ≥ si for i = 1, . . . , `. Moreover, ˆ y1,..., y` define a canonical set of Jordan chains of T (·) at λ if and only if ` = k and sj = rj for j = 1, . . . , `. ˆ 7. Let S(·) be an analytic matrix function with det S(λ) 6= 0. x0,..., xk is a Jordan ˆ chain of T (·)S(·) corresponding to λ if and only if the vectors y0,..., yk given by Pj 1 (i) ˆ yj = i=0 i! S (λ)xj−i, j = 0, . . . , k − 1 is a Jordan chain of T (·) corresponding to λˆ. 8. For S as in the last fact the Jordan chains of T (·) coincide with those of S(·)T (·) corresponding to the same λˆ. 9. Two regular analytic matrix functions T1(·) and T2(·) have a common Jordan pair at ˆ −1 ˆ λ if and only if T2(λ)T1 (λ) is analytic and invertible at λ. 10. [GKS93] Let T (·), Φ(·), D(·) and P (·) be regular n × n matrix functions, analytic at λˆ, such that L(λ)Φ(λ) = P (λ)D(λ) in a neighborhood of λˆ. Assume that Φ(λˆ) is invertible and that D(·) is a diagonal matrix polynomial with diagonal entries ˆ κ1 ˆ κn (λ − λ) ,..., (λ − λ) , where κ1 ≥ · · · ≥ κn. Then the following three conditions are equivalent: (i) the columns of Φ(·) form an extended canonical system of root functions of T (·) ˆ at λ with partial multiplicities κ1, . . . , κn (ii) det P (λˆ) 6= 0 Pn ˆ (iii) j=1 κj is the the algebraic multiplicity of λ. P∞ ˆ j n 11. [GKS93] Let x(λ) = j=0(λ − λ) xj be an analytic C -vector function with x0 6= 0, ˆ and set X := [x0,..., xp]. Then x(·) is a root function of T (·) at λ of multiplicity at −1 most p if and only if T (λ)X(λI − Jλ,pˆ ) is an n × p analytic matrix function. Here ˆ Jλ,pˆ denotes a p × p Jordan block with eigenvalue λ. 12. [AST09] T (·) admits a representation P (λ)T (λ)Q(λ) = D(λ) where P (·) and Q(·) are regular analytic matrix functions with constant nonzero , and D(λ) = diag {dj(λ))j=1...,n is a diagonal matrix of analytic functions such that dj(λ)/dj−1(λ) are analytic for j = 2, 3, . . . , n. This representation is also called local Smith form. 13. [AST09] With the representation in the last fact, if qj(λ) is the jth column of Q, and ˆ ˆ ˆ λ a zero of dj(·), then (λ, qj(λ)) is an eigenpair of T (·). ˆ 14. The non-zero partial multiplicities κj in the local Smith form of T (·) at λ coincide with the lengths r1 ≥ · · · ≥ rk of Jordan chains in a canonical set. 15. A Jordan pair (X,J) of T (·) at an eigenvalue λˆ is regular. 16. [GR81] Let λˆ be an eigenvalue of T (·) with algebraic multiplicity α, and let (Y,Z) ∈ Cn×α ×Cα×α be a pair of matrices such that σ(Z) = {λˆ}.(Y,Z) is similar to a Jordan 115-6 Handbook of Linear Algebra

pair (X,J) (i.e. Y = XS and Z = S−1JS for some invertible matrix S) if and only if (Y,Z) is regular and the following equation holds:

∞ X 1 T Y (T − λIˆ )j = 0, where T = T (j)(λˆ) j j j! i=0

(note that only a finite number of terms in the left-hand side of the equation is different from zero, because σ (Z) = {λˆ}). 17. [HL99] Suppose that A(λ) and B(λ) are analytic matrix-valued functions such that A(λˆ) and B(λˆ) are non-singular. Then the partial multiplicities of the eigenvalue λˆ of T (λ) and T˜(λ) := B(λ)A(λ)C(λ) coincide. 18. [HL99] Suppose that a matrix-valued function T (λ, τ) depends analytically on λ and continuously on τ and that λˆ = 0 is an eigenvalue of T (·, 0) of algebraic multiplicity α. Then there exists a neighborhood O of λˆ such that, for all τ sufficiently close to the origin, there are exactly α eigenvalues (counting with algebraic multiplicities) of the matrix-valued function T (·, τ) in O.

Examples:

λ2 −λ 1. [GLR82] For T (λ) = we have det T (λ) = λ4, and hence λˆ = 0 is an eigenvalue of 0 λ2 T (·) with algebraic multiplicity 4 and geometric multiplicity 2. 0 For an eigenvector x0 = [x01; x02] the first generalized eigenvector x1 satisfies T (0)x0 + 0 −1 T (0)x = 0, i.e. x = 0, and x exists if and only if x = 0, and x can be taken 1 0 0 0 1 02 1 1 00 0 completely arbitrary. For a second generalized eigenvector x2 we have 2 T (0)x0 +T (0)x1 + 1 0 0 −1 T (0)x = 0, i.e. x + x = 0, i.e. x = x , and if this equation is satisfied, 2 0 1 0 0 0 1 01 12 x2 can be chosen arbitrarily. The condition for the third generalized eigenvector x3 reads 1 0 0 −1 x + x = 0, which implies x = 0 and is contradictory. 0 1 1 0 0 2 12 To summarize, the length of a Jordan chain can not exceed 3. Jordan chains of length 1 are x0, x0 6= 0, Jordan chains of length 2 are x0 = [x01; 0], x1 with x01 6= 0 and x1 arbitrary, and Jordan chains of length 3 are x0 = [x01; 0], x1 = [x11; x01], x2, where x01 6= 0, and x11 (1) and x2 are arbitrary. One example of a canonical system of Jordan chains is x0 = [1; 0], (1) (1) (2) x1 = [0; 1], x2 = [1; 1], x0 = [0; 1]. T (0) = 0 implies that x(·) is a root function at λˆ = 0, if x1 and x2 are analytic and x(0) 6= 0. 2 2 T (λ)x(λ) = [λ x1(λ) − λx2(λ); λ x2(λ)] = 0 yields that x has at least the multiplicity 2, and if x2(λ) = λx1(λ), then the multiplicity is 3, and a higher multiplicity is not possible. 0 00 0 In the latter case one obtains a Jordan chain as [x1(0); 0], [x1(0); x1(0)], [x1 (0); 2x1(0)]. 2. For the quadratic eigenvalue problem in (115.2), det T (λ) = λ4 − λ3 − 3λ2 + λ + 2. Hence, λˆ = −1 is an eigenvalue with algebraic multiplicity 2 and geometric multiplicity 1. From

 −6 6  0  −6 6   5 −5  0 T (−1)x = x = ,T (−1)x +T 0(−1)x = x + x = 0 −12 12 0 0 1 0 −12 12 1 10 −10 0 0

it follows that x = [1; 1] is an eigenvector corresponding to λˆ, and a generalized eigenvector 1 1 −1 1  as well. Then for X = and J = the pair (X,J) is a regular pair of order 2, 1 1 0 −1 namely the Jordan pair corresponding to λˆ = −1. Nonlinear Eigenvalue Problem 115-7

115.3 Variational Characterization of Eigenvalues

Variational characterizations of eigenvalues are very powerful tools when studying self- adjoint linear operators on a Hilbert space. Many things can be easily proved using these characterizations; for example, bounds for eigenvalues, comparison theorems, interlacing results and monotonicity of eigenvalues, to name just a few. This section presents similar results for nonlinear eigenvalue problems. A minmax charac- terization was first proved in [Duf55] for overdamped quadratic eigenproblems, generalized in [Rog64] to general overdamped, and in [VW82] to non-overdamped problems. Although the characterizations also hold for infinite dimensional problems [Had68, VW82] the pre- sentation here is restricted to the finite dimensional case. We assume in this whole section that J ⊂ R is an open interval (which may be un- bounded), and we consider a family of Hermitian matrices T : J → Cn×n depending continuously on the parameter λ ∈ J, such that the following two conditions are satisfied (i) For every x ∈ Cn, x 6= 0 the real equation f(λ; x) := x∗T (λ)x = 0 (115.4) has at most one solution λ =: p(x) in J. Then (115.4) implicitly defines a (nonlinear) functional on some domain D(p). (ii) (λ − p(x))f(λ; x) > 0 for every x ∈ D(p) and every λ ∈ J, λ 6= p(x). (115.5)

Definitions: The functional p : D(p) → J is called the Rayleigh functional. n If D(p) = C \{0}, then the problem T (λ)x = 0 is called overdamped. An eigenvalue λˆ ∈ J of T (·) is a jth eigenvalue if µ = 0 is the j largest eigenvalue of the matrix T (λˆ).

Facts: n In this subsection we denote by Sj the set of all j dimensional subspaces of C . The following facts for which no specific reference is given can be found in [Had68, VW82, Vos09].

1. D(p) is an open set in Cn. 2. p(αx) = p(x) for every x ∈ D(p) and every α ∈ C \{0}. 3. If T (·) is differentiable in a neighborhood of an eigenvalue λˆ and xˆ∗T 0(λˆ)xˆ 6= 0 for a corresponding eigenvector xˆ, then xˆ is a stationary point of p, i.e. |p(xˆ + h) − p(xˆ)| = o(khk). In the real case T : J → Rn×n, J ⊂ R, we have ∇p(xˆ) = 0. 4. For every j ∈ {1, . . . , n} there is at most one jth eigenvalue of T (·). 5. T (·) has at most n eigenvalues in J. 6. [Rog64] If T (·) is overdamped, then T (·) has exactly n eigenvalues in J. 7. If λj := inf sup p(x) ∈ J, V ∈Sj ,V ∩D(p)6=∅ x∈V ∩D(p)

then λj is a jth eigenvalue of T (·). 8. (minmax characterization) If λj ∈ J is a jth eigenvalue of T (·), then

λj := min max p(x) ∈ J. V ∈Sj ,V ∩D(p)6=∅ x∈V ∩D(p)

The minimum is attained for an invariant subspace of the matrix T (λj) corresponding to its j largest eigenvalues. The maximum is attained for some x ∈ ker T (λj). 115-8 Handbook of Linear Algebra

9. Let λ1 := infx∈D(p) p(x) ∈ J and λj ∈ J for some j ∈ {1, . . . , n}. Then for every k ∈

{1, . . . , j} there exists Uk ∈ Sk with Uk ⊂ D(p) ∪ {0} and λk := maxx∈Uk, x6=0 p(x). Hence, λk := min max p(x) ∈ J for k = 1, . . . , j. V ∈Sk,V ⊂D(p)∪{0} x∈V, x6=0

10. [Vos03] (maxmin characterization) Assume that there exists a jth eigenvalue λj ∈ J. Then λj = max inf p(x). ⊥ ⊥ V ∈Sj−1,V ∩D(p)6=∅ x∈V ∩D(p)

The maximum is attained for every invariant subspace of T (λj) corresponding to its j − 1 largest eigenvalues. 11. Let λj ∈ J be a jth eigenvalue of T (·) and λ ∈ J. Then

< <   x∗T (λ)x   λ = λj ⇐⇒ max min ∗ = 0. V ∈Sj x∈V,x6=0 x x > >

12. (orthogonality) [Had68] Let T (·) be differentiable in J. Then eigenvectors can be chosen orthogonal with respect to the generalized scalar product

 T (p(x)) − T (p(y)) y∗ x, if p(x) 6= p(y) [x, y] := p(x) − p(y)  y∗T 0(p(x))x, if p(x) = p(y)

which is symmetric and homogeneous, but in general is not bilinear. If T (·) is differentiable und condition (ii) strengthened to x∗T 0(p(x))x > 0 for every x ∈ D, then [·, ·] is definite, i.e. [x, x] > 0 for every x ∈ D(p). 13. (Rayleigh’s principle) Assume that J contains λ1, . . . , λj−1 where λk is a kth eigen- value of T (·), and let xk, k = 1, . . . , j − 1 be a corresponding eigenvectors. If

λj := inf{p(x): x ∈ D(p), [x, xk] = 0, k = 1, . . . , j − 1} ∈ J,

then λj is a jth eigenvalue of T (·). 14. (Sylvester’s law; overdamped case) Assume that T (·) is overdamped. For σ ∈ J let (π, ν, δ) be the inertia of T (σ). Then T (·) has π eigenvalues that are smaller than σ, ν eigenvalues that exceed σ, and if δ 6= 0, then σ is an eigenvalue of multiplicity δ. 15. (Sylvester’s law; extreme eigenvalues) Assume that T (µ) is negative definite for some µ ∈ J, and for σ > µ let (π, ν, δ) be the inertia of T (σ). Then T (·) has exactly π eigenvalues in J that are smaller than σ. 16. (Sylvester’s law; general case) Let µ ∈ J, and assume that for every r dimensional subspace V with V ∩ D(p) 6= ∅ there exists x ∈ V ∩ D(p) with p(x) > µ. For σ ∈ J, σ > µ let (π, ν, δ) be the inertia of T (σ). Then for j = r, . . . , π there exists a jth eigenvalue λj of T (·) in [µ, σ).

Examples:

2 n×n 1. [Duf55] The quadratic pencil Q(λ) := λ A + λB + C with positive definite A, B, C ∈ C ∗ 2 ∗ ∗ n is overdamped if and only if d(x) := (x Bx) − 4(x Ax)(x Cx) > 0 for every x ∈ C \{0}. ∗ ∗ For x 6= 0 the quadratic equation x Q(λ)x = 0 has two real solutions p±(x) = (−x Bx ± p ∗ d(x))/(2x Ax), and γ− := supx6=0 p−(x) < γ+ := infx6=0 p+(x). Q(·) has n eigenvalues in (−∞, γ+) which are minmax values of p− and n eigenvalues in (γ−, 0) which are minmax values of p+. Nonlinear Eigenvalue Problem 115-9

2. Assume that Q(·) as in the last example is not necessarily overdamped, and let in(Q(σ)) = (π, ν, δ) denote the inertia of Q(σ). If σ < γ+ := infx6=0{p+(x): p+(x) ∈ R}, then Q(·) has exactly ν eigenvalues in (−∞, σ), and if σ > γ− := supx6=0{p−(x): p−(x) ∈ R}, then Q(·) has ν eigenvalues in (σ, 0). √ If µ and µmax are the minimal and maximal eigenvalues of Cx = µAx, then − µmax ≤ min √ γ+ and − µmin ≥ γ−. If κmin and κmax are the minimal and maximal eigenvalues of Cx = κBx, respectively, then −2κmax ≤ γ+ and −2κmin ≥ γ−.

115.4 General Rayleigh Functionals

Whereas Section 115.3 presupposes the existence and uniqueness of a Rayleigh functional for problems allowing for a variational characterization, this section extends its definition to more general eigenproblems. It collects results on the existence and approximation prop- erties of a Rayleigh functional in a vicinity of eigenvectors corresponding to algebraically simple eigenvalues. The material in this section is mostly taken from [Sch08, SS10].

Definitions: n×n Let T : D → C be a matrix valued mapping, which is analytic, or which is differentiable with Lipschitz continuous derivative in the real case. Let (λ,ˆ xˆ) be an eigenpair of T (·), and define neighborhoods B(λ,ˆ τ) := {λ ∈ C : |λ − λˆ| < τ} n ˆ and Kε(xˆ) := {x ∈ C : ∠(Span{x}, Span{xˆ}) ≤ ε} of λ and xˆ, respectively. p : Kε → B(λ,ˆ τ) is a (one-sided) Rayleigh functional if the following conditions hold: (i) p(αx) = p(x) for every α ∈ C, α 6= 0 ∗ (ii) x T (p(x))x = 0 for every x ∈ Kε(xˆ) ∗ 0 (iii) x T (p(x))x 6= 0 for every x ∈ Kε(xˆ). Let (yˆ, λ,ˆ xˆ) be an eigentriplet of T (·). p : Kε(xˆ) × Kε(yˆ) → B(λ,ˆ τ) is a two-sided Rayleigh functional if the following conditions hold for every x ∈ Kε(xˆ) and y ∈ Kε(yˆ): (i) p(αx, βy) = p(x, y) for every α, β ∈ C \{0}, (ii) y∗T (p(x, y))x = 0, (iii) y∗T 0(p(x, y))x 6= 0. The generalized Rayleigh quotient (which was introduced in [Lan61] only for polynomial eigenvalue problems) is defined as y∗T (λ)x p : K (xˆ) × B(λ,ˆ τ) × K (yˆ) → B(λ,ˆ τ), p (y, λ, x) := λ − . L ε ε L y∗T 0(λ)x

Facts: The following facts can be found in [Sch08, SS10]. 1. Let (yˆ, λ,ˆ xˆ) be an eigentriplet of T (·) with kxˆk = kyˆk = 1, and assume that yˆ∗T 0(λˆ)xˆ 6= 0. Then there exist ε > 0 and τ > 0 such that the two-sided Rayleigh functional is defined in Kε(ˆx) × Kε(yˆ), and 8 kT (λˆ)k |p(x, y) − λˆ| ≤ tan ξ tan η, 3 |yˆ∗T 0(λˆ)xˆ| where ξ := ∠(Span{x}, Span{xˆ}) and η := ∠(Span{y}, Span{yˆ}). 115-10 Handbook of Linear Algebra

2. Under the conditions of Fact 1 let ξ < π/3 and η < π/3. Then

32 kT (λˆ)k |p(x, y) − λˆ| ≤ kx − xˆkky − yˆk. 3 |yˆ∗T 0(λˆ)xˆ| 3. Under the conditions of Fact 1 the two-sided Rayleigh functional is stationary at (xˆ, yˆ), i.e. |p(xˆ + s, yˆ + t) − λˆ| = O((ksk + ktk)2). 4. Let (λ,ˆ xˆ) be an eigenpair of T (·) with kxˆk = 1 and xˆ∗T 0(λˆ)xˆ 6= 0, and suppose that T (λˆ) = T (λˆ)∗. Then there exist ε > 0 and τ > 0 such that the one-sided Rayleigh functional p(·) is defined in Kε(xˆ), and

8 kT (λˆ)k |p(x) − λˆ| ≤ tan2 ξ, 3 |xˆ∗T 0(λˆ)xˆ|

where ξ := ∠(Span{x}, Span{xˆ}). 5. Let (λ,ˆ xˆ) be an eigenpair of T (·) with kxˆk = 1 and xˆ∗T 0(λˆ)xˆ 6= 0. Then there exist ε > 0 and τ > 0 such that the one-sided Rayleigh functional p(·) is defined in Kε(xˆ), and 10 kT (λˆ)k |p(x) − λˆ| ≤ tan ξ, 3 |xˆ∗T 0(λˆ)xˆ| where ξ := ∠(Span{x}, Span{xˆ}). 6. Let xˆ be a right eigenvector of T (·) corresponding to λˆ, and xˆ∗T 0(λˆ)xˆ 6= 0. The one-sided Rayleigh functional p is only stationary at xˆ if xˆ is also a left eigenvector. 7. The generalized Rayleigh quotient pL is obtained when applying Newton’s method to the equation defining the two-sided Rayleigh functional for fixed x and y. 8. [Lan61] Let (yˆ, λ,ˆ xˆ) be an eigentriplet of T (·) with yˆ∗T 0(λˆ)xˆ 6= 0. Then the general- ˆ ized Rayleigh quotient pL is stationary at (yˆ, λ, xˆ). 9. Under the conditions of Fact 1 the generalized Rayleigh quotient pL is defined for all ˆ λ ∈ B(λ, τ) and (x, y) ∈ Kε(xˆ) × Kε(yˆ), and

ˆ ˆ 2 ˆ 4kT (λ)k 2L |λ − λ| |pL(y, λ, x) − λ| ≤ tan ξ tan η + , |yˆ∗T 0(λˆ)xˆ| |yˆ∗T 0(λˆ)xˆ| cos ξ cos η where L denotes the Lipschitz constant of T 0(·).

115.5 Methods for dense eigenvalue problems

The size of the eigenvalue problems that can be treated with the numerical methods con- sidered in this section is limited to a few thousands depending on the available storage capacities. Moreover, they require several factorizations of varying matrices to approximate one eigenvalue, and therefore, they are not appropriate for large and sparse problems. How- ever, they are needed to solve the projected eigenproblem in most of the iterative projection methods for sparse problems. For general nonlinear eigenvalue problems, the classical approach is to formulate the eigenvalue problem as a system of nonlinear equations and to use variations of Newton’s method or the method. Thus, these methods are local and therefore not guaranteed to converge, but as for linear eigenvalue problems their basin of convergence can be enlarged using homotopy methods [DP01] or trust region strategies [YMW07]. Nonlinear Eigenvalue Problem 115-11

Facts:

1. [Kub70] Let T (λ)P (λ) = Q(λ)R(λ) be the QR factorization of T (λ), where P (λ) is a permutation matrix which is chosen such that the diagonal elements rjj(λ) of R(λ) are decreasing in magnitude, i.e. |r11(λ)| ≥ |r22(λ)| ≥ · · · ≥ |rnn(λ)|. Then λ is an eigenvalue of T (·) if and only if rnn(λ) = 0. Applying Newton’s method to this equation, one obtains the iteration 1 λk+1 = λk − T ∗ 0 −1 en Q(λk) T (λk)P (λk)R(λk) en

for approximations to an eigenvalue of problem T (λ)x = 0, where en denotes the nth unit vector. Approximations to left and right eigenvectors can be obtained from yk = Q(λk)en −1 and xk = P (λk)R(λk) en. However, this relatively simple idea is not efficient, since it computes eigenvalues one at a time and needs several O(n3) factorizations per eigenvalue. It is, however, useful in the context of iterative refinement of computed eigenvalues and eigenvectors. 2. [AR68] Applying Newton’s method to the nonlinear system

 T (λ)x  F (x, λ) := = 0 v∗x − 1

where v ∈ Cn is suitably chosen one obtains the inverse iteration given in Algorithm 1. Being a variant of Newton’s method it converges locally and quadratically for simple eigenpairs.

Algorithm 1: Inverse iteration ∗ Require: Initial pair (λ0, x0) and normalization vector v with v x0 = 1 1: for k = 0, 1, 2,... until convergence do 0 2: solve T (λk)uk+1 = T (λk)xk for uk+1 ∗ ∗ 3: λk+1 ← λk − (v xk)/(v uk+1) ∗ 4: normalize xk+1 ← uk+1/v uk+1 5: end for

3. If T (·) is Hermitian such that the general conditions of Section 115.3 are satisfied, one obtains the Rayleigh functional iteration if the update of λk+1 in step 3 of Algorithm 1 is replaced with λk+1 ← p(uk+1). This method converges locally and cubically [Rot89] for simple eigenpairs. 4. [Lan61, Ruh73] Replacing the vector v in the normalization step of inverse iteration ∗ for a general matrix function T (·) with vk = T (λk) yk, where yk is an approximation to a left eigenvector, the update for λ becomes

∗ ykT (λk)xk λk+1 ← λk − ∗ 0 , ykT (λk)xk

which is the generalized Rayleigh quotient pL. 5. [Sch08] For general T (·) and simple eigentriplets (yˆ, λ,ˆ xˆ) cubic convergence is also achieved by the two-sided Rayleigh functional iteration in Algorithm 2. If the linear system in step 2 is solved by factorizing T (λk), then taking the the factorization can be reused for the system in step 3. So, the cost of one iteration step is similar to the one of the one-sided Rayleigh functional iteration. 115-12 Handbook of Linear Algebra

Algorithm 2: Two-sided Rayleigh functional iteration ∗ ∗ Require: Initial triplet (y0, λ0, x0) where x0x0 = y0y0 = 1 1: for k = 0, 1, 2,... until convergence do 0 2: solve T (λk)uk+1 = T (λk)xk for uk+1; xk+1 ← uk+1/kuk+1k ∗ 0 ∗ 3: solve T (λk) vk+1 = T (λk) yk for vk+1; yk+1 ← vk+1/kvk+1k ∗ 4: solve yk+1T (λk+1)xk+1 = 0 for λk+1 5: end for

6. [Neu85] The cost for solving a linear system in each iteration step with a varying matrix is avoided in the residual inverse iteration in Algorithm 3 where the matrix T (λ0) is fixed during the whole iteration (or at least for several steps).

Algorithm 3: Residual inverse iteration ∗ Require: Initial pair (λ0, x0) and normalization vector w with w x0 = 1 1: for k = 0, 1, 2,... until convergence do ∗ −1 2: solve w T (λ0) T (λk+1)xk = 0 for λk+1 3: solve T (λ0)uk = T (λk+1)xk for uk ∗ 4: set vk+1 ← xk − uk and normalize xk+1 ← vk+1/w vk+1 5: end for

If T (·) is Hermitian and λˆ ∈ R, then the convergence can be improved by determining ∗ λk+1 in step 1 via the Rayleigh functional, i.e. solving xkT (λk+1)xk = 0 for λk+1. If T (·) is twice continuously differentiable and λˆ is algebraically simple, then the ˆ residual inverse iteration converges for all (λ0, x0) sufficiently close to (λ, xˆ), and

kxk+1 − xˆk ˆ ˆ t = O(|λ0 − λ|) and |λk+1 − λ| = O(kxk − xˆk ) kxk − xˆk

where t = 2 in the Hermitian case if λk+1 is updated via the Rayleigh functional, and t = 1 in the general case. 7. [Ruh73] The first order approximation T (λ + σ)x = T (λ)x + σT 0(λ)x + o(|σ|) sug- gests the method of successive linear problems in Algorithm 4, which also converges quadratically for simple eigenvalues.

Algorithm 4: Method of successive linear problems

Require: Initial approximation λ0 1: for k = 0, 1,... until convergence do 0 2: solve the linear eigenproblem T (λk)u = θT (λk)u 3: choose an eigenvalue θ smallest in modulus 4: λk+1 = λk − θ 5: end for

ˆ If λ is a semi-simple eigenvalue, xk converges to a right eigenvector xˆ. If yˆ is a left eigenvector corresponding to λˆ such that yˆ∗T 0(λˆ)xˆ 6= 0 (which is guaranteed for a simple eigenvalue), then the convergence factor is given by (cf. [Jar12])

λ − λˆ 1 yˆ∗T 00(λˆ)xˆ c := lim k+1 = . ˆ 2 ∗ 0 ˆ k→∞ (λk − λ) 2 yˆ T (λ)xˆ Nonlinear Eigenvalue Problem 115-13

8. [Wer70] If the nonlinear eigenvalue problem allows for a variational characterization of its eigenvalues, then the safeguarded iteration, which aims at a particular eigenvalue, is a natural choice.

Algorithm 5: Safeguarded iteration for determining an mth eigenvalue

Require: Approximation λ0 to an mth eigenvalue 1: for k = 0, 1,... until convergence do 2: determine an eigenvector xk corresponding to the m-largest eigenvalue of T (λk) ∗ 3: solve xkT (λk+1)xk = 0 for λk+1 4: end for

Under the conditions of Section 115.3, the safeguarded iteration has the following properties [NV10]: ˆ (i) If λ1 := infx∈D(p) p(x) ∈ J and x0 ∈ D, then the safeguarded iteration with ˆ m = 1 converges globally to λ1. ˆ (ii) If T (·) is continuously differentiable and λm is a simple eigenvalue, then the ˆ safeguarded iteration converges locally and quadratically to λm. 0 ˆ (iii) Let T (·) be twice continuously differentiable and T (λm) be positive definite. If xk in step 3 is chosen to be an eigenvector corresponding to the m largest 0 eigenvalue of the generalized eigenvalue problem T (λk)x = µT (λk)x, then the convergence is even cubic. 9. [SX11] For higher dimensions n it is too costly to solve the occurring linear systems exactly. Szyld and Xue [SX11] studied inexact versions of inverse iteration and residual inverse iteration and proved that the same order of convergence can be achieved as for the exact methods if the respective linear systems are solved sufficiently accurately.

115.6 Iterative projection methods

For sparse linear eigenvalue problems Ax = λx iterative projection methods like the Lanc- zos, Arnoldi, rational Krylov or Jacobi–Davidson method are very efficient. Here the di- mension of the eigenproblem is reduced by projecting it to a subspace of much smaller dimension, and the reduced problem is solved by a fast technique for dense problems. The subspaces are expanded in the course of the algorithm in an iterative way with the aim that some of the eigenvalues of the reduced matrix become good approximations to some of the wanted eigenvalues of the given large matrix. Two types of iterative projection methods are in use: methods which expand the subspaces independently of the eigenpair of the projected problem and which take advantage of a normal form of A like the Arnoldi, Lanczos, and rational Krylov method, and methods which aim at a particular eigenpair and choose the expansion such that it has a high approximation potential for a wanted eigenvector like the Jacobi–Davidson method. For general nonlinear eigenproblems a normal form does not exist. Therefore, generaliza- tions of iterative projection methods to general nonlinear eigenproblems always have to be of the second type. There are essentially two types of these methods, the Jacobi–Davidson method (and its two–sided version) which is based on inverse iteration and the nonlinear Arnoldi method which is based on residual inverse iteration. Jacobi–Davidson method Assume that we are given a search space V and a matrix V with orthonormal columns 115-14 Handbook of Linear Algebra containing a basis of V. Let (y, θ) be an eigenpair of the projected problem V ∗T (λ)V y = 0 and x = V y be the corresponding Ritz vector. A direction with high approximation potential is given by inverse iteration v = T (θ)−1T 0(θ)x, however replacing v with an inexact solution of the linear system T (θ)v = T 0(θ)x will spoil the favorable approximation properties of inverse iteration. Actually, we are not interested in the direction v but in an expansion of V which contains v, and for every α 6= 0 the vector t = x + αv is as qualified as v. It was shown in [Vos07] that the most robust expansion of this type is obtained if x and t := x + αv are orthogonal, and it is easily seen that this t solves the so called correction equation

 T 0(θ)xx∗   xx∗  I − T (θ) I − t = T (θ)x, t ⊥ x. x∗T 0(θ)x x∗x

The resulting iterative projection method is called Jacobi–Davidson method, a template of which is given in Algorithm 6.

Algorithm 6: Nonlinear Jacobi–Davidson method Require: Initial basis V , V ∗V = I; m = 1 1: determine K ≈ T (σ)−1, σ close to first wanted eigenvalue 2: while m ≤ number of wanted eigenvalues do 3: compute an approximation θ to the mth wanted eigenvalue and corresponding ∗ eigenvector y of the projected problem TV (θ)y := V T (θ)V y = 0 4: determine the Ritz vector u = V y and the residual r = T (θ)u 5: if krk/kuk <  then 6: accept approximate eigenpair (λm, xm) := (θ, u); increase m ← m + 1; 7: reduce search space V if indicated −1 8: determine new preconditioner K ≈ T (λm) if necessary 9: choose approximation (θ, u) to next eigenpair 10: compute residual r = T (θ)u; 11: end if 12: Find approximate solution of correction equation

T 0(θ)uu∗ uu∗ (I − )T (θ)(I − )t = −r, t ⊥ u (115.6) u∗T 0(θ)u u∗u

(by preconditioned Krylov solver, e.g.) 13: orthogonalize t = t − VV ∗t, v = t/ktk, and expand subspace V = [V, v] 14: update projected problem 15: end while

Facts:

1. The Jacobi–Davidson method was introduced for polynomial eigenproblem in [SBF96] and studied for general nonlinear eigenvalue problems in [BV04, Vos07a]. 2. As in the linear case the correction equation (115.6) does not have to be solved exactly to maintain fast convergence, but usually a few steps of a Krylov subspace solver with an appropriate preconditioner suffice to obtain a good expansion direction of the search space. 3. In the correction equation (115.6) the operator T (θ) is restricted to map the subspace x⊥ into itself. Hence, if K−1 ≈ T (θ) is a preconditioner of T (θ), then a preconditioner Nonlinear Eigenvalue Problem 115-15

for an iterative solver of (115.6) should be modified correspondingly to

T 0(θ)uu∗ uu∗ K˜ := (I − )K−1(I − ). u∗T 0(θ)u u∗u

With left-preconditioning equation (115.6) becomes

T 0(θ)uu∗ uu∗ K˜ −1T˜(θ)t = −K˜ −1r, t ⊥ u where T˜(θ) := (I − )T (θ)(I − ). u∗T 0(θ)u u∗u

Taking into account the projectors in the preconditioner, i.e. using K˜ instead of K in a preconditioned Krylov solver, raises the cost only slightly. In every step one has to solve one linear system Kw = y, and to initialize the solver requires only one additional solve. 4. In step 1 of Algorithm 6 any preinformation such as a small number of known ap- proximate eigenvectors of problem (115.1) corresponding to eigenvalues close to σ or of eigenvectors of a contiguous problem can and should be used. If no information on eigenvectors is at hand, and if one is interested in eigenvalues close to the parameter σ ∈ D, one can choose an initial vector at random, execute a few Arnoldi steps for the linear eigenproblem T (σ)u = θu or T (σ)u = θT 0(σ)u, and choose the eigenvector corresponding to the smallest eigenvalue in modulus or a small number of Schur vectors as initial basis of the search space. Starting with a random vector without this preprocessing usually will yield a value λm in step 4 which is far away from σ and will avert convergence. 5. As the subspaces expand in the course of the algorithm the increasing storage or the computational cost for solving the projected eigenvalue problems may make it necessary to restart the algorithm and purge some of the basis vectors. Since a restart destroys information on the eigenvectors and particularly on the one the method is just aiming at, the method is restarted only if an eigenvector has just converged. An obvious way to restart is to determine a Ritz pair (µ, u) from the projection to the current search space span(V ) approximating an eigenpair wanted next, and to restart the Jacobi–Davidson method with this single vector u. However, this may discard too much valuable information contained in span(V ), and may slowdown the speed of convergence too much. Therefore, thick restarts with subspaces spanned by the Ritz vector u and a small number of eigenvector approximations obtained in previous steps which correspond to eigenvalues closest to µ are preferable. 6. A crucial point in iterative methods for general nonlinear eigenvalue problems when approximating more than one eigenvalue is to inhibit the method from converging to the same eigenvalue repeatedly. For linear eigenvalue problems locking of already converged eigenvectors can be achieved using an incomplete Schur factorization. For nonlinear problems allowing for a variational characterization of its eigenvalues one can determine the eigenpairs one after another solving the projected problem by safeguarded iteration [BV04]. For general nonlinear eigenproblems a locking procedure based on invariant pairs was introduced in [Eff12] (cf. Subsection 7). Pm 7. Often the matrix function T (·) is given in the following form T (λ) := j=1 fj(λ)Aj n×n where fj :Ω → C are continuous functions and Aj ∈ C are fixed matrices. Then the projected problem can be updated easily appending one row and column to each ∗ of the projected matrices V AjV .

Two-sided Jacobi–Davidson method In Algorithm 6 approximations to an eigenvalue are obtained in step 3 from a Galerkin projection of T (λ)x = 0 to the search space Span (V ) for right eigenvectors. Computing a 115-16 Handbook of Linear Algebra left search space also with a correction equation for left eigenvectors and applying a Petrov- Galerkin projection one arrives at the Two-sided Jacobi-Davidson method in Algorithm 7 (where only the computation of one eigentriplet is considered):

Algorithm 7: Two-sided Jacobi–Davidson method Require: Initial bases U with U ∗U = I and V with V ∗V = I 1: while not converged do 2: solve V ∗T (θ)Uc = 0 and U ∗T (θ)∗V d = 0 for (θ, c, d) ∗ 3: determine Ritz vectors u = Uc and v = V d and residuals ru = T (θ)u, rv = T (θ) v 4: if min(kruk/kuk, krvk/kvk) <  then 5: accept approximate eigentriplet (v, θ, u); STOP 6: end if 7: Solve (approximately) correction equations

T 0(θ)uv∗ uu∗ (I − )T (θ)(I − )s = −r , s ⊥ u, v∗T 0(θ)u u∗u u

T 0(θ)∗vu∗ vv∗ (I − )T (θ)∗(I − )t = −r , t ⊥ v u∗T 0(θ)∗v v∗v v

8: orthogonalize s = s − UU ∗s, s = s/ksk, and expand left search space U = [U, s] 9: orthogonalize t = t − VV ∗t, t = t/ktk, and expand right search space V = [V, t] 10: end while

Facts:

8. [Sch08] θ as computed in step 2 is the value of the two-sided Rayleigh functional at (u, v), and one therefore may expect local cubic convergence for simple eigenvalues. 9. [HS03] The correction equation in step 7 of Algorithm 7 can be replaced with

T 0(θ)uv∗ T 0(θ)uv∗ (I − )T (θ)(I − )s = −r , s ⊥ u, v∗T 0(θ)u v∗T 0(θ)u u

T 0(θ)∗vu∗ T 0(θ)∗vu∗ (I − )T (θ)∗(I − )t = −r , t ⊥ v. u∗T 0(θ)∗v u∗T 0(θ)∗v v This variant was suggested in [HS03] for linear eigenvalue problems, and its general- ization to the nonlinear problem is obvious. Since again θ is the value of the two-sided Rayleigh functional the convergence should also be cubic. 10. [SS06] Replacing the correction equations with

∗ ∗ (I − vv )T (θ)(I − uu )s = −ru, s ⊥ u,

∗ ∗ ∗ (I − uu )T (θ) (I − vv )t = −rv, t ⊥ v one obtains the primal-dual Jacobi-Davidson method which was shown to be quadratically convergent.

Nonlinear Arnoldi method Expanding the current search space V by the direction vˆ = x − T −1(σ)T (θ)x as suggested by residual inverse iteration generates similar robustness problems as for inverse iteration. If vˆ is close to the desired eigenvector, then an inexact evaluation of vˆ spoils the favorable approximation properties of residual inverse iteration. Nonlinear Eigenvalue Problem 115-17

Similarly as in the Jacobi–Davidson method one could replace vˆ by z := x + αvˆ where α is chosen that x∗z = 0, and one could determine an approximation to z solving a correction equation. However, since the new search direction is orthonormalized against the previous search space V and since x is contained in V we may choose the new di- rection v = T (σ)−1T (θ)x as well. This direction satisfies the orthogonality condition x∗v = 0 at least in the limit as θ approaches a simple eigenvalue λˆ (cf. [Vos07]), i.e. ∗ −1 limθ→λˆ x T (σ) T (θ)x = 0. A template for the preconditioned nonlinear Arnoldi method with restarts and varying preconditioner is just like Algorithm 6. Only step 12 has to be replaced with t = Kr.

Facts:

11. The general remarks about the initial approximation to the eigenvector, restarts and locking following the Jacobi–Davidson method apply to the nonlinear Arnoldi method also. 12. Since the residual inverse iteration with fixed pole σ converges (at least) linearly, and the contraction rate satisfies O(|σ−λm|), it is reasonable to update the preconditioner if the convergence (measured by the quotient of the last two residual norms before convergence) has become too slow. 13. The nonlinear Arnoldi method was introduced for quadratic eigenvalue problems in [Mee01] and for general nonlinear eigenvalue problems in [Vos04]. 14. [LBL10] studies a variant that avoids complex arithmetic augmenting the search space by two vectors, the real and imaginary part of the expansion t = Kr.

115.7 Methods using invariant pairs

One of the most important problems when determining more than one eigenpair of a nonlin- ear eigenvalue problem is to prevent the method from determining the same pair repeatedly. Jordan chains are conceptually elegant but unstable under perturbations. More robust con- cepts for computing several eigenvalues along with the corresponding (generalized) eigen- vectors were introduced only recently and are based on invariant pairs [Kre09, BT09]. It is convenient to consider the nonlinear eigenvalue problem in the following form:

m X T (λ)x := fj(λ)Ajx = 0 (115.7) j=1

n×n where fj :Ω → C are analytic functions and Aj ∈ C are fixed matrices.

Definitions: k×k n×k Let the eigenvalues of S ∈ C be contained in Ω and let X ∈ C . Then (X,S) is called invariant pair of the nonlinear eigenvalue problem (115.7) if

m X AjXfj(S) = 0. j=1 115-18 Handbook of Linear Algebra

n×k k×k A pair (X,S) ∈ C × C is minimal if there is ` ∈ N such that the matrix  X   XS  V (X,S) :=   `  .   .  XS`−1 has rank k. The smallest such ` is called the minimality index of (X,S). An invariant pair (X,S) is called simple if (X,S) is minimal and the algebraic multiplicities of the eigenvalues of S are identical to the ones of the corresponding eigenvalues of T (·).

Facts: The following facts for which no specific reference is given can be found in [Kre09]. 1. Let (X,S) be a minimal invariant pair of (115.7). Then the eigenvalues of S are eigenvalues of T (·). 2. By the Cayley–Hamilton theorem the minimality index of a minimal pair can not exceed k. 3. [BK11] For a regular matrix polynomial of degree m the minimality index of a minimal invariant pair can not exceed m. 4. [Eff12] Let p0, . . . , p`−1 be a basis for the polynomials of degree less than `. Then the pair (X,S) is minimal with minimality index at most ` if and only if   Xp0(S) V p(X,S) =  .  `  .  Xp`−1(S) has full column rank. ˜ ˜ ˜ n×k˜ k˜×k˜ 5. [BK11] If V`(X,S) has rank k < k, then there is a minimal pair (X, S) ∈ C ×C such that Span(X˜) = Span(X) and Span(V`(X,˜ S˜)) = Span(V`(X,S)). 6. If (X,S) is a minimal invariant pair, then (XZ,Z−1SZ) is also a minimal invariant pair for every invertible matrix Z ∈ Ck×k. 7. Let (X,S) be a minimal invariant pair, and let pj ∈ Πk be the Hermite interpolating polynomials of fj at the spectrum of S of maximum degree k. Then (X,S) is a Pm minimal invariant pair of P (λ)x := j=1 pj(λ)Ajx = 0. 8. Let (λj, xj), j = 1, . . . , k be eigenpairs of T (·) with λi 6= λj for i 6= j. Then the invariant pair (X,S) := ([x1,..., xk], diag(λ1, . . . , λk)) is minimal. 9. Consider the nonlinear matrix operator

 n×k k×k n×k C × CΩ → C T : Pm (115.8) (X,S) 7→ j=1 AjXfj(S)

k×k where CΩ denotes the set of k×k matrices with eigenvalues in Ω. Then an invariant pair (X,S) satisfies T(X,S) = 0, but this relation is not sufficient to characterize (X,S). To define a scaling condition, choose ` such that the matrix V`(X,S) has rank k, and define the partition   W0  W1  W =   := V (X,S)(V (X,S)∗V (X,S))−1 ∈ nk×k  .  ` ` ` C  .  W`−1 Nonlinear Eigenvalue Problem 115-19

n×k with Wj ∈ C . Then V(X,S) = 0 for the operator n×k k×k n×k ∗ V : C × CΩ → C , V(X,S) := W V`(X,S) − Ik. If (X,S) is a minimal invariant pair for the nonlinear eigenvalue problem T (·)x = 0, then (X,S) is simple if and only if the linear matrix operator

n×k k×k n×k k×k L : C × C → C × C , (∆X, ∆S) 7→ (DT(∆X, ∆S), DV(∆X, ∆S))

is invertible, where DT and DV denotes the Fr´echet derivative of T and V, respectively. 10. [Kre09] The last Fact motivates to apply Newton’s method to the system T(X,S) = 0, V(X,S) = 0 which can be written as −1 (Xp+1,Sp+1) = (Xp,Sp) − L (T(Xp,Sp), V(Xp,Sp))

where L = (DT, DV) is the Jacobian matrix of T(X,S) = 0, V(X,S) = 0. m X DT(∆X, ∆S) = T(∆X,S) + AjX[Dfj(S)](∆S), j=1

m ∗ X ∗ j j DV(∆X, ∆S) = W0 ∆X + Wj (∆XS + X[DS ](∆S)). j=1

Algorithm 8: Newton’s method for computing invariant pairs n×k k×k ∗ Require: Initial pair (X0,S0) ∈ C × C such that V`(X0,S0) V`(X0,S0) = Ik 1: p ← 0, W ← V`(X0,S0) 2: repeat 3: Res ← T(Xp,Sp) 4: Solve linear matrix equation L(∆X, ∆S) = (Res,O) 5: X˜p+1 ← Xp − ∆X, S˜p+1 ← Sp − ∆S 6: Compute compact QR decomposition V`(X˜p+1, S˜p+1) = WR −1 −1 7: Xp+1 ← X˜p+1R , Sp+1 ← RS˜p+1R 8: until convergence

11. [Bey12, BEK11] Z −1 T(X,S) = T (z)X(zI − S) dz Γ where Γ is a contour (i.e. a simply closed curve) in Ω containing the spectrum of S in its interior. 12. [BEK11] 1 Z (X,S)(∆X, ∆S) = (∆X + X(zI − S)−1∆S)(zI − S)−1 dz. DT 2πi Γ

Y 13. [Eff12] Let (X,S) be a minimal (index `) invariant pair of T (·). If ([ V ],M) is a minimal invariant pair of the augmented analytic matrix function  Ω → (n+k)×(n+k),  C Tˆ :    S v  ˆ y T([X, y], [ 0 µ ])  T (µ) = p ∗ p S v ek+1  v [V`+1(X,S)] V`+1([X, y], [ 0 µ ]) 115-20 Handbook of Linear Algebra

p T k+1 with T as in Fact 11, V`+1 analogous to Fact 4, and ek+1 = (0,..., 0, 1) ∈ R , SV then ([X,Y ], [ 0 M ]) is a minimal invariant pair of T (·). Conversely, for any minimal in- SV Y −XF variant pair ([X,Y ], [ 0 M ]) of T (·) there exists a unique F such that ([ V −(SF −FM) ],M) is a minimal invariant pair of Tˆ(·). 14. The previous fact suggests that working with Tˆ(·) deflates the minimal invariant pair (X,S) from T (·). 15. [Eff12] Effenberger combined the deflation in Fact 13 with the Jacobi–Davidson method to determine several eigenpairs of a nonlinear eigenvalue problem one after another in a safe way. 16. [GKS93] The pair (X,S) is minimal if and only if

λI − S rank X

has full rank for every λ ∈ C (or, equivalently, for every eigenvalue λ of S). ˆ k×k 17. [GKS93] Let λ be an eigenvalue of T (·) and X := [x0,..., xk−1] ∈ C with x0 6= 0. ˆ ˆ Then x0,..., xk−1 is a Jordan chain at λ if and only if (X,Jk(λ)) is an invariant pair ˆ ˆ of T (·), where Jk(λ) denotes a k × k Jordan block corresponding to λ. 18. [BEK11] Let λˆ be an eigenvalue of T (·) and consider a matrix X = [X(1),...,X(p)], (i) (i) (i) (i) (i) (i) X = [x0 ,..., xmi ], with x0 6= 0. Then every x0 ,..., xmi for i = 1, . . . , p is a ˆ ˆ Jordan chain if and only if (X,J) with J := diag(Jm1 (λ),...,Jmp (λ)) is an invariant (1) (p) pair of T (·). Moreover, (X,J) is minimal if and only if x0 ,..., x0 are linearly independent. 19. [SX12] Suppose that (X,S) is a simple invariant pair of (115.7), λˆ an eigenvalue of S, and J = Z−1SZ is the Jordan canonical form of S. Assume that J has m Jordan ˆ blocks corresponding to λ, each of size ki × ki, 1 ≤ i ≤ m. Then there are exactly m ˆ Jordan chains of T (·) corresponding to λ, the length of each is ki, and the geometric multiplicity of λˆ is m. This fact demonstrates that the spectral structure of an eigenvalue λˆ of a matrix function T (·), including the algebraic, partial and geometric multiplicities together with all Jordan chains, is completely represented in a simple invariant pair (X,S) for which λˆ is an eigenvalue of S.

Examples:

1. For the quadratic eigenvalue problem (115.2) with eigenvalue λˆ = −1 and eigenvector x = [1; 1] the pair (X,S) := (x, λˆ) is a minimal invariant pair with minimality index 1, which is not simple, because the algebraic multiplicity of λˆ is 2 as an eigenvalue of T (λ)x = 0 and only 1 as an eigenvalue of S. 1 1 −1 1  The Jordan pair (X1,S1) with X1 = and S1 = is a minimal invariant pair 1 1 0 −1 with minimality index 2, which is simple, and the same is true for the pairs (X2,S2) with 1 2 1 0 X2 = and S1 = , and (X3,S3) with X3 := [X1,X2] and S3 := diag(X1,X2). 1 2 0 2

115.8 The infinite Arnoldi method

Let T :Ω → Cn×n be analytic on a neighborhood Ω of the origin, and assume that λ = 0 is not an eigenvalue of T (·). To determine eigenvalues close to 0 [JMM12] use the equivalence Nonlinear Eigenvalue Problem 115-21 of T (λ)x = 0 to a linear, infinite dimensional eigenvalue problem, and apply the linear Arnoldi method, which can be reformulated to an iteration involving only standard linear algebra operations on matrices and vectors of finite dimension.

Definitions: B(λ) := T (0)−1(T (0) − T (λ))/λ for λ 6= 0 and B(0) := −T (0)−1T 0(0) is analytic on Ω, and λˆ is an eigenvalue of T (·) if and only if λˆ is an eigenvalue of λB(λ)x = x. ∞ n P∞ (i) (i) Let D(B) := {φ ∈ C (R, C ): i=0 B (0)φ (0) < ∞}, and define

θ ∞ Z X B(i)(0)  d  B(θ) := φ(θˆ) dθˆ + C(φ),C(φ) := φ(i)(0) = B( )φ (0). (115.9) i! dθ 0 i=0

p p×p (Ψ,R) ∈ D(B) × C is an invariant pair of the operator B if (BΨ)(θ) = Ψ(θ)R.

Facts: The following facts can be found in [JMM12].

1. Let x ∈ Cn \{0}, λ ∈ Ω and denote φ(θ) := xeλθ. Then the following two statements are equivalent: (i) (λ,ˆ xˆ) is an eigenpair of T (·) (ii) (λ,ˆ φ) is an eigenpair of the linear, infinite dimensional eigenvalue problem λBφ = φ. 2. All eigenfunctions of B depend exponentially on θ, i.e. if λBψ = ψ, then ψ(θ) = xeλθ. 3. The (linear) Arnoldi method for the operator B is given in Algorithm 9. Here h·, ·i ∞ n k×k denotes a scalar product on C (R, C ), and Hk = (hik) ∈ C a constructed in the algorithm.

Algorithm 9: Arnoldi method for B

Require: Initial function φ1 with hφ1, φ1i = 1 1: for k = 1, 2,... until convergence do 2: ψ ← Bψk 3: for i=1,. . . ,k do 4: hik ← hψ, φii 5: ψ ← ψ − hikφi 6: end for p 7: hk+1,k ← hψ, ψi 8: φk+1 ← ψ/hk+1,k 9: end for 10: Compute eigenvalues µi of Hessenberg matrix Hk 11: Return eigenvalue approximations 1/µi of B

4. Since the Arnoldi method favors extreme eigenvalues of B, 1/µi will approximate eigenvalues of T (·) close to the origin. 5. If φ1 is a polynomial of degree k, then Bφ is a polynomial of degree k + 1. Hence, if φ1 is a constant function, then Algorithm 9 after N steps arrives at a Krylov space KN (B, φ1) = Span{φ1, . . . , φN } of vectors of polynomials of degree N − 1. 6. Let {qi}i=0,1,... be a sequence of polynomials such that qi is of degree i with non- N×N zero leading coefficient, and let q0 ≡ 1. Let LN ∈ R be an integration map 115-22 Handbook of Linear Algebra

corresponding to {qi} such that

   0  q0(θ) q1(θ)  .   .   .  = LN  .  . 0 qN−1(θ) qN (θ)

n×N Let the columns of (x0,..., xN−1) =: X ∈ C denote the vector coefficients in the PN−1 basis {qi}, and denote a vector of polynomials φ(θ) := i=0 qi(θ)xi. PN If ψ(θ) = (Bφ)(θ) =: i=0 qi(θ)yi, then the coefficients yi of Bφ are given by PN−1 d  PN (y1,..., yN ) = XLN and y0 = i=0 B( dθ )qi(θ)xi (0) − i=1 qi(0)yi. This fact permits to reformulate Algorithm 9 to an iteration involving only standard linear algebra operations on matrices and vectors of finite dimension. In [JMM12] the details are worked out for two polynomial bases, the monomial basis i qi = θ and Chebyshev polynomials. 7. [JMM11] Suppose that S ∈ Cp×p is invertible and suppose that (Ψ,S−1) is an in- variant pair of B. Then Ψ can be expressed as Ψ(θ) = X exp(θS) for some matrix X ∈ Cp×p. Pm p×p 8. [JMM11] Assume that T (λ) := j=1 fj(λ)Aj. Let S ∈ C be nonsingular and X ∈ Cn×p. The following two statements are equivalent: (i) (Ψ,S−1), Ψ(θ) := X exp(θS) is an invariant pair of the operator B. Pm (ii) (X,S) is an invariant pair of T (·), i.e. j=1 AjXfj(S) = 0. 9. Inspired by the implicitly restarted Arnoldi method for linear eigenproblems [JMM11] proposes a variant of the infinite Arnoldi method for the nonlinear eigenvalue problem T (λ)x = 0 which allows for locking already converged eigenpairs. The locked part of the partial Schur factorization for linear problems is replaced by invariant pairs. The method uses functions φ(θ) = XeSθc + q(θ) where X ∈ Cn×p, S ∈ Cp×p, c ∈ Cp and q : C → Cn is a vector of polynomials.

References [AR68] P.M. Anselone, L.B. Rall, The solution of characteristic value-vector problems by New- ton’s method, Numer. Math., 11:38–45, 1968. [AST09] J. Asakura, T. Sakurai, H. Tadano, T. Ikegami, K. Kimura, A numerical method for nonlinear eigenvalue problems using contour integrals, JSIAM Letters, 1:52–55, 2009. [BV04] T. Betcke, H. Voss, A Jacobi–Davidson–type projection method for nonlinear eigenvalue problems, Future Generation Comput. Syst., 20:363–372, 2004. [Bey12] W.-J. Beyn, An integral method for solving nonlinear eigenvalue problems, Linear Al- gebra Appl., 436:3839–3863, 2012. [BEK11] W.-J. Beyn, C. Effenberger, D. Kressner, Continuation of eigenvalues and invariant pairs for parametrized nonlinear eigenvalue problems, Numer. Math., 119:489–516, 2011. [BT09] W.-J. Beyn, V. Th¨ummler,Continuation of low-dimensional invariant subspaces in dy- namical systems of large dimension, SIAM J. Matrix Anal. Appl., 31:1361–1381, 2009. [BK11] T. Betcke, D. Kressner, Perturbation, extraction and refinement of invariant pairs, Linear Algebra Appl., 435:514–536, 2011. [DP01] E.M. Daya, M. Potier–Ferry, A numerical method for nonlinear eigenvalue problems application to vibrations of viscoelastic structures, Comupers & Structures, 79:533–541, 2001. [Duf55] R.J. Duffin, A minimax theory for overdamped networks, J. Rat. Mech. Anal., 4:221– 233, 1955. Nonlinear Eigenvalue Problem 115-23

[Eff12] C. Effenberger, Robust successive computation of eigenpairs for nonlinear eigenvalue problems, Tech.Report 27.2012, Math. Inst. of Comput. Sc. Engn., EPF Lausanne 2012. [GKS93] I. Gohberg, M.A. Kaashoek, F. van Schagen, On the local theory of regular analytic functions, Linear Algebra Appl., 182:9–25, 1993. [GLR82] I. Gohberg, P. Lancaster, L. Rodman. Matrix Polynomials, Academic Press, New York, 1982. [GR81] I. Gohberg, L. Rodman, On the local theory of regular analytic functions, Linear Algebra Appl., 182:9–25, 1993. [Had68] K.P. Hadeler, Variationsprinzipien bei nichtlinearen Eigenwertaufgaben, Arch. Ration. Mech. Anal., 30:297–307, 1968. [HS03] M.E. Hochstenbach, G.L.P Sleijpen, Two-sided and alternating Jacobi-Davidson,, Linear Algebra Appl.,358:145–172, 2003. [HL99] V. Hryniv, P. Lancaster, On the perturbation of analytic matrix functions, Integral Equations Operator Theory,34:325–338, 1999. [Jar12] E. Jarlebring, Convergence factor for Newton methods for nonlinear Eigenvalue problems, Linear Algebra Appl.,436:3943–3853, 2012. [JMM11] E. Jarlebring, W. Michiels, K. Meerbergen, Computing a partial Schur factorization of nonlinear eigenvalue problems using the infinite Arnoldi method, TechRep Dept. Comp. Science, K.U. Leuven, 2011. [JMM12] E. Jarlebring, W. Michiels, K. Meerbergen, A linear eigenvalue algorithm for the non- linear eigenvalue problem, Numer. Math, accepted, 2012. [Kre09] D. Kressner, A block Newton method for nonlinear eigenvalue problems, Numer. Math.,114:355–372, 2009. [Kub70] V.N. Kublanovskaya, On an approach to the solution of the generalized latent value problem for λ-matrices, SIAM. J. Numer. Anal.,7:532–537, 1970. [Lan61] P. Lancaster, A generalised Rayleigh quotient iteration for lambda-matrices, Arch. Rat. Mech. Anal.,8:309–322, 1961. [LBL10] B.-S. Liao, Z. Bai, L.-Q. Lee, K. Ko, Nonlinear Rayleigh–Ritz for solving large scale nonlinear eigenvalue problems, Taiwanese J. Math., 14:869–883, 2010. [Mee01] K. Meerbergen, Locking and restarting quadratic eigenvalue solvers, SIAM J. Sci. Comput., 22:1814–1839, 2001. [Neu85] A. Neumaier, Residual inverse iteration for the nonlinear eigenvalue problem, SIAM J. Numer. Anal.,22:914–923, 1985. [NV10] V. Niendorf, H. Voss, Detecting hyperbolic and definite matrix polynomials, Linear Algebra Appl.,432:1017–1035, 2010. [Rog64] E.H. Rogers, A minimax theory for overdamped systems, Arch. Ration. Mech. Anal.,16:89–96, 1964. [Rot89] K. Rothe. L¨osungsverfahren f¨urnichtlineare Matrixeigenwertaufgaben mit Anwen- dungen auf die Ausgleichselementmethode, Ph.D. Thesis, Universit¨atHamburg, Ger- many, 1989. [Ruh73] A. Ruhe, Algorithms for nonlinear eigenvalue problems, SIAM J. Numer. Anal.,10:674– 689, 1973. [Sch08] K. Schreiber. Nonlinear Eigenvalue Problems: Newton-type Methods and Nonlinear Rayleigh Functionals, Ph.D. Thesis, TU Berlin, Germany, 2008. [SS06] H. Schwetlick, K. Schreiber, A primal-dual Jacobi–Davidson-like method for nonlinear eigenvalue problems, TechRep ZIH-IR-0613, Technische Universit¨atDresden, Germany, 2006. [SS10] H. Schwetlick, K. Schreiber, Nonlinear Rayleigh functionals. Linear Algebra Appl., 436:3991 – 4016, 2012. [SBF96] G.L. Sleijpen, G.L. Booten, D.R. Fokkema, H.A. van der Vorst, Jacobi-Davidson type methods for generalized eigenproblems and polynomial eigenproblems, BIT Numerical 115-24 Handbook of Linear Algebra

Mathematics, 36:595–633, 1996. [SX11] D. Szyld, F. Xue, Local convergence analysis of several inexact Newton-type algorithms for general nonlinear eigenvalue problems, TechRep 11-08-09, Temple University, Philadel- phia , USA, 2011. [SX12] D. Szyld, F. Xue, Several properties of invariant pairs of nonlinear algebraic eigenvalue problems TechRep 12-02-09, Temple University, Philadelphia , USA, 2012. [Vos03] H. Voss, A maxmin principle for nonlinear eigenvalue problems with application to a rational spectral problem in fluid–solid vibration, Appl. Math., 48:607–622, 2003. [Vos04] H. Voss, An Arnoldi method for nonlinear eigenvalue problems, BIT Numerical Math- ematics, 44:387–401, 2004. [Vos07] H. Voss, A new justification of the Jacobi–Davidson method for large eigenproblems, Linear Algebra Appl., 424:448–455, 2007. [Vos07a] H. Voss, A Jacobi–Davidson method for nonlinear and nonsymmetric eigenproblems, Computers & Structures, 85:1284–1292, 2007. [Vos09] H. Voss, A minmax principle for nonlinear eigenproblems depending continuously on the eigenparameter, Numer. Linear Algebra Appl., 16:899–913, 2009. [VW82] H. Voss, B. Werner, A minimax principle for nonlinear eigenvalue problems with appli- cations to nonoverdamped systems, Math. Meth. Appl. Sci., 4:415–424, 1982. [Wer70] B. Werner. Das Spektrum von Operatorenscharen mit verallgemeinerten Rayleighquo- tienten, Ph.D. Thesis, Universit¨atHamburg, Germany, 1970. [YMW07] C. Yang, J.C. Meza, L.-W. Wang, A trust region direct constrained minimization algorithm for the Kohn-Sham equation, SIAM J. Sci. Comput., 29:1854–1875, 2007.