Arnoldi Methods for the Eigenvalue Problem, Generalized Or Not

ARNOLDI METHODS FOR THE EIGENVALUE PROBLEM, GENERALIZED OR NOT MARKO HUHTANEN∗ Abstract. Arnoldi methods are devised for polynomially computing either a Hessenberg- triangular form or a triangular-triangular+rank-one form to numerically solve large eigenvalue problems, generalized or not. If generalized, then no transformations into a standard form take place. If standard, then a new Arnoldi method arises. The equivalence transformations involved are unitary, or almost, and polynomially generated. The normal generalized eigenvalue structure is identified for which separate Arnoldi methods of moderate complexity are derived. Averaging elements of the n Grassmannian Grk( C ) is suggested to optimally generate subspaces in the codomain. Then the two canonical forms described get averaged, giving rise to an optimal Arnoldi method. For k = 1 this replaces the Rayleigh quotient by yielding the field of directed norms instead of the field of values. Key words. generalized eigenvalue problem, Arnoldi method, Krylov subspace, polynomial method, normal eigenvalue problem, optimality, Rayleigh quotient, Grassmannian average AMS subject classifications. 65F15, 65F50 1. Introduction. Consider a large eigenvalue problem Mx = λNx (1.1) of computing a few eigenvalues λ and corresponding eigenvectors x Cn. Here the matrices M,N Cn×n are possibly sparse as is often the case in applications∈ [18, 20, 24]. Using the generalized∈ Schur decomposition as a starting point, an algorithm for n×k the task consists of first computing Qk C with orthonormal columns, or almost, for k n. Depending on the cost, a number∈ of suggestions have been made to this end; see≪ [7] and [2, Chapter 8] as well as [22]. (There is a vast literature on iterative methods for eigenvalue problems; for a concise review on the generalized eigenvalue n×k problem, see, e.g., [23, Section 7].) Thereafter Zk C with orthonormal columns is generated. The eigenvalue problem then gets∈ reduced in dimension once some linear combinations A and (nonsingular) B of M and N are compressed by forming the partial equivalence transformation ∗ ∗ Zk AQk and Zk BQk. (1.2) In this paper, Arnoldi methods are devised for polynomially computing Qk and Zk in the domain and codomain. Without any transformations into a standard form taking place, classical iterations get covered in a natural way. A new Arnoldi method for the standard eigenvalue problem arises. The normal generalized eigenvalue structure is identified, admitting devising Arnoldi methods of moderate computational complexity for interior eigenvalues. For a given Qk, a criterion for optimally computing Zk is n devised based on averaging two elements of the Grassmannian Grk( C ). An optimal Arnoldi method for the eigenvalue problem arises. Altogether, it is shown that the standard and generalized eigenvalue problems should not be treated separately, and that the classical Arnoldi method is not optimal. To derive Arnoldi methods for (1.1), for appropriate Krylov subspaces inspect the associated resolvent operator λ R(A, B, λ) = (λB A)−1. 7−→ − ∗ Division of Mathematics, Department of Electrical and Information Engineering, University of Oulu, 90570 Oulu 57, Finland, ([email protected]). 1 2 MARKO HUHTANEN Analogously to the “shift-and-invert” paradigm, here the choice of the linear combinations A and B is a delicate issue in view of convergence. The Neumann series expansion of the resolvent operator reveals a Krylov space structure for computing Qk in the domain. For computing Zk in the codomain, two Krylov subspaces surface in a natural way, both giving rise to finitely computable canonical forms. With the first choice for a Krylov subspace to compute Zk, the matrix com- pressions attain the well-known Hessenberg-triangular form. Recall that the classical Arnoldi method is an iterative alternative to using elementary unitary transformations to convert a single matrix into a Hessenberg form. For the generalized eigenvalue problem, elementary unitary transformations can be used to bring a pair of matrices into a Hessenberg-triangular form [16].1 Here an Arnoldi method for iteratively computing this form is devised. Even though we are dealing with the generalized eigenvalue problem, polynomially the Ritz values result from a standard Arnoldi minimization problem. (In [22, Section 3] the same partial form is derived purely algebraically, without establishing a polynomial connection.) Polynomials are important by the fact that they link iterative methods with other fields of mathematics, offering a large pool of tools accordingly; see [15, 11] and references therein. With the second choice for a Krylov subspace to compute Zk, the matrix com- pressions attain a triangular-triangular+rank-one form. We have encountered such a structure neither in connection with the generalized eigenvalue problem nor with the classical Arnoldi method. Because of the symmetric roles of the matrices A and B, it can argued that it is equally natural as the Hessenberg-triangular form by the fact that the Ritz values arise polynomially as well, although somewhat surprisingly, through a GMRES (generalized minimal residual) minimization problem. This provides hence an explicit link between the GMRES method and eigenvalue approximations. In the case of a standard eigenvalue problem, a triangular-companion matrix form arises. There are Krylov subspace methods for computing Ritz values for normal matrices [11]. These methods can be applied to normal generalized eigenvalue problems. A generalized eigenvalue problem is said to be normal if the generalized Schur decomposition involves diagonal matrices. This leads to a natural extension of the classical notion of normality providing computationally by far the most attractive setting. That is, for interior eigenvalues the normal case turns out to be strikingly different by not requiring applying shift-and-invert techniques. In this context, the so-called folded spectrum method can be treated as a special case. Partly motivated by a possible inaccurate computation of Qk in forming the partial equivalence transformation (1.2), a criterion for optimally computing Zk for a given Qk is devised. Formulated in terms of Grassmannians, it consists of compar- ing two k-dimensional subspaces of Cn so as to simultaneously compute a maximal projection onto them in terms of applying Zk. For the two Arnoldi methods just described, the Zk’s suggested get replaced with their average such that the resulting Arnoldi method can be regarded as optimal. In particular, for k = 1 solving this for the standard eigenvalue problem replaces the Rayleigh quotient q∗Aq with q∗Aq Aq q∗Aq k k | | such that the field of values accordingly becomes the field of directed norms. Con- sequently, the corresponding power method converges more aggressively towards an 1Serves as a “front end” decomposition before executing the actual QZ iteration yielding the generalized Schur decomposition [8, p.380]. ARNOLDI METHODS FOR EIGENVALUES 3 exterior eigenvalue. This is not insignificant since the simple power method effect is the reason behind the success of more complex iterative methods for eigenvalues. For k 2, optimally computing Zk for a given Qk is not costly and thereby provides a≥ noteworthy option for any method relying on the use of a partial equivalence transformation.2 In Section 2 Krylov subspaces and two Arnoldi methods for the generalized eigenvalue problem are described. Although being polynomial methods both, the give rise to very different canonical forms. Section 3 deals with the normal generalized eigenvalue problem and how shift-and-invert can then be avoided. In Section 4 a way to perform the partial equivalence transformation optimally is derived. An optimal Arnoldi method arises. The field of directed norms is introduced. In Section 5 nu- merical experiments are conducted. 2. Arnoldi methods for the eigenvalue problem. In the generalized eigenvalue eigenvalue problem (1.1), the task is to locate the singular elements of the nonsingular3 two dimensional matrix subspace = span M,N . (2.1) V { } In practice this is solved by computing invertible matrices X, Y Cn×n so as to perform an equivalence transformation ∈ = X Y (2.2) W V whose singular elements are readily identifiable. For problems of moderate size, a numerically reliable way [16] to achieve this is based on fixing a basis of and then computing the generalized Schur decomposition of the following theorem.V Theorem 2.1. Suppose A, B Cn×n. Then there exist unitary matrices Q and Z such that Z∗AQ = T and Z∗BQ∈= S are upper triangular both. If the unitary matrices Q and Z can be chosen in such a way that T and S are diagonal, then there are good reasons to call the matrix subspace normal. Correspondingly, the generalized eigenvalue problem (1.1) is then said to beV normal. This yields a very natural extension the notion of normal matrix; see Section 3. For large problems, computing the generalized Schur decomposition is typically not realistic. This paper is concerned with devicing Arnoldi methods to construct a partial equivalence transformation of , i.e., with iteratively computing matrices X∗, Y Cn×k, with orthonormal columnsV for k n, to accordingly perform a dimension∈ reduction in (2.2). 4 Bearing in mind that≪ the generalized Schur decomposition carries two unitary matrices, i.e., two different orthonormal bases, a natural Arnoldi method to this end should involve two Krylov subspaces. 2.1. Krylov subspaces of the eigenvalue problem. To compute Krylov subspaces, fix a basis of by setting V A = aM + bN and B = cM + dN (2.3) for some scalars a,b,c,d C such that det a b =0. The choice of these parameters ∈ c d 6 is a delicate issue determined by which eigenvalues are being searched. Expressed by 2Practically all the methods for the eigenvalue problem can be recast in such a way that they rely on the use of a partial equivalence transformation. 3Nonsingular means that V contains invertible elements.

Load more