Deflation by Restriction for the Inverse-Free Preconditioned Krylov Subspace Method
Total Page:16
File Type:pdf, Size:1020Kb
NUMERICAL ALGEBRA, doi:10.3934/naco.2016.6.55 CONTROL AND OPTIMIZATION Volume 6, Number 1, March 2016 pp. 55{71 DEFLATION BY RESTRICTION FOR THE INVERSE-FREE PRECONDITIONED KRYLOV SUBSPACE METHOD Qiao Liang Department of Mathematics University of Kentucky Lexington, KY 40506-0027, USA Qiang Ye∗ Department of Mathematics University of Kentucky Lexington, KY 40506-0027, USA (Communicated by Xiaoqing Jin) Abstract. A deflation by restriction scheme is developed for the inverse-free preconditioned Krylov subspace method for computing a few extreme eigen- values of the definite symmetric generalized eigenvalue problem Ax = λBx. The convergence theory for the inverse-free preconditioned Krylov subspace method is generalized to include this deflation scheme and numerical examples are presented to demonstrate the convergence properties of the algorithm with the deflation scheme. 1. Introduction. The definite symmetric generalized eigenvalue problem for (A; B) is to find λ 2 R and x 2 Rn with x 6= 0 such that Ax = λBx (1) where A; B are n × n symmetric matrices and B is positive definite. The eigen- value problem (1), also referred to as a pencil eigenvalue problem (A; B), arises in many scientific and engineering applications, such as structural dynamics, quantum mechanics, and machine learning. The matrices involved in these applications are usually large and sparse and only a few of the eigenvalues are desired. Iterative methods such as the Lanczos algorithm and the Arnoldi algorithm are some of the most efficient numerical methods developed in the past few decades for computing a few eigenvalues of a large scale eigenvalue problem, see [1, 11, 19]. Their speed of convergence depends on the spectral distribution of (1) and they may suffer from slow convergence when the desired eigenvalues are not well separated. Preconditioning techniques may be used to accelerate the convergence of these iterative methods. One of the most effective techniques is the shift-and-invert transformation. However, for truly large problems, it may be too expensive or even infeasible to employ the shift-and-invert method as it requires the LU factorization of a shifted matrix. To avoid the fill-in of zero entries of a sparse matrix caused by 2010 Mathematics Subject Classification. Primary: 65F15. Key words and phrases. Generalized eigenvalue problem, Krylov subspace, deflation. Research supported in part by NSF under Grant DMS-1317424 and DMS-1318633. ∗ Corresponding author: Qiang Ye. 55 56 QIAO LIANG AND QIANG YE the LU factorization, one can consider incomplete factorizations that are used as preconditioners for solving linear systems. Several methods have been proposed that can effectively accelerate convergence of an eigenvalue algorithm without using the (complete) LU factorization of the shift-and-invert. The Jacobi-Davidson method [20,4], the JDCG algorithm [16], the locally preconditioned conjugate gradient method (LOBPCG) [6,7], and the inverse-free preconditioned Krylov subspace method [5, 14] are some of such methods, among many others (see [2,3,6,7, 12, 13, 15, 21, 22, 24, 26] for example). The inverse-free precondioned Krylov subspace method of [5] is a Krylov subspace projection method that computes the smallest (or the largest) eigenvalues of (1). The method is based on an inner-outer iteration that does not require the inversion of B or any shifted matrix A − λB. Given an approximate eigenvector xk and its Rayleigh quotient ρk, it approximates the smallest eigenvalue iteratively through the Rayleigh-Ritz projection on the Krylov subspace 2 m Km(Hk; xk) := spanfxk;Hkxk;Hk xk;:::;Hk xkg (2) where Hk := A − ρkB. It is proved in [5] that this method converges at least lin- early with a rate determined by the spectral separation of the smallest eigenvalue of Hk. This convergence theory leads to a preconditioning scheme that acceler- ates the convergence through some equivalent congruent transformation based on incomplete factorizations. This procedure, however, computes one eigenvalue (the smallest) only. To compute additional eigenvalues, a deflation technique needs to be used. Note that a block version developed in [18] can compute several eigenvalues simultaneously, but it is efficient largely for severely clustered eigenvalues. Deflation processes are standard methods used by iterative eigenvalue algorithm- s to compute additional eigenvalues after some eigenvalues have converged. Two widely used deflation techniques are the Wielandt deflation (or known as deflation by subtraction) where the matrix is modified with a low rank perturbation to move converged eigenvalue to a different part of spectrum, and deflation by restriction where approximate eigenvectors and relevant subspaces are projected to the orthog- onal complement of the converged eigenvectors, see [17, 19, 25]. Both of these de- flation schemes can be used in the standard Lanczos and Arnoldi algorithms. There are variations of these schemes that are suitable for some particular methods. For example, the implicitly restarted Arnoldi algorithm [9, 10, 11] employs an elaborate deflation scheme (locking and purging) that is similar to deflation by restriction. The Jacobi-Davidson method [20, 23] incorporates a partial Schur decomposition deflation. For the inverse-free preconditioned Krylov subspace method [5, 14], a natural deflation scheme is the Wielandt deflation where the method is implicitly applied to a low rank modified problem. Specifically, assume that ` eigenpairs (λi; vi)(for T 1 ≤ i ≤ `) of (A; B) have been found and let V` = [v1; :::; vl] with V` BV` = I and Λ` = diag(λ1; :::; λ`). If λ`+1 ≤ ::: ≤ λn are the remaining eigenvalues, then λ`+1 is the smallest eigenvalue of T (A`;B) := (A + (BV`)Σ(BV`) ;B) (3) where Σ = diagfα − λig(1 ≤ i ≤ `) with α any value chosen to be greater than λ`+1. Therefore λ`+1 can be computed by applying the inverse-free preconditioned Krylov subspace algorithm to (A`;B) and all the known convergence theory is directly applicable. This deflation method is implemented in the MATLAB program eigifp of [14]. DEFLATION FOR INVERSE-FREE KRYLOV SUBSPACE METHOD 57 In some applications, the deflation (3) alters the structure of the problem which may cause some difficulties in its implementations. Consider computing a few small- est singular values of a matric C. In [8], the inverse-free Krylov subspace method is implicitly applied to A = CT C to compute the singular values of C. To carry out the Rayleigh-Ritz projection, it is proposed to apply the projection directly to C rather than to A = CT C; namely, we compute the smallest singular values of T T T Ym CZm, rather than the smallest eigenvalues of ZmC CZm, where Zm and Ym are orthogonal bases of the subspaces Km and CKm respectively. In this way, the sin- gular values of C are approximated by the singular values of the projection matrix. Avoiding squaring of the singular values allows tiny singular values to be computed more accurately (see [8]). However, the deflation (3) changes the problem to the one T T for C C + V`ΣV` , for which the Ritz values can not be formulated as the singular values of some projection matrix. In this setting, therefore, it is more desirable to work with the original problem without the low rank modification. The deflation by restriction is a method that ensures convergence to the desired eigenpairs through projecting all approximate eigenvectors (or subspaces of candi- date eigenvectors) to the B-orthogonal complement of V` := spanfv1; ··· ; v`g. This certainly ensures the resulting approximate eigenvectors are the ones desired, but the process alters the underlying algorithm which may change its convergence prop- erties. For the inverse-free Krylov subspace method, we can apply Rayleigh-Ritz projections directly on (A; B) but replace the Krylov subspace Km(A − ρkB; xk) by the projected subspace T T Km((I − V`V` B)(A − ρkB); (I − V`V` B)xk) (4) This enforces that all approximate eigenvectors obtained are in the B-orthogonal complement of V` but the resulting algorithm is no longer equivalent to the original one and there is no guarantee on the convergence of the new process. Note that the T spectral projection I − V`V` B does not generally diagonalize the matrix A − ρkB (for B 6= I) and hence the projected subspace (4) is not equivalent to the one T produced by A − ρkB and a projected initial vector (I − V`V` B)xk. Nevertheless, this deflation strategy is implemented in [8] for the singular value problem and it is reported numerically that it has a convergence characteristics similar to the original process. In this paper, we will show that the original convergence theory in [5] can be generalized to the algorithm modified with the deflation by restriction scheme based on (4). We will show that the algorithm with deflation by restriction still converges globally and under some conditions converge locally at least linearly. However, T the rate of convergence is now determined by the eigenvalues of (I − BV`V` )(A − T ρkB)(I − V`V` B) as illustrated in Section 3. Numerical experiments will be given to demonstrate our theoretical results. The paper is organized as follows. We briefly review the inverse-free precondi- tioned Krylov subspace method in Section2. In Section3 we introduce the new deflation strategy and prove some global and local convergence results that are sim- ilar to the ones in [5]. We will present some numerical examples in Section4 to illustrate the convergence properties. We conclude the paper with some remarks in Section 5. Throughout the paper, we use k · k to denote 2-norm. 2. Inverse-free Preconditioned Krylov Subspace Method. Given a vector x, the Rayleigh quotient ρ(x) = (xT Ax)=(xT Bx) is the best approximation to an ei- genvalue in the sense that α = ρ(x) minimizes the 2-norm of the residual Ax−αBx.