Condition Numbers for Lanczos Bidiagonalization with Complete Reorthogonalization A

View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Linear Algebra and its Applications 371 (2003) 315–331 www.elsevier.com/locate/laa Condition numbers for Lanczos bidiagonalization with complete reorthogonalization A. Malyshev a,∗, M. Sadkane b aDepartment of Informatics, University of Bergen, N-5020 Bergen, Norway bDépartement de Mathématiques, Université de Bretagne Occidentale, 6, Av. Le Gorgeu, BP 809, 29285 Brest Cedex, France Received 4 March 2002; accepted 4 March 2003 Submitted by L.N. Trefethen Abstract We derive exact and computable formulas for the condition numbers characterizing the forward instability in Lanczos bidiagonalization with complete reorthogonalization. One se- ries of condition numbers is responsible for stability of Krylov spaces, the second for stability of orthonormal bases in the Krylov spaces and the third for stability of the bidiagonal form. The behaviors of these condition numbers are illustrated numerically on several examples. © 2003 Elsevier Inc. All rights reserved. Keywords: Lanczos bidiagonalization; Krylov subspaces; Condition numbers; Stability 1. Introduction Bidiagonalization of matrices by orthogonal transformations is an important part of many algorithms of numerical linear algebra. For example, it is the first step in the widely used Singular Value Decomposition algorithm [5]. Partial or truncated bidiagonalization by means of the Lanczos process is useful in approximation of extreme and especially largest singular values and the associated singular vectors of large sparse matrices and in solving least squares problems [1]. ∗ Corresponding author. E-mail addresses: [email protected] (A. Malyshev), [email protected] (M. Sadkane). 0024-3795/$ - see front matter 2003 Elsevier Inc. All rights reserved. doi:10.1016/S0024-3795(03)00482-8 316 A. Malyshev, M. Sadkane / Linear Algebra and its Applications 371 (2003) 315–331 The standard Lanczos bidiagonalization of a matrix A ∈ Rm×n, m n, is briefly described as follows (more details are found in [1,5]): Algorithm 1 (Lanczos Bidiagonalization of A). n • Choose v1 ∈ R with v12 = 1 • Set s = Av1; α1 =s2; u1 = s/α1 • for j = 2, 3,...do T t = A uj−1 − αj−1vj−1; βj−1 =t2; vj = t/βj−1 s = Avj − βj−1uj−1; αj =s2; uj = s/αj end for The success of this algorithm is partly due to its simplicity. For its implemen- tation, one only needs two subroutines that efficiently compute the matrix–vector products with A and with its transpose AT. If the iteration is continued until j = n in exact arithmetic, then Algorithm 1 generates two matrices with orthonormal col- n×n m×n umns, V =[v1,v2,...,vn]∈R and U1 =[u1,u2,...,un]∈R ,aswellas an upper bidiagonal matrix α1 β1 α2 β2 .. .. B1 = . (1) .. . βn−1 αn such that = T = T AV U1B1,AU1 VB1 . (2) Accurate information on largest singular values and vectors of A is often available after a few iterations j n of Algorithm 1. That is, only the leading j × j part of B1 and the first j columns of V and U1 suffice to approximate the desired singular values and vectors of A (see [1, Section 7.6.4]). In exact arithmetic, the vectors v1,...,vj and u1,...,uj form orthonormal bases of the Krylov spaces T T T j−1 Kj (A A, v1) = Span v1,A Av1,...,(A A) v1 (3) and T T T j−1 Lj (AA ,u1) = Span u1,AA u1,...,(AA ) u1 (4) respectively. These bases will be denoted by Bj ={v1,v2,...,vj } and Cj ={u1,u2, ...,uj }. The Krylov spaces provide a convenient tool for derivation of theoretical estimates of the convergence rate to the extreme singular values. Note that the vectors u1,...,un can be completed by orthonormal vectors un+1,...,um so that the matrix m×m U =[u1,...,um]∈R is orthogonal and satisfies AV = UB, ATU = VBT (5) A. Malyshev, M. Sadkane / Linear Algebra and its Applications 371 (2003) 315–331 317 with B × B = 1 ∈ Rm n. (6) 0 Unfortunately, the vectors vj and uj generated by the Lanczos iteration of Al- gorithm 1 in finite precision quickly lose their orthogonality as j increases. A more expensive algorithm based on a complete reorthogonalization Lanczos scheme with the help of Householder transformations does not suffer from the loss of orthogonality. This algorithm is equivalent to Algorithm 1 in exact arithmetic when αj =/ 0and βj =/ 0forallj. Below we give a pseudocode of this algorithm and refer to a similar scheme for v the standard symmetric Lanczos in [5, p. 487]. Each matrix of the form Hj in this pseudocode stands for a Householder reflector. Algorithm 2 (Lanczos Bidiagonalization with Reorthogonalization). • ∈ n = v v = Choose v1 R with v1 2 1 and determine H1 so that H1 v1 e1 • = u u = = u Set s Av1, determine H1 so that H1 s α1e1 and set u1 H1 e1. • for j = 2, 3,...do = v v T − t (Hj−1 ...H1 )(A uj−1 αj−1vj−1) v v = ˆ ˆ T determine Hj so that Hj t (t1,...,tj−1,βj−1, 0,...,0) = v v vj H1 ...Hj ej = u u − s (Hj−1 ...H1 )(Avj βj−1uj−1) u u = ˆ ˆ T determine Hj so that Hj s (s1,...,sj−1,αj , 0,...,0) = u u set uj H1 ...Hj ej end for Ideally, starting with v1, all that we can hope to get from Algorithm 2 in finite precision is the exact bidiagonalization of a perturbation A + of A.Morepre- cisely, we assume that Algorithm 2 computes orthogonal matrices V =[ v1,..., vn], where v1 = v1,andU =[ u1,..., um] and an upper bidiagonal matrix B such that the relations (A + )V = U B, (A + )TU = V B T (7) are satisfied exactly. The main concern of this paper is the analysis of the difference between the exact quantities U, V , B and their computed counterparts U , V , B under variation of an infinitesimal perturbation . More precisely, we are interested in the stability analysis of the Krylov spaces, the corresponding orthonormal bases and the bidiagonal reduction generated by the idealized Algorithm 2. When large perturbations occur in the computed U , V , B , such a phenomenon is called forward instability of the Lanczos bidiagonalization. More details on this phenomenon in the context of QR factorization can be found in [9,10]. 318 A. Malyshev, M. Sadkane / Linear Algebra and its Applications 371 (2003) 315–331 Similar topics have been previously studied by Carpraux et al. [2], Kuznetsov [7] and Paige and Van Dooren [8]. The authors of [2] consider the Krylov spaces and orthonormal bases generated by the Arnoldi method and develop corresponding condition numbers using a first order analysis under infinitesimal perturbations of the matrix. The theory of [2] is extended in [7] to the case where the starting vector of the Arnoldi iteration is also subject to perturbation. In [8], the authors use the perturbation techniques due to Chang [3] to thoroughly analyze sensitivity of the nonsymmetric (and symmetric) Lanczos tridiagonalization. They also derive condition numbers for the corresponding Krylov bases and spaces. As expected, their condition numbers coincide with those from [2] in the symmetric case. Our approach is similar to the one in [2], and the reader is assumed to be familiar with the arguments developed in this reference. In Sections 2 and 3 we derive formulas for the condition numbers of the Krylov spaces and orthonormal bases generated by Algorithm 2 as well as the condition numbers associated with the bidiagonal reduction. Section 4 is devoted to numerical experiments. Throughout this paper, the symbol 2 denotes the Euclidean norm or its induced matrix norm. The symbol F denotes the Frobenius norm. The space spanned by the columns of a matrix B is denoted by Span {B}. The identity matrix of order j is denoted by Ij or just I when its order is. clear from the context. Its kth column i = . × is denoted by ek. The notation Ij (0i−j .Ij ) with j i denotes the j i matrix − j = whose first i j columns are zero. Note that Ij Ij .WealsousetheMATLAB n×m style notation: if E = (ek,l)1kn;1lm ∈ R and 1 i<j m,thenE(i1 : : ≡ j1,i2 j2) (ek,l)i1kj1;i2lj2 . 2. Derivation of formulas Let us look for V and U in the form V = (I + X)V and U = (I + Y)U ,where n×n m×m X ∈ R and Y ∈ R , and assume that the starting vector v1 is not perturbed. The latter immediately implies that Xv1 = 0. Since we work with infinitesimal perturbations only, the matrices X and Y are skew-symmetric. Owing to the orthogonality of V and U , B = U T(A + )V. Discarding all quadratic terms in the identity B = U T(I + Y T)(A + )(I + X)V and using (5), we deduce B − B + U T V + (U TY TU)B + B(V TXV ) = 0. The matrices X = V TXV and Y = U TYU are also skew-symmetric and satisfy the following Sylvester matrix equation with = U T V : YB − BX = + B − B. (8) := T = { } The Krylov space K j Kj (A A, v1) Span v1,...,vj is the linear span Ij := + T + of the columns of V 0 . Its perturbation Kj Kj ((A ) (A ), v1) is that A. Malyshev, M. Sadkane / Linear Algebra and its Applications 371 (2003) 315–331 319 of (I + X)V Ij . The definition of matrices X and Y related by the Sylvester equa- 0 tion (8) clearly suggests that it is more convenient to work with Ij and V T(I + 0 Ij = Ij + Ij Ij + Ij Ij X)V 0 0 X 0 instead of V 0 and (I X)V 0 .

Load more