Structure in Loss of Orthogonality $

Structure in loss of orthogonality I Xiao-Wen Changa, Christopher C. Paigea,∗, David Titley-Peloquinb aSchool of Computer Science, McGill University, Montréal,Québec, Canada bDepartment of Bioresource Engineering, McGill University, Ste-Anne-de-Bellevue, Québec, Canada Abstract In [SIAM J. Matrix Anal. Appl., 31 (2009), pp. 565{583] it was shown that for any sequence of k unit 2-norm n-vectors, the columns of Vk, there is a special (n+k)-square unitary matrix Q(k) that can be used in the analysis of numerical algorithms based on (k) orthogonality. A k × k submatrix Sk of Q provides valuable theoretical information on the loss of orthogonality among the columns of Vk. Here it is shown that the singular value decomposition (SVD) and Jordan canonical form (JCF) of Sk both reveal the null space of Vk as well as orthonormal vectors available from a right-side orthogonal transformation of Vk. The JCF of Sk is shown to reveal more than its SVD does. The Lanczos orthogonal tridiagonalization process for a Hermitian matrix is then used to indicate the occurrence of some of these properties in practical computations. Keywords: Loss of orthogonality, singular value decomposition, Jordan canonical form, rounding error analysis, Lanczos process, eigenproblem. 2000 MSC: 65F15, 65F25, 65G50, 15A18 1. Introduction n×k If Vk 2 C has unit 2-norm columns, one can define the strictly upper triangular 4 −1 H matrix Sk = (I + Uk) Uk, where Uk is the strictly upper triangular part of Vk Vk, as well as the unitary matrix S (I −S )V H (k) 4 k k k k Q = H ; (1) Vk(Ik −Sk) In −Vk(Ik −Sk)Vk see Theorem 1 below. This Q(k) was described in [13] and can be the basis of the rounding error analysis of several numerical algorithms based on orthogonality, see, e.g., [14, 15]. But more generally the matrix Sk provides valuable theoretical information on the loss of orthogonality among the columns of any such Vk. Here properties of Sk are developed in general, and used to show the various properties of Vk that can occur. In particular, IWith best wishes to Paul Van Dooren, one of the brightest and most likeable of people. ∗Corresponding author Email addresses: [email protected] (Xiao-Wen Chang), [email protected] (Christopher C. Paige), [email protected] (David Titley-Peloquin) Preprint submitted to Linear Algebra and Its Applications July 7, 2020 we show that the Jordan canonical form (JCF) of Sk can reveal important properties that are not available from the singular value decomposition (SVD) of Sk. The paper is organized as follows. In the next two sections we give a very brief history followed by the notation used here. Section 4 summarizes the theorem on unitary Q(k) in (1), while section 5 derives some properties of Q(k) that we need. Section 6 deals with the SVD of Sk, and shows how it defines important subspaces related to Vk. Section 7 introduces the JCF of Sk, then section 8 shows how this reveals more properties of Vk. These are new results for general Vk with unit length columns, so proofs are given in these sections 7 & 8. Section 9 summarizes the Lanczos process and the result of its rounding error analysis in [14], which shows that the finite precision Lanczos process behaves as a higher dimensional exact Lanczos process for a slightly perturbed (k + n) × (k + n) matrix Ak. Section 10 states a theorem on how the Lanczos process converges, and then uses the JCF of Sk to reveal some surprising numerical behaviors of the Lanczos process and therefore of some other numerical iterative orthogonalization algorithms. 2. A very brief history of the Lanczos process and orthogonalization Although the orthogonal tridiagonalization of a Hermitian matrix A devised by Cor- nelius Lanczos [8] is simple and elegant mathematically, its numerical behavior has fasci- nated many for 70 years. The Lanczos process was originally discarded because of its loss of orthogonality, then brought back in importance and very gradually understood. There have been many useful works on this resuscitation, such as [11, 12, 20, 9, 10, 14, 15]. The ideas behind the Lanczos process led to other valuable algorithms such as in [4, 16, 2], and there has also been work on the sensitivity of the tridiagonal matrix and vectors resulting from the Lanczos process to perturbations in A, see for example [18]. But an understanding of the loss of orthogonality of the Lanczos process turned out to be crucial. A breakthrough in our understanding of loss of orthogonality in general was initiated by a comment by Charles Sheffield [21] to Gene Golub, which Gene related to Ake˚ Björck and Chris Paige around 1990, see [1]. This concerned the loss of orthogonality in modified Gram-Schmidt (MGS), but it was shown in [13] that it could be extended to apply to any sequence of unit-length vectors vj. A more complete background of this is given in [13, Section 2.2]. This approach was applied in [14] to give an augmented backward stability result for the Hermitian matrix Lanczos process [8], and this was used in [15] to prove the iterative convergence of the Lanczos process for the eigenproblem and solution of equations, along with more history in [15, Section 2]. Here we look more deeply into the properties of Sk in (1) and what it tells us about loss of orthogonality in general. 3. Notation 4 We use \=" for \is defined to be", and \≡" for \is equivalent to". Let In denote the n×k n × n unit matrix, with j-th column ej. We say Q1 2 C has orthonormal columns H n×k if Q1 Q1p= Ik and write Q1 2 U . For a vector v, we denote its Euclidean norm by 4 H n×m kvk2 = v v. For a matrix B = [b1; b2; : : : ; bm] 2 C we denote its Frobenius norm 4 by kBkF , its spectral norm by kBk2 = σmax(B) the maximum singular value of B, and its range by Range(B). For indices, i:j means i; i+1; : : : ; j, while Bi:j ≡ [bi; bi+1; : : : ; bj]. 2 We will be dealing with sequences of matrices of increasing dimensions, and will use the index k to denote the k-th matrix in a sequence, usually as for example Q(k), in (k) (k) (k) which case subscripts denote partitioning, as in Q ≡ [Q1 j Q2 ]. We often omit the particular superscript ·(k) when the meaning is clear. However there are five special matrices where we denote the k-th matrix by a subscript: Vk, Uk, Sk, Tk, and Ak. For these the (k + 1)-st matrix can be obtained from the k-th by adding a column, e.g., Vk+1 = [Vk; vk+1], or a column and a row, and there is no need for further subscripts. This makes their presentation and manipulation easier to understand in formulae. 4. Obtaining a unitary matrix from unit-length n-vectors The next theorem was given in full with proofs in [13]. It allows us to develop a (k) (k +n)×(k +n) unitary matrix Q from any n×k matrix Vk with unit-length columns. n Theorem 1 ([13, Theorem 2.1]). For integers n ≥ 1 and k ≥ 1 suppose that vj 2 C satisfies kvjk2 = 1; j =1:k+1, and Vk = [v1; : : : ; vk]. If Uk is the strictly upper triangular H H matrix satisfying Vk Vk = I + Uk + Uk , define the strictly upper triangular matrix Sk via 4 −1 −1 k×k Sk = (Ik + Uk) Uk = Uk(Ik + Uk) 2 C : (2) Then H H kSkk2 ≤ 1; Vk Vk = I , kSkk2 = 0; Vk Vk singular , kSkk2 = 1: (3) Here Sk is the unique strictly upper triangular k × k matrix such that " # h i (k) (k) S (I −S )V H (k) (k) (k) Q11 Q12 4 k k k k (k+n)×(k+n) Q ≡ Q1 Q2 ≡ (k) (k) = H 2U : Q Q Vk(Ik −Sk) In −Vk(Ik −Sk)V k n 21 22 k (4) Finally Sk and Sk+1 have the following relations S s S ≡ k k+1 2 (k+1)×(k+1); s =(I −S )V H v : (5) k+1 0 0 C k+1 k k k k+1 Here is an indication of a proof. From (2) it can be shown that −1 −1 −1 UkSk = SkUk;Uk =(Ik −Sk) Sk ≡Sk(Ik −Sk) ; (Ik −Sk) = Ik +Uk: (6) (k)H (k) (k) To prove Q1 Q1 = Ik in (4), use (2) and (6) to give (dropping · and ·k): S U(I + U)−1 U Q ≡ = = (I + U)−1; 1 V (I − S) V (I + U)−1 V H −H H H −1 −H H H −1 Q1 Q1 = (I +U) [V V +U U](I +U) = (I +U) [I +U +U +U U](I +U) = (I + U)−H [(I + U)H (I + U)](I + U)−1 = I: This was given in [15, §4]. Next, for example, kSkk2 ≤ 1 in (3) follows immediately. Finally, the first equation in (5) follows from the definition of Sk, and to prove the second equation in (5) we see from (6) that Sk+1=(Ik+1 −Sk+1)Uk+1, so that s V H v (I −S )V H v k+1 =S e =(I −S )U e =(I −S ) k k+1 = k k k k+1 . 0 k+1 k+1 k+1 k+1 k+1 k+1 k+1 k+1 0 0 3 5.

Load more