<<

A Tour of the Lanczos and its Convergence Guarantees through the Decades

Qiaochu Yuan

Department of UC Berkeley Joint work with Prof. Ming Gu, Bo Li

April 17, 2018

Qiaochu Yuan Tour of Lanczos April 17, 2018 1 / 43 Timeline

(1965) (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 2 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

Our work: (2015) (1966, 1971, 1980) 1) Convergence boundsConvergence for Convergence (1975, 1980) RBL, arbitrary b bounds for bounds for Convergence 2) Superlinear convergenceRandomized Lanczos bounds for Block Lanczos Block Lanczosfor typical data matrices

Qiaochu Yuan Tour of Lanczos April 17, 2018 3 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 4 / 43 Lanczos Iteration Algorithm

Developed by Lanczos in 1950 [Lan50].

Widely used iterative algorithm for computing the extremal eigenvalues and corresponding eigenvectors of a large, sparse, symmetric matrix A.

Goal n×n Given a symmetric matrix A ∈ R , with eigenvalues λ1 > λ2 > ··· > λn and associated eigenvectors u1, ··· , un, want to find approximations for

λi , i = 1, ··· , k, the k largest eigenvalues of A

ui , i = 1, ··· , k, the associated eigenvectors where k  n.

Qiaochu Yuan Tour of Lanczos April 17, 2018 5 / 43 Lanczos - Details

General idea: 1 Select an initial vector v. 2 Construct K (A, v, k) = span{v, Av, A2v, ··· , Ak−1v}.

3 Restrict and project A to the Krylov subspace, T = projKA|K 4 Use eigen values and vectors of T as approximations to those of A.

In matrices:

 k−1  n×k Kk = v Av ··· A v ∈ R   Qk = q1 q2 ··· qk ← qr (Kk ) T k×k Tk = Qk AQk ∈ R

Qiaochu Yuan Tour of Lanczos April 17, 2018 6 / 43 Lanczos - Details   α1 β1  .. ..   β1 . .        A q1 ··· qj = q1 ··· qj qj+1  .. ..   . . βj−1     βj−1 αj  βj At each step j = 1, ··· , k of Lanczos iteration: T AQj = Qj Tj + βj qj+1ej+1 Use the three-term recurrence:

Aqj = βj−1qj−1 + αj qj + βj qj+1 Calculate the αs, βs as: T αj = qj Aqj (1)

rj = (A − αj I) qj − βj−1qj (2)

βj = krj k2, qj+1 = rj /βj (3)

Qiaochu Yuan Tour of Lanczos April 17, 2018 7 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 8 / 43 Convergence of Lanczos

(k) How well does λi , the eigenvalues of Tk , approximate λi , the eigenvalues of A, for i = 1, ··· , k?

First answered by Kaniel in 1966 [Kan66] and Paige in 1971 [Pai71]. Theorem (Kaniel-Paige Inequality)

If v is chosen to be not orthogonal to the eigenspace associated with λ1, then 2 (k) tan θ (u1, v) 0 ≤ λ1 − λ1 ≤ (λ1 − λn) 2 (4) Tk−1 (γ1)

where Ti (x) is the Chebyshev polynomial of degree i, θ (·, ·) is the angle between two vectors, and

λ1 − λ2 γ1 = 1 + 2 λ2 − λn

Qiaochu Yuan Tour of Lanczos April 17, 2018 9 / 43 Convergence of Lanczos

Later generalized by Saad in 1980 [Saa80]. Theorem (Saad Inequality) T For i = 1, ··· , k, if v is chosen such that ui v 6= 0, then

(k) !2 (k) Li tan θ (ui , v) 0 ≤ λi − λi ≤ (λi − λn) (5) Tk−i (γi )

where Ti (x) is the Chebyshev polynomial of degree i, θ (·, ·) is the angle between two vectors, and

λi − λi+1 γi = 1 + 2 λi+1 − λn  λ(k)−λ  Qi−1 j n (k) j=1 (k) if i 6= 1 Li = λj −λi  1 if i = 1

Qiaochu Yuan Tour of Lanczos April 17, 2018 10 / 43 Aside - Chebyshev Polynomials

Recall 1  p j  p j  T (x) = x + x2 − 1 + x − x2 − 1 (6) j 2 When j is large,

j 1  p 2  T (x) ≈ x + x − 1 140 j 2 120 Chebyshev, Lanczos and when g is small, 100 j 1  p  80 Tj (1 + g) ≈ 1 + g + 2g 2 60

determines the convergence of 40 Monomial, power method

Lanczos with 20

  0 λi − λi+1 g = Θ 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 λi+1 − λn

Qiaochu Yuan Tour of Lanczos April 17, 2018 11 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 12 / 43 Block Lanczos Algorithm

Introduced by Golub and Underwood in 1977 [GU77] and Cullum and Donath in 1974 [CD74].

The block generalization of the Lanczos method uses, instead of a single   initial vector v, a block of b vectors V = v1 ··· vb , and builds the Krylov subspace in q iterations as  q−1 Kq (A, V, q) = span V, AV, ··· , A V .

k ≤ b, bq  n.

Compared to classical Lanczos, block Lanczos is more memory and efficient, using BLAS3 operations. has the ability to converge to eigenvalues with cluster size > 1. has faster convergence with respect to number of iterations.

Qiaochu Yuan Tour of Lanczos April 17, 2018 13 / 43 Block Lanczos Algorithm - Details

 T  A1 B1  .. ..   B1 . .       . .  A Q1 ··· Qj = Q1 ··· Qj Qj+1  .. .. T   Bj−1     Bj−1 Aj  Bj At each step j = 1, ··· , k of Lanczos iteration:   AQ = QTj + Qj+1 0 ··· 0 Bj Use the three-term recurrence: T AQj = Qj−1Bj−1 + Qj Aj + Qj+1Bj Calculate the As, Bs as: T Aj = Qj AQj (7)  T  Qj+1Bj ← qr AQj − Qj Aj − Qj Bj−1 (8)

Qiaochu Yuan Tour of Lanczos April 17, 2018 14 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 15 / 43 Convergence of Block Lanczos

First analyzed by Underwood in 1975 [Und75] with the introduction of the algorithm. Theorem (Underwood Inequality)

Let λi , ui , i = 1, ··· , n be the eigenvalues and eigenvectors of A   T respectively. Let U = u1 ··· ub . If U V is of full rank b, then for i = 1, ··· , b 2 (q) tan Θ(U, V) 0 ≤ λi − λi ≤ (λ1 − λn) 2 (9) Tq−1 (ρi )

where Ti (x) is the Chebyshev polynomial of degree i, T  cos Θ (U, V) = σmin U V , and

λi − λb+1 ρi = 1 + 2 λb+1 − λn

Qiaochu Yuan Tour of Lanczos April 17, 2018 16 / 43 Convergence of Block Lanczos Theorem (Saad Inequality [Saa80])

Let λj , uj , j = 1, ··· , n be the eigenvalues and eigenvectors of A   T respectively. Let Ui = ui ··· ui+b−1 . For i = 1, ··· , b, if Ui V is of full rank b, then

(q) !2 (q) Li tan Θ (Ui , V) 0 ≤ λi − λi ≤ (λi − λn) (10) Tq−i (ˆγi )

where Ti (x) is the Chebyshev polynomial of degree i, T  cos Θ (U, V) = σmin U V , and

λi − λi+b γˆi = 1 + 2 λi+b − λn  λ(q)−λ  Qi−1 j n (q) j=1 (q) if i 6= 1 Li = λj −λi  1 if i = 1

Qiaochu Yuan Tour of Lanczos April 17, 2018 17 / 43 Aside - Block Size

Recall, when j is large and g is small 1  j T (1 + g) ≈ 1 + g + p2g (11) j 2

 λ − λ  classical: g = Θ i i+1 λi+1 − λn   140 λi − λi+b g g block: gb = Θ 120 b λi+b − λn 100

Suppose eigenvalue distributed as 80 λ > (1 + )λ for all j: Chebyshev, Lanczos j j+1 60 1 − (1 + )−b 40 g ≈ · λ Monomial, power method b (1 + )−b i 20 0 ≈ b · λi  0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 ≈ b · g

Qiaochu Yuan Tour of Lanczos April 17, 2018 18 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 19 / 43 Randomized Block Lanczos

In recent years, there has been increased interest in to compute low-rank approximations of matrices, with high computational efficiency requirement, running on large matrices low approximation accuracy requirement, 2-3 digits of accuracy

Applications mostly in big-data computations compression of data matrices matrix processing techniques, e.g. PCA optimization of the nuclear norm objective function

Qiaochu Yuan Tour of Lanczos April 17, 2018 20 / 43 Randomized Block Lanczos

Grew out of work done in randomized algorithms, in particular Randomized Subspace Iteration, by Rokhlin, Szlam, and Tygert in 2009 [RST09] Halko, Martinsson, and Tropp in 2011 [HMST11] Gu [Gu15], and Musco [MM15] in 2015

Idea: Instead of taking any initial set of vectors V, an unfortunate choice of which could result in poor convergence, choose V = AΩ, a random projection of the columns of A, to better capture the range space.

Qiaochu Yuan Tour of Lanczos April 17, 2018 21 / 43 RSI & RBL Pseudocode

Algorithm 1 Randomized Subspace Algorithm 2 Randomized Block Iteration (RSI) pseudocode Lanczos (RBL) pseudocode m×n m×n Input:A ∈ R , target rank k, Input:A ∈ R , target rank k, block size b, number of iter. q block size b, number of iter. q m×n m×n Output:B k ∈ R , a rank-k ap- Output:B k ∈ R , a rank-k ap- proximation proximation n×b n×b 1: Draw Ω ∈ R . ωij ∼ N (0, 1). 1: Draw Ω ∈ R . ωij ∼ N (0, 1). 2: Form K = (AAT )q−1AΩ. 2: Form 3: Orthogonalize qr(K) → Q. K =[AΩ, (AAT )AΩ, ··· , (AAT )q−1AΩ]. T T 4: Bk = (QQ A)k = Q(Q A)k . 3: Orthogonalize qr(K) → Q. T T 4: Bk = (QQ A)k = Q(Q A)k .

∗ (M)k indicates the k-truncated SVD of matrix M. ∗ requires bq ≥ k. ∗∗ numerically unstable.

Qiaochu Yuan Tour of Lanczos April 17, 2018 22 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 23 / 43 Lanczos for SVD

A modification of the Lanczos algorithm can be used to find the extremal m×n singular value pairs of a non-symmetric matrix A ∈ R .

Utilizing the connections between the singular value decomposition of A and the eigen decompositions of AAT and AT A,   1 Select a block of initial vectors V = v1 ··· vb . T  T  2 Construct Krylov subspaces KV A A, V, q and KU AA , AV, q . 3 Restrict and project A to form B = proj A| KU KV 4 Use singular values and vectors of B as approximations to those of A.

Qiaochu Yuan Tour of Lanczos April 17, 2018 24 / 43 Golub-Kahan [GK65]

 T  A1 B1  .. ..       . .  A V1 ··· Vj = U1 ··· Uj  .   .. T   Bj−1  Aj   A1  ..   B1 .  T       A U1 ··· Uj = V1 ··· Vj Vj+1  .. ..   . .     Bj−1 Aj  Bj At each step j = 1, ··· , k of the bidiagonalization, calculate As and Bs as:

 T  Uj Aj ← qr AVj − Uj−1Bj−1 (12)

 T  Vj+1Bj ← qr A Uj − Vj Aj (13)

Qiaochu Yuan Tour of Lanczos April 17, 2018 25 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Convergence Convergence (1975, 1980) bounds for bounds for Convergence Randomized Lanczos bounds for Block Lanczos Block Lanczos

Qiaochu Yuan Tour of Lanczos April 17, 2018 26 / 43 Convergence of RBL

Analyzed by Musco and Musco in 2015 [MM15]. Theorem     log n log√ n For block size b ≥ k, in q = Θ  iterations of RSI and q = Θ  iterations of RBL, with constant probability 99/100, the following inequalities bounds are satisfied:

2 2 2 |σi − σi (Bk )| ≤ σk+1, , for i = 1, ··· , k (14)

 log(n/)  In the event that σb+1 ≤ cσk with c < 1, taking q = Θ min(1,σk /σb+1−1)   and q = Θ √ log(n/) suffices for RSI and RBL respectively. min(1,σk /σb+1−1)

Qiaochu Yuan Tour of Lanczos April 17, 2018 27 / 43 Timeline

(1965) Lanczos algorithm (2009, 2010, 2015) for SVD Randomized (1950) (1974, 1977) Block Lanczos Lanczos iteration Block Lanczos algorithm algorithm

1950 1960 1970 1980 1990 2000 2010

(2015) (1966, 1971, 1980) Our work: Convergence Convergence (1975, 1980) 1) Convergence boundsbounds for for bounds for Convergence RBL, arbitrary bRandomized Lanczos bounds for 2) Superlinear convergenceBlock Lanczos Block Lanczosfor typical data matrices

Qiaochu Yuan Tour of Lanczos April 17, 2018 28 / 43 Aside: Convergence of RSI

Many of our analysis techniques are similar to those used by Gu to analyze RSI [Gu15]. The bounds for RBL resulting from the current work will be similar in form to the bounds for RSI.

Theorem (RSI convergence)

Let Bk be the approximation returned by the RSI algorithm on T A = UΣV , with target rank k, block size b ≥ k, and q iterations. If Ωˆ 1 T  ˆ T ˆ T T has full row rank in V Ω = Ω1 Ω2 , then

σj σj ≥ σj (Bk ) ≥ (15) r  4q+2 2 † 2 σb+1 1 + kΩˆ 2k kΩˆ k 2 1 2 σj

Qiaochu Yuan Tour of Lanczos April 17, 2018 29 / 43 Current Work - Core Ideas

The core pieces of our analysis are: 1 the growth behavior of Chebyshev polynomials, 2 the choice of a clever orthonormal basis for Krylov subspace, 3 the creation of a spectrum “gap”, by separating the spectrum of A into those singular values that are “close” to σk , and those that are sufficiently smaller in magnitude.

Qiaochu Yuan Tour of Lanczos April 17, 2018 30 / 43 Setup & Notation

n×b For block size b, random Gaussian matrix Ω ∈ R , we are interested in the column span of h i T  T q−1 Kq = AΩ AA AΩ ··· AA AΩ

m×n T Let the SVD of A ∈ R be UΣV . For any 0 ≤ p ≤ q, define

 2 2(q−p−1)  Kˆ p : = UT2p+1(Σ) ΩΣˆ Ωˆ ··· Σ Ωˆ

= UT2p+1(Σ)Vq−p

n×b(q−p) Note Vq−p ∈ R . We require b(q − p) ≥ k, the target rank. Lemma

With Kˆ p and Kq as previously defined, n o span {Kq} ⊇ span Kˆ p (16)

Qiaochu Yuan Tour of Lanczos April 17, 2018 31 / 43 Convergence of RBL

Theorem (Main Result)

Let Bk be the approximation returned by the RBL algorithm. With notation as previously defined, if the random starting Ω is initialized such that the block Vandermonde formed by the sub-blocks of Vq−p is invertible, then for all 1 ≤ j ≤ k, and all choicesa of s, r, σ σ ≥ σ (B ) ≥ j+s (17) j j k r   1 + C2T −2 1 + 2 · σj −σj+s+r+1 2p+1 σj+s+r+1

k+r where p = q − b , and C is a constant independent of q. as is chosen to be non-zero to handle multiple singular values, and can be set to zero otherwise.

Qiaochu Yuan Tour of Lanczos April 17, 2018 32 / 43 Summary of Convergence Bounds

Rewriting bounds in comparable forms: citation bound req. on b (q) λj Saad [Saa80] λj ≥ 2  λ −λ  b ≥ k 1+L(q) tan2 Θ(U,V)T −2 1+2 j j+b j q−j λj+b

Musco [MM15] (q) σj σ ≥ s b ≥ k j σ2 spec. indep. 1+C2 log2(n) q−2 k+1 1 σ2 j

Musco [MM15] (q) σj σ ≥ s b ≥ k j √ σ2 −q min(1,σ /σ −1) k+1 spec. dep. 1+C2ne k b+1 σ2 j

(q) σj b ≥ 1 Current Work σj ≥ s  σ −σ  1+C2T −2 1+2 j j+r+1 bq ≥ k + r 3 2q+1−2(k+r)/b σj+r+1

Qiaochu Yuan Tour of Lanczos April 17, 2018 33 / 43 Typical Case - Superlinear Convergence

Recall: {aq} convergence superlinearly to a if

|a − a| lim q+1 = 0 (18) q→∞ |aq − a|

In practice, Lanczos algorithms (classical, block, randomized) often exhibit superlinear convergence behavior. It has been shown that classical Lanczos iteration is theoretically superlinearly convergent under certain assumptions about the singular spectrum [saa94, Li10]. We show this for block Lanczos algorithms, i.e., that under certain assumptions about the singular spectrum, block Lanczos produces rank k approximations Bk such that σj (Bk ) → σj superlinearly.

Qiaochu Yuan Tour of Lanczos April 17, 2018 34 / 43 Typical Case - Superlinear Convergence

A typical data matrix might have singular value spectrum decaying to 0, i.e., σj → 0. In this case our bound suggests that convergence is governed by 2 2  1  −p a := C(r)T −1 (1 + g) ≈ C(r) · 1 + g + p2g → 0 q p 2 with σ − σ  σ  g = 2 j j+r+1 = 2 j − 1 → ∞ σj+r+1 σj+r+1  k + r   k + r  p = 2 q − + 1 = 2q + 1 − 2 b b 1 We argue that aq+1/aq → 0 as follows: for all  > 0, choose r so that − 1 1 + g ≥  2 . Then, a q+1 ≤  aq 1Recall 1) our main result holds for all r; 2) k + r = (q − p)b, and so choosing r amounts to choosing q. Qiaochu Yuan Tour of Lanczos April 17, 2018 35 / 43 Effect of r

There is some optimal value of r, typically non-zero, which achieves the best convergence factor. The balance is between larger (smaller) values of r, which implies lower (higher) Chebyshev degree but bigger (smaller) gap.

  Figure: Value of reciprocal convergence factor T −1 1 + 2 σj −σj+r+1 2q+1−2((k+r)/b) σj+r+1 as r varies, for Daily Activities and Sports Dataset, k = j = 100, b = 10, q = 20.

10 -1

10 -2

10 -3

10 -4

10 -5 0 10 20 30 40 50 60 r Qiaochu Yuan Tour of Lanczos April 17, 2018 36 / 43 Numerical Example

Experimentally, choices of smaller block sizes 1 ≤ b < k appear favorable with superlinear convergence for all block sizes.

Figure: Daily Activities and Sports Dataset - A ∈ R9120×5625. Daily Activities Dataset, target k = 200, j = 200 10 0

-5 j 10 ))/ k (B j - j 10 -10 RSI b=200+2 RSI b=200+5 RSI b=200+10 RBL b=200 rel. err. = ( 10 -15 RBL b=100 RBL b=50 RBL b=20 RBL b=10 RBL b=2 10 -20 0 1000 2000 3000 4000 5000 Number of MATVECs = b*(1+2*q) Qiaochu Yuan Tour of Lanczos April 17, 2018 37 / 43 Numerical Example

Figure: Eigenfaces Dataset - A ∈ R10304×400. Eigenfaces Dataset, target k = 100, j = 100 10 0

-5 j 10 ))/ k (B j - j 10 -10 RSI b=100 RSI b=100+2 RSI b=100+5 RBL b=100 rel. err. = ( 10 -15 RBL b=50 RBL b=20 RBL b=10 RBL b=2 RBL b=1 10 -20 0 500 1000 1500 2000 2500 Number of MATVECs = b*(1+2*q)

Qiaochu Yuan Tour of Lanczos April 17, 2018 38 / 43 Conclusions

Both the theoretical analysis and numerical evidence suggest that, holding the number of matrix vector operations constant, RBL with smaller block size b is better. For matrices with decaying spectrum, RBL achieves superlinear convergence. However the preference for smaller b must be balanced with the advantages of a larger b for computational efficiency and reasons in a practical implementation, and should be further investigated.

Qiaochu Yuan Tour of Lanczos April 17, 2018 39 / 43 ReferencesI

Jane Cullum and William E Donath, A block lanczos algorithm for computing the q algebraically largest eigenvalues and a corresponding eigenspace of large, sparse, real symmetric matrices, Decision and Control including the 13th Symposium on Adaptive Processes, 1974 IEEE Conference on, vol. 13, IEEE, 1974, pp. 505–509. Gene Golub and William Kahan, Calculating the singular values and pseudo-inverse of a matrix, Journal of the Society for Industrial and Applied Mathematics, Series B: 2 (1965), no. 2, 205–224. Gene Howard Golub and Richard Underwood, The block lanczos method for computing eigenvalues, Mathematical software, Elsevier, 1977, pp. 361–377.

Qiaochu Yuan Tour of Lanczos April 17, 2018 40 / 43 ReferencesII

Ming Gu, Subspace iteration randomization and singular value problems, SIAM Journal on Scientific Computing 37 (2015), no. 3, A1139–A1173. Nathan Halko, Per-Gunnar Martinsson, Yoel Shkolnisky, and Mark Tygert, An algorithm for the principal component analysis of large data sets, SIAM Journal on Scientific computing 33 (2011), no. 5, 2580–2594. Shmuel Kaniel, Estimates for some computational techniques in linear algebra, Mathematics of Computation 20 (1966), no. 95, 369–378. , An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, United States Governm. Press Office Los Angeles, CA, 1950.

Qiaochu Yuan Tour of Lanczos April 17, 2018 41 / 43 ReferencesIII

Ren-Cang Li, Sharpness in rates of convergence for the symmetric lanczos method, Mathematics of Computation 79 (2010), no. 269, 419–435. Cameron Musco and Christopher Musco, Randomized block krylov methods for stronger and faster approximate singular value decomposition, Advances in Neural Information Processing Systems, 2015, pp. 1396–1404. Christopher Conway Paige, The computation of eigenvalues and eigenvectors of very large sparse matrices., Ph.D. thesis, University of London, 1971. Vladimir Rokhlin, Arthur Szlam, and Mark Tygert, A randomized algorithm for principal component analysis, SIAM Journal on Matrix Analysis and Applications 31 (2009), no. 3, 1100–1124.

Qiaochu Yuan Tour of Lanczos April 17, 2018 42 / 43 ReferencesIV

Yousef Saad, On the rates of convergence of the lanczos and the block-lanczos methods, SIAM Journal on Numerical Analysis 17 (1980), no. 5, 687–706. Theoretical error bounds and general analysis of a few lanczos-type algorithms, 1994. Richard Underwood, An iterative block lanczos method for the solution of large sparse symmetric eigenproblems, Tech. report, 1975.

Qiaochu Yuan Tour of Lanczos April 17, 2018 43 / 43