Nuclear Talent course: Computational Many-body Methods for Nuclear Physics

Morten Hjorth-Jensen(1), Calvin Johnson(2), Francesco Pederiva(3), and Kevin Schmidt(4)

(1)Michigan State University and University of Oslo (2)San Diego State University (3)University of Trento (4)Arizona State University

ECT*, June 25-July 13 Lecture slides with exercise for Friday July 6 2012

1 / 90 Topics second week, Friday

Friday: diagonalization algorithms

I Householder’s (QR) algorithms for smaller systems, typically with dimensionalities less than 105 basis states. We will also discuss Jacobi’s method as a warm-up case.

I Basics of Krylov methods (iterative methods) with Lanczos algorithm for symmetric matrices

I Today’s aim is to build your own Lanczos algorithm using the single-particle basis and Slater determinant basis algorithms you built yesterday. The example we will use is that of the simple pairing model discussed the first Monday.

2 / 90 Eigenvalue Solvers

One speaks normally of two main approaches to solving the eigenvalue problem.

The first is the formal method, involving determinants and the characteristic polynomial. This proves how many eigenvalues there are, and is the way most of you learned about how to solve the eigenvalue problem, but for matrices of dimensions greater than 2 or 3, it is rather impractical.

The other general approach is to use similarity or unitary tranformations to reduce a matrix to diagonal form. Almost always this is done in two steps: first reduce to for example a tridiagonal form, and then to diagonal form. The main algorithms we will discuss in detail, the Householder (a so-called direct method) and Lanczos algorithms (an ), follow this methodology.

3 / 90 Diagonalization methods, direct methods

Direct or non-iterative methods require for matrices of dimensionality n × n typically O(n3) operations. These methods are normally called standard methods and are used for dimensionalities n ∼ 105 or smaller. A brief historical overview

Year n 1950 n = 20 (Wilkinson) 1965 n = 200 (Forsythe et al.) 1980 n = 2000 Linpack 1995 n = 20000 Lapack 2012 n ∼ 105 Lapack

shows that in the course of 60 years the dimension that direct diagonalization methods can handle has increased by almost a factor of 104. However, it pales beside the progress achieved by computer hardware, from flops to petaflops, a factor of almost 1015. We see clearly played out in history the O(n3) bottleneck of direct matrix algorithms. Sloppily speaking, when n ∼ 104 is cubed we have O(1012) operations, which is smaller than the 1015 increase in flops.

4 / 90 Diagonalization methods

Why iterative methods? If the matrix to diagonalize is large and sparse, direct methods simply become impractical, also because many of the direct methods tend to destroy sparsity. As a result large dense matrices may arise during the diagonalization procedure. The idea behind iterative methods is to project the n−dimensional problem in smaller spaces, so-called Krylov subspaces. Given a matrix Aˆ and a vector vˆ, the associated Krylov sequences of vectors (and thereby subspaces) vˆ, Aˆvˆ, Aˆ2vˆ, Aˆ3vˆ,... , represent successively larger Krylov subspaces.

Matrix Aˆxˆ = bˆ Aˆxˆ = λxˆ Aˆ = Aˆ∗ Conjugate gradient Lanczos Aˆ 6= Aˆ∗ GMRES etc Arnoldi

5 / 90 Important Matrix and vector handling packages

The Numerical Recipes codes have been rewritten in Fortran 90/95 and C/C++ by us. The original source codes are taken from the widely used software package LAPACK, which follows two other popular packages developed in the 1970s, namely EISPACK and LINPACK.

I LINPACK: package for linear equations and least square problems. I LAPACK:package for solving symmetric, unsymmetric and generalized eigenvalue problems. From LAPACK’s website http://www.netlib.org it is possible to download for free all source codes from this library. Both C/C++ and Fortran versions are available.

I BLAS (I, II and III): (Basic Linear Algebra Subprograms) are routines that provide standard building blocks for performing basic vector and matrix operations. Blas I is vector operations, II vector-matrix operations and III matrix-matrix operations. Highly parallelized and efficient codes, all available for download from http://www.netlib.org.

6 / 90 Basic Matrix Features

Matrix Properties Reminder

    a11 a12 a13 a14 1 0 0 0  a21 a22 a23 a24   0 1 0 0  A =   I =    a31 a32 a33 a34   0 0 1 0  a41 a42 a43 a44 0 0 0 1

The inverse of a matrix is defined by

A−1 · A = I

7 / 90 Basic Matrix Features

Matrix Properties Reminder

Relations Name matrix elements T A = A symmetric aij = aji T −1 P P A = A real orthogonal k aik ajk = k aki akj = δij ∗ ∗ A = A real matrix aij = aij † ∗ A = A hermitian aij = aji †−1 P ∗ P ∗ A = A unitary k aik ajk = k aki akj = δij

8 / 90 Some famous Matrices

1. Diagonal if aij = 0 for i 6= j

2. Upper triangular if aij = 0 for i > j

3. Lower triangular if aij = 0 for i < j

4. Upper Hessenberg if aij = 0 for i > j + 1

5. Lower Hessenberg if aij = 0 for i < j + 1

6. Tridiagonal if aij = 0 for |i − j| > 1

7. Lower banded with bandwidth p aij = 0 for i > j + p

8. Upper banded with bandwidth p aij = 0 for i < j + p 9. Banded, block upper triangular, block lower triangular....

9 / 90 Basic Matrix Features

Some Equivalent Statements For an N × N matrix A the following properties are all equivalent 1. If the inverse of A exists, A is nonsingular. 2. The equation Ax = 0 implies x = 0. 3. The rows of A form a basis of RN . 4. The columns of A form a basis of RN . 5. A is a product of elementary matrices. 6. 0 is not eigenvalue of A.

10 / 90 Important Mathematical Operations

The basic matrix operations that we will deal with are addition and subtraction

A = B ± C =⇒ aij = bij ± cij , (1)

scalar- A = γB =⇒ aij = γbij , (2) vector-matrix multiplication

n X y = Ax =⇒ yi = aij xj , (3) j=1

matrix-matrix multiplication

n X A = BC =⇒ aij = bik ckj , (4) k=1

and transposition T A = B =⇒ aij = bji (5)

11 / 90 Important Mathematical Operations

Similarly, important vector operations that we will deal with are addition and subtraction

x = y ± z =⇒ xi = yi ± zi , (6)

scalar-vector multiplication x = γy =⇒ xi = γyi , (7) vector-vector multiplication (called Hadamard multiplication)

x = yz =⇒ xi = yi zi , (8)

the inner or so-called dot product resulting in a constant

n T X x = y z =⇒ x = yj zj , (9) j=1

and the outer product, which yields a matrix,

T A = yz =⇒ aij = yi zj , (10)

12 / 90 Eigenvalue Solvers

Let us consider the matrix A of dimension n. The eigenvalues of A is defined through the matrix equation Ax(ν) = λ(ν)x(ν), where λ(ν) are the eigenvalues and x(ν) the corresponding eigenvectors. Unless otherwise stated, when we use the wording eigenvector we mean the right eigenvector. The left eigenvector is defined as

(ν) (ν) (ν) x LA = λ x L

The above right eigenvector problem is equivalent to a set of n equations with n

unknowns xi .

13 / 90 Eigenvalue Solvers

The eigenvalue problem can be rewritten as   A − λ(ν)I x(ν) = 0,

with I being the unity matrix. This equation provides a solution to the problem if and only if the determinant is zero, namely

(ν) A − λ I = 0,

which in turn means that the determinant is a polynomial of degree n in λ and in general we will have n distinct zeros.

14 / 90 Eigenvalue Solvers

× The eigenvalues of a matrix A ∈ Cn n are thus the n roots of its characteristic polynomial P(λ) = det(λI − A), or n Y P(λ) = (λi − λ) . i=1 The set of these roots is called the spectrum and is denoted as λ(A). If λ(A) = {λ1, λ2, . . . , λn} then we have

det(A) = λ1λ2 . . . λn,

and if we define the trace of A as

n X Tr(A) = aii i=1

then Tr(A) = λ1 + λ2 + ··· + λn.

15 / 90 Abel-Ruffini Impossibility Theorem

The Abel-Ruffini theorem (also known as Abel’s impossibility theorem) states that there is no general solution in radicals to polynomial equations of degree five or higher. The content of this theorem is frequently misunderstood. It does not assert that higher-degree polynomial equations are unsolvable. In fact, if the polynomial has real or complex coefficients, and we allow complex solutions, then every polynomial equation has solutions; this is the fundamental theorem of algebra. Although these solutions cannot always be computed exactly with radicals, they can be computed to any desired degree of accuracy using numerical methods such as the Newton-Raphson method or Laguerre method, and in this way they are no different from solutions to polynomial equations of the second, third, or fourth degrees. The theorem only concerns the form that such a solution must take. The content of the theorem is that the solution of a higher-degree equation cannot in all cases be expressed in terms of the polynomial coefficients with a finite number of operations of addition, subtraction, multiplication, division and root extraction. Some polynomials of arbitrary degree, of which the simplest nontrivial example is the monomial equation axn = b, are always solvable with a radical.

16 / 90 Abel-Ruffini Impossibility Theorem

The Abel-Ruffini theorem says that there are some fifth-degree equations whose solution cannot be so expressed. The equation x5 − x + 1 = 0 is an example. Some other fifth degree equations can be solved by radicals, for example x5 − x4 − x + 1 = 0. The precise criterion that distinguishes between those equations that can be solved by radicals and those that cannot was given by Galois and is now part of Galois theory: a polynomial equation can be solved by radicals if and only if its Galois group is a solvable group. Today, in the modern algebraic context, we say that second, third and fourth degree polynomial equations can always be solved by radicals because the symmetric groups

S2, S3 and S4 are solvable groups, whereas Sn is not solvable for n ≥ 5.

17 / 90 Eigenvalue Solvers

In the present discussion we assume that our matrix is real and symmetric, that is n×n A ∈ R . The matrix A has n eigenvalues λ1 . . . λn (distinct or not). Let D be the diagonal matrix with the eigenvalues on the diagonal

 λ1 0 0 0 ... 0 0   0 λ2 0 0 ... 0 0     0 0 λ3 0 0 ... 0  D =   .  ......   0 ...... λn−1  0 ...... 0 λn

If A is real and symmetric then there exists a real orthogonal matrix S such that

T S AS = diag(λ1, λ2, . . . , λn),

and for j = 1 : n we have AS(:, j) = λj S(:, j).

18 / 90 Eigenvalue Solvers

× To obtain the eigenvalues of A ∈ Rn n, the strategy is to perform a series of similarity transformations on the original matrix A, in order to reduce it either into a diagonal form as above or into a tridiagonal form. We say that a matrix B is a similarity transform of A if

B = ST AS, where ST S = S−1S = I.

The importance of a similarity transformation lies in the fact that the resulting matrix has the same eigenvalues, but the eigenvectors are in general different.

19 / 90 Eigenvalue Solvers

To prove this we start with the eigenvalue problem and a similarity transformed matrix B. Ax = λx and B = ST AS. We multiply the first equation on the left by ST and insert ST S = I between A and x. Then we get (STAS)(STx) = λSTx, (11)

which is the same as     B STx = λ STx .

The variable λ is an eigenvalue of B as well, but with eigenvector STx.

20 / 90 Eigenvalue Solvers

The basic philosophy is to

I either apply subsequent similarity transformations (direct method) so that

T T SN ... S1 AS1 ... SN = D, (12)

I or apply subsequent similarity transformations so that A becomes tridiagonal (Householder) or upper/lower triangular (QR method). Thereafter, techniques for obtaining eigenvalues from tridiagonal matrices can be used.

I or use so-called power methods I or use iterative methods (Krylov, Lanczos, Arnoldi). These methods are popular for huge matrix problems.

21 / 90 Eigenvalue Solvers, simplest possible method, Jacobi

Consider an (n × n) orthogonal transformation matrix

 1 0 ... 0 0 ... 0 0   0 1 ... 0 0 ... 0 0     ...... 0 ...     0 0 ... cos θ 0 ... 0 sin θ  S =    0 0 ... 0 1 ... 0 0     ...... 0 ...   0 0 ... − sin θ 0 ... 1 cos θ  0 0 ... 0 ...... 0 1

with property ST = S−1. It performs a plane rotation around an angle θ in the Euclidean n−dimensional space.

22 / 90 Eigenvalue Solvers, Jacobi

It means that its matrix elements that differ from zero are given by

skk = sll = cos θ, skl = −slk = − sin θ, sii = −sii = 1 i 6= k i 6= l,

A similarity transformation B = ST AS, results in

bik = aik cos θ − ail sin θ, i 6= k, i 6= l

bil = ail cos θ + aik sin θ, i 6= k, i 6= l 2 2 bkk = akk cos θ − 2akl cos θ sin θ + all sin θ 2 2 bll = all cos θ + 2akl cos θsinθ + akk sin θ 2 2 bkl = (akk − all ) cos θ sin θ + akl (cos θ − sin θ)

The angle θ is arbitrary. The recipe is to choose θ so that all non-diagonal matrix elements bkl become zero.

23 / 90 Eigenvalue Solvers, Jacobi

The main idea is thus to reduce systematically the norm of the off-diagonal matrix elements of a matrix A v u n n uX X 2 off(A) = t aij . i=1 j=1,j6=i To demonstrate the algorithm, we consider the simple 2 × 2 similarity transformation of the full matrix. The matrix is symmetric, we single out 1 ≤ k < l ≤ n and use the abbreviations c = cos θ and s = sin θ to obtain

 b 0   c −s   a a   c s  kk = kk kl . 0 bll s c alk all −s c

24 / 90 Eigenvalue Solvers, Jacobi We require that the non-diagonal matrix elements bkl = blk = 0, implying that

2 2 akl (c − s ) + (akk − all )cs = bkl = 0.

If akl = 0 one sees immediately that cos θ = 1 and sin θ = 0. The Frobenius norm of an orthogonal transformation is always preserved. The Frobenius norm is defined as v u n n uX X 2 ||A||F = t |aij | . i=1 j=1

This means that for our 2 × 2 case we have

2 2 2 2 2 2akl + akk + all = bkk + bll ,

which leads to n 2 2 X 2 2 2 off(B) = ||B||F − bii = off(A) − 2akl , i=1 since n n 2 X 2 2 X 2 2 2 2 2 ||B||F − bii = ||A||F − aii + (akk + all − bkk − bll ). i=1 i=1 This results means that the matrix A moves closer to diagonal form for each transformation. 25 / 90 Eigenvalue Solvers, Jacobi Defining the quantities tan θ = t = s/c and

a − a cot 2θ = τ = ll kk , 2akl

we obtain the quadratic equation (using cot 2θ = 1/2(cot θ − tan θ)

t2 + 2τt − 1 = 0,

resulting in p t = −τ ± 1 + τ 2, and c and s are easily obtained via

1 c = p , 1 + t2

and s = tc. Choosing t to be the smaller of the roots ensures that |θ| ≤ π/4 and has the effect of minimizing the difference between the matrices B and A since

n X 2a2 ||B − A||2 = 4(1 − c) (a2 + a2) + kl . F ik il c2 i=1,i6=k,l

26 / 90 Eigenvalue Solvers, Jacobi algo

I Choose a tolerance , making it a small number, typically 10−8 or smaller. I Setup a while-test where one compares the norm of the newly computed off-diagonal matrix elements

v u n n uX X 2 off(A) = t aij > . i=1 j=1,j6=i

I Now choose the matrix elements akl so that we have those with largest value, that is |akl | = maxi6=j |aij |.

I Compute thereafter τ = (all − akk )/2akl , tan θ, cos θ and sin θ. I Compute thereafter the similarity transformation for this set of values (k, l), obtaining the new matrix B = S(k, l, θ)T AS(k, l, θ).

I Compute the new norm of the off-diagonal matrix elements and continue till you have satisfied off(B) ≤ 

The convergence rate of the Jacobi method is however poor, one needs typically 3n2 − 5n2 rotations and each rotation requires 4n operations, resulting in a total of 12n3 − 20n3 operations in order to zero out non-diagonal matrix elements.

27 / 90 Jacobi’s method, an example to convice you about the algorithm

We specialize to a symmetric 3 × 3 matrix A. We start the process as follows (assuming that a23 = a32 is the largest non-diagonal) with c = cos θ and s = sin θ       1 0 0 a11 a12 a13 1 0 0 B =  0 c −s   a21 a22 a23   0 c s  . 0 s c a31 a32 a33 0 −s c

We will choose the angle θ in order to have a23 = a32 = 0. We get (symmetric matrix)   a11 a12c − a13s a12s + a13c 2 2 2 2 B =  a12c − a13s a22c + a33s − 2a23sc (a22 − a33)sc + a23(c − s )  . 2 2 2 2 a12s + a13c (a22 − a33)sc + a23(c − s ) a22s + a33c + 2a23sc

Note that a11 is unchanged! As it should.

28 / 90 Jacobi’s method, an example to convice you about the algorithm

We have   a11 a12c − a13s a12s + a13c 2 2 2 2 B =  a12c − a13s a22c + a33s − 2a23sc (a22 − a33)sc + a23(c − s )  . 2 2 2 2 a12s + a13c (a22 − a33)sc + a23(c − s ) a22s + a33c + 2a23sc

or

b11 = a11

b12 = a12 cos θ − a13 sin θ, 1 6= 2, 1 6= 3

b13 = a13 cos θ + a12 sin θ, 1 6= 2, 1 6= 3 2 2 b22 = a22 cos θ − 2a23 cos θ sin θ + a33 sin θ 2 2 b33 = a33 cos θ + 2a23 cos θ sin θ + a22 sin θ 2 2 b23 = (a22 − a33) cos θ sin θ + a23(cos θ − sin θ)

We will fix the angle θ so that b23 = 0.

29 / 90 Jacobi’s method, an example to convice you about the algorithm

We get then a new matrix

  b11 b12 b13 B =  b12 b22 0  . b13 0 a33

We repeat then assuming that b12 is the largest non-diagonal matrix element and get a new matrix       c −s 0 b11 b12 b13 c s 0 C =  s c 0   b12 b22 0   −s c 0  . 0 0 1 b13 0 b33 0 0 1

We continue this process till all non-diagonal matrix elements are zero (ideally). You will notice that performing the above operations that the matrix element b23 which was previous zero becomes different from zero. This is one of the problems which slows down the Jacobi procedure.

30 / 90 Jacobi’s method, an example to convice you about the algorithm

The more general expression for the new matrix elements are

bii = aii , i 6= k, i 6= l

bik = aik cos θ − ail sin θ, i 6= k, i 6= l

bil = ail cos θ + aik sin θ, i 6= k, i 6= l 2 2 bkk = akk cos θ − 2akl cos θ sin θ + all sin θ 2 2 bll = all cos θ + 2akl cos θ sin θ + akk sin θ 2 2 bkl = (akk − all ) cos θ sin θ + akl (cos θ − sin θ)

This is what we will need to code.

31 / 90 Jacobi code example Main part // we have defined a matrix A and a matrix R for the eigenvector, both of dim n x n // The final matrix R has the eigenvectors in its row elements, it is set to one // for the diagonal elements in the beginning, zero else . .... double tolerance = 1.0E−10; i n t iterations = 0; while ( maxnondiag > tolerance && iterations <= maxiter ) { i n t p , q ; maxnondiag = offdiag(A, p, q, n); J a c o b i rotate(A, R, p, q, n); iterations++; } ...

32 / 90 Jacobi code example Finding the max nondiagonal element // the offdiag function double o f f d i a g ( double ∗∗A, i n t p , i n t q , i n t n ) ; { double max ; for ( i n t i = 0; i < n ; ++ i ) { for ( i n t j = i +1; j < n ; ++ j ) { double aij = fabs(A[i][j]); i f ( a i j > max) { max= aij; p= i; q= j; } } } return max ; } ...

33 / 90 Jacobi code example Finding the new matrix elements void J a c o b i r o t a t e ( double ∗∗ A, double ∗∗ R, i n t k , i n t l , i n t n ) { double s , c ; i f ( A[k][l] != 0.0 ) { double t , tau ; tau = (A[l][l] − A[k][k])/(2 ∗A[ k ] [ l ] ) ;

i f ( tau >= 0 ) { t = 1.0/(tau + sqrt(1.0 + tau ∗ tau ) ) ; } else { t = −1.0/(− tau +sqrt(1.0 + tau ∗ tau ) ) ; }

c = 1/sqrt(1+t ∗ t ) ; s = c∗ t ; } else { c = 1 . 0 ; s = 0 . 0 ; } 34 / 90 Jacobi code example

double a kk , a l l , a ik , a i l , r i k , r i l ; a kk = A[k][k]; a ll = A[l][l]; A[ k ] [ k ] = c∗c∗ a kk − 2.0∗ c∗s∗A[ k ] [ l ] + s∗s∗ a l l ; A[ l ] [ l ] = s∗s∗ a kk + 2.0∗ c∗s∗A[ k ] [ l ] + c∗c∗ a l l ; A[k][l] = 0.0; / / hard−coding non−diagonal elements by hand A[l][k] = 0.0; / / same here for ( i n t i = 0; i < n ; i ++ ) { i f ( i !=k&&i != l ) { a ik = A[i][k]; a il = A[i][l]; A[ i ] [ k ] = c∗ a i k − s∗ a i l ; A[k][i] = A[i][k]; A[ i ] [ l ] = c∗ a i l + s∗ a i k ; A[l][i] = A[i][l]; }

35 / 90 Jacobi code example

and finally the new eigenvectors r ik = R[i][k]; r il =R[i][l];

R[ i ] [ k ] = c∗ r i k − s∗ r i l ; R[ i ] [ l ] = c∗ r i l + s∗ r i k ; } return ; } // end of function jacobi r o t a t e

36 / 90 Eigenvalue Solvers, Householder

The first step consists in finding an orthogonal matrix S which is the product of (n − 2) orthogonal matrices S = S1S2 ... Sn−2, each of which successively transforms one row and one column of A into the required tridiagonal form. Only n − 2 transformations are required, since the last two elements are already in tridiagonal form. In order to determine each Si let us see what happens after the first multiplication, namely,

 a11 e1 0 0 ... 0 0  e a0 a0 ...... a0  1 22 23 2n  T =  0 a0 a0 ...... a0  S1 AS1  32 33 3n   0 ......  0 0 0 0 an2 an3 ...... ann

where the primed quantities represent a matrix A0 of dimension n − 1 which will

subsequentely be transformed by S2.

37 / 90 Eigenvalue Solvers, Householder

The factor e1 is a possibly non-vanishing element. The next transformation produced 0 by S2 has the same effect as S1 but now on the submatirx A only

 a11 e1 0 0 ... 0 0  e a0 e 0 ...... 0  1 22 2  ( )T =  0 e a00 ...... a00  S1S2 AS1S2  2 33 3n   0 ......  00 00 0 0 an3 ...... ann

Note that the effective size of the matrix on which we apply the transformation reduces for every new step. In the previous Jacobi method each similarity transformation is in principle performed on the full size of the original matrix.

38 / 90 Eigenvalue Solvers, Householder

After a series of such transformations, we end with a set of diagonal matrix elements

0 00 n−1 a11, a22, a33 ... ann ,

and off-diagonal matrix elements

e1, e2, e3,..., en−1.

The resulting matrix reads

  a11 e1 0 0 ... 0 0 0  e1 a22 e2 0 ... 0 0   00   0 e2 a e3 0 ... 0  T  33  S AS =  ......  .  (n−1)   0 ...... an−2 en−1   (n−1)  0 ...... en−1 an−1

39 / 90 Eigenvalue Solvers, Householder

It remains to find a recipe for determining the transformation Sn. We illustrate the method for S1 which we assume takes the form

 1 0T  S = , 1 0 P

with 0T being a zero row vector, 0T = {0, 0, ···} of dimension (n − 1). The matrix P is symmetric with dimension ((n − 1) × (n − 1)) satisfying P2 = I and PT = P. A possible choice which fullfils the latter two requirements is

P = I − 2uuT ,

where I is the (n − 1) unity matrix and u is an n − 1 column vector with norm uT u (inner product).

40 / 90 Eigenvalue Solvers, Householder

Note that uuT is an outer product giving a matrix of dimension ((n − 1) × (n − 1)). Each matrix element of P then reads

Pij = δij − 2ui uj ,

where i and j range from 1 to n − 1. Applying the transformation S1 results in

 a (Pv)T  ST AS = 11 , 1 1 Pv A0

T T where v = {a21, a31, ··· , an1} and P must satisfy (Pv) = {k, 0, 0, ···}. Then

Pv = v − 2u(uT v) = ke, (13)

with eT = {1, 0, 0,... 0}.

41 / 90 Eigenvalue Solvers, Householder

Solving the latter equation gives us u and thus the needed transformation P. We do first however need to compute the scalar k by taking the scalar product of the last equation with its transpose and using the fact that P2 = I. We get then

n T 2 T 2 X 2 (Pv) Pv = k = v v = |v| = ai1, i=2

which determines the constant k = ±v.

42 / 90 Eigenvalue Solvers, Householder

Now we can rewrite Eq. (13) as

v − ke = 2u(uT v),

and taking the scalar product of this equation with itself and obtain

T 2 2 2(u v) = (v ± a21v), (14)

which finally determines v − ke u = . 2(uT v) In solving Eq. (14) great care has to be exercised so as to choose those values which make the right-hand largest in order to avoid loss of numerical precision. The above steps are then repeated for every transformations till we have a suitable for obtaining the eigenvalues.

43 / 90 Eigenvalue Solvers, Householder, brute force

Our Householder transformation has given us a tridiagonal matrix. We discuss here how one can use Jacobi’s iterative procedure to obtain the eigenvalues. Let us specialize to a 4 × 4 matrix. The tridiagonal matrix takes the form

  d1 e1 0 0  e1 d2 e2 0  A =   .  0 e2 d3 e3  0 0 e3 d4

As a first observation, if any of the elements ei are zero the matrix can be separated

into smaller pieces before diagonalization. Specifically, if e1 = 0 then d1 is an eigenvalue.

44 / 90 Eigenvalue Solvers, Householder

Thus, let us introduce a transformation S1 which operates like

 cos θ 0 0 sin θ   0 0 0 0  S1 =    0 0 0 0  cos θ 0 0 cos θ

45 / 90 Eigenvalue Solvers, Householder

Then the similarity transformation

 0 0  d1 e1 0 0 0 T 0  e1 d2 e2 0  S1 AS1 = A =  0   0 e2 d3 e 3  0 0 0 0 e3 d4

produces a matrix where the primed elements in A0 have been changed by the transformation whereas the unprimed elements are unchanged. If we now choose θ to 0 0 0 0 give the element a21 = e = 0 then we have the first eigenvalue = a11 = d1. (This is actually what you are doing in project 2!!)

46 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

This procedure can be continued on the remaining three-dimensional submatrix for the next eigenvalue. Thus after few transformations we have the wanted diagonal form. What we see here is just a special case of the more general procedure developed by Francis in two articles in 1961 and 1962. Using Jacobi’s method is not very efficient either. The algorithm is based on the so-called QR method (or just QR-algorithm). It follows from a theorem by Schur which states that any square matrix can be written out in terms of an orthogonal matrix Qˆ and an upper triangular matrix Uˆ . Historically R was used instead of U since the wording right triangular matrix was first used.

47 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

The method is based on an iterative procedure similar to Jacobi’s method, by a succession of planar rotations. For a tridiagonal matrix it is simple to carry out in principle, but complicated in detail! Schur’s theorem Aˆ = Qˆ Uˆ , is used to rewrite any square matrix into a times an upper triangular matrix. We say that a square matrix is similar to a triangular matrix. Householder’s algorithm which we have derived is just a special case of the general Householder algorithm. For a symmetric square matrix we obtain a tridiagonal matrix. There is a corollary to Schur’s theorem which states that every is unitarily similar to a diagonal matrix.

48 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

It follows that we can define a new matrix

AˆQˆ = Qˆ Uˆ Qˆ ,

and multiply from the left with Qˆ −1 we get

Qˆ −1AˆQˆ = Bˆ = Uˆ Qˆ ,

where the matrix Bˆ is a similarity transformation of Aˆ and has the same eigenvalues as Bˆ.

49 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

Suppose Aˆ is the triangular matrix we obtained after the Householder transformation,

Aˆ = Qˆ Uˆ ,

and multiply from the left with Qˆ −1 resulting in

Qˆ −1Aˆ = Uˆ .

Suppose that Qˆ consists of a series of planar Jacobi like rotations acting on sub blocks of Aˆ so that all elements below the diagonal are zeroed out

ˆ ˆ ˆ ˆ Q = R12R23 ... Rn−1,n.

50 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

ˆ A transformation of the type R12 looks like

 c s 0 0 0 ... 0 0 0   −s c 0 0 0 ... 0 0 0     0 0 1 0 0 ... 0 0 0  ˆ   R12 =  ......     0 0 0 0 0 ... 1 0 0   0 0 0 0 0 ... 0 1 0  0 0 0 0 0 ... 0 0 1

51 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

The matrix Uˆ takes then the form

 x x x 0 0 ... 0 0 0   0 x x x 0 ... 0 0 0     0 0 x x x ... 0 0 0  ˆ   U =  ......     0 0 0 0 0 ... x x x   0 0 0 0 0 ... 0 x x  0 0 0 0 0 ... 0 0 x

which has a second superdiagonal.

52 / 90 Eigenvalue Solvers, Householder’s and Francis’ algorithm

We have now found Qˆ and Uˆ and this allows us to find the matrix Bˆ which is, due to Schur’s theorem, unitarily similar to a triangular matrix (upper in our case) since we have that Qˆ −1AˆQˆ = Bˆ, from Schur’s theorem the matrix Bˆ is triangular and the eigenvalues the same as those of Aˆ and are given by the diagonal matrix elements of Bˆ. Why? Our matrix Bˆ = Uˆ Qˆ .

53 / 90 Eigenvalue Solvers, Householder functions

The programs which performs these transformations are matrix A −→ tridiagonal matrix −→ diagonal matrix

C++: void trd2(double ∗∗a, int n, double d[], double e[]) void tqli(double d[], double[], int n, double ∗∗z) Fortran: CALL tred2(a, n, d, e) CALL tqli(d, e, n, z)

54 / 90 Using Lapack to solve linear algebra and eigenvalue problems

Suppose you wanted to solve a general system of linear equations Aˆx = b, where Aˆis an n × n square matrix and x and b are n-element column vectors. You opt to use the routine dgesv. The man page abstract obtained with

man dgesv

if Lapack is properly installed. The C++ declaration is

void dgesv(int n, int nrhs, double *da, int lda, int *ipivot, double *db, int ldb, int *info) ;

55 / 90 Lapack example

#include #define MAX 10 using namespace std ; i n t main ( ) { // Values needed for dgesv i n t n ; i n t nrhs = 1; double a [MAX] [MAX] ; double b [ 1 ] [MAX] ; i n t lda = MAX; i n t ldb = MAX; i n t i p i v [MAX] ; i n t i n f o ; // Other values i n t i , j ; }

56 / 90 Lapack example

// Read the values of the matrix cout << "Enter n \n" ; c i n >> n ; cout << "On each line type a row of the matrix A followed by one element of b:\n" ; for ( i = 0; i < n ; i ++){ cout << "row " << i << "" ; for ( j = 0; j < n; j++)std::cin >> a [ j ] [ i ] ; c i n >> b [ 0 ] [ i ] ; } }

57 / 90 Lapack example // Solve the linear system dgesv(n, nrhs, &a[0][0], lda, ipiv , &b[0][0], ldb, &info); // Check for success i f ( i n f o == 0) { // Write the answer cout << "The answer is\n" ; for ( i = 0; i < n ; i ++) cout << "b[" << i << "]\t" << b [ 0 ] [ i ] << "\ n" ; } else { // Write an error message c e r r << "dgesv returned error " << i n f o << "\ n" ; } return i n f o ; } 58 / 90 Eigenvalue Solvers, Power methods ˆ We assume A can be diagonalized. Let λ1, λ2, ... , λn be the n eigenvalues (counted ˆ with multiplicity) of A and let v1, v2, ... , vn be the corresponding eigenvectors. We assume that λ1 is the dominant eigenvalue, so that |λ1| > |λj | for j > 1. The initial vector b0 can be written:

b0 = c1v1 + c2v2 + ··· + cmvm.

Then k k k k A b0 = c1A v1 + c2A v2 + ··· + cmA vm k k k = c1λ1v1 + c2λ2v2 + ··· + cmλmvm   k  k  k c2 λ2 cm λm = c1λ v1 + v2 + ··· + vm . 1 c1 λ1 c1 λ1

The expression within parentheses converges to v1 because |λj /λ1| < 1 for j > 1. On the other hand, we have Ak b b = 0 . k k kA b0k

Therefore, bk converges to (a multiple of) the eigenvector v1. The convergence is geometric, with ratio

λ2 , λ1

where λ2 denotes the second dominant eigenvalue. Thus, the method converges slowly if there is an eigenvalue close in magnitude to the dominant eigenvalue.

59 / 90 Eigenvalue Solvers, Power methods

Under the assumptions:

I A has an eigenvalue that is strictly greater in magnitude than its other eigenvalues

I The starting vector b0 has a nonzero component in the direction of an eigenvector associated with the dominant eigenvalue. then:

I A subsequence of (bk ) converges to an eigenvector associated with the dominant eigenvalue

Note that the sequence (bk ) does not necessarily converge. It can be shown that iφ bk = e k v1 + rk where: v1 is an eigenvector associated with the dominant eigenvalue, iφ and krk k → 0. The presence of the term e k implies that (bk ) does not converge iφ unless e k = 1. Under the two assumptions listed above, the sequence (µk ) defined ∗ bk Abk by µk = ∗ converges to the dominant eigenvalue. bk bk

60 / 90 Eigenvalue Solvers, Power methods

Power iteration is not used very much because it can find only the dominant eigenvalue. The algorithm is however very useful in some specific case. For instance, Google uses it to calculate the page rank of documents in their search engine. For matrices that are well-conditioned and as sparse as the web matrix, the method can be more efficient than other methods of finding the dominant eigenvector. Some of the more advanced eigenvalue algorithms can be understood as variations of the power iteration. For instance, the method applies power iteration to the matrix Aˆ−1. Other algorithms look at the whole subspace generated by the vectors

bk . This subspace is known as the . It can be computed by or Lanczos iteration. The latter is method of choice for diagonalizing symmetric matrices with huge dimensionalities.

61 / 90 Lanczos’ iteration

Basic features with a real symmetric matrix (and normally huge n > 106 and sparse) Aˆ of dimension n × n:

I Lanczos’ algorithm generates a sequence of real tridiagonal matrices Tk of dimension k × k with k ≤ n, with the property that the extremal eigenvalues of Tk are progressively better estimates of Aˆ’ extremal eigenvalues.

I The method converges to the extremal eigenvalues.

I The similarity transformation is

Tˆ = Qˆ T AˆQˆ , ˆ with the first vector Qeˆ1 = qˆ1.

62 / 90 Lanczos’ iteration

We are going to solve iteratively

Tˆ = Qˆ T AˆQˆ , ˆ ˆ with the first vector Qeˆ1 = qˆ1. We can write out the matrix Q in terms of its column vectors ˆ Q = [qˆ1qˆ2 ... qˆn] .

63 / 90 Lanczos’ iteration

The matrix Tˆ = Qˆ T AˆQˆ , can be written as   α1 β1 0 ...... 0  β1 α2 β2 0 ... 0     0 β2 α3 β3 ... 0  Tˆ =    ...... 0     . . . βn−2 αn−1 βn−1  0 ...... 0 βn−1 αn

64 / 90 Lanczos’ iteration Using the fact that Qˆ Qˆ T = ˆI, we can rewrite

Tˆ = Qˆ T AˆQˆ ,

as Qˆ Tˆ = AˆQˆ , and if we equate columns (recall from the previous slide)   α1 β1 0 ...... 0  β1 α2 β2 0 ... 0     0 β2 α3 β3 ... 0  Tˆ =    ...... 0     . . . βn−2 αn−1 βn−1  0 ...... 0 βn−1 αn

we obtain ˆ Aqˆk = βk−1qˆk−1 + αk qˆk + βk qˆk+1.

65 / 90 Lanczos’ iteration

We have thus ˆ Aqˆk = βk−1qˆk−1 + αk qˆk + βk qˆk+1,

with β0qˆ0 = 0 for k = 1 : n − 1. Remember that the vectors qˆk are orthornormal and this implies T ˆ αk = qˆk Aqˆk , and these vectors are called Lanczos vectors.

66 / 90 Lanczos’ iteration, the algorithm

We have thus ˆ Aqˆk = βk−1qˆk−1 + αk qˆk + βk qˆk+1,

with β0qˆ0 = 0 for k = 1 : n − 1 and

T ˆ αk = qˆk Aqˆk .

If ˆ ˆ ˆrk = (A − αk I)qˆk − βk−1qˆk−1, is non-zero, then qˆk+1 = ˆrk /βk ,

with βk = ±||ˆrk ||2.

67 / 90 A simple implementation of the Lanczos algorithm

r_0 = q_1; beta_0=1; q_0=0; int k = 0; while (beta_k != 0) q_{k+1} = r_k/beta_k k = k+1 alpha_k = q_kˆT A q_k r_k = (A-alpha_k I) q_k -beta_{k-1}q_{k-1} beta_k = || r_k||_2 end while

68 / 90 Lanczos’ iteration

Outline of the algorithm:

I We choose an initial Lanczos vector |lanc0i as the zeroth order approximation to the lowest eigenvector. Our experience is that any reasonable choice is acceptable as long as the vector does not have special properties such as good angular momentum. That would usually terminate the iteration process at too early a stage.

I The next step involves generating a new vector through the process |newp+1 >= H|lancp >. Throughout this process we construct the energy matrix elements of H in this Lanczos basis. First, the diagonal matrix elements of H are then obtained by

hlancp|H|lancpi = hlancp| newp+1 , (15)

69 / 90 Lanczos’ iteration

I The new vector |newp+1i is then orthogonalized to all previously calculated Lanczos vectors

p−1 0 X |newp+1i = |newp+1i − |lancpi · hlancp| newp+1 − |lancq i · hlancq | newp+1 , q=0 (16) and finally normalized

1 0 |lancp+1i = r |newp+1i, (17) 0 0 E hnewp+1| newp+1

to produce a new Lanczos vector.

70 / 90 Lanczos’ iteration

I The off–diagonal matrix elements of H are calculated by

0 0 E hlancp+1|H|lancpi = hnewp+1| newp+1 , (18)

and all others are zero.

I After n iterations we have an energy matrix of the form

 H H 0 ··· 0   0,0 0,1   H H H ··· 0   0,1 1,1 1,2   0 H H ··· 0  2,1 2,2 (19) . . . .  . . . .   . . . . Hp−1,p    0 0 0 Hp,p−1 Hp,p

as the p’th approximation to the eigenvalue problem.

71 / 90 Lanczos’ iteration

The number p is a reasonably small number and we can diagonalize the matrix by standard methods to obtain eigenvalues and eigenvectors which are linear combinations of the Lanczos vectors.

I This process is repeated until a suitable convergence criterium has been reached. In this method each Lanczos vector is a linear combination of the basic |SDi with dimension n. For large values of n we note one of the important difficulties associated with the Lanczos method. Large disk storage is needed when the number of Lanczos vector exceeds ≈ 100.

72 / 90 Problems with the Lanczos’ iteration procedure

I The main cpu time–consuming process

H|qi i = |pi

due to the large number of non-diagonal matrix elements

X i H|qi i = Cν hSDµ|H|SDν i ν,µ

However, each individual matrix element is easy to calculate

I Examples:

Type 122Sn 116Sn Dimension ≈ 2 · 106 ≈ 16 · 106 non-diag. elem ≈ 4.3 · 108 ≈ 1.2 · 109

73 / 90 Problems with Lanczos iteration

I The matrix elements are needed at every Lanczos iteration. Too many non–diagonal two–particle matrix elements to be calculated and saved. Must be recalculated at each iteration

I Numerical roundoff errors requires orthogonalization of all Lanczos vectors. All Lanczos vectors may be kept during the process Slow convergence requires large number of Lanczos vectors (≥ 100)

I Due to numerical roundoff errors symmetry properties like angular momentum J are destroyed through the process. However, The final converged eigenvectors have the symmetry properties given by the total Hamiltonian H.

74 / 90 Homework

HOMEWORK!

75 / 90 Exercise from Monday, 25 June 2012, simple pairing Hamiltonian

This exercise is particularly relevant for the shell-model part, since we will use this Hamiltonian as demonstration on how to build a shell-model code. Our specific model consists of N doubly-degenerate and equally spaced single-particle levels labelled by p = 1, 2,..., 10 and spin σ = ±1/2. We write the Hamiltonian as ˆ ˆ ˆ H = H0 + V , where ˆ X † H0 = d (p − 1)apσapσ, pσ with d the spacing between the single-particle levels and

ˆ 1 X † † V = − G ap+ap−aq−aq+, 2 pq

that is a simple pairing term only with strength G.

76 / 90 Simple pairing model, illustration with model space

Pd We can define a model space via the operator P = i=1 |Φi ihΦi | and P∞ excluded space Q = i=d+1 |Φi ihΦi |, where Φi is a Slater determinant which can be constructed by placing particles in various single-particle states p. p = 10 p = 9 p = 8 s s Q p = 7 s p = 6 s p = 5 s p = 4 p = 3 s s P p = 2 s s p = 1 s s c c c c c c s s s s s c c c

77 / 90 Simple pairing Hamiltonian, the warm up exercise from last week

I Assume no broken pairs and N = 4 particles and only p = 4 doubly degenerate levels. Find all possible Slater determinants with no broken pairs.

I Use this basis to set up the .

I Scale the Hamiltonian in terms of either the single-particle spacing d or the interaction strength g. Compute the eigenvalues using for example Matlab or Octave for values of g ∈ [−1, 1]. We will use this as a benchmark now in our set up of a shell-model code. The latter entails the coding of the Slater determinants (yesterday’s exercise) and the the Lanczos algorithm (today) for solving eigenvalue problems.

78 / 90 Reminder from yesterday’s homework

The goal of yesterday’s exercise is to construct the M-scheme basis of Slater determinants in binary representation. Here M-scheme means the total Jz of the many-body states is fixed. The steps are:

I Read in a user-supplied file of single-particle states (we will provide examples); I Ask for the total M of the system and the number of particles N; I Construct all the N-particle states with given M. You will validate the code by comparing both the number of states and specific states.

79 / 90 Homework Part 1: Thursday, 5 July 2012

We will supply you with sample files for single-particle states (ending in .spstates, e.g. sd.spstates). The format of the file is as follows

12 ! number of single-particle states 1 1 0 1 -1 ! index nrnodes l 2 x j 2 x jz 2 1 0 1 1 3 0 2 3 -3 4 0 2 3 -1 5 0 2 3 1 6 0 2 3 3 7 0 2 5 -5 8 0 2 5 -3 9 0 2 5 -1 10 0 2 5 1 11 0 2 5 3 12 0 2 5 5

This represents the 1s1/2-0d3/2-0d5/2 valence space, or the sd-space. There are twelve single-particle states, labeled by an overall index, and which have associated quantum numbers the number of radial nodes (nrnodes), the orbital angular momentum l, and the angular momentum j and third component jz . To keep everything a integer, we store 2 × j and 2 × jz .

80 / 90 Homework Part 1: Thursday, 5 July 2012

To read in the single-particle states you need to:

I Open the file

I Read the number of single-particle states (in the above example, 12); allocate memory; all you need is a single array storing 2 × jz for each state, labeled by the index;

I Read in the quantum numbers and store 2 × jz (and anything else you happen to want).

81 / 90 Homework Part 1: Thursday, 5 July 2012

The next step is to read in the number of particles N and the fixed total M (or, actually, 2 × M). For this assignment we assume only a single species of particles, say neutrons, although this can be relaxed. Note: Although it is often a good idea to try to write a more general code, given the short time alloted we would suggest you keep your ambition in check, at least in the initial phases of the project.

You should probably write an error trap to make sure N and M are congruent; if N is even, then 2 × M should be even, and if N is odd then 2 × M should be odd.

82 / 90 Homework Part 1: Thursday, 5 July 2012

The final step is to generate the set of N-particle Slater determinants with fixed M. The Slater determinants will be stored in occupation representation. In many codes this representation is done compactly in bit notation (1s and 0s). Many of you did this yesterday. We will discuss this in greater detail also next week.

83 / 90 Homework Part 1: Thursday, 5 July 2012

Also, for much of the validation of your codes we will consider the pairing interaction. A simple space is the (1/2)2 space 4 ! number of single-particle states 1 0 0 1 -1 2 0 0 1 1 3 1 0 1 -1 4 1 0 1 1

For N = 2 there are 4 states with M = 0; show this by hand and confirm your code reproduces it. Similarly, for N = 4 there are 6 states with M = 0.

84 / 90 Homework Part 1: Thursday, 5 July 2012

For application in the pairing model we can go further and consider only states with no “broken pairs,” that is, if +k is filled (or m = +1/2, so is −k (m = −1/2). If you want, you can modify your code to accept only these, and obtain the following 6 states: |1, 2, 3, 4i, |1, 2, 5, 6i, |1, 2, 7, 8i, |3, 4, 5, 6i, |3, 4, 7, 8i, |5, 6, 7, 8i We willll be using this as our example in setting up our own Lanczos algorithm.

85 / 90 Example case: pairing Hamiltonian

We consider a space with 2Ω single-particle states, with each state labeled by k = 1, 2, 3, Ω and m = ±1/2. The convention is that the state with k > 0 has m = +1/2 while −k has m = −1/2. The Hamiltonian we consider is

Hˆ = −GPˆ+Pˆ−,

where ˆ X ˆ†ˆ† P+ = ak a−k . k>0

† and Pˆ− = (Pˆ+) .

We will first solve this using a trick, the quasi-spin formalism, to obtain the exact results. Then we will try again using the explicit Slater determinant formalism.

86 / 90 Example case: pairing Hamiltonian

One can show (and this is a good exercise!) that

h i   ˆ ˆ X ˆ†ˆ ˆ† ˆ ˆ P+, P− = ak ak + a−k a−k − 1 = N − Ω. k>0

Now define 1 Pˆz = (Nˆ − Ω). 2 Finally you can show h i Pˆz , Pˆ± = ±Pˆ±.

This means the operators Pˆ±, Pˆz form an SU(2) algebra, and we can bring to bear all our intution about angular momentum, even though there is no actual angular momentum involved. So we rewrite the Hamiltonian to make this explicit:

ˆ ˆ ˆ ˆ 2 ˆ 2 ˆ  H = −GP+P− = −G P − Pz + Pz

87 / 90 Example case: pairing Hamiltonian

Because of the SU(2) algebra, we know that the eigenvalues of Pˆ 2 must be of the form p(p + 1), with p either integer or half-integer, and the eigenvalues of Pˆz are mp with p ≥ |mp|, with mp also integer or half-integer.

But because Pˆz = (1/2)(Nˆ − Ω), we know that for N particles the value mp = (N − Ω)/2. Furthermore, the values of mp range from −Ω/2 (for N = 0) to +Ω/2 (for N = 2Ω, with all states filled). We deduce the maximal p = Ω/2 and for a given n the values range of p range from |N − Ω|/2 to Ω/2 in steps of 1 (for an even number of particles) Following Racah we introduce the notation p = (Ω − v)/2 where v = 0, 2, 4, ..., Ω − |N − Ω| With this it is easy to deduce that the eigenvalues of the pairing Hamiltonian are

−G(N − v)(2Ω + 2 − N − v)/4

This also works for N odd, with v = 1, 3, 5,... .

88 / 90 Example case: pairing Hamiltonian, our specific test case

Using the (1/2)4 single-particle space

8 ! number of single-particle states 1 0 0 1 -1 2 0 0 1 1 3 1 0 1 -1 4 1 0 1 1 5 2 0 1 -1 6 2 0 1 1 7 3 0 1 -1 8 3 0 1 1

and then taking only 4-particle, M = 0 states that have no ‘broken pairs’, there are six basis Slater determinants: |1, 2, 3, 4i, |1, 2, 5, 6i, |1, 2, 7, 8i, |3, 4, 5, 6i, |3, 4, 7, 8i, |5, 6, 7, 8i

89 / 90 Simple pairing Hamiltonian, today’s explicit tasks

I Assume no broken pairs and N = 4 particles and only p = 4 doubly degenerate levels. Find all possible Slater determinants with no broken pairs and encode these and the single-particle basis.

I Set up the Hamiltonian matrix elements.

I Diagonalize this matrix using Householder’s method and check with your results from the previous week.

I Use then Lanczos method to find the same eigenvalues. You will then have functions which set up a single-particle basis, a Slater determinant basis and the Lanczos algorithm itself that uses this information. Next week we will generalize these functions to larger problems and develop a function that uses a Hamiltonian operator in a second-quantized form acting on a fermion wave function also in a second-quantized form.

90 / 90