Random Matrices
Estelle Basor
Mathematics Department California Polytechnic State University San Luis Obispo, 93407 Winter 1995 and Spring 2007
1 Contents
2 1 Introduction
These notes are based on a series of lectures given by Estelle Basor in an applied analysis graduate course at Cal Poly first done in Winter of 1996. They were based on notes taken by the students in the class, typed by Mike Robertson, and then finally edited by Estelle Basor and Jon Jacobsen. The notes were again used in Spring of 2007 and corrections to the notes were provided by the students in the class and some other sections were added.
These notes are not meant to be complete or perfect, but rather an informal set of notes describing a new and lively field in mathematics. It is the intention of the author that they provide a heuristic background so that one can begin to understand random matrices and consider future questions in the subject. The notes provide an introduction to the basics of the Gaussian Unitary Ensemble (GUE), a derivation of the semi-circle law, a derivation of the Painlev´eequation connected to the basic probabilites of the GUE and at the end a derivation of the distribution function for linear statistics.
Many of the tools used are provided in the text, while others are missing. In particular, the asymptotics of the relevant orthogonal polynomials are used but not derived. Several properties of Hilbert spaces, operators, trace class operators are used and not proved.
3 4 2 The probability distribution for the eigenvalues of a random hermitian matrix
Given a random Hermitian matrix what can we say about its eigenvalues? Some questions include: What is the largest one? How are they spaced? What is the probability that one does (or does not) lie in a given interval? This section will describe a probability distribution on the space of matrices and show how this distribution induces one on the space of eigenvalues. To begin, we consider a general 2 × 2 Hermitian matrix
" (0) (1) # x11 x12 + ix12 H = (0) (1) . x12 − ix12 x22
(0) (1) The matrix H has four free parameters, namely x11,x12 , x12 , x22. Thus we may identify the space of 2 × 2 Hermitian matrices with R4. That is to say, if
4 (0) (1) A ⊆ R , then H ∈ A iff x11,x12 , x12 , x22 ∈ A.
It is not hard to see that in a more general situation an n × n Hermitian matrix will have n2 free parameters and we may similarly identify the space of n × n Hermitian matrices with Rn2 . Given A ⊆ Rn2 , we seek the probability that H ∈ A and also information about the eigenvalues of H.
Definition 1. Let H be a Hermitian matrix. We define the probability density P (H) by
−trace(H2) P (H) = cne .
This means that the probability of a random n × n Hermitian matrix occurring in A ⊆ Rn2 is Z Z n Y Y (0) Y (1) cn ... P (H) dxii dxij dxij . (1) A i=1 1≤i „ “ ”2 “ ”2 « ZZZZ 2 (0) (1) 2 − x11+2 x12 +2 x12 +x22 (0) (1) c2 e dx11dx22dx12 dx12 . (2) A As it stands this contains too much information to be useful. We need to scale our search down to specifically the eigenvalues of H. Our goal is to apply a change of variables to (??) and let the constant absorb some information leaving us with a finer look at the eigenvalues themselves. To do this we will need some general facts. 5 1. The probability given in (??) is independent with respect to any change of basis. Proof: Let H0 = M −1HM for some orthogonal matrix M. Then (H0)2 = M −1HMM −1HM = M −1H2M, and tr(H02) = tr M −1H2M = tr M −1MH2 = tr H2 . So P (H0) = P (H). Thus P (H) is left unchanged by a change in basis, in particular, a unitary change of basis. 2. Every Hermitian matrix is unitarily diagonalizable, that is if H is Hermitian then H may be written as the product H = U −1DU where D is diagonal and U is unitary (U −1 = U ∗). 3. The eigenvalues of a Hermitian matrix are real. Proof: Using the inner product, notice hv, λvi = λ¯ ||v||2 and hv, λvi = hv, Hvi = hHv, vi = hλv, vi = λ ||v||2. Hence λ = λ¯ and is therefore real. Thus D is a real diagonal matrix containing the eigenvalues of H. Our goal is to apply a change of variables to (??) corresponding to H ↔ U −1DU. For clarity we will restrict our attention to the 2 × 2 case first and then address the n × n case. A problem arises though that in general the U −1DU representation is not unique as the eigenvalues can be interchanged and there are many choices for the eigenvectors forming U. We need a method of selecting a particular U and D so that they are unique. Since the entries of D are real we may simply order them specifying θ 0 D = 1 0 θ2 where θ1 ≤ θ2. Since H has four free variables and D has only two, U must depend on two free parameters. Let U consist of two column vectors U = v1 v2 . We may choose a so that a √ 0 ≤ a ≤ 1 and v = . Since U is unitary ||v || = 1 thus b = 1 − a2. And so 1 beiα 1 a v = √ . 1 1 − a2eiα Now since hv1, v2i = 0 and ||v2|| = 1, v2 is completely determined if we also insist that its first c non-zero entry is positive. Suppose v = with c ≥ 0. Since ||v || = 1 we have as above 2 deiβ 2 c √ √ v = √ . Since hv , v i = 0 we have ac + 1 − a2 1 − c2ei(α−β) = 0, 2 1 − c2eiβ 1 2 −ac ei(α−β) = . p(1 − a2) (1 − c2) Thus 1) ei(α−β) ∈ R −ac 2) √ = 1. (1−a2)(1−c2) 6 Since a ≥ 0, c ≥ 0 it must be that −ac = −1 p(1 − a2) (1 − c2) √ c 1 − a2 √ = . 1 − c2 a 1−a2 Thus our choice of a determines c. If we let a2 = k then, c2 = 1 − c2 k k c2 = 1 + k r k c = . 1 + k Furthermore if ei(α−β) = −1 then given α we can determine β. Thus we have shown that a, α, θ1, and θ2 completely determine a unique representation for H as H = U −1DU with U and D specified as above. Conversely given a U and D we may find H by multiplication. We have established a one to one correspondence between Hermitian matrices and products of the form U −1DU and can use this correspondence to perform a change of variables (0) (1) f x11,x12 , x12 , x22 = (θ1,θ2, p1, p2) with θ1 ≤ θ2, p1 ∈ [0, 1] , p2 ∈ [0, 2π) . Now for the general case we cannot be as explicit, but we can say that H = U −1DU where D is diagonal with diagonal elements θ1, . . . , θn satisfying θ1 ≤ θ2 ≤ ... ≤ θn and U unitary with column vectors of length one and with first non-zero coordinate positive. Notice that if u1 is the first column vector of U it has 2n − 2 real free parameters and that u2 has 2n − 4 free parameters and so on so that the variables in U account for n2 − n parameters. We thus have a change of variables given by some function f so that (1) f x11, . . . , x(n−1)n = (θ1, . . . , θn, p1, . . . , pn2−n) . Using this change of variables we need to transform the integral: Z Z (1) cn ... P (H) dx11 . . . dx(n−1)n (3) A into one with the new variables: Z Z cn ... P (H) |J (θi, pl)| dθ1 . . . dpn2−n. f(A) 7 Here the expression |J (θi, pl)| stands for the absolute value of the determinant of the Jacobian matrix. Notice that the P (H) term is easy to convert since −traceH2 −trace(U ∗DU) −traceD2 − Pn θ2 P (H) = e = e = e = e i=1 1 . Before we begin the Jacobian computation we notice the following: 1. Recall that if A is a matrix then ∂A ∂a = ij . ∂y ij ∂y ij (This means simply that the partial derivative of a matrix is defined to be the partials of all its entries.) ∂AB ∂A ∂B 2. If A and B are matrices then the product rule ∂y = ∂y B + A ∂y holds true. ∗ ∗ 3. Since UU ∗ = I we have that ∂UU = 0 or by the product rule ∂U U ∗ + U ∂U = 0 or ∂pi ∂pi ∂pi ∂U ∂U ∗ U ∗ = −U . ∂pi ∂pi Similarly since U ∗U = I ∂U ∗ ∂U U = −U ∗ . ∂pi ∂pi ∗ Define Si = U ∂U and note that since U is independent of the θ variables, Si is also. Now letting ∂pi H equal U ∗DU we find that ∂H ∂U ∗DU ∂U ∗ ∂DU = = DU + U ∗ ∂pi ∂pi ∂pi ∂pi ∂U ∗ ∂D ∂U = DU + U ∗ U + U ∗D ∂pi ∂pi ∂pi ∂U ∗ ∂U ∂D = DU + U ∗D since = 0 . ∂pi ∂pi ∂pi This implies ∂H ∂U ∗ ∂U U U ∗ = U D + D U ∗ = SiD − DSi. ∂pi ∂pi ∂pi Since D is diagonal the lk entry of the matrix SiD − DSi is given by i Slk(θk − θl) i i where Slk is the lk entry of the S matrix. This matrix remember does not depend on the θis. 8 Also ∂H ∂ (U ∗DU) ∂U ∗ ∂D ∂U = = DU + U ∗ U + U ∗D ∂θi ∂θi ∂θi ∂θi ∂θi ∗ ∂D 0 = U U Since U depends only on pis ∂θi and hence U ∂H U ∗ has lk entry δ δ . To sum up we have two crucial formulas ∂θi lk ki 1. (U ∂H U ∗) = Si (θ − θ ) ∂pi lk lk k l 2. (U ∂H U ∗) = δ δ . ∂θi lk lk ki Our next step is to write expressions for these same matrices using only the formula for matrix multiplication. In terms of components U ∂H U ∗ has lk entry ∂θi n X ∂H U U ∗ lj ∂θ j=1 i jk X ∂H = U U ∗ lj ∂θ mk j,m i jm X ∂H = U U ∗ . ∂θ lj mk j,m i jm Note that this last summation can be written as X ∂H ∗ X ∂H ∗ UljUjk + UljUmk ∂θi ∂θi j jj j6=m jm and this is the same as ∂(x(0) + ix(1) )! X ∂xjj ∗ X jm jm ∗ UljUjk + UljUmk. ∂θi ∂θi j j6=m But this is also (0) (1) ! (0) (1) ! X ∂xjj X ∂(xjm + ixjm) X ∂(xjm − ixjm) U U ∗ + U U ∗ + U U ∗ , ∂θ lj jk ∂θ lj mk ∂θ lm lk j i j (0) ! (1) ! X ∂xjj X ∂xjm X ∂xjm = U U ∗ + (U U ∗ + U U ∗ ) + (iU U ∗ − iU U ∗ ). ∂θ lj jk ∂θ lj mk lm jk ∂θ lj mk lm jk j i j 9 The lk component of U ∂H U ∗ is exactly the above same expression except with the partial with ∂pi respect to the θi replaced with the partial with respect to the pi. Next think carefully about the Jacobian. It is a n2 × n2 matrix. We are going to construct it by (0) (1) placing the variables across, first with the xii terms, them the xij terms and then finally the xij terms. This will correspond to n2 columns. We are going to run the partial derivatives down the rows, first the θ partials followed by the pi partials and these last in any order. 2 2 We now multiply J (θi, pi) by a matrix C where C is constructed as follows. It is also n ×n and has ∗ columns indexed by lk. The lk column will be one where first we have the terms (in rows) UljUjk ∗ ∗ n2−n ∗ ∗ (n of these), followed by UljUmk + UlmUjk ( 2 of these) and then finally iUljUmk − iUlmUjk. In block form this is ∗ UljUjk ∗ ∗ Clk = UljUmk + UlmUjk , ∗ ∗ iUljUmk − iUlmUjk. We are also careful so that we arrange the columns so that the first n columns are where l = k. In the 2 × 2 case everything looks like this. (0) (1) ∂x11 ∂x22 ∂x12 ∂x12 ∂θ1 ∂θ1 ∂θ1 ∂θ1 (0) (1) ∂x11 ∂x22 ∂x12 ∂x12 ∂θ2 ∂θ2 ∂θ2 ∂θ2 JC = (0) (1) C11 C22 C12 C21 ∂x11 ∂x22 ∂x12 ∂x12 ∂p1 ∂p1 ∂p1 ∂p1 (0) (1) ∂x11 ∂x22 ∂x12 ∂x12 ∂p2 ∂p2 ∂p2 ∂p2 U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂θ1 ∂θ1 ∂θ1 ∂θ1 11 22 12 21 U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂θ2 ∂θ2 ∂θ2 ∂θ2 = 11 22 12 21 U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂p1 ∂p1 ∂p1 ∂p1 11 22 12 21 U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂p2 11 ∂p2 22 ∂p2 12 ∂p2 21 δ11δ11 δ22δ21 δ12δ11 δ21δ21 δ11δ12 δ22δ22 δ12δ12 δ21δ22 = 1 1 0 0 S12 (θ2 − θ1) S21 (θ1 − θ2) 2 2 0 0 S12 (θ2 − θ1) S12 (θ1 − θ2) 1 0 0 0 0 1 0 0 = 1 1 . 0 0 S12 (θ2 − θ1) S21 (θ1 − θ2) 2 2 0 0 S12 (θ2 − θ1) S21 (θ1 − θ2) Taking the determinant of both sides we have 1 1 S12 (θ1 − θ2) S21 (θ2 − θ1) |J| |C| = 2 2 S12 (θ1 − θ2) S21 (θ2 − θ1) 10 1 1 2 S12 S21 = (θ1 − θ2) 2 2 . S12 S21 1 1 S12 S21 2 Since C and 2 2 depend only on p1 and p2 we may write | det (J) | = (θ1 − θ2) g (p1, p2) S12 S21 where S1 S1 12 21 S2 S2 g (p , p ) = 12 21 1 2 det (C) and (3) becomes ZZZZ −θ2−θ2 2 c2 e 1 2 (θ1 − θ2) g (p1, p2) dθ1dθ2dp1dp2. f(A) In the general case, JC has a certain block form In A BD where In is just the identity matrix A and B are the zero matrices and the matrix D is one whose i lk columns are of have entries Slk(θk − θl). If we take the determinant of JC now we have Y 2 (θj − θk) × h(p1, . . . , pn2−n) j We now integrate with respect to all the pis variables and we see we have an integral of the form Z Z Pn 2 Y 0 − i=1 θi 2 cn ... e (θl − θk) dθ1 . . . dθn. B l Theorem 1 Theorem 1 the Gaussian Unitary Ensemble induces a probability density on the set of eigenvalues given by Pn 2 Y 0 − i=1 θi 2 PN (θ1, . . . , θn) = cne (θl − θk) l The probability of finding a set of eigenvalues within a given set A ⊂ Rn is given by the integral Z Z 2 2 2 0 −(θ1+θ2+···+θn) Y 2 cn ... e (θi − θj) dθ1dθ2 ··· dθn A j 11 From now on we want to think of n as very large and so we will replace n with N and because of soon to be described connection with orthogonal polynomials and integral operators we will use xi notation for the eigenvalue variables. Our next goal will be to prove that for the N × N case that 1 P (x , x , . . . x ) = det (K (x , x ))|N N 1 2 N N! N i j i,j=1 PN−1 where KN (x, y) = k=0 ϕk (x) ϕk (y) with ϕk (x) the normalized Hermite function, ϕk (x) = x2 − 2 R cke Hk (x) so that ϕk1 (x) ϕk2 (x) dx = δk1k2 . and cN is the proper normalizing constant. We have that −(x2+x2+...+x2 ) Y 2 PN (x1, x2, . . . , xN ) = cN e 1 2 N (xi − xj) , i Starting from this equation, consider the following steps. Step1 We know that 1 1 ··· 1 Y x1 x2 ··· xN (xi − xj) = det . . . . . . . .. . j 1 1 For example the 2×2 case yields (x2 − x1) = . To prove this in general let p (x1, x2, . . . , xN ) = x1 x2 Q i 1 1 ··· 1 x1 x2 ··· xN q (x1, x2, . . . , xN ) = det . . .. . . . . . . N−1 N−1 N−1 x1 x2 ··· xN If xi = xj for any i 6= j then it follows that p (x1, x2, . . . , xN ) = 0 and q (x1, x2, . . . , xN ) = 0 as two of the columns in the determinant will be identical. This implies that (xi − xj) is a factor of q (x1, x2, . . . , xN ) for i 6= j so p must divide evenly into q. The degree of q is the same as p, both N(N−1) p being 2 , so q must be constant. Checking one term it is easy to see that they must be the same. This leads to the formula that 1 1 ··· 1 1 1 ··· 1 2 2 2 x1 x2 ··· xN x1 x2 ··· xN −(x1+x2+...+xN ) PN (x1, x2, . . . , xN ) = cN e ...... N−1 N−1 N−1 N−1 N−1 N−1 x1 x2 ··· xN x1 x2 ··· xN 12 Step 2 We can show that 1 1 ··· 1 H0 (x1) H0 (x2) ··· H0 (xN ) x1 x2 ··· xN H1 (x1) H1 (x2) ··· H1 (xN ) = d . . .. . N ...... N−1 N−1 N−1 x1 x2 ··· xN HN−1 (x1) HN−1 (x2) ··· HN−1 (xN ) th th for some constant dN where Hn is the n Hermite polynomial. Each Hn is an n degree polynomial and can be constructed by a linear combination of the rows in the left hand matrix. The dN captures the changes in the leading terms of each Hn, recalling that multiplying a row by a constant changes the value of the determinant by the same factor. Step 3 A factor in a column of a determinant can be factored out to leave the following. x2 x2 x2 − 1 − 2 − N e 2 H0 (x1) e 2 H0 (x2) ··· e 2 H0 (xN ) 2 2 2 „ x2 x2 x2 « x x x − 1 + 2 +···+ N − 1 − 2 − N 2 2 2 Y e 2 H1 (x1) e 2 H1 (x2) ··· e 2 H1 (xN ) e (xi − xj) = dN ...... j Step 4 Now multiply each row by the necessary normalizing constant and multiply the determinant by its 0 reciprocal thereby making a new constant dN to get ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN ) „ x2 x2 x2 « − 1 + 2 +···+ N ϕ (x ) ϕ (x ) ··· ϕ (x ) 2 2 2 Y 1 1 1 2 1 N e (x − x ) = d0 . i j N ...... j Step 5 The original equation can now be written PN (x1, x2, . . . , xN ) ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN ) ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN ) ϕ1 (x1) ϕ1 (x2) ··· ϕ1 (xN ) ϕ1 (x1) ϕ1 (x2) ··· ϕ1 (xN ) = c d0 2 N N ...... ϕN−1 (x1) ϕN−1 (x2) ··· ϕN−1 (xN ) ϕN−1 (x1) ϕN−1 (x2) ··· ϕN−1 (xN ) 0 2 N = cN dN det (Mij)ij=1 13 PN−1 where Mij = k=0 ϕk (xi) ϕk (xj) noting that the determinant is unaffected by transposition. 0 2 1 So all that is left to show is that cN (dN ) is N! to obtain the desired result. This could have been done by carefully keeping track of these constants all along but would have required that we perform the integration of pi variables in the formation of cN which we would like to avoid. Before finding this constant, there is a useful lemma. R ∞ Lemma 1 Suppose f (x, y) is a function of two variables and that −∞ f (x, x) dx = c < ∞ and R ∞ also −∞ f (x, y) f (y, z) dy = f (x, z) , then Z ∞ K K−1 det f (xi, xj)|i,j=1 dxK = (c − K + 1) det f (xi, xj)|i,j=1 . −∞ Proof: Consider the determinant using the permutation definition, K K X σ Y det f (xi, xj)|i,j=1 = (−1) f xl, xσ(l) . σ∈SK l=1 The variable xK will appear in a factor either once in the form f (xK , xK ) or twice in the form R ∞ f (xa, xK ) f (xK , xb) . In the f (xK , xK ) case −∞ f (xK , xK ) dxK = c and in the other Z . . . f (xa, xK ) f (xK , xb) . . . dxK = . . . f (xa, xb) .... Now break up our original sum as K K K X σ Y X σ Y X σ Y (−1) f xl, xσ(l) = (−1) f xl, xσ(l) + (−1) f xl, xσ(l) . σ∈SK l=1 σ|σ(K)=K l=1 σ|σ(K)6=K l=1 After integration with respect to xK the sums yield K−1 K−1 X σ Y X σ Y c (−1) f xl, xσ(l) − (K − 1) (−1) f xl, xσ(l) σ∈SK−1 l=1 σ∈SK−1 l=1 K−1 = (c − K + 1) det (f (xi, xj))|i,j=1 . 1 R ∞ Now, our constant of N! in our expression for PN (x1, x2, . . . , xN ) can be computed since −∞ KN (x, x)dx = R ∞ N and −∞ KN (x, y)KN (y, z)dy = KN (x, z). We have Z ∞ Z ∞ N ... det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN −∞ −∞ Z ∞ Z ∞ N = ... 1 · det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN−1 −∞ −∞ Z ∞ Z ∞ N = ... 1 · 2 · det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN−2 −∞ −∞ . . = 1 · 2 · 3 ··· N = N! 14 Thus the determinant portion of PN is N! when fully integrated and the integration over the 1 probability distribution is 1, cN = N! . We have established our goal that PN (x1, x2, . . . xN ) = 1 N N! det (KN (xi, xj))|i,j=1 . Define the n-point correlation function as N! Z Z R (x , x , . . . , x ) = ... P (x , x , . . . , x ) dx dx ··· dx . N 1 2 n (N − n)! N 1 2 N n+1 n+2 N This leads to the following lemma: Lemma 2 n RN (x1, x2, . . . , xn) = det (KN (xi, xj))|i,j=1 . Proof: N! Z Z 1 R (x , x , . . . , x ) = ... det (K (x , x ))|N dx dx ··· dx N 1 2 n (N − n)! N! N i j i,j=1 n+1 n+2 N 1 Z Z = ... 1 det (K (x , x ))|N−1 dx dx ··· dx (N − n)! N i j i,j=1 n+1 n+2 N−1 1 Z Z = ... 1 · 2 det (K (x , x ))|N−2 dx dx ··· dx (N − n)! N i j i,j=1 n+1 n+2 N−2 . . 1 · 2 ··· (N − (n + 1) + 1) = det (K (x , x ))|n (N − n)! N i j i,j=1 n = det (KN (xi, xj))|i,j=1 . 15 3 The density of the eigenvalues In this section we compute the density function for the eigenvalues for finite N and N large. For large N this function is approximated by a semi-ellipse. This result, conjectured and proved in GUE and also for several other ensembles in the literature is known as Wigner’s semi-cirle law. In the following all integrals are assumed to be from minus infinity to infinity unless otherwise indicated. If f is a symmetric function in all its variables then the following integral is the expected value of f with respect to PN Z Z ... f (x1, x2, . . . xN ) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN . PN If f (x1, x2, . . . xN ) = i=1 χA (xi) then the above integral counts the number of eigenvalues that are contained in the set A. So we can rewrite this integral as " N # Z Z X ... χA (xi) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN i=1 Z Z = N ... χA (x1) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN Z = KN (x, x) dx. A We are using the symmetry of PN in the last computation. This gives us a quite simple expression for the expected number of eigenvalues in a set A, and we see that KN (x, x) is the density function for the eigenvalues. Using formulas from the theory of orthogonal polynomials, we have e−x2 K (x, x) = √ H0 (x) H (x) − H0 (x)H (x) . N 2N−1 (N − 1)! π N N−1 N−1 N Then using the well-known asymptotic expansions for Hermite polynomials one can show that √ p √ √ (2/π) 1 − x2 |x| ≤ 1 lim 2/NKN ( 2N + 1x, 2N + 1x) = N→∞ 0 |x| ≥ 1 This last equation√ is what is referred to as√ the “semi-circle law”. This tells us that if we are far away from 2N there are on the order of N eigenvalues in any finite interval. 16 4 The probability that an interval (a, b) contains no eigenvalues In this section we find an expression that describes the probability that an interval contains no eigenvalues of a random matrix. We first do this for finite N and then later take the limit as N becomes large. We first recall our definition for a Fredholm determinant. Recall that given a bounded continuous integral linear operator with kernel K(x, y) on L2 (a, b) we defined the Fredholm determinant ∞ n X (−1) λn Z b Z b det (I − λK) = ··· det (K (x , x ))|n dx ··· dx . n! i j i,j=1 1 n n=0 a a What is the probability of finding no eigenvalues in an interval J = (a, b)? It is exactly the same as finding all of the eigenvalues in the complement of J. Thus the probability that all xi are not in (a, b) is given by Z Z = ··· PN (x1, x2, . . . , xN ) dx1 ··· dxN Jc Jc Z ∞ Z ∞ = ... PN (x1, x2, . . . , xN ) χJc (x1) ··· χJc (xN ) dx1 ··· dxN . −∞ −∞ But since χJc (x) = 1 − χJ (x) the probability becomes Z ∞ Z ∞ = ... PN (x1, x2, . . . , xN ) (1 − χJ (x1)) ··· (1 − χJ (xN )) dx1 ··· dxN −∞ −∞ N Z ∞ Z ∞ X X = ... PN (x1, x2, . . . , xN ) 1 − χJ (xi) + χJ (xi) χJ (xj) −∞ −∞ i=1 i 17 Z ∞ Z ∞ = 1 − N ... PN (x1, x2, . . . , xN ) χJ (x1) dx1 ··· dxN −∞ −∞ N! Z ∞ Z ∞ + ... PN (x1, x2, . . . , xN ) χJ (x1) χJ (x2) dx1 ··· dxN − · · · 2! (N − 2)! −∞ −∞ Z ∞ Z ∞ N N Y + (−1) ... PN (x1, x2, . . . , xN ) χJ (xi) dx1 ··· dxN −∞ −∞ i=1 Z ∞ Z ∞ 1 N = 1 − ... det (KN (xi, xj))|i,j=1 χJ (x1) dx1 ··· dxN (N − 1)! −∞ −∞ Z ∞ Z ∞ 1 N + ... det (KN (xi, xj))|i,j=1 χJ (x1) χJ (x2) dx1 ··· dxN − · · · 2! (N − 2)! −∞ −∞ N (−1)N Z ∞ Z ∞ Y + ... det (K (x , x ))|N χ (x ) dx ··· dx . N! N i j i,j=1 J i 1 N −∞ −∞ i=1 Using the formula for the n-point correlation functions this becomes Z ∞ = 1 − KN (x1, x1) χJ (x1) dx1 + −∞ Z ∞ Z ∞ 1 2 det (KN (xi, xj))|i,j=1 χJ (x1) χJ (x2) dx1dx2 − · · · 2! −∞ −∞ N N (−1) Z ∞ Z ∞ Y + ... det (K (x , x ))|N χ (x ) dx ··· dx N! N i j i,j=1 J i 1 N −∞ −∞ i=1 Z b Z b Z b 1 2 = 1 − KN (x1, x1)dx1 + det(KN (xi, xj))|i,j=1dx1dx2 + ... a 2! a a N Z b Z b (−1) N + ... det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a Note that the probability looks just like a Fredholm determinant for the kernel KN , except that this is a finite sum of integrals. The determinant agrees with the first N + 1 terms and in fact the remaining terms in the Fredholm determinant are zero for this choice of KN. For example, K1 (x1, x2) = ϕ0 (x1) ϕ0 (x2) and K1 (x1, x1) K1 (x1, x2) ϕ0 (x1) ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) = K1 (x2, x1) K1 (x2, x2) ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ0 (x2) ϕ0 (x1) ϕ0 (x1) = ϕ0 (x1) ϕ0 (x2) ϕ0 (x2) ϕ0 (x2) = 0. In any such example where the number of rows in the determinant are greater than the number of variables N, the columns can be factored so as to always have a repeated column. 18 3 To further illustrate this look at det (K2 (xi, xj))|i,j=1 . 3 det (K2 (xi, xj))|i,j=1 2 2 ϕ0 (x1) + ϕ1 (x1) ϕ0 (x1) ϕ0 (x2) + ϕ1 (x1) ϕ1 (x2) ϕ0 (x1) ϕ0 (x3) + ϕ1 (x1) ϕ1 (x3) 2 2 = ϕ0 (x2) ϕ0 (x1) + ϕ1 (x2) ϕ1 (x1) ϕ0 (x2) + ϕ1 (x2) ϕ0 (x2) ϕ0 (x3) + ϕ1 (x2) ϕ1 (x3) 2 2 ϕ0 (x3) ϕ0 (x1) + ϕ1 (x3) ϕ1 (x1) ϕ0 (x3) ϕ0 (x2) + ϕ1 (x3) ϕ1 (x2) ϕ0 (x3) + ϕ1 (x3) 2 ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) ϕ0 (x1) ϕ0 (x3) 2 = ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ0 (x2) ϕ0 (x3) 2 ϕ0 (x3) ϕ0 (x1) ϕ0 (x3) ϕ0 (x2) ϕ0 (x3) 2 ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) ϕ1 (x1) ϕ1 (x3) 2 + ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ1 (x2) ϕ1 (x3) + ··· 2 ϕ0 (x3) ϕ0 (x1) ϕ0 (x3) ϕ0 (x2) ϕ1 (x3) ϕ0 (x1) ϕ0 (x1) ϕ0 (x1) = ϕ0 (x1) ϕ0 (x2) ϕ0 (x3) ϕ0 (x2) ϕ0 (x2) ϕ0 (x2) ϕ0 (x3) ϕ0 (x3) ϕ0 (x3) ϕ0 (x1) ϕ0 (x1) ϕ1 (x1) + ϕ0 (x1) ϕ0 (x2) ϕ1 (x3) ϕ0 (x2) ϕ0 (x2) ϕ1 (x2) + ··· ϕ0 (x3) ϕ0 (x3) ϕ1 (x3) = 0 + 0 + ··· To see this for general N convince yourself that there are at most N independent columns spanning the columns of the determinant so if the size of the determinant is bigger than N the determinant will be zero. From all of this we can say that probability of finding no eigenvalues in the interval (a, b) is given PN−1 by the Fredholm determinant det (I − KN ) , where KN (x, y) = k=0 ϕk (x) ϕk (y) , the kernel of our integral operator on L2 (a, b) . You should note that det (I − KN ) = det (I − λKN )|λ=1 . Also note that if K (x, y) is any function satisfying R K (x, x) dx = N and R K (x, y) K (y, z) dy = K (x, z) and if we define the probability 1 N distribution PN (x1, . . . xN ) = N! det (K (xm, xn))|m,n=1 , then the probability of finding no eigen- values in (a, b) is still det(I − K). In particular, if K were any sum of terms involving orthogonal polynomials, the probability is still described by the Fredholm determinant, with simply a different kernel. The probability of finding exactly one eigenvalue in an interval (a, b) is given by d − det (I − λKN ) . dλ λ=1 This can be seen by the following computation: 19 Z Z Z Pr (exactly one eigenvalue in J = (a, b)) = ··· PN (x1, . . . xN ) dx1 ··· dxN J Jc Jc Z Z Z Z + ··· PN (x1, . . . xN ) dx1 ··· dxN Jc J Jc Jc Z Z Z Z + ··· + ··· PN (x1, . . . xN ) dx1 ··· dxN Jc Jc Jc J Z Z Z = N ··· PN (x1, . . . xN ) dx1 ··· dxN J Jc Jc since PN (x1, . . . xN ) is symmetric with respect to any interchange of variables. Thus, we have Z Z Z N ··· PN (x1, . . . , xN ) dx1 ··· dxN J Jc Jc Z ∞ Z ∞ = N ... PN (x1, . . . , xN ) χJ (x1) (1 − χJ (x2)) ··· (1 − χJ (xN )) dx1 ··· dxN −∞ −∞ " N Z ∞ Z ∞ X = N ... PN (x1, . . . , xN ) χJ (x1) 1 − χJ (xi) −∞ −∞ i=2 N N # X N Y + χJ (xi) χJ (xj) − · · · + (−1) χJ (xi) dx1 ··· dxN i,j=2,i Z b N (N − 1) (N − 2)! Z b Z b = KN (x1, x1) dx1 − det (KN (xi, xj))|i,j=1,2 dx1dx2 a N! a a Z b Z b N 1 N + ··· + (−1) N ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a Z b Z b Z b 2 = KN (x1, x1) dx1 − det (KN (xi, xj))|i,j=1 dx1dx2 + ··· a a a Z b Z b N 1 N + (−1) N ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a d = − det (I − λKN ) . dλ λ=1 In general the probability of finding k eigenvalues in (a, b) will be given by the kth derivative of the Fredholm determinant evaluated at λ = 1. The semi-circle law tell us that √ 1 p(2N − x2), |x| < 2N K (x, x) ∼ π √ N 0, |x| > 2N √ 2N 2 implying that KN (x, x) ∼ π for |x| < 2N. So we would, at least, heuristically expect that in any bounded interval near zero that the number of eigenvalues would√ become very large as N tends to infinity. However, If one looks at an interval of length about 1/ 2N then one would hope that individual eigenvalues could be detected. So we are going to apply our above computation to a rescaled interval. 20 Instead of considering the fixed interval J we will consider the interval √a , √b . 2N 2N th The Fredholm determinant then becomes det (I − KN ) which is a sum whose k term is given by k Z √b Z √b (−1) 2N 2N k ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxk. k! √a √a 2N 2N The change of variables x → √xi is introduced in the integral obtaining i 2N k Z b Z b k (−1) xi xj 1 ··· det KN √ , √ √ dx1 ··· dxk. k! a a 2N 2N 2N i,j=1 Notice that the bounds of integration are back to J. What follows is an analysis of the behavior of KN as N becomes large. We will show √ √ √ sin (x − y) lim (1/ 2N)KN x/ 2N, y/ 2N = . N→∞ π (x − y) This last kernel is called the sine kernel. To see this we use the following asymptotic formula: −x2 Γ(n + 1) √ nπ 1 e 2 H (x) = λ cos 2n + 1x − + O n n n 1/2 Γ 2 + 1 2 n where Γ(n + 1) λn = n Γ 2 + 1 for n even and Γ(n + 2) −1/2 λn = n 3 (2n + 1) Γ 2 + 2 for n odd. We begin with the expression for K and substitute √x and √y for x and y respec- N 2N 2N tively. Using the Christoffel-Darboux formula we have 2 2 − x − y e 2 e 2 H (x) H (y) − H (y) H (x) K (x, y) = √ N N−1 N N−1 . N 2N (N − 1)! π x − y So thus (assuming N is even), √ √ √ 0 KN (x, y) = (1/ 2N)KN (x/ 2N, y/ 2N) „ «2 „ «2 x y √ √ y y − 2N − 2N H √x H √ − H √ H √x e 2 e 2 N 2N N−1 2N N 2N N−1 2N = √ √ N √x − √y 2 (N − 1)! π 2N 2N 2N 2 2 − (x) − (y) e 4N e 4N x y y x = √ HN √ HN−1 √ − HN √ HN−1 √ 2N (N − 1)! π (x − y) 2N 2N 2N 2N − 1 ! (2N − 1) 2 Γ(N + 1) Γ (N) ∼ √ 2N (N − 1)! π (x − y) N N Γ 2 + 1 Γ 2 + 1 21 " r ! r !! 2N + 1 Nπ 2N − 1 (N − 1) π × cos x − cos y − 2N 2 2N 2 r ! r !! # 2N + 1 Nπ 2N − 1 (N − 1) π 1 − cos y − cos x − + O √ 2N 2 2N 2 N Now we use Stirlings formula and take the limit as N goes to infinity. The details are left to the reader. The final result is 0 sin(x − y) lim KN = . N→∞ π(x − y) From now on the sine kernel will be denoted by simply K or K(x, y). 22 5 A closer look at the Fredholm determinant Our goal in the following pages is to get some information about det(I − λK) where K is the sine kernel and we think of the operator as defined on L2((−s, s)). In other words, we want information about the probability of finding no eigenvalues in the interval (−s, s). It is useful not to try to compute the determinant directly, but the log of the determinant. This is because we have the formula log det(I − λK) = trace log(I − λK) at our disposal. We will eventually derive a differential equation (nonlinear) that has a connection to the above function (thought of as a function of s). To get started we recall some operator theory. Suppose {K (s)} is a family of operators, and that ||K (s)|| < 1,K (s) is trace class and K0(s) is defined. Then d (log (det (I − K (s)))) = −trace (I − K (s))−1 · K0 (s) . ds To prove this d d (log (det (I − K (s)))) = (trace (log (I − K (s)))) ds ds d = trace (log (I − K (s))) ds ∞ n !! d X − (K (s)) = trace ds n n=1 ∞ n ! X d − (K (s)) = trace ds n n=1 ∞ n ! X d (K (s)) = − trace ds n n=1 ∞ ! X h i = − trace (K (s))n−1 K0 (s) n=1 = −trace (I − K (s))−1 · K0 (s) since (I − K (s))−1 = P (K (s))n is just a geometric series. Note the above uses the fact that ∞ n ! ∞ ! X d (K (s)) X h i − trace = − trace (K (s))n−1 K0 (s) . ds n n=1 n=1 Now let J be the interval (−s, s) and consider the operator K (s) with kernel K (x, y) χJ (y) where R s K (x, y) is the sine kernel. This operator sends f to the function −s K (x, y) f (y) dy. 23 To find K0 (s) we use the Fundamental Theorem of Calculus. The operator K0 (s) sends f to K (x, s) f (s) + K (x, −s) f (−s) . Note that the image of f is finite rank and spanned by the two vectors, K (x, s) and K (x, −s) . However, this operator does not make sense for all functions in L2 but only for functions that are continuous. We will ignore this fact for the time being. Define the operator δ0 by δ0 (f) = f (0) . This is called the Dirac operator or point evaluation on the linear space of continuous function. One often sees the Dirac operator written as Z δ0 (f) = δ0(x)f(x)dx = f (0) . This is because one can take a sequence of functions {δn(x)} that tend to infinity at zero and have support that tends to zero such that Z lim δn(x)f(x)dx = f(0). n→∞ Thus thinking of the symbol δ0(x) as a function that captures the value of a function f at the point where the argument of the delta function is zero, we use δ0 (x − y) as the kernel of the identity operator, although this is not a function in the ordinary sense. Thus Z δ0 (x − y) f (y) dy = f (x) . 0 We can now write the kernel of K (s) as K (x, y) δ0 (y − s) + K (x, y) δ0 (y + s) . To summarize, Z ∞ 0 K (s) f = [K (x, y) δ0 (y − s) + K (x, y) δ0 (y + s)] f (y) dy −∞ Z ∞ = [K (x, y) δ0 (y − s) f (y) + K (x, y) δ0 (y + s) f (y)] dy −∞ = K (x, s) f (s) + K (x, −s) f (−s) as before. We are happy to see K0 (s) acting as an integral operator just as before, however, acting on continuous functions. Next we consider the kernel representation of (I − K (s))−1 and since we have one for K0 (s) we will be able to also find the kernel of (I − K (s))−1 · K0 (s) . −1 P∞ n −1 P∞ n Recall that (I − K (s)) = n=0 K whenever ||K|| < 1. Thus (I − K (s)) = I + n=1 K . Now Kn has kernel Z s Z s ··· K (x, x1) ··· K (xn−1, y) dx1 ··· dxn−1. −s −s P∞ n −1 Thus n=1 K has a very complicated kernel which we will call R (x, y) . The operator (I − K) has kernel ρ (x, y) = δ0 (x − y) + R (x, y) . Note that (I − K)−1 = I + (I − K)−1 K since I + (I − K)−1 K (I − K) = (I − K) I + (I − K)−1 K = I. 24 This means that (I − K)−1 K has kernel R (x, y) , and thus that R composed with K is R − K. When (I − K)−1 · K0 (s) operates on f the result is Z s ρ (x, y)[K (y, s) f (s) + K (y, −s) f (−s)] dy −s which can be written as R (x, s) f (s) + R (x, −s) f (−s) . In terms of kernels we see that this operator has kernel R (x, s) δ (y − s) + R (x, −s) δ (y + s) . This has rank two, just like the image of K0(s). We can also easily compute its trace to see that it is R (s, s) + R (−s, −s) We should point out here that this last computation tells that we need to know R(s, s) and R(−s, −s). These are the important numbers and much of what is done in the next few pages is to to find a way to describe these quantities. To summarize the list of kernels: operator kernel K0 (s) K (x, s) δ (y − s) + K (x, −s) δ (y + s) (I − K)−1 ρ (x, y) = δ (x − y) + R (x, y) (I − K)−1 · K0 (s) R (x, s) δ (y − s) + R (x, −s) δ (y + s) We now compute the kernels of some additional operators. d −1 Lemma 3 The operator ds (I − K) has kernel R (x, s) ρ (s, y) + R (x, −s) ρ (−s, y) = M (x, y) . d −1 −1 dK −1 Proof: From the homework, ds (I − K) = (I − K) ds (I − K) . The right hand side has kernel ZZ ρ (x, y) {K (y, s) δ (z − s) + K (y, −s) δ (z + s)} ρ (z, u) dydz = R (x, s) ρ (s, u) + R (x, −s) ρ (−s, u) as desired. Next we introduce an important lemma about commutators. Recall the commutator of two op- erators A and B is simply AB − BA and is symbolized by [A, B] . Define the operator D by 1 0 df D : C (a, b) → C (a, b) and Df = dx . Lemma 4 The operator [D, (I − K)−1] has kernel − R (x, s) ρ (s, y) + R (x, −s) ρ (−s, y) . 25 h i Proof: The commutator D, (I − K)−1 = (I − K)−1 [D,K](I − K)−1 . This is just algebra. ∂K(x,y) [D,K] = DK − KD.DK has kernel ∂x , but KD is more complicated. Z s Z s 0 y=s ∂K (x, y) KDf = K (x, y) f (y) dy = K (x, y) f (y)|y=−s − f (y) dy −s −s ∂y so KD has kernel ∂K (x, y) K (x, s) δ (y − s) − K (x, −s) δ (y + s) − . ∂y ∂K ∂K Since K (x, y) is a function of x − y, ∂x = − ∂y and thus the kernel of [D,K] is −K (x, s) δ (y − s) + K (x, −s) δ (y + s) . To find the kernel of [D, (I − K)−1] we have ZZ ρ (x, y)[−K (y, s) δ (z − s) + K (y, −s) δ (z + s)] ρ (z, u) dydz Z = (−R (x, s) δ (z − s) ρ (z, u) + R (x, −s) δ (z + s) ρ (z, u)) dz = −R (x, s) ρ (s, u) + R (x, −s) ρ (−s, u) . We will consider a slightly more general problem in what follows, that is, the probability for finding no eigenvalues in a finite union of intervals. We can think of K as acting on L2(I) where I is the S S set I = (a1, a2) (a3, a4) ... (an−1, an). Also notice that λ sin (x − y) A (x) A0 (y) − A (y) A0 (x) λK (x, y) = = π (x − y) x − y q λ where A (x) = π sin x. We define functions Q (x, aˆ) = (I − K)−1 A (x) P (x, aˆ) = (I − K)−1 A0 (x) wherea ˆ is a vector containing each ai. Define the operator Mx (Multiplication by x) to be Mxf = xf. h −1i Lemma 5 The operator Mx, (I − K) has kernel Q (x, aˆ) I − Kt−1 A0 (y) − P (x, aˆ) I − Kt−1 A (y) where Kt has kernel K (y, x) . h −1i −1 −1 Proof: Mx, (I − K) = (I − K) [Mx,K](I − K) . Now [Mx,K] has kernel