Random Matrices

Estelle Basor

Mathematics Department California Polytechnic State University San Luis Obispo, 93407 Winter 1995 and Spring 2007

1 Contents

2 1 Introduction

These notes are based on a series of lectures given by Estelle Basor in an applied analysis graduate course at Cal Poly first done in Winter of 1996. They were based on notes taken by the students in the class, typed by Mike Robertson, and then finally edited by Estelle Basor and Jon Jacobsen. The notes were again used in Spring of 2007 and corrections to the notes were provided by the students in the class and some other sections were added.

These notes are not meant to be complete or perfect, but rather an informal set of notes describing a new and lively field in mathematics. It is the intention of the author that they provide a heuristic background so that one can begin to understand random matrices and consider future questions in the subject. The notes provide an introduction to the basics of the Gaussian Unitary Ensemble (GUE), a derivation of the semi-circle law, a derivation of the Painlev´eequation connected to the basic probabilites of the GUE and at the end a derivation of the distribution function for linear statistics.

Many of the tools used are provided in the text, while others are missing. In particular, the asymptotics of the relevant orthogonal polynomials are used but not derived. Several properties of Hilbert spaces, operators, operators are used and not proved.

3 4 2 The probability distribution for the eigenvalues of a random hermitian matrix

Given a random Hermitian matrix what can we say about its eigenvalues? Some questions include: What is the largest one? How are they spaced? What is the probability that one does (or does not) lie in a given interval? This section will describe a probability distribution on the space of matrices and show how this distribution induces one on the space of eigenvalues. To begin, we consider a general 2 × 2 Hermitian matrix

" (0) (1) # x11 x12 + ix12 H = (0) (1) . x12 − ix12 x22

(0) (1) The matrix H has four free parameters, namely x11,x12 , x12 , x22. Thus we may identify the space of 2 × 2 Hermitian matrices with R4. That is to say, if

4  (0) (1)  A ⊆ R , then H ∈ A iff x11,x12 , x12 , x22 ∈ A.

It is not hard to see that in a more general situation an n × n Hermitian matrix will have n2 free parameters and we may similarly identify the space of n × n Hermitian matrices with Rn2 . Given A ⊆ Rn2 , we seek the probability that H ∈ A and also information about the eigenvalues of H.

Definition 1. Let H be a Hermitian matrix. We define the probability density P (H) by

−trace(H2) P (H) = cne .

This means that the probability of a random n × n Hermitian matrix occurring in A ⊆ Rn2 is Z Z n Y Y (0) Y (1) cn ... P (H) dxii dxij dxij . (1) A i=1 1≤i

„ “ ”2 “ ”2 « ZZZZ 2 (0) (1) 2 − x11+2 x12 +2 x12 +x22 (0) (1) c2 e dx11dx22dx12 dx12 . (2) A As it stands this contains too much information to be useful. We need to scale our search down to specifically the eigenvalues of H. Our goal is to apply a change of variables to (??) and let the constant absorb some information leaving us with a finer look at the eigenvalues themselves. To do this we will need some general facts.

5 1. The probability given in (??) is independent with respect to any change of basis. Proof: Let H0 = M −1HM for some orthogonal matrix M. Then (H0)2 = M −1HMM −1HM = M −1H2M, and tr(H02) = tr M −1H2M = tr M −1MH2 = tr H2 . So P (H0) = P (H). Thus P (H) is left unchanged by a change in basis, in particular, a unitary change of basis.

2. Every Hermitian matrix is unitarily diagonalizable, that is if H is Hermitian then H may be written as the product H = U −1DU where D is diagonal and U is unitary (U −1 = U ∗).

3. The eigenvalues of a Hermitian matrix are real. Proof: Using the inner product, notice hv, λvi = λ¯ ||v||2 and hv, λvi = hv, Hvi = hHv, vi = hλv, vi = λ ||v||2. Hence λ = λ¯ and is therefore real.

Thus D is a real diagonal matrix containing the eigenvalues of H.

Our goal is to apply a change of variables to (??) corresponding to H ↔ U −1DU. For clarity we will restrict our attention to the 2 × 2 case first and then address the n × n case. A problem arises though that in general the U −1DU representation is not unique as the eigenvalues can be interchanged and there are many choices for the eigenvectors forming U. We need a method of selecting a particular U and D so that they are unique. Since the entries of D are real we may simply order them specifying  θ 0  D = 1 0 θ2 where θ1 ≤ θ2. Since H has four free variables and D has only two, U must depend on two   free parameters. Let U consist of two column vectors U = v1 v2 . We may choose a so that  a  √ 0 ≤ a ≤ 1 and v = . Since U is unitary ||v || = 1 thus b = 1 − a2. And so 1 beiα 1

 a  v = √ . 1 1 − a2eiα

Now since hv1, v2i = 0 and ||v2|| = 1, v2 is completely determined if we also insist that its first  c  non-zero entry is positive. Suppose v = with c ≥ 0. Since ||v || = 1 we have as above 2 deiβ 2  c  √ √ v = √ . Since hv , v i = 0 we have ac + 1 − a2 1 − c2ei(α−β) = 0, 2 1 − c2eiβ 1 2

−ac ei(α−β) = . p(1 − a2) (1 − c2)

Thus 1) ei(α−β) ∈ R

−ac 2) √ = 1. (1−a2)(1−c2)

6 Since a ≥ 0, c ≥ 0 it must be that −ac = −1 p(1 − a2) (1 − c2) √ c 1 − a2 √ = . 1 − c2 a

1−a2 Thus our choice of a determines c. If we let a2 = k then,

c2 = 1 − c2 k k c2 = 1 + k r k c = . 1 + k

Furthermore if ei(α−β) = −1 then given α we can determine β.

Thus we have shown that a, α, θ1, and θ2 completely determine a unique representation for H as H = U −1DU with U and D specified as above. Conversely given a U and D we may find H by multiplication. We have established a one to one correspondence between Hermitian matrices and products of the form U −1DU and can use this correspondence to perform a change of variables

 (0) (1)  f x11,x12 , x12 , x22 = (θ1,θ2, p1, p2) with θ1 ≤ θ2, p1 ∈ [0, 1] , p2 ∈ [0, 2π) .

Now for the general case we cannot be as explicit, but we can say that H = U −1DU where D is diagonal with diagonal elements θ1, . . . , θn satisfying θ1 ≤ θ2 ≤ ... ≤ θn and U unitary with column vectors of length one and with first non-zero coordinate positive. Notice that if u1 is the first column vector of U it has 2n − 2 real free parameters and that u2 has 2n − 4 free parameters and so on so that the variables in U account for n2 − n parameters. We thus have a change of variables given by some function f so that

 (1)  f x11, . . . , x(n−1)n = (θ1, . . . , θn, p1, . . . , pn2−n) .

Using this change of variables we need to transform the integral:

Z Z (1) cn ... P (H) dx11 . . . dx(n−1)n (3) A into one with the new variables: Z Z cn ... P (H) |J (θi, pl)| dθ1 . . . dpn2−n. f(A)

7 Here the expression |J (θi, pl)| stands for the absolute value of the determinant of the Jacobian matrix.

Notice that the P (H) term is easy to convert since

−traceH2 −trace(U ∗DU) −traceD2 − Pn θ2 P (H) = e = e = e = e i=1 1 .

Before we begin the Jacobian computation we notice the following:

1. Recall that if A is a matrix then ∂A ∂a  = ij . ∂y ij ∂y ij (This means simply that the partial derivative of a matrix is defined to be the partials of all its entries.)

∂AB ∂A ∂B 2. If A and B are matrices then the product rule ∂y = ∂y B + A ∂y holds true.

∗ ∗ 3. Since UU ∗ = I we have that ∂UU = 0 or by the product rule ∂U U ∗ + U ∂U = 0 or ∂pi ∂pi ∂pi ∂U ∂U ∗ U ∗ = −U . ∂pi ∂pi Similarly since U ∗U = I ∂U ∗ ∂U U = −U ∗ . ∂pi ∂pi

∗ Define Si = U ∂U and note that since U is independent of the θ variables, Si is also. Now letting ∂pi H equal U ∗DU we find that ∂H ∂U ∗DU ∂U ∗ ∂DU = = DU + U ∗ ∂pi ∂pi ∂pi ∂pi ∂U ∗ ∂D ∂U = DU + U ∗ U + U ∗D ∂pi ∂pi ∂pi ∂U ∗ ∂U  ∂D  = DU + U ∗D since = 0 . ∂pi ∂pi ∂pi

This implies ∂H ∂U ∗ ∂U U U ∗ = U D + D U ∗ = SiD − DSi. ∂pi ∂pi ∂pi

Since D is diagonal the lk entry of the matrix SiD − DSi is given by

i Slk(θk − θl)

i i where Slk is the lk entry of the S matrix. This matrix remember does not depend on the θis.

8 Also ∂H ∂ (U ∗DU) ∂U ∗ ∂D ∂U = = DU + U ∗ U + U ∗D ∂θi ∂θi ∂θi ∂θi ∂θi ∗ ∂D 0  = U U Since U depends only on pis ∂θi and hence U ∂H U ∗ has lk entry δ δ . To sum up we have two crucial formulas ∂θi lk ki

1. (U ∂H U ∗) = Si (θ − θ ) ∂pi lk lk k l 2. (U ∂H U ∗) = δ δ . ∂θi lk lk ki

Our next step is to write expressions for these same matrices using only the formula for matrix multiplication. In terms of components U ∂H U ∗ has lk entry ∂θi

n X ∂H  U U ∗ lj ∂θ j=1 i jk

X ∂H  = U U ∗ lj ∂θ mk j,m i jm X ∂H  = U U ∗ . ∂θ lj mk j,m i jm

Note that this last summation can be written as     X ∂H ∗ X ∂H ∗ UljUjk + UljUmk ∂θi ∂θi j jj j6=m jm and this is the same as

  ∂(x(0) + ix(1) )! X ∂xjj ∗ X jm jm ∗ UljUjk + UljUmk. ∂θi ∂θi j j6=m

But this is also   (0) (1) ! (0) (1) ! X ∂xjj X ∂(xjm + ixjm) X ∂(xjm − ixjm) U U ∗ + U U ∗ + U U ∗ , ∂θ lj jk ∂θ lj mk ∂θ lm lk j i j

  (0) ! (1) ! X ∂xjj X ∂xjm X ∂xjm = U U ∗ + (U U ∗ + U U ∗ ) + (iU U ∗ − iU U ∗ ). ∂θ lj jk ∂θ lj mk lm jk ∂θ lj mk lm jk j i j

9 The lk component of U ∂H U ∗ is exactly the above same expression except with the partial with ∂pi respect to the θi replaced with the partial with respect to the pi.

Next think carefully about the Jacobian. It is a n2 × n2 matrix. We are going to construct it by (0) (1) placing the variables across, first with the xii terms, them the xij terms and then finally the xij terms. This will correspond to n2 columns. We are going to run the partial derivatives down the rows, first the θ partials followed by the pi partials and these last in any order.

2 2 We now multiply J (θi, pi) by a matrix C where C is constructed as follows. It is also n ×n and has ∗ columns indexed by lk. The lk column will be one where first we have the terms (in rows) UljUjk ∗ ∗ n2−n ∗ ∗ (n of these), followed by UljUmk + UlmUjk ( 2 of these) and then finally iUljUmk − iUlmUjk. In block form this is

 ∗  UljUjk ∗ ∗ Clk =  UljUmk + UlmUjk  , ∗ ∗ iUljUmk − iUlmUjk. We are also careful so that we arrange the columns so that the first n columns are where l = k.

In the 2 × 2 case everything looks like this.

 (0) (1)  ∂x11 ∂x22 ∂x12 ∂x12 ∂θ1 ∂θ1 ∂θ1 ∂θ1  (0) (1)   ∂x11 ∂x22 ∂x12 ∂x12   ∂θ2 ∂θ2 ∂θ2 ∂θ2    JC =  (0) (1)  C11 C22 C12 C21  ∂x11 ∂x22 ∂x12 ∂x12   ∂p1 ∂p1 ∂p1 ∂p1   (0) (1)  ∂x11 ∂x22 ∂x12 ∂x12 ∂p2 ∂p2 ∂p2 ∂p2           U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂θ1 ∂θ1 ∂θ1 ∂θ1   11  22  12  21   U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗   ∂θ2 ∂θ2 ∂θ2 ∂θ2  =   11  22  12  21   U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗   ∂p1 ∂p1 ∂p1 ∂p1    11  22  12  21  U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ U ∂H U ∗ ∂p2 11 ∂p2 22 ∂p2 12 ∂p2 21

  δ11δ11 δ22δ21 δ12δ11 δ21δ21  δ11δ12 δ22δ22 δ12δ12 δ21δ22  =  1 1   0 0 S12 (θ2 − θ1) S21 (θ1 − θ2)  2 2 0 0 S12 (θ2 − θ1) S12 (θ1 − θ2)  1 0 0 0   0 1 0 0  =  1 1  .  0 0 S12 (θ2 − θ1) S21 (θ1 − θ2)  2 2 0 0 S12 (θ2 − θ1) S21 (θ1 − θ2) Taking the determinant of both sides we have 1 1 S12 (θ1 − θ2) S21 (θ2 − θ1) |J| |C| = 2 2 S12 (θ1 − θ2) S21 (θ2 − θ1)

10 1 1 2 S12 S21 = (θ1 − θ2) 2 2 . S12 S21

1 1 S12 S21 2 Since C and 2 2 depend only on p1 and p2 we may write | det (J) | = (θ1 − θ2) g (p1, p2) S12 S21 where

S1 S1 12 21 S2 S2 g (p , p ) = 12 21 1 2 det (C) and (3) becomes ZZZZ −θ2−θ2 2 c2 e 1 2 (θ1 − θ2) g (p1, p2) dθ1dθ2dp1dp2. f(A)

In the general case, JC has a certain block form   In A BD where In is just the identity matrix A and B are the zero matrices and the matrix D is one whose i lk columns are of have entries Slk(θk − θl). If we take the determinant of JC now we have

Y 2 (θj − θk) × h(p1, . . . , pn2−n) j

We now integrate with respect to all the pis variables and we see we have an integral of the form Z Z Pn 2 Y 0 − i=1 θi 2 cn ... e (θl − θk) dθ1 . . . dθn. B l

Theorem 1 Theorem 1 the Gaussian Unitary Ensemble induces a probability density on the set of eigenvalues given by Pn 2 Y 0 − i=1 θi 2 PN (θ1, . . . , θn) = cne (θl − θk) l

The probability of finding a set of eigenvalues within a given set A ⊂ Rn is given by the integral

Z Z 2 2 2 0 −(θ1+θ2+···+θn) Y 2 cn ... e (θi − θj) dθ1dθ2 ··· dθn A j

11 From now on we want to think of n as very large and so we will replace n with N and because of soon to be described connection with orthogonal polynomials and integral operators we will use xi notation for the eigenvalue variables.

Our next goal will be to prove that for the N × N case that 1 P (x , x , . . . x ) = det (K (x , x ))|N N 1 2 N N! N i j i,j=1 PN−1 where KN (x, y) = k=0 ϕk (x) ϕk (y) with ϕk (x) the normalized Hermite function, ϕk (x) = x2 − 2 R cke Hk (x) so that ϕk1 (x) ϕk2 (x) dx = δk1k2 . and cN is the proper normalizing constant.

We have that −(x2+x2+...+x2 ) Y 2 PN (x1, x2, . . . , xN ) = cN e 1 2 N (xi − xj) , i

Starting from this equation, consider the following steps.

Step1

We know that  1 1 ··· 1  Y  x1 x2 ··· xN  (xi − xj) = det  . . . .  .  . . .. .  j

1 1 For example the 2×2 case yields (x2 − x1) = . To prove this in general let p (x1, x2, . . . , xN ) = x1 x2 Q i

 1 1 ··· 1   x1 x2 ··· xN    q (x1, x2, . . . , xN ) = det  . . .. .  .  . . . .  N−1 N−1 N−1 x1 x2 ··· xN

If xi = xj for any i 6= j then it follows that p (x1, x2, . . . , xN ) = 0 and q (x1, x2, . . . , xN ) = 0 as two of the columns in the determinant will be identical. This implies that (xi − xj) is a factor of q (x1, x2, . . . , xN ) for i 6= j so p must divide evenly into q. The degree of q is the same as p, both N(N−1) p being 2 , so q must be constant. Checking one term it is easy to see that they must be the same. This leads to the formula that

1 1 ··· 1 1 1 ··· 1

2 2 2 x1 x2 ··· xN x1 x2 ··· xN −(x1+x2+...+xN ) PN (x1, x2, . . . , xN ) = cN e ...... N−1 N−1 N−1 N−1 N−1 N−1 x1 x2 ··· xN x1 x2 ··· xN

12 Step 2

We can show that

1 1 ··· 1 H0 (x1) H0 (x2) ··· H0 (xN )

x1 x2 ··· xN H1 (x1) H1 (x2) ··· H1 (xN ) = d . . .. . N ...... N−1 N−1 N−1 x1 x2 ··· xN HN−1 (x1) HN−1 (x2) ··· HN−1 (xN )

th th for some constant dN where Hn is the n Hermite polynomial. Each Hn is an n degree polynomial and can be constructed by a linear combination of the rows in the left hand matrix. The dN captures the changes in the leading terms of each Hn, recalling that multiplying a row by a constant changes the value of the determinant by the same factor.

Step 3

A factor in a column of a determinant can be factored out to leave the following.

x2 x2 x2 − 1 − 2 − N e 2 H0 (x1) e 2 H0 (x2) ··· e 2 H0 (xN ) 2 2 2 „ x2 x2 x2 « x x x − 1 + 2 +···+ N − 1 − 2 − N 2 2 2 Y e 2 H1 (x1) e 2 H1 (x2) ··· e 2 H1 (xN ) e (xi − xj) = dN ...... j

Step 4

Now multiply each row by the necessary normalizing constant and multiply the determinant by its 0 reciprocal thereby making a new constant dN to get

ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN ) „ x2 x2 x2 « − 1 + 2 +···+ N ϕ (x ) ϕ (x ) ··· ϕ (x ) 2 2 2 Y 1 1 1 2 1 N e (x − x ) = d0 . i j N ...... j

Step 5

The original equation can now be written

PN (x1, x2, . . . , xN )

ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN ) ϕ0 (x1) ϕ0 (x2) ··· ϕ0 (xN )

ϕ1 (x1) ϕ1 (x2) ··· ϕ1 (xN ) ϕ1 (x1) ϕ1 (x2) ··· ϕ1 (xN ) = c d0 2 N N ......

ϕN−1 (x1) ϕN−1 (x2) ··· ϕN−1 (xN ) ϕN−1 (x1) ϕN−1 (x2) ··· ϕN−1 (xN ) 0 2 N = cN dN det (Mij)ij=1

13 PN−1 where Mij = k=0 ϕk (xi) ϕk (xj) noting that the determinant is unaffected by transposition.

0 2 1 So all that is left to show is that cN (dN ) is N! to obtain the desired result. This could have been done by carefully keeping track of these constants all along but would have required that we perform the integration of pi variables in the formation of cN which we would like to avoid. Before finding this constant, there is a useful lemma.

R ∞ Lemma 1 Suppose f (x, y) is a function of two variables and that −∞ f (x, x) dx = c < ∞ and R ∞ also −∞ f (x, y) f (y, z) dy = f (x, z) , then Z ∞  K   K−1  det f (xi, xj)|i,j=1 dxK = (c − K + 1) det f (xi, xj)|i,j=1 . −∞

Proof: Consider the determinant using the permutation definition,

K  K  X σ Y  det f (xi, xj)|i,j=1 = (−1) f xl, xσ(l) . σ∈SK l=1

The variable xK will appear in a factor either once in the form f (xK , xK ) or twice in the form R ∞ f (xa, xK ) f (xK , xb) . In the f (xK , xK ) case −∞ f (xK , xK ) dxK = c and in the other Z . . . f (xa, xK ) f (xK , xb) . . . dxK = . . . f (xa, xb) ....

Now break up our original sum as

K K K X σ Y  X σ Y  X σ Y  (−1) f xl, xσ(l) = (−1) f xl, xσ(l) + (−1) f xl, xσ(l) .

σ∈SK l=1 σ|σ(K)=K l=1 σ|σ(K)6=K l=1

After integration with respect to xK the sums yield K−1 K−1 X σ Y  X σ Y  c (−1) f xl, xσ(l) − (K − 1) (−1) f xl, xσ(l)

σ∈SK−1 l=1 σ∈SK−1 l=1 K−1 = (c − K + 1) det (f (xi, xj))|i,j=1 .

1 R ∞ Now, our constant of N! in our expression for PN (x1, x2, . . . , xN ) can be computed since −∞ KN (x, x)dx = R ∞ N and −∞ KN (x, y)KN (y, z)dy = KN (x, z). We have Z ∞ Z ∞ N ... det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN −∞ −∞ Z ∞ Z ∞ N = ... 1 · det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN−1 −∞ −∞ Z ∞ Z ∞ N = ... 1 · 2 · det (KN (xi, xj))|i,j=1 dx1dx2 ··· dxN−2 −∞ −∞ . . = 1 · 2 · 3 ··· N = N!

14 Thus the determinant portion of PN is N! when fully integrated and the integration over the 1 probability distribution is 1, cN = N! . We have established our goal that PN (x1, x2, . . . xN ) = 1 N N! det (KN (xi, xj))|i,j=1 .

Define the n-point correlation function as N! Z Z R (x , x , . . . , x ) = ... P (x , x , . . . , x ) dx dx ··· dx . N 1 2 n (N − n)! N 1 2 N n+1 n+2 N

This leads to the following lemma:

Lemma 2 n RN (x1, x2, . . . , xn) = det (KN (xi, xj))|i,j=1 .

Proof: N! Z Z 1 R (x , x , . . . , x ) = ... det (K (x , x ))|N dx dx ··· dx N 1 2 n (N − n)! N! N i j i,j=1 n+1 n+2 N 1 Z Z = ... 1 det (K (x , x ))|N−1 dx dx ··· dx (N − n)! N i j i,j=1 n+1 n+2 N−1 1 Z Z = ... 1 · 2 det (K (x , x ))|N−2 dx dx ··· dx (N − n)! N i j i,j=1 n+1 n+2 N−2 . . 1 · 2 ··· (N − (n + 1) + 1) = det (K (x , x ))|n (N − n)! N i j i,j=1 n = det (KN (xi, xj))|i,j=1 .

15 3 The density of the eigenvalues

In this section we compute the density function for the eigenvalues for finite N and N large. For large N this function is approximated by a semi-ellipse. This result, conjectured and proved in GUE and also for several other ensembles in the literature is known as Wigner’s semi-cirle law.

In the following all integrals are assumed to be from minus infinity to infinity unless otherwise indicated. If f is a symmetric function in all its variables then the following integral is the expected value of f with respect to PN Z Z ... f (x1, x2, . . . xN ) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN .

PN If f (x1, x2, . . . xN ) = i=1 χA (xi) then the above integral counts the number of eigenvalues that are contained in the set A. So we can rewrite this integral as

" N # Z Z X ... χA (xi) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN i=1 Z Z = N ... χA (x1) PN (x1, x2, . . . xN ) dx1dx2 ··· dxN Z = KN (x, x) dx. A

We are using the symmetry of PN in the last computation. This gives us a quite simple expression for the expected number of eigenvalues in a set A, and we see that KN (x, x) is the density function for the eigenvalues.

Using formulas from the theory of orthogonal polynomials, we have

e−x2 K (x, x) = √ H0 (x) H (x) − H0 (x)H (x) . N 2N−1 (N − 1)! π N N−1 N−1 N

Then using the well-known asymptotic expansions for Hermite polynomials one can show that  √ p √ √ (2/π) 1 − x2 |x| ≤ 1 lim 2/NKN ( 2N + 1x, 2N + 1x) = N→∞ 0 |x| ≥ 1

This last equation√ is what is referred to as√ the “semi-circle law”. This tells us that if we are far away from 2N there are on the order of N eigenvalues in any finite interval.

16 4 The probability that an interval (a, b) contains no eigenvalues

In this section we find an expression that describes the probability that an interval contains no eigenvalues of a random matrix. We first do this for finite N and then later take the limit as N becomes large. We first recall our definition for a Fredholm determinant.

Recall that given a bounded continuous with kernel K(x, y) on L2 (a, b) we defined the Fredholm determinant

∞ n X (−1) λn Z b Z b  det (I − λK) = ··· det (K (x , x ))|n dx ··· dx . n! i j i,j=1 1 n n=0 a a

What is the probability of finding no eigenvalues in an interval J = (a, b)? It is exactly the same as finding all of the eigenvalues in the complement of J.

Thus the probability that all xi are not in (a, b) is given by

Z Z = ··· PN (x1, x2, . . . , xN ) dx1 ··· dxN Jc Jc Z ∞ Z ∞ = ... PN (x1, x2, . . . , xN ) χJc (x1) ··· χJc (xN ) dx1 ··· dxN . −∞ −∞

But since χJc (x) = 1 − χJ (x) the probability becomes Z ∞ Z ∞ = ... PN (x1, x2, . . . , xN ) (1 − χJ (x1)) ··· (1 − χJ (xN )) dx1 ··· dxN −∞ −∞  N Z ∞ Z ∞ X X = ... PN (x1, x2, . . . , xN ) 1 − χJ (xi) + χJ (xi) χJ (xj) −∞ −∞ i=1 i

17 Z ∞ Z ∞ = 1 − N ... PN (x1, x2, . . . , xN ) χJ (x1) dx1 ··· dxN −∞ −∞ N! Z ∞ Z ∞ + ... PN (x1, x2, . . . , xN ) χJ (x1) χJ (x2) dx1 ··· dxN − · · · 2! (N − 2)! −∞ −∞ Z ∞ Z ∞ N N Y + (−1) ... PN (x1, x2, . . . , xN ) χJ (xi) dx1 ··· dxN −∞ −∞ i=1 Z ∞ Z ∞ 1 N = 1 − ... det (KN (xi, xj))|i,j=1 χJ (x1) dx1 ··· dxN (N − 1)! −∞ −∞ Z ∞ Z ∞ 1 N + ... det (KN (xi, xj))|i,j=1 χJ (x1) χJ (x2) dx1 ··· dxN − · · · 2! (N − 2)! −∞ −∞ N (−1)N Z ∞ Z ∞ Y + ... det (K (x , x ))|N χ (x ) dx ··· dx . N! N i j i,j=1 J i 1 N −∞ −∞ i=1 Using the formula for the n-point correlation functions this becomes Z ∞ = 1 − KN (x1, x1) χJ (x1) dx1 + −∞ Z ∞ Z ∞ 1 2 det (KN (xi, xj))|i,j=1 χJ (x1) χJ (x2) dx1dx2 − · · · 2! −∞ −∞ N N (−1) Z ∞ Z ∞ Y + ... det (K (x , x ))|N χ (x ) dx ··· dx N! N i j i,j=1 J i 1 N −∞ −∞ i=1 Z b Z b Z b 1 2 = 1 − KN (x1, x1)dx1 + det(KN (xi, xj))|i,j=1dx1dx2 + ... a 2! a a N Z b Z b (−1) N + ... det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a

Note that the probability looks just like a Fredholm determinant for the kernel KN , except that this is a finite sum of integrals. The determinant agrees with the first N + 1 terms and in fact the remaining terms in the Fredholm determinant are zero for this choice of KN.

For example, K1 (x1, x2) = ϕ0 (x1) ϕ0 (x2) and

K1 (x1, x1) K1 (x1, x2) ϕ0 (x1) ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) = K1 (x2, x1) K1 (x2, x2) ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ0 (x2)

ϕ0 (x1) ϕ0 (x1) = ϕ0 (x1) ϕ0 (x2) ϕ0 (x2) ϕ0 (x2) = 0.

In any such example where the number of rows in the determinant are greater than the number of variables N, the columns can be factored so as to always have a repeated column.

18 3 To further illustrate this look at det (K2 (xi, xj))|i,j=1 .

3 det (K2 (xi, xj))|i,j=1 2 2 ϕ0 (x1) + ϕ1 (x1) ϕ0 (x1) ϕ0 (x2) + ϕ1 (x1) ϕ1 (x2) ϕ0 (x1) ϕ0 (x3) + ϕ1 (x1) ϕ1 (x3) 2 2 = ϕ0 (x2) ϕ0 (x1) + ϕ1 (x2) ϕ1 (x1) ϕ0 (x2) + ϕ1 (x2) ϕ0 (x2) ϕ0 (x3) + ϕ1 (x2) ϕ1 (x3) 2 2 ϕ0 (x3) ϕ0 (x1) + ϕ1 (x3) ϕ1 (x1) ϕ0 (x3) ϕ0 (x2) + ϕ1 (x3) ϕ1 (x2) ϕ0 (x3) + ϕ1 (x3) 2 ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) ϕ0 (x1) ϕ0 (x3) 2 = ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ0 (x2) ϕ0 (x3) 2 ϕ0 (x3) ϕ0 (x1) ϕ0 (x3) ϕ0 (x2) ϕ0 (x3) 2 ϕ0 (x1) ϕ0 (x1) ϕ0 (x2) ϕ1 (x1) ϕ1 (x3) 2 + ϕ0 (x2) ϕ0 (x1) ϕ0 (x2) ϕ1 (x2) ϕ1 (x3) + ··· 2 ϕ0 (x3) ϕ0 (x1) ϕ0 (x3) ϕ0 (x2) ϕ1 (x3)

ϕ0 (x1) ϕ0 (x1) ϕ0 (x1)

= ϕ0 (x1) ϕ0 (x2) ϕ0 (x3) ϕ0 (x2) ϕ0 (x2) ϕ0 (x2)

ϕ0 (x3) ϕ0 (x3) ϕ0 (x3)

ϕ0 (x1) ϕ0 (x1) ϕ1 (x1)

+ ϕ0 (x1) ϕ0 (x2) ϕ1 (x3) ϕ0 (x2) ϕ0 (x2) ϕ1 (x2) + ···

ϕ0 (x3) ϕ0 (x3) ϕ1 (x3) = 0 + 0 + ···

To see this for general N convince yourself that there are at most N independent columns spanning the columns of the determinant so if the size of the determinant is bigger than N the determinant will be zero.

From all of this we can say that probability of finding no eigenvalues in the interval (a, b) is given PN−1 by the Fredholm determinant det (I − KN ) , where KN (x, y) = k=0 ϕk (x) ϕk (y) , the kernel of our integral operator on L2 (a, b) .

You should note that det (I − KN ) = det (I − λKN )|λ=1 . Also note that if K (x, y) is any function satisfying R K (x, x) dx = N and R K (x, y) K (y, z) dy = K (x, z) and if we define the probability 1 N distribution PN (x1, . . . xN ) = N! det (K (xm, xn))|m,n=1 , then the probability of finding no eigen- values in (a, b) is still det(I − K). In particular, if K were any sum of terms involving orthogonal polynomials, the probability is still described by the Fredholm determinant, with simply a different kernel.

The probability of finding exactly one eigenvalue in an interval (a, b) is given by

d − det (I − λKN ) . dλ λ=1 This can be seen by the following computation:

19 Z Z Z Pr (exactly one eigenvalue in J = (a, b)) = ··· PN (x1, . . . xN ) dx1 ··· dxN J Jc Jc Z Z Z Z + ··· PN (x1, . . . xN ) dx1 ··· dxN Jc J Jc Jc Z Z Z Z + ··· + ··· PN (x1, . . . xN ) dx1 ··· dxN Jc Jc Jc J Z Z Z = N ··· PN (x1, . . . xN ) dx1 ··· dxN J Jc Jc since PN (x1, . . . xN ) is symmetric with respect to any interchange of variables.

Thus, we have Z Z Z N ··· PN (x1, . . . , xN ) dx1 ··· dxN J Jc Jc Z ∞ Z ∞ = N ... PN (x1, . . . , xN ) χJ (x1) (1 − χJ (x2)) ··· (1 − χJ (xN )) dx1 ··· dxN −∞ −∞ " N Z ∞ Z ∞ X = N ... PN (x1, . . . , xN ) χJ (x1) 1 − χJ (xi) −∞ −∞ i=2 N N # X N Y + χJ (xi) χJ (xj) − · · · + (−1) χJ (xi) dx1 ··· dxN i,j=2,i

Z b N (N − 1) (N − 2)! Z b Z b = KN (x1, x1) dx1 − det (KN (xi, xj))|i,j=1,2 dx1dx2 a N! a a Z b Z b N 1 N + ··· + (−1) N ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a Z b Z b Z b 2 = KN (x1, x1) dx1 − det (KN (xi, xj))|i,j=1 dx1dx2 + ··· a a a Z b Z b N 1 N + (−1) N ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxN N! a a

d = − det (I − λKN ) . dλ λ=1 In general the probability of finding k eigenvalues in (a, b) will be given by the kth derivative of the Fredholm determinant evaluated at λ = 1.

The semi-circle law tell us that √  1 p(2N − x2), |x| < 2N K (x, x) ∼ π √ N 0, |x| > 2N √ 2N 2 implying that KN (x, x) ∼ π for |x| < 2N. So we would, at least, heuristically expect that in any bounded interval near zero that the number of eigenvalues would√ become very large as N tends to infinity. However, If one looks at an interval of length about 1/ 2N then one would hope that individual eigenvalues could be detected. So we are going to apply our above computation to a rescaled interval.

20   Instead of considering the fixed interval J we will consider the interval √a , √b . 2N 2N

th The Fredholm determinant then becomes det (I − KN ) which is a sum whose k term is given by

k Z √b Z √b (−1) 2N 2N k ··· det (KN (xi, xj))|i,j=1 dx1 ··· dxk. k! √a √a 2N 2N

The change of variables x → √xi is introduced in the integral obtaining i 2N

k Z b Z b     k (−1) xi xj 1 ··· det KN √ , √ √ dx1 ··· dxk. k! a a 2N 2N 2N i,j=1 Notice that the bounds of integration are back to J.

What follows is an analysis of the behavior of KN as N becomes large. We will show √  √ √  sin (x − y) lim (1/ 2N)KN x/ 2N, y/ 2N = . N→∞ π (x − y) This last kernel is called the sine kernel. To see this we use the following asymptotic formula:

−x2 Γ(n + 1)  √ nπ   1  e 2 H (x) = λ cos 2n + 1x − + O n n n  1/2 Γ 2 + 1 2 n where Γ(n + 1) λn = n  Γ 2 + 1 for n even and Γ(n + 2) −1/2 λn = n 3 (2n + 1) Γ 2 + 2 for n odd. We begin with the expression for K and substitute √x and √y for x and y respec- N 2N 2N tively. Using the Christoffel-Darboux formula we have

2 2 − x − y e 2 e 2 H (x) H (y) − H (y) H (x) K (x, y) = √ N N−1 N N−1 . N 2N (N − 1)! π x − y

So thus (assuming N is even), √ √ √ 0 KN (x, y) = (1/ 2N)KN (x/ 2N, y/ 2N) „ «2 „ «2 x y √ √     y   y    − 2N − 2N H √x H √ − H √ H √x e 2 e 2  N 2N N−1 2N N 2N N−1 2N  = √ √ N √x − √y 2 (N − 1)! π 2N  2N 2N  2 2 − (x) − (y) e 4N e 4N   x   y   y   x  = √ HN √ HN−1 √ − HN √ HN−1 √ 2N (N − 1)! π (x − y) 2N 2N 2N 2N − 1 ! (2N − 1) 2 Γ(N + 1) Γ (N) ∼ √ 2N (N − 1)! π (x − y) N  N  Γ 2 + 1 Γ 2 + 1

21 " r ! r !! 2N + 1 Nπ 2N − 1 (N − 1) π × cos x − cos y − 2N 2 2N 2 r ! r !! # 2N + 1 Nπ 2N − 1 (N − 1) π  1  − cos y − cos x − + O √ 2N 2 2N 2 N

Now we use Stirlings formula and take the limit as N goes to infinity. The details are left to the reader. The final result is 0 sin(x − y) lim KN = . N→∞ π(x − y) From now on the sine kernel will be denoted by simply K or K(x, y).

22 5 A closer look at the Fredholm determinant

Our goal in the following pages is to get some information about det(I − λK) where K is the sine kernel and we think of the operator as defined on L2((−s, s)). In other words, we want information about the probability of finding no eigenvalues in the interval (−s, s). It is useful not to try to compute the determinant directly, but the log of the determinant. This is because we have the formula log det(I − λK) = trace log(I − λK) at our disposal. We will eventually derive a differential equation (nonlinear) that has a connection to the above function (thought of as a function of s).

To get started we recall some operator theory. Suppose {K (s)} is a family of operators, and that ||K (s)|| < 1,K (s) is trace class and K0(s) is defined. Then

d   (log (det (I − K (s)))) = −trace (I − K (s))−1 · K0 (s) . ds To prove this d d (log (det (I − K (s)))) = (trace (log (I − K (s)))) ds ds  d  = trace (log (I − K (s))) ds ∞ n !! d X − (K (s)) = trace ds n n=1 ∞ n ! X d − (K (s)) = trace ds n n=1 ∞ n ! X d (K (s)) = − trace ds n n=1 ∞ ! X h i = − trace (K (s))n−1 K0 (s) n=1   = −trace (I − K (s))−1 · K0 (s) since (I − K (s))−1 = P (K (s))n is just a geometric series.

Note the above uses the fact that

∞ n ! ∞ ! X d (K (s)) X h i − trace = − trace (K (s))n−1 K0 (s) . ds n n=1 n=1

Now let J be the interval (−s, s) and consider the operator K (s) with kernel K (x, y) χJ (y) where R s K (x, y) is the sine kernel. This operator sends f to the function −s K (x, y) f (y) dy.

23 To find K0 (s) we use the Fundamental Theorem of Calculus. The operator K0 (s) sends f to K (x, s) f (s) + K (x, −s) f (−s) . Note that the image of f is finite rank and spanned by the two vectors, K (x, s) and K (x, −s) . However, this operator does not make sense for all functions in L2 but only for functions that are continuous. We will ignore this fact for the time being.

Define the operator δ0 by δ0 (f) = f (0) . This is called the Dirac operator or point evaluation on the linear space of continuous function. One often sees the Dirac operator written as Z δ0 (f) = δ0(x)f(x)dx = f (0) .

This is because one can take a sequence of functions {δn(x)} that tend to infinity at zero and have support that tends to zero such that Z lim δn(x)f(x)dx = f(0). n→∞

Thus thinking of the symbol δ0(x) as a function that captures the value of a function f at the point where the argument of the delta function is zero, we use δ0 (x − y) as the kernel of the identity operator, although this is not a function in the ordinary sense. Thus Z δ0 (x − y) f (y) dy = f (x) .

0 We can now write the kernel of K (s) as K (x, y) δ0 (y − s) + K (x, y) δ0 (y + s) . To summarize, Z ∞ 0 K (s) f = [K (x, y) δ0 (y − s) + K (x, y) δ0 (y + s)] f (y) dy −∞ Z ∞ = [K (x, y) δ0 (y − s) f (y) + K (x, y) δ0 (y + s) f (y)] dy −∞ = K (x, s) f (s) + K (x, −s) f (−s) as before. We are happy to see K0 (s) acting as an integral operator just as before, however, acting on continuous functions.

Next we consider the kernel representation of (I − K (s))−1 and since we have one for K0 (s) we will be able to also find the kernel of (I − K (s))−1 · K0 (s) .

−1 P∞ n −1 P∞ n Recall that (I − K (s)) = n=0 K whenever ||K|| < 1. Thus (I − K (s)) = I + n=1 K . Now Kn has kernel Z s Z s ··· K (x, x1) ··· K (xn−1, y) dx1 ··· dxn−1. −s −s P∞ n −1 Thus n=1 K has a very complicated kernel which we will call R (x, y) . The operator (I − K) has kernel ρ (x, y) = δ0 (x − y) + R (x, y) .

Note that (I − K)−1 = I + (I − K)−1 K since     I + (I − K)−1 K (I − K) = (I − K) I + (I − K)−1 K = I.

24 This means that (I − K)−1 K has kernel R (x, y) , and thus that R composed with K is R − K.

When (I − K)−1 · K0 (s) operates on f the result is Z s ρ (x, y)[K (y, s) f (s) + K (y, −s) f (−s)] dy −s which can be written as R (x, s) f (s) + R (x, −s) f (−s) .

In terms of kernels we see that this operator has kernel

R (x, s) δ (y − s) + R (x, −s) δ (y + s) .

This has rank two, just like the image of K0(s). We can also easily compute its trace to see that it is R (s, s) + R (−s, −s)

We should point out here that this last computation tells that we need to know R(s, s) and R(−s, −s). These are the important numbers and much of what is done in the next few pages is to to find a way to describe these quantities. To summarize the list of kernels:

operator kernel K0 (s) K (x, s) δ (y − s) + K (x, −s) δ (y + s) (I − K)−1 ρ (x, y) = δ (x − y) + R (x, y) (I − K)−1 · K0 (s) R (x, s) δ (y − s) + R (x, −s) δ (y + s)

We now compute the kernels of some additional operators.

d −1 Lemma 3 The operator ds (I − K) has kernel R (x, s) ρ (s, y) + R (x, −s) ρ (−s, y) = M (x, y) .

d −1 −1 dK −1 Proof: From the homework, ds (I − K) = (I − K) ds (I − K) . The right hand side has kernel ZZ ρ (x, y) {K (y, s) δ (z − s) + K (y, −s) δ (z + s)} ρ (z, u) dydz = R (x, s) ρ (s, u) + R (x, −s) ρ (−s, u) as desired.

Next we introduce an important lemma about commutators. Recall the commutator of two op- erators A and B is simply AB − BA and is symbolized by [A, B] . Define the operator D by 1 0 df D : C (a, b) → C (a, b) and Df = dx .

Lemma 4 The operator [D, (I − K)−1] has kernel − R (x, s) ρ (s, y) + R (x, −s) ρ (−s, y) .

25 h i Proof: The commutator D, (I − K)−1 = (I − K)−1 [D,K](I − K)−1 . This is just algebra. ∂K(x,y) [D,K] = DK − KD.DK has kernel ∂x , but KD is more complicated. Z s Z s 0 y=s ∂K (x, y) KDf = K (x, y) f (y) dy = K (x, y) f (y)|y=−s − f (y) dy −s −s ∂y so KD has kernel ∂K (x, y) K (x, s) δ (y − s) − K (x, −s) δ (y + s) − . ∂y ∂K ∂K Since K (x, y) is a function of x − y, ∂x = − ∂y and thus the kernel of [D,K] is −K (x, s) δ (y − s) + K (x, −s) δ (y + s) .

To find the kernel of [D, (I − K)−1] we have ZZ ρ (x, y)[−K (y, s) δ (z − s) + K (y, −s) δ (z + s)] ρ (z, u) dydz Z = (−R (x, s) δ (z − s) ρ (z, u) + R (x, −s) δ (z + s) ρ (z, u)) dz = −R (x, s) ρ (s, u) + R (x, −s) ρ (−s, u) .

We will consider a slightly more general problem in what follows, that is, the probability for finding no eigenvalues in a finite union of intervals. We can think of K as acting on L2(I) where I is the S S set I = (a1, a2) (a3, a4) ... (an−1, an). Also notice that λ sin (x − y) A (x) A0 (y) − A (y) A0 (x) λK (x, y) = = π (x − y) x − y

q λ where A (x) = π sin x. We define functions   Q (x, aˆ) = (I − K)−1 A (x)   P (x, aˆ) = (I − K)−1 A0 (x) wherea ˆ is a vector containing each ai. Define the operator Mx (Multiplication by x) to be Mxf = xf.

h −1i Lemma 5 The operator Mx, (I − K) has kernel

    Q (x, aˆ) I − Kt−1 A0 (y) − P (x, aˆ) I − Kt−1 A (y) where Kt has kernel K (y, x) .

h −1i −1 −1 Proof: Mx, (I − K) = (I − K) [Mx,K](I − K) . Now [Mx,K] has kernel

A (x) A0 (y) − A0 (x) A (y)

26 since we have the following:

operator kernel  A(x)A0(y)−A0(x)A(y)  K x−y  A(x)A0(y)−A0(x)A(y)  MxK x x−y  A(x)A0(y)−A0(x)A(y)  KMx x−y y.

h −1i One half of the kernel of Mx, (I − K) is given by

Z Z ρ (w, x) A (x) A0 (y) ρ (y, z) dxdy I I Z Z = ρ (w, x) A (x) dx A0 (y) ρ (y, z) dy I I   = Q (x, aˆ) I − Kt−1 A0 (z) while similarly, for the other half Z Z − ρ (w, x) A (y) A0 (x) ρ (y, z) dxdy I I   = −P (x, aˆ) I − Kt−1 A (z) .

h −1i And thus the kernel of Mx, (I − K) is the sum of these

    Q (x, aˆ) I − Kt−1 A0 (z) − P (x, aˆ) I − Kt−1 A (z) .

h −1 i Lemma 6 The operator Mx, (I − K) K has kernel (x − y) R (x, y) .

Proof:

h −1i h −1 i Mx, (I − K) = Mx,I + (I − K) K

h −1 i = [Mx,I] + Mx, (I − K) K

h −1 i = 0 + Mx, (I − K) K and this has kernel (x − y) R (x, y) .

The function R (x, y) can now be written in terms of P and Q.     Q (x, aˆ) I − Kt−1 A0 (y) − P (x, aˆ) I − Kt−1 A (y) R (x, y) = . x − y

But clearly for our sine kernel the Kt is the same as K. So we have thus proved that

27 Q (x, aˆ) P (y, aˆ) − P (x, aˆ) Q (y, aˆ) R (x, y) = . x − y

The function R(x, y) is called the resolvent kernel and notice it has the same form as the sine kernel and now we have reduced our problem once again but this time into finding information about P and Q. And remember we really need R(x, x) so this is what we consider next.

In order to determine R (x, x) consider P (y, aˆ) as y approaches a fixed value for x. We can expand P (y, aˆ) in a Taylor series about the point x as

 d  h i P (y, aˆ) = P (x, aˆ) + P (x, aˆ) (y − x) + O (y − x)2 . dx

Similarly for Q  d  h i Q (y, aˆ) = Q (x, aˆ) + Q (x, aˆ) (y − x) + O (y − x)2 . dx

Substituting this into the expression for R (x, y) ,

R (x, y)  h i Q (x, aˆ)(P (x, aˆ) + P 0 (x, aˆ)(y − x)) − P (x, aˆ) Q (x, aˆ) + Q0 (x, aˆ)(y − x) + O (y − x)2 = x − y = −Q (x, aˆ) P 0 (x, aˆ) + P (x, aˆ) Q0 (x, aˆ) + O (x − y) .

Taking the limit as y → x,

R (x, x) = −Q (x, aˆ) P 0 (x, aˆ) + P (x, aˆ) Q0 (x, aˆ) .

Now let’s examine how R behaves at the endpoints of the intervals in I. We define

qj as Q(x, aˆ)|x=aj and pj as P (x, aˆ)|x=aj .

qj pk−pj qk For ai 6= aj,R (aj, ak) = . But what is R (aj, aj)? We can generalize our previous compu- aj −ak h −1i P2m k tation to see that D, (I − K) has kernel − k=1 (−1) R (x, ak) ρ (ak, y) . So,

Q0 (x, aˆ) = DQ (x, a) = D (I − K)−1 A (x) h i = (I − K)−1 DA (x) + D, (I − K)−1 A (x) Z 2m −1 0 X k = (I − K) A (x) − (−1) R (x, ak) ρ (ak, y) A (y) dy I k=1 2m X k = P (x, aˆ) − (−1) R (x, ak) Q (ak, aˆ) k=1 since R ρ (a , y) A (y) dy = Q (a , aˆ) . Therefore Q0 (x, a)| = p − P2m (−1)k R (a , a ) q . I k k x=aj j k=1 j k k

28 Similarly for P 0 (x, aˆ) ,P 0 (x, a)| = −q − P2m (−1)k R (a , a ) p . x=aj j k=1 j k k

Placing both of these expressions into the formula for R (x, x) yields

2m ! 2m ! X k X k R (aj, aj) = pj pj − (−1) R (aj, ak) qk − qj −qj − (−1) R (aj, ak) pk k=1 k=1 2m 2 2 X k = pj + qj − (−1) R (aj, ak)(pjqk − qjpk) k=1 2m 2 2 X k = pj + qj − (−1) R (aj, ak) R (ak, aj)(ak − aj) . k=1

Now,

∂qj ∂Q (x, aˆ) ∂Q (x, aˆ) = + ∂a ∂x ∂a j x=aj j x=aj 2m X k ∂Q (x, aˆ) = pj − (−1) R (aj, ak) qk + . ∂aj k=1 x=aj

Recalling that Q (x, aˆ) = (I − K)−1 A (x) and doing the same derivative computation as before we see that Z ∂ −1 j (I − K) A (x) = (−1) R (x, aj) ρ (aj, y) A (y) dy ∂aj j = (−1) R (x, aj) qj.

So the expression for ∂qj simplifies to ∂aj

2m ∂qj X k = pj − (−1) R (aj, ak) qk. ∂aj k = 1 k 6= j

In a completely similar fashion, we find that

2m ∂pj X k = −qj − (−1) R (aj, ak) pk. ∂aj k = 1 k 6= j

sin(x−y) Now let’s go back to the case that we really care about where K (x, y) = π(x−y) and where I is (−s, s) . For this interval, K is symmetric with respect to interchange of variables and satisfies K (x, −y) = K (−x, y) .

This same property holds for the kernel ρ. To see this define a flip operator J that sends the function f(x) to g(x) = g(−x). It is easy to check that if K and J commute and that this is

29 equivalent to the kernel satisfying the above property. (Note: this depends on the fact that the interval is symmetric too.) Now (I − K)−1 · J = J · (I − K)−1 since

(I − K)−1 · J(I − K) = J · (I − K)−1(I − K).

This also tells us that R(x, −y) = R(−x, y) and thus R(s, s) = R(−s, −s).

It is also the case that : q1 = −q2.

Proof: Z s q1 = lim Q (x, aˆ) = lim ρ (x, y) A (y) dy x→−s+ x→−s+ −s Z s = lim ρ (−x, y) A (y) dy x→s− −s Z s = lim ρ (x, −y) A (y) dy x→s− −s Z s = lim ρ (x, y) A (−y) dy x→s− −s Z s = − lim ρ (x, y) A (y) dy = −q2. x→s− −s

A similar argument leads to p1 = p2.

We can now obtain the following useful identities:

d 1. ds log det (I − K) = −2R (s, s)

q1p2−p1q2 q1p1 2. R (−s, s) = −2s = − s

3. dq1 = (−1) ∂q1 + ∂q1 = −p − 2R (−s, s) q ds ∂a1 ∂a2 1 1 4. dp1 = (−1) ∂p1 + ∂p1 = q + 2R (−s, s) p ds ∂a1 ∂a2 1 1

2 2 2 5. R (s, s) = q1 + p1 − 2s (R (−s, s))

We are finally very close to finding our differential equation.

Define

a (s) = sR (s, s) b (s) = sR (−s, s) .

First notice that

30 d d (sR (−s, s)) = (−p q ) ds ds 1 1 = −p1 (−p1 − 2R (−s, s) q1) − (q1 + 2R (−s, s) p1) q1 2 2 = p1 − q1.

Using the same sort of computation one can also show that

d (sR (s, s)) = p2 + q2. ds 1 1

If we square both sides of the right-hand side of the last two equations, it is clear that

da2 db 2 = + 4b2. ds ds

It is also the case from the fundamental formulas for a and b that da a − s = −2b2. ds Differentiating this last equation we see that

sa00 = 4bb0.

Inserting this into the previous equation we arrive at the following.

Theorem 2 The function a satisfies the following differential equation:

s2(a00)2 = 8(sa0 − a)(−2(sa0 − a) + (a0)2).

This last equation is called a Painlev´eequation. Equations of the Painlev´etype have been studied extensively and have many interesting properites that we will not discuss here. Since our Fredholm determinant has a derivative (with respect to s) that can be directly written in terms of the function a, we have found an interesting connection between random matrices and this type of second order equation.

Since we have the equation we can at least heuristically investigate the asymptotics of a. Suppose that we assume that a grows at most like a power, that is

a ∼ csk.

If we plug this into the equation, it turns out that the only power that would be consistent with the equation is k = 2. If we retrace our steps we find that our Fredholm determinant would behave

31 as s → ∞ as e−cs2 . Although we will not prove this here (because it is quite hard) this turns out to be true and in fact the full asymptotic formula is given by (here λ = 1)

s2 log s log 2 log det(I − K) = − − + + 3ζ0(−1) + o(1). 2 4 12 This was first conjectured by Freeman Dyson in 1976. The first two terms were proved in 1995 by Harold Widom and the constant term was computed by Torsten Ehrhardt in 2006. See [?, ?, ?].

32 6 Some comments

We did all of the above using a distributional kernel for the derivative of the operator K. Here is a sketch of why we can use the distributional kernel.

Define the operator U : L2(−s, s) → L2(−1, 1) by Uf(x) = g(x) = s−1/2f(sx). Then it is clear that U is invertible if s 6= 0. It is also the case that if we define

sin s(x − y) K (x, y) = s π(x − y) we have that the operator corresponding to the above kernel satisfies

−1 Ks = UKU .

From this we have that det(I − Ks) = det(I − K) since these operators have the same eigenvalues. So we could have computed with the other operator just as easily and it is now the case that the s derivative of Ks makes perfectly good sense and everything works. We can however use our computations because we can compute traces by applying the operators to normalized eigenfunctions, which are smooth in both cases. Remember both derivatives are rank two and all the operators make sense on smooth functions.

One last remark. The Painlev´eequation we derived looks independent of λ but in fact the quantities certainly do depend on λ. This can be reflected in the boundary condition

a(s, λ) = −λs − λ2s2 − ... which can be derived from the small s Neumann expansion of the determinant (I − λK).

33 7 Toeplitz determinants

We are next going to consider the distribution function for a linear statistic, that is, the distribution of values of N X f(λi) i=1 where the λi’s are the eigenvalues of a random Hermitian matrix and f is a function defined on (−∞, ∞).) We will need another theorem about the asymptotics of determinants and since it is a much simpler case to understand, and the analogue of what we are trying to compute we now digress a bit and prove a theorem about the determinants of finite Toeplitz matrices.

We note here that the notes from this section and the remaining sections are taken from the article [?] and more details can be found there.

∞ Consider a sequence of complex numbers {ai}i=−∞ and the associated matrix

n−1 Tn = (ai−j)i,j=0.

The matrix Tn is constant along its diagonals and has the following structure:   a0 a−1 a−2 ··· a−n+1  a1 a0 a−1 ··· a−n+2     a2 a1 a0 ··· a−n+3     . . . .   . . . .  an−1 an−2 an−3 ··· a0

The matrix is called a finite Toeplitz matrix and the basic problem is to determine what happens to Dn = det Tn as n → ∞.

The finite matrix also looks like a “truncation” of the infinite one   a0 a−1 a−2  ..   a1 a0 a−1 .    (4)  ..   a2 a1 a0 .   . . .  ...... and since our finite matrix is growing in size it makes sense to ask if we can somehow get information about determinants from the infinite array. In order to do this, we will try to view the infinite array as an operator on a . We do this because the theory of Fredholm determinants defined for operators on Hilbert spaces is fairly well understood. If we imagine that we are multiplying the infinite matrix on the right by a column vector it is quite natural to choose the Hilbert space as an extension of n-dimensional complex space. We use the Hilbert space of unilateral seqences (usually denoted by l2) ( ∞ ) ∞ X 2 {fk}k=0 | |fk| < ∞ , k=0

34 which we identify with the

2 2 1 H = {f ∈ L (S ) | fk = 0, k < 0}, where Z π 1 iθ −ikθ fk = f(e )e dθ. 2π −π

For this association we think of fk as the kth Fourier coefficient of the function f defined on the circle. In other words, in the L2 sense,

∞ iθ X ikθ f(e ) = fke . k=0

We remind the reader that H2 is a closed subspace of L2, and that the inner product of two functions is given by 1 Z π hf, gi = f(eiθ)g(eiθ)dθ. 2π −π 1/2 The two-norm of f, denoted by kfk2, is given by hf, fi and also known to be equal to

∞ !1/2 X 2 |fk| . k=0

We denote the orthogonal projection of L2 onto H2 by P. The operator P simply takes the Fourier series for f and removes terms with negative index and satisfies P 2 = P ∗ = P. There is one more fact about H2 that will be useful in what follows. Every function in H2 has an analytic extension into the interior of the unit circle given by

∞ X k f(z) = fkz . k=0 When we work with Toeplitz determinants it is often convenient to mix up all these ideas. In other words, sometimes we think of an infinite sequence, sometimes a function on the circle and sometimes the analytic function in the interior of the circle. There is a large body of literature devoted to these topics. For additional information, we refer the reader to [?, ?, ?].

Now let φ ∈ L∞(S1)and define the operator

T (φ): H2 → H2 by T (φ)f = P (φf). ikθ The operator T (φ) is called a Toeplitz operator with symbol φ. Let ek = e . The functions ∞ 2 {ek}k=0 form a Hilbert space basis for H . To find the matrix representation of the operator T (φ) we compute hT (φ)ek, eji = hP (φek), eji = hφek,P (ej)i Z π 1 iθ ikθ −ijθ = hφek, eji = φ(e )e e dθ = φj−k. 2π −π

35 This shows that this operator has exactly the matrix representation of the infinite array given in (??).

It turns out that when we compose Toeplitz operators, as we will soon do, another operator will appear. It is the Hankel operator with symbol φ

H(φ): H2 → H2 defined by H(φ)f = P (φJ(f)) where J(f)(eiθ) = e−iθf(e−iθ). The matrix representation of this operator has a different structure than that of a Toeplitz operator. It is found by computing

hH(φ)ek, eji = hP (φJ(ek)), eji = hφe−k−1,P (ej)i Z π 1 iθ −i(k+j+1)θ = hφe−k−1, eji = φ(e )e dθ = φj+k+1. 2π −π Thus the matrix representation has constants running down the “opposite” diagonals and has the form   φ1 φ2 φ3 ···  φ2 φ3    (5)  φ3   .  . Much is known about the invertibility of Toeplitz operators and their structure. However, we only need to recall a few things about Toeplitz operators before we begin to connect the infinite operators to the finite determinants. Here is a list of them.

Theorem 3

(a) T (φ) is a

∞ T 2 (b) If ψ+ ∈ L H , then T (φψ+) = T (φ)T (ψ+) (c) T (φ)∗ = T (φ¯)

∞ T 2 (d) If ψ− ∈ L H , then T (ψ−φ) = T (ψ−)T (φ)

(e) T (φψ) = T (φ)T (ψ) + H(φ)H(ψe), where ψe(eiθ) = ψ(e−iθ).

To prove (a) notice that

kT (φ)fk2 = kP (φf)k2 ≤ kφfk2 ≤ kφk∞kfk2.

It is actually the case that the norm of T (φ) is equal to kφk∞, although this fact is not essential for our purposes. To prove (b) consider

T (φψ+)f = P (φ(ψ+f)) = P (φP (ψ+f)) = T (φ)T (ψ+)f.

36 2 The middle equality holds since both ψ+ and f are already in H . (Another way to say this is the product of two functions with only non-negative Fourier coefficients has only non-negative Fourier coefficients.) The proof of (c) follows straight from the definition and then using (b) and (c) property (d) holds as well. The proof of property (e) requires a little more work, but is reduced to algebra once we note that the kth Fourier coefficient of φψ is given by

∞ X φlψk−l. l=−∞ Notice that these properties say that Toeplitz operators with symbols that have only non-negative coefficients can be factored on the right and Toeplitz operators with symbols having only non- positive coefficients can be factored on the left. These factorizations are used frequently.

2 2 Next, define Pn : H → H by

Pn(f0, f1, f2, ···) = (f0, f1, f2, ··· , fn−1, 0, 0, ···).

This last definition allows us to identify Tn(φ), the finite Toeplitz matrix generated by the Fourier coefficients of φ as the truncation of T (φ) or as PnT (φ)Pn.

At this point the reader may wonder if T (φ) is I plus trace class. If it were, then limits of finite Toeplitz matrices would be very easy to compute. This is not the case unless φ(eiθ) ≡ 1, (just think about the diagonal) but something very close to this statement is true. If we recall Theorem 1.1 part (e) we see that if φ−1 is a bounded function, then

T (φ)T (φ−1) = I − H(φ)H(φe−1), so the operator T (φ)T (φ−1) will be I+ trace class if both H(φ) and H(φe−1) are Hilbert-Schmidt. From the matrix representation of H(φ) it is very easy to see that H(φ) will be Hilbert-Schmidt if

∞ X 2 |k||φk| < ∞. k=1 Putting this all together we have the following lemma.

Lemma 7 Suppose that φ and φ−1 are bounded functions satisfying

∞ ∞ X 2 X −1 2 |k||φk| < ∞ and |k||(φ )−k| < ∞. k=1 k=1 Then the operator T (φ)(T φ−1) is I + K where K is trace class.

How does knowing the above help us with the determinant of PnT (φ)Pn? Let us proceed informally for awhile. It is certainly the case that if T (φ) is upper or lower triangular, the determinants are easy to compute. This is precisely when φ or φ¯ belong to L∞ T H2. Of course, this is a special case, but it is known that for a fairly large class of functions φ there exist functions φ+ and φ− ∞ T 2 such that both φ+, φ¯− ∈ L H , and φ = φ−φ+. Thus we have

PnT (φ)Pn = PnT (φ−)T (φ+)Pn.

37 Here we have used Theorem 3, parts (b) and (d). It would be nice if we could move the Pn into the middle because then we would have the product of upper and lower triangular matrices. Unfortunately we cannot do this, but notice we could if the factors were reversed. This is because

PnT (φ+) = PnT (φ+)Pn and PnT (φ−)Pn = T (φ−)Pn. Using this motivation, we write

−1 −1 Pn T (φ) Pn = Pn T (φ+) T (φ+ ) T (φ) T (φ− ) T (φ−) Pn −1 −1 = Pn T (φ+) Pn T (φ+ ) T (φ) T (φ− ) Pn T (φ−) Pn.

Now the upper-left blocks of Pn T (φ±) Pn are Tn(φ±), which are triangular matrices with diagonal n n entries (φ±)0. Therefore they have determinant ((φ−)0) ((φ+)0) , so

Dn(φ) = det Tn(φ) equals this product times the determinant of the upper-left block of

−1 −1 Pn T (φ+ ) T (φ) T (φ− ) Pn. To find the limit of this determinant, we notice by using Lemma 1.3 that if φ−1 and φ satisfy the conditions of the lemma then

T (φ)T (φ−1) = I − H(φ)H(φe−1)

−1 and the product of the Hankels is trace class. If we multiply on the left of this equation by T (φ+ ) and on the right by T (φ+) then we have

−1 −1 T (φ+ ) T (φ)T (φ ) T (φ+) = I + K where K is trace class. This is immediate since trace class operators form an ideal. For the expression on the right side of this equality, we have by Thoerem 1.2 (b)

−1 −1 −1 −1 −1 T (φ+ ) T (φ)T (φ ) T (φ+) = T (φ+ ) T (φ)T (φ− φ+ ) T (φ+) −1 −1 = T (φ+ ) T (φ) T (φ− ) and thus −1 −1 T (φ+ ) T (φ) T (φ− ) = I + K.

Therefore using Theorem 1.2 (d) we have that

n n −1 −1 lim det Tn(φ)/((φ−)0) ((φ+)0) = lim det Pn T (φ ) T (φ) T (φ ) Pn n→∞ n→∞ + −

−1 −1 −1 −1 −1 = det(T (φ+ ) T (φ) T (φ− )) = det(T (φ)T (φ− )T (φ+ )) = det(T (φ)T (φ )). This last limit statement is almost the classical form of the Strong Szeg¨oLimit Theorem. It will be in exactly that form once we rewrite the constants. To become completely rigorous we assume that the function φ = φ−φ+ has a logarithm log φ that is bounded such that

log φ = log φ− + log φ+

38 ∞ T 2 where log φ− and log φ+ are in L H , satisfy

∞ X 2 |k| |(log φ−)−k| < ∞ k=1 and ∞ X 2 |k| |(log φ+)k| < ∞. k=1 P∞ 2 Conveniently the bounded functions f satisfying k=−∞ |kkfk| < ∞ form a Banach algebra under a natural norm and for any such f the Hankel matrix (fi+j+1) is the matrix of a Hilbert-Schmidt −1 −1 −1 operator. If log φ± belong to this algebra so do φ−, φ+, (φ+) , (φ−) , φ and φ , and it follows that all associated Hankel operators are Hilbert-Schmidt. With these assumptions, we consider −1 −1 −1 −1 det(T (φ+ ) T (φ) T (φ− )) = det(T (φ+ ) T (φ−) T (φ+)T (φ− )). n n Now we know that (T (log φ±)) = T ((log φ±) ) by Theorem 1. (b) and (d), and thus we can write the above determinant as det eK1 eK2 e−K1 e−K2 where K1 = −T (log φ+),K2 = T (log φ+). Then using det(eK1 eK2 e−K1 e−K2 ) = etr (K1K2−K2K1) yields −1 −1 det(T (φ+ ) T (φ−) T (φ+)T (φ− ) = exp(tr (K1K2 − K2K1). The operator K1K2 − K2K1 = −T (log φ+)T (log φ−) + T (log φ−)T (log φ+) is equal to −T (log φ+)T (log φ−) + T ((log φ−)(log φ+)) by Theorem 1.2 (b) and is equal to

H(log φ−)H(log φ+) by Theorem 1.2 (e). The trace of this product is easily seen to be

∞ X ksks−k, k=1 where sk = (log φ)k.

(Note (log φ)k = ((log φ)+)k for k positive and ((log φ)−)k for k negative.) Finally, the term

n ((φ−)0(φ+)0) is the same as G(φ) = exp(s0)

39 since (φ±)0 = exp(log(φ±))0 and log φ = log φ− + log φ+. The constant G(φ) is called the geometric mean of φ. We collect the above results in this theorem.

Theorem 4 (Strong Szeg¨oLimit Theorem) Suppose the functions log φ± belong to the algebra of P∞ 2 2 bounded functions f satisfying k=−∞ |kkfk| < ∞, and in addition suppose log φ−, log φ+ ∈ H . Let φ = φ−φ+. Then ∞ ! n X lim Dn(φ)/G(φ) = exp ksks−k n→∞ k=1 where G(φ) and sk are as previously defined.

40 8 Wiener-Hopf determinants and linear statistics

We now state the analogue for the previous section for the Wiener-Hopf case. To understand how these results are analogous to what we have described so far, think of the finite Toeplitz matrix as the operator −1 PnT MφT Pn where Mφ is multiplication by the function φ and T is the discrete Fourier transform, associating the Fourier coefficients to the corresponding function in H2.

It turns out for the Gaussian Unitary Ensemble, for the study of linear statistics we define a finite 2 Wiener-Hopf operator, Wα(φ), defined on L (0, α) by

−1 PαF MφFPα where Pα is multiplication by the characteristic function of (0, α), and F is the Fourier transform. For linear statistics the important quantity is

det(I + Wα(φ)) where φ = eiλf − 1. (The above determinant is well-defined for sufficiently nice φ.)

You should convince yourself using convolutions arguments that this is the same Wiener-Hopf operator you are familiar with.

The analogue of the Strong Szeg¨oLimit Theorem says that if φ = eb − 1 then as α → ∞

 α Z ∞ Z ∞  det(I + Wα(φ)) ∼ exp b(x)dx + xˆb(x)ˆb(−x)dx , 2π −∞ 0 where ˆb(x) is the Fourier transform of b.

The proof is essentially the same except we need to worry about operators being trace class, so we are always considering the identity plus something trace class. This makes the proof a little less elegant, but has exactly the same steps.

Now consider a random variable of the form

N X f(xi) i=1 where in all that follows f is a continuous real-valued function belonging to L1(−∞, ∞) and which vanishes at ± ∞.

The mean µN is N Z ∞ Z ∞ X ··· f(xi)PN (x1, . . . , xN )dx1 ··· dxN . (6) −∞ −∞ i=1

41 Recall that the function PN has the important property [?] Z ∞ Z ∞ N! n ··· PN (x1, . . . , xn, xn+1, . . . , xN )dxn+1 ··· dxN = det K(xi, xj) |i,j=1 . (7) (N − n)! −∞ −∞ Thus, (??) is easily seen to be Z ∞ f(x)KN (x, x) dx (8) −∞ √ which, after changing x to x/ 2N, becomes Z ∞ x 1  x y  f(√ )√ KN √ , √ dx −∞ 2N 2N 2N 2N and is asymptotic to Z ∞ x f(√ )K(x, x) dx −∞ 2N or √ 2N Z ∞ f(x) dx. π −∞

A more difficult, yet also straightforward problem, is to find an expression for the distribution function of a random variable of this type. A fundamental formula from probability theory shows that if we call the probability distribution function φN , then ∞ ∞ Z Z PN ˇ ik j=1 f(xj ) φN (k) = ··· e PN (x1, . . . , xN )dx1 ··· dxN . (9) −∞ −∞ Thus, Z ∞ Z ∞ N ˇ Y ikf(xj ) φN (k) = ··· e PN (x1, . . . , xN ) dx1 ··· dxN −∞ −∞ j=1 Z ∞ Z ∞ N Y ikf(xj ) = ··· ((e − 1) + 1)PN (x1, . . . , xN ) dx1 ··· dxN −∞ −∞ j=1 N N Z ∞ Z ∞ X X = ··· {1 + (eikf(xj ) − 1) + (eikf(xj ) − 1)(eikf(xl) − 1) + ...} −∞ −∞ j=1 j

×PN (x1, . . . , xN ) dx1 ··· dxN Z ∞ 1 ikf(x) = 1 + (e − 1)KN (x, x) dx 1! −∞ 1 Z ∞ Z ∞ ikf(x1) ikf(x2) + (e − 1)(e − 1) det(KN (xj, xl)) |1≤j,l≤2 dx1 dx2 2! −∞ −∞ N 1 Z ∞ Z ∞ Y + ··· + ··· (eikf(xj ) − 1)P (x , . . . , x ) dx ··· dx . N! N 1 N 1 N −∞ −∞ j=1 In each integral we rescale to obtain Z ∞ Z ∞ Z ∞ ˇ 1 0 1 0 φN (k) = 1 + K (x1, x1) dx1 + K (x1, x2) dx1 dx2 1! −∞ 2! −∞ −∞ Z ∞ Z ∞ 1 0 + ··· + ··· K (x1, . . . , xN ) dx1 ··· dxN (10) N! −∞ −∞

42 where  xj  0 ikf( √ ) xj xl 1 K (x1, . . . , xn) = det (e 2N − 1)KN (√ , √ )√ . (11) 2N 2N 2N 1≤j,l≤n Letting N → ∞ and using the appropriate asymptotics we see this is the formula for the Fredholm determinant det(I + K) where K has kernel

ikf( √x ) sin(x − y) K(x, y) = (e 2N − 1) . (12) π(x − y)

We remark here that the asymptotics to arrive at the above formula are far from trivial. The details of an analogous computation for a slightly different case are found in [?] and the reader may follow more or less the same steps for the above.

Our final step is to recognize this as a Fredholm determinant of a Wiener-Hopf operator and then apply Szeg¨o’s(Kac) Theorem.

−1 This operator can easily be seen to be the product MψFP F where P g = χ(−1,1)g, Mψg = fg and ikf( x ) F is the Fourier transform and ψ = e 2N − 1. So we have that the transform of our distribution function is

−1 −1 −1 det(I + MψFP F ) = det(I + F MψFP ) = det(I + P F MψFP ).

2 This√ is defined then on L (−1, 1) with kernel that is the transform of a function√ that√ depends on x/ 2N. But this is unitarily equivalent to the convolution operator on L2(− 2N, 2N) with “φ” ikf(x) equal to e − 1.√Also note that since convolution is translation invariant this is the same as the operator on L2(0, 2 2N).

Thus from our theorem we know the asymptotics. They are

Theorem 5 √ Z ∞ Z ∞ ! 2N 2 det(I + Wα(φ)) ∼ exp ikf(x)dx − k xfˆ(x)fˆ(−x)dx . π −∞ 0

Now the Fourier transform of a Gaussian distribution has the same form.

Z ∞ (t−µ)2 1 − ikt ikµ −k2σ2/2 √ e 2σ2 e dt = e e . (13) σ 2π −∞ and the mean and the variance can be read off from the transform.

Now compare√ this with what we have. It is clear that asymptotically we have a Gaussian with 2N R ∞ R ∞ ˆ ˆ mean π −∞ f(x)dx and variance 2 0 xf(x)f(−x)dx. This of course works as long as f is in 1 R ∞ ˆ 2 L and and −∞ |xf(x)| dx is finite.

43 9 Appendix

To get a notion of how to define a determinant for an infinite array, recall that if M is a finite Qn matrix then det M is the product i=1 βi where the βi are the eigenvalues of M. If we extend this Q∞ to an infinite product i=1 βi then we are guaranteed the product will converge if βi = 1 + λi with

∞ X |λi| < ∞. i=1

Hence we look for operators of the form I + K where K has a discrete set of eigenvalues λi which P∞ satisfy i=1 |λi| < ∞. A class of operators with exactly this property is the set of trace class operators. We say that K is trace class if

∞ X ∗ 1/2 kKk1 = h(KK ) en, eni < ∞ (14) n=1 for any set of orthonormal basis vectors {en} for our Hilbert space. (We are assuming here that our space is separable.) It can be shown that if the above sum is finite with respect to some basis it is finite and independent for any choice of basis. It is not always convenient to check whether or not an operator is trace class from the definition. However, it is fairly straight forward to check if an operator is Hilbert-Schmidt and it is well known that a product of two Hilbert-Schmidt operators is trace class. The definition of the Hilbert-Schmidt class of operators is the set of K such that the sum X 2 |hKei, eji| < ∞ i,j is finite for some choice of orthonormal basis. If it is, then the above sum is independent of the choice of basis and its square root is called the Hilbert-Schmidt norm of the operator. The basic facts about trace class and Hilbert-Schmidt operators are contained in the following theorem. We state this theorem without proof, but refer the reader to [?] or [?] for general results. The last part of this theorem first appeared in [?] which is really the first paper where the idea of trace class operators was used to make the results about Toeplitz operators seem natural. The reader is strongly advised to look at this paper.

Theorem 6

(a) Trace class operators form an ideal in the set of all bounded operators and are closed in the topology defined by the trace norm defined in (??).

(b) Hilbert-Schmidt operators form an ideal in the set of all bounded operators with respect to the Hilbert-Schmidt norm and are closed in the topology defined by the Hilbert-Schmidt norm.

(c) The product of two Hilbert-Schmidt operators is trace class.

(d) If K is trace class, then det Pn(I + K)Pn → det(I + K) as n → ∞. Here Pn is the orthogonal projection on the linear span of the basis elements {e0, e1, . . . , en−1}. (For the first determinant we think of Pn(I + K)Pn as the finite rank operator defined on the image of Pn.)

44 ∗ ∗ (e) If An → A, Bn → B strongly (pointwise in the Hilbert space) and if K is trace class, then AnKBn → AKB in the trace norm. (f) The functions defined by tr K and det(I+K) are continuous on the set of trace class operators with respect to the trace norm.

(g) If K1K2 and K2K1 are trace class then tr (K1K2) = tr (K2K1) and det(I + K1K2) = det(I + K2K1).

Now that we have a way to define an infinite determinant for an operator of the form I + K when K is trace class, the question still remains as to how one can compute the determinant in some concrete way. This will be crucial for our applications to random matrices. There are two basic formulas (among many others) that we will consider. The first is that if

I + K = eA where A is trace class, then det(I + K) = det eA = etr A.

The second is that if operators K1 and K2 satisfy the condition that

K1K2 − K2K1 is trace class (note that neither K1 or K2 need be) then

det(eK1 eK2 e−K1 e−K2 ) = etr (K1K2−K2K1).

The first formula follows from the definition of the exponential and properties (a), (d) and (e) above and the fact that the formula is true for finite matrices. To see this compare the expressions (ePnAPn ) and (eA). The second we shall not prove but remark that it follows by using the Baker- Campbell-Hausdorf formula to expand products of exponentials on non-commuting operators. This is shown in [?].

45 References

[1] E. L. Basor, Y, Chen, H. Widom, Determinants of Hankel matrices. J. Funct. Anal. 179 (2001), no. 1, 214–234.

[2] E. L. Basor, Toeplitz determinants, Fisher-Hartwig symbols, and random matrices. Recent perspectives in random matrix theory and number theory, 309–336, London Math. Soc. Lecture Note Ser., 322, Cambridge Univ. Press, Cambridge, 2005.

[3] A. B¨ottcher, B. Silbermann. Introduction to Large Truncated Toeplitz Matrices, Springer- Verlag, Berlin, 1998.

[4] F.J. Dyson, Fredholm determinants and inverse scattering problems, Comm. Math. Phys. 47, 171-183 (1976).

[5] Peter L. Duren. Theory of Hp Spaces, Academic Press, New York, 1970.

[6] T. Ehrhardt, Dyson’s constatn in the asymptotics of the Fredholm determinant of the sine kernel. Comm. Math. Phys. 262, 317–341 (2006).

[7] I.C. Gohberg, M.G. Krein. Introduction to the theory of linear nonselfadjoint operators Vol. 18, Translations of Mathematical Monographs, Amer. Math. Soc., Rhode Island, 1969.

[8] M. L. Mehta. Random Matrices, Academic Press, San Diego, 1991.

[9] G. Szeg¨o. Orthogonal Polynomials, vol. 23, Colloquium Publications, Amer. Math. Soc., New York, 1959.

[10] C. A. Tracy, H. Widom. Introduction to random matrices, in Proc. 8th Scheveningen Conf., Springer Lecture Notes in Physics, (1993).

[11] H. Widom, Asymptotics for the Fredholm determinant of the sine kernel on a union of intervals, Comm. Math. Phys. 171, 159-180 (1995).

46