<<

THE ANALYTIC THEORY OF ORTHOGONAL POLYNOMIALS

DAVID DAMANIK, ALEXANDER PUSHNITSKI, AND BARRY SIMON

Contents 1. Introduction 2 1.1. Introduction and Overview 2 1.2. Matrix-Valued Measures 6 1.3. Matrix M¨obius Transformations 10 1.4. Applications and Examples 14 2. Matrix Orthogonal Polynomials on the Real Line 17 2.1. Preliminaries 17 2.2. Block Jacobi Matrices 23 2.3. The m-Function 29 2.4. Second Kind Polynomials 31 2.5. Solutions to the Difference Equations 32 2.6. and the Christoffel–Darboux Formula 33 2.7. The CD 34 2.8. Christoffel Variational Principle 35 2.9. Zeros 37 2.10. Lower Bounds on p and the Stieltjes–Weyl Formula for m 39 2.11. Wronskians of Vector-Valued Solutions 40 2.12. The Order of Zeros/Poles of m(z) 40 2.13. Resolvent of the Jacobi Matrix 41 3. Matrix Orthogonal Polynomials on the Unit Circle 43 3.1. Definition of MOPUC 43 3.2. The Szeg˝oRecursion 43 3.3. Second Kind Polynomials 47 3.4. Christoffel–Darboux Formulas 51 3.5. Zeros of MOPUC 53 Date: October 22, 2007. 2000 Subject Classification. 42C05, 47B36, 30C10. Key words and phrases. Orthogonal Polynomials, Matrix-Valued Measures, Block Jacobi Matrices, Block CMV Matrices. D.D. was supported in part by NSF grants DMS–0500910 and DMS-0653720. B.S. was supported in part by NSF grants DMS–0140592 and DMS-0652919 and U.S.–Israel Binational Science Foundation (BSF) Grant No. 2002068. 1 2 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

3.6. Bernstein–Szeg˝oApproximation 56 3.7. Verblunsky’s Theorem 57 3.8. Matrix POPUC 57 3.9. Matrix-Valued Carath´eodory and Schur Functions 57 3.10. Coefficient Stripping, the Schur Algorithm, and Geronimus’ Theorem 59 3.11. The CMV Matrix 62 3.12. The Resolvent of the CMV Matrix 65 3.13. Khrushchev Theory 72 4. The Szeg˝oMapping and the Geronimus Relations 73 5. Regular MOPRL 77 5.1. Upper Bound and Definition 77 5.2. Density of Zeros 77 5.3. General Asymptotics 78 5.4. Weak Convergence of the CD Kernel and Consequences 79 5.5. Widom’s Theorem 80 5.6. A Conjecture 81 References 81

1. Introduction 1.1. Introduction and Overview. Orthogonal polynomials on the real line (OPRL) were developed in the nineteenth century and orthog- onal polynomials on the unit circle (OPUC) were initially developed around 1920 by Szeg˝o. Their matrix analogues are of much more recent vintage. They were originally developed in the MOPUC case indirectly in the study of prediction theory [116, 117, 129, 131, 132, 138, 196] in the period 1940–1960. The connection to OPUC in the case was discovered by Krein [131]. Much of the theory since is in the electrical engineering literature [36, 37, 38, 39, 40, 41, 120, 121, 122, 123, 203]; see also [84, 86, 87, 88, 142]. The corresponding real line theory (MOPRL) is still more recent: Following early work of Krein [133] and Berezan’ski [9] on block Jacobi matrices, mainly as applied to self-adjoint extensions, there was a sem- inal paper of Aptekarev–Nikishin [4] and a flurry of papers since the 1990s [10, 11, 12, 14, 16, 17, 19, 20, 21, 22, 29, 35, 43, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 64, 65, 66, 67, 68, 69, 71, 73, 74, 75, 76, 77, 79, 83, 85, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 137, 139, 140, 143, 144, 145, 148, 149, 150, 155, 156, 157, 154, 161, 162, 179, 186, 198, 200, 201, 202, 204]; see also [7]. MATRIX ORTHOGONAL POLYNOMIALS 3

There is very little on the subject in monographs — the more classical ones (e.g., [23, 82, 93, 184]) predate most of the subject; see, however, Atkinson [5, Section 6.6]. Ismail [118] has no discussion and Simon [167, 168] has a single section! Because of the use of MOPRL in [33], we became interested in the subject and, in particular, we needed some basic results for that paper which we couldn’t find in the literature or which, at least, weren’t very accessible. Thus, we decided to produce this comprehensive review that we hope others will find useful. As with the scalar case, the subject breaks into two parts, conve- niently called the analytic theory (general structure results) and the algebraic theory (the set of non-trivial examples). This survey deals entirely with the analytic theory. We note, however, that one of the striking developments in recent years has been the discovery that there are rich classes of genuinely new MOPRL, even at the classical level of Bochner’s theorem; see [20, 55, 70, 72, 102, 109, 110, 111, 112, 113, 156, 161] and the forthcoming monograph [63] for further discussion of this algebraic side. In this introduction, we will focus mainly on the MOPRL case. For scalar OPRL, a key issue is the passage from measure to monic OPRL, then to normalized OPRL, and finally to Jacobi parameters. There are no choices in going from measure to monic OP, Pn(x). They are determined by P (x)= xn + lower order, xj,P =0 j =1,...,n 1. (1.1) n h ni − However, the basic condition on the orthonormal polynomials, namely, p ,p = δ (1.2) h n mi nm does not uniquely determine the pn(x). The standard choice is P (x) p (x)= n . n P k nk However, if θ0, θ1,... are arbitrary real numbers, then eiθn P (x) p˜ (x)= n (1.3) n P k nk also obey (1.2). If the recursion coefficients (aka Jacobi parameters), are defined via

xpn = an+1pn+1 + bn+1pn + anpn−1 (1.4) then the choice (1.3) leads to

˜ iθn −iθn−1 bn = bn, a˜n = e ane . (1.5) 4 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

The standard choice is, of course, most natural here; for example, if n pn(x)= κnx + lower order (1.6) then an > 0 implies κn > 0. It would be crazy to make any other choice. For MOPRL, these choices are less clear. As we will explain in Section 1.2, there are now two matrix-valued “inner products” formally written as f,g = f(x)† dµ(x)g(x) (1.7) hh iiR Z f,g = g(x) dµ(x)f †(x) (1.8) hh iiL Z where now µ is a matrix-valued measure and † denotes the adjoint, R L and corresponding two sets of monic OPRL: Pn (x) and Pn (x). The orthonormal polynomials are required to obey pR,pR = δ 1. (1.9) hh n miiR nm The analogue of (1.3) is p˜R(x)= P R(x) P R,P R −1/2σ (1.10) n n hh n n ii n R for a unitary σn. For the immediately following, use pn to be the choice σ 1. For any such choice, we have a recursion relation, n ≡ R R † R R xpn (x)= pn+1(x)An+1 + pn (x)Bn+1 + pn−1(x)An (1.11) with the analogue of (1.5) (comparing σ 1 to general σ ) n ≡ n ˜ † ˜ † Bn = σnBnσn An = σn−1Anσn. (1.12) The obvious analogue of the scalar case is to pick σ 1, which n ≡ makes κn in R n pn (x)= κnx + lower order (1.13) obey κn > 0. Note that (1.11) implies † κn = κn+1An+1 (1.14) or, inductively, † † −1 κn =(An ...A1) . (1.15) In general, this choice does not lead to An positive or even Hermitian. Alternatively, one can pick σn so A˜n is positive. Besides these two “obvious” choices, κn > 0 or An > 0, there is a third that An be lower triangular that, as we will see in Section 1.4, is natural. Thus, in the R study of MOPRL one needs to talk about equivalent sets of pn and of Jacobi parameters, and this is a major theme of Chapter 2. Interest- ingly enough for MOPUC, the commonly picked choice equivalent to MATRIX ORTHOGONAL POLYNOMIALS 5

An > 0 (namely, ρn > 0) seems to suffice for applications. So we do not discuss equivalence classes for MOPUC. Associated to a set of matrix Jacobi parameters is a block Jacobi ma- trix, that is, a matrix which when written in l l blocks is tridiagonal; see (2.29) below. × In Chapter 2, we discuss the basics of MOPRL while Chapter 3 discusses MOPUC. Chapter 4 discusses the Szeg˝omapping connection of MOPUC and MOPRL. Finally, Chapter 5 discusses the extension of the theory of regular OPs [180] to MOPRL. While this is mainly a survey, it does have numerous new results, of which we mention: (a) The clarification of equivalent Jacobi parameters and several new theorems (Theorems 2.8 and 2.9). (b) A new result (Theorem 2.28) on the order of poles or zeros of m(z) in terms of eigenvalues of J and the once stripped J (1). (c) Formulas for the resolvent in the MOPRL (Theorem 2.29) and MOPUC (Theorem 3.24) cases. R (d) A theorem on zeros of det(Φn ) (Theorem 3.7) and eigenvalues of a cutoff CMV matrix (Theorem 3.10). (e) A new proof of the Geronimus relations (Theorem 4.2). (f) Discussion of regular MOPRL (Chapter 5). There are numerous open questions and conjectures in this paper, of which we mention: (1) We prove that type 1 and type 3 Jacobi parameters in the Nevai class have An 1 but do not know if this is true for type 2 and, if so, how to prove→ it. (2) Determine which monic matrix polynomials, Φ, can occur as monic MOPUC. We know det(Φ(z)) must have all of its zeros in the unit disk in C, but unlike the scalar case where this is sufficient, we do not know necessary and sufficient conditions. (3) Generalize Khrushchev theory [125, 126, 101] to MOPUC; see Sec- tion 3.13. (4) Provide a proof of Geronimus relations for MOPUC that uses the theory of canonical moments [43]; see the discussion at the start of Chapter 4. (5) Prove Conjecture 5.9 extending a result of Stahl– Totik [180] from OPRL to MOPRL. It is a pleasure to thank Alexander Aptekarev, Christian Berg, An- tonio Dur´an, Jeff Geronimo, Fritz Gesztesy, Alberto Gr¨unbaum, Paco Marcell´an, Ken McLaughlin, Hermann Schulz-Baldes, and Walter Van Assche for useful correspondence. 6 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

1.2. Matrix-Valued Measures. Let l denote the ring of all l l complex-valued matrices; we denote byMα† the Hermitian conjugate× of α . (Because of the use of ∗ for Szeg˝odual in the theory of ∈ Ml OPUC, we do not use it for adjoint.) For α l, we denote by α its Euclidean norm (i.e., the norm of α as a linear∈ M operator on Cl withk k the usual Euclidean norm). Consider the set of all polynomials in P z C with coefficients from l. The set can be considered either as∈ a right or as a left moduleM over ; clearly,P conjugation makes the Ml left and right structures isomorphic. For n = 0, 1,... , n will denote those polynomials in of degree at most n. The set denotesP the set of all polynomials inPz C with coefficients from CVl. The standard l ∈ inner product in C is denoted by , Cl . A matrix-valued measure, µ, onh· R·i (or C) is the assignment of a positive semi-definite l l matrix µ(X) to every Borel set X which is countably additive. We× will usually normalize it by requiring µ(R)= 1 (1.16) (or µ(C)= 1) where 1 is the l l . (We use 1 in general × for an identity operator, whether in l or in the operators on some other Hilbert space, and 0 for the zeroM operator or matrix.) Normally, our measures for MOPRL will have compact support and, of course, our measures for MOPUC will be supported on all or part of ∂D (D is the unit disk in C). Associated to any such measures is a scalar measure

µtr(X) = Tr(µ(X)) (1.17) the (normalized by Tr(1)= l). µtr is normalized by µtr(R)= l. Applying the Radon–Nikodym theorem to the matrix elements of µ, we see there is a positive semi-definite matrix function Mij(x) so

dµij(x)= Mij(x) dµtr(x). (1.18) Clearly, by (1.17), Tr(M(x))=1 (1.19) for dµtr-a.e. x. Conversely, any scalar measure with µtr(R) = l and positive semi-definite matrix-valued function M obeying (1.19) define a matrix-valued measure normalized by (1.16). Given l l matrix-valued functions f,g, we define the l l matrix f,g by× × hh iiR f,g = f(x)†M(x)g(x) dµ (x) (1.20) hh iiR tr Z MATRIX ORTHOGONAL POLYNOMIALS 7 that is, its (j, k) entry is

fnj(x) Mnm(x)gmk(x) dµtr(x). (1.21) nm X Z Since f †Mf 0, we see that ≥ f, f 0. (1.22) hh iiR ≥ One might be tempted to think of f, f 1/2 as some kind of norm, but hh iiR that is doubtful. Even if µ is supported at a single point, x0, with −1 M = l 1, this “norm” is essentially the of A = f(x0), which is known not to obey the triangle inequality! (See [169, Sect. I.1] for an example.) However, if one looks at f = (Tr f, f )1/2 (1.23) k kR hh iiR one does have a norm (or, at least, a semi-norm). Indeed, f,g = Tr f,g (1.24) h iR hh iiR is a sesquilinear form which is positive semi-definite, so (1.23) is the semi-norm corresponding to an inner product and, of course, one has a Cauchy–Schwarz inequality Tr f,g f g . (1.25) | hh iiR|≤k kRk kR We have not specified which f’s and g’s can be used in (1.20). We have in mind mainly polynomials in x in the real case and Laurent polynomials in z in the ∂D case although, obviously, continuous func- tions are okay. Indeed, it suffices that f (and g) be measurable and obey Tr(f †(x)f(x)) dµ (x) < (1.26) tr ∞ Z for the integrals in (1.21) to converge. The set of equivalence classes under f g if f g R = 0 defines a Hilbert space, , and f,g R is the inner∼ productk − on thisk space. H h i Instead of (1.20), we use the suggestive shorthand

f,g = f(x)† dµ(x)g(x). (1.27) hh iiR Z The use of R here comes from “right” for if α , ∈Ml f,gα = f,g α (1.28) hh iiR hh iiR fα,g = α† f,g (1.29) hh iiR hh iiR but, in general, f,αg is not related to f,g . hh iiR hh iiR 8 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

1/2 While (Tr f, f R) is a natural analogue of the norm in the scalar case, it will sometimeshh ii be useful to instead consider [det f, f ]1/2. (1.30) hh iiR Indeed, this is a stronger “norm” in that det > 0 Tr > 0 but not vice-versa. ⇒ When dµ is a “,” that is, each M(x) is , one can appreciate the difference. In that case, dµ = dµ1 dµl and the MOPRL are direct sums (i.e., diagonal matrices) of⊕···⊕ scalar OPRL P R(x,dµ)= P (x,dµ ) P (x,dµ ). (1.31) n n 1 ⊕···⊕ n l Then l 1/2 R 2 P = P ( ,dµ ) 2 (1.32) k n kR k n · j kL (dµj ) j=1  X  while l R R 1/2 (det P ,P ) = P ( ,dµ ) 2 . (1.33) hh n n iiR k n · j kL (dµj ) j=1 Y In particular, in terms of extending the theory of regular mea- R 1/n 1/2 sures [180], P is only sensitive to max P ( ,dµ ) 2 while n R n j L (dµj ) R Rk 1k/2 k · k (det Pn ,Pn R) is sensitive to them all. Thus, det will be needed for thathh theoryii (see Chapter 5). There will also be a left inner product and, correspondingly, two sets of MOPRL and MOPUC. We discuss this further in Sections 2.1 and 3.1. Occasionally, for Cl vector-valued functions f and g, we will want to consider the scalar

fk(x) Mkj(x)gj(x) dµtr(x) (1.34) Xk,j Z which we will denote

d f(x),µ(x)g(x) Cl. (1.35) h i Z We next turn to module Fourier expansions. A set ϕ N in (N { j}j=1 H may be infinite) is called orthonormal if and only if ϕ ,ϕ = δ 1. (1.36) hh j kiiR jk This natural terminology is an abuse of notation since (1.36) implies in , R but not normalization, and is much stronger than orthogonalityh· in·i , . h· ·iR MATRIX ORTHOGONAL POLYNOMIALS 9

Suppose for a moment that N < . For any a1,...,aN l, we N ∞ ∈ M can form j=1 ϕjaj and, by the right multiplication relations (1.28), (1.29), and (1.36), we have P N N N † ϕjaj, ϕjbj = ajbj. (1.37) j=1 j=1 R j=1 X X  X We will denote the set of all such N ϕ a by —it is a vector j=1 j j H(ϕj ) subspace of of (over C) Nl2. P Define forHf , ∈ H N π (f)= ϕ ϕ , f . (1.38) (ϕj ) jhh j iiR j=1 X It is easy to see it is the orthogonal projection in the scalar inner product , R from to (ϕj ). By theh· standard·i HilbertH H space calculation, taking care to only mul- tiply on the right, one finds the Pythagorean theorem, N f, f = f π f, f π f + ϕ , f † ϕ , f . (1.39) hh iiR hh − (ϕj ) − (ϕj ) iiR hh j iiRhh j iiR j=1 X As usual, this proves for infinite N that N ϕ , f † ϕ , f f, f (1.40) hh j iiRhh j iiR ≤ hh iiR j=1 X and the convergence of N ϕ ϕ , f π (f) (1.41) jhh j iiR ≡ (ϕj ) j=1 X allowing the definition of π(ϕj ) and of (ϕj ) Ran π(ϕj ) for N = . An orthonormal set is called completeH if ≡ = . In that∞ case, H(ϕj ) H equality holds in (1.40) and π(ϕj )(f)= f. For orthonormal bases, we have the Parseval relation from (1.39) ∞ f, f = ϕ , f † ϕ , f (1.42) hh iiR hh j iiRhh j iiR j=1 X and ∞ f 2 = Tr( ϕ , f † ϕ , f ). (1.43) k kR hh j iiRhh j iiR j=1 X 10 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

1.3. Matrix M¨obius Transformations. Without an understanding of matrix M¨obius transformations, the form of the MOPUC Geronimus theorem we will prove in Section 3.10 will seem strange-looking. To set the stage, recall that scalar fractional linear transformations (FLT) are associated to matrices T =( a b ) with det T = 0 via c d 6 az + b f (z)= . (1.44) T cz + d Without loss, one can restrict to det(T )=1. (1.45)

Indeed, T fT is a 2 to 1 map of SL(2, C) to maps of C to itself. One7→ advantage of the matrix formalism is that the∪ map {∞}is a matrix homomorphism, that is, f = f f (1.46) T ◦S T ◦ S which shows that the of FLTs is SL(2, C)/ 1, 1 . While (1.46) can be checked by direct calculation,{ a− more} instructive way is to look at the complex projective line. u, v C2 0 are called equivalent if there is λ C 0 so that u =∈λv.\{ Let} [ ] denote ∈ \ { 1} · equivalence classes. Except for [ 0 ], every equivalence class contains exactly one point of the form z with z C. If [ 1 ] is associated with 1  0 , the set of equivalence classes is naturally∈ associated with C . ∞   ∪{∞} fT then obeys z f (z) T = T (1.47) 1 1      from which (1.46) is immediate. By M¨obius transformations we will mean those FLTs that map D onto itself. Let 1 0 J = (1.48) 0 1  −  z Then [u] = [ 1 ] with z = 1 (resp. z < 1) if and only if u,Ju = 0 (resp. u,Ju < 0). From| | this, it is not| | hard to show that if det(h T )=1,i h i  then fT maps D invertibly onto D if and only if T †JT = J. (1.49)

a b If T has the form ( c d ), this is equivalent to a 2 c 2 =1, b 2 d 2 = 1, ab¯ cd¯ =0. (1.50) | | −| | | | −| | − − The set of T ’s obeying det(T ) = 1 and (1.49) is called SU(1, 1). It is studied extensively in [168, Sect. 10.4]. MATRIX ORTHOGONAL POLYNOMIALS 11

The self-adjoint elements of SU(1, 1) are parametrized by α D via ρ = (1 α 2)1/2 ∈ −| | 1 1 α T = (1.51) α ρ α¯ 1   associated to z + α f (z)= . (1.52) Tα 1+¯αz Notice that −1 Tα = T−α (1.53) and that z D, ! α such that T (0) = z ∀ ∈ ∃ α namely, α = z. It is a basic theorem that every holomorphic bijection of D to D is an f for some T in SU(1, 1) (unique up to 1). T ± With this in place, we can turn to the matrix case. Let l be the space of l l complex matrices with the Euclidean norm inducedM by × 1/2 the vector norm , . Let h· ·iCl D = A : A < 1 . (1.54) l { ∈Ml k k } We are interested in holomorphic bijections of Dl to itself, especially via a suitable notion of FLT. There is a huge (and diffuse) literature on the subject, starting with its use in analytic number theory. It has also been studied in connection with electrical engineering filters and indefinite matrix Hilbert spaces. Among the huge literature, we mention [1, 3, 78, 99, 114, 166]. Especially relevant to MOPUC is the book of Bakonyi–Constantinescu [6]. Consider l l = l[2] as a right module over l. The l- M ⊕M M X X′ M M projective line is defined by saying [ Y ] Y ′ , both in l[2] 0 , if and only if there exists Λ , Λ invertible∼ so that M \{ } ∈Ml   X = X′Λ,Y = Y ′Λ. (1.55) Let T be a map of [2] of the form Ml A B T = (1.56) CD   acting on [2] by Ml X AX + BY T = . (1.57) Y CX + DY     Because this acts on the left and Λ equivalence on the right, T maps equivalence classes to themselves. In particular, if CX +D is invertible, 12 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

X fT [X] T maps the equivalence class of [ 1 ] to the equivalence class of 1 , where −1   fT [X]=(AX + B)(CX + D) . (1.58) So long as CX + D remains invertible, (1.46) remains true. Let J be the 2l 2l matrix in l l block form × × 1 0 J = . (1.59) 0 1  −  X † † Note that (with [ 1 ] = [X 1]) X † X J 0 X†X 1 X 1. (1.60) 1 1 ≤ ⇔ ≤ ⇔k k ≤     Therefore, if we define SU(l, l) to be those T ’s with det T = 1 and T †JT = J (1.61) then T SU(l, l) f [D ]= D as a bijection. (1.62) ∈ ⇒ T l l If T has the form (1.56), then (1.61) is equivalent to A†A C†C = D†D B†B = 1, (1.63) − − A†B = C†D (1.64) (the fourth relation B†A = D†C is equivalent to (1.64)). This depends on

A B Proposition 1.1. If T = ( C D ) obeys (1.61) and X < 1, then CX + D is invertible. k k Proof. (1.61) implies that T −1 = JT †J (1.65) A† C† = (1.66) B† −D†  −  Clearly, (1.61) also implies T −1 SU(l, l). Thus, by (1.63) for T −1, ∈ DD† CC† = 1. (1.67) − This implies first that DD† 1, so D is invertible, and second that ≥ D−1C 1. (1.68) k k ≤ Thus, X < 1 implies D−1CX < 1 so 1+D−1CX is invertible, and thus sok iskD(1 + D−1CXk ). k  It is a basic result of Cartan [18] (see Helgason [114] and the discus- sion therein) that MATRIX ORTHOGONAL POLYNOMIALS 13

Theorem 1.2. A holomorphic bijection, g, of Dl to itself is either of the form

g(X)= fT (X) (1.69) for some T SU(l, l) or ∈ t g(X)= fT (X ). (1.70) Given α with α < 1, define ∈Ml k k ρL =(1 α†α)1/2, ρR =(1 αα†)1/2. (1.71) − − Lemma 1.3. We have αρL = ρRα, α†ρR = ρLα†, (1.72) α(ρL)−1 =(ρR)−1α, α†(ρR)−1 =(ρL)−1α†. (1.73) D ∞ n Proof. Let f be analytic in with f(z)= n=0 cnz its Taylor series at z = 0. Since α†α < 1, we have k k P ∞ † † n f(α α)= cn(α α) (1.74) n=0 X norm convergent, so α(α†α)n =(αα†)nα implies αf(α†α)= f(αα†)α (1.75) which implies the first halves of (1.72) and (1.73). The other halves follow by taking adjoints. 

Theorem 1.4. There is a one-one correspondence between α’s in l obeying α < 1 and positive self-adjoint elements of SU(l, l) via M k k (ρR)−1 (ρR)−1α T = (1.76) α (ρL)−1α† (ρL)−1   Proof. A straightforward calculation using Lemma 1.3 proves that Tα is † A B self-adjoint and TαJTα = J. Conversely, if T is self-adjoint, T =( C D ) and in SU(l, l), then T † = T A† = A, B† = C, so (1.63) becomes ⇒ AA† BB† = 1 (1.77) − so if α = A−1B (1.78) then (1.77) becomes A−1(A−1)† + αα† = 1. (1.79) Since T 0, A 0 so (1.79) implies A = (ρR)−1, and then (1.78) implies B≥=(ρR)−≥1α. 14 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

By Lemma 1.3, C = B† = α†(ρR)−1 =(ρL)−1α† (1.80) and then (by D = D†, C† = B, and (1.63)) DD† CC† = 1 plus D> 0 implies D =(ρL)−1. −  Corollary 1.5. For each α D , the map ∈ l R −1 † −1 L fTα (X)=(ρ ) (X + α)(1 + α X) (ρ ) (1.81)

takes Dl to Dl. Its inverse is given by −1 R −1 † −1 L f (X)= f − (X)=(ρ ) (X α)(1 α X) (ρ ). (1.82) Tα T α − − There is an alternate form for the right side of (1.81). Proposition 1.6. The following identity holds true for any X, X 1: k k ≤ ρR(1+ Xα†)−1(X + α)(ρL)−1 =(ρR)−1(X + α)(1+ α†X)−1ρL. (1.83) x-ref? Proof. We have TK. Sasha to insert  proof here 1.4. Applications and Examples. There are a number of simple examples which show that beyond their intrinsic mathematical interest, MOPRL and MOPUC have wide application.

(a) Jacobi matrices on a strip. Let Λ Zν be a subset (perhaps infi- nite) of the ν-dimensional lattice Zν and⊂ let l2(Λ) be square summable sequences indexed by Λ. Suppose a real αij is given for all i, j Λ with αij = 0 unless i j = 1 (nearest neighbors). Let β be a real∈ sequence indexed by i | −Λ.| Suppose i ∈ sup αij + sup βi < . (1.84) i,j | | i | | ∞ Define a bounded operator, J, on l2(Λ) by

(Ju)i = αijuj + βiui. (1.85) j X The sum is finite with at most 2ν elements. The special case Λ = 1, 2,... with bi = βi, ai = αi,i+1 > 0 corre- sponds precisely to classical{ semi-infinite} tridiagonal Jacobi matrices. Now consider the situation where Λ′ Zν−1 is a finite set with l elements and ⊂ Λ= j Zν : j 1, 2,... ; (j ,...j ) Λ′ (1.86) { ∈ 1 ∈{ } 2 ν ∈ } MATRIX ORTHOGONAL POLYNOMIALS 15 a “strip” with cross-section Λ′. J then has a block l l matrix Jacobi form where (γ, δ Λ′) × ∈ (Bi)γδ = b(i,γ), (γ = δ), (1.87) = a , (γ = δ), (1.88) (i,γ)(i,δ) 6 (Ai)γδ = a(i,γ)(i+1,δ). (1.89) The nearest neighbor condition says (A ) =0 if γ = δ. If i γδ 6 a(i,γ)(i+1,γ) > 0 (1.90) for all i, γ, then Ai is invertible and we have a block Jacobi matrix of the kind described in Section 2.2 below. By allowing general Ai, Bi, we obtain an obvious generalization of this model—an interpretation of general MOPRL. Schr¨odinger operators on strips have been studied in part as approx- imations to Zν ; see [31, 95, 130, 134, 151, 164]. From this point of view, it is also natural to allow periodic boundary conditions in the vertical directions. Furthermore, there is closely related work on Schr¨odinger (and other) operators with matrix-valued potentials; see, for example, [8, 24, 25, 26, 27, 28, 30, 96, 97, 165].

(b) Two-sided Jacobi matrices. This example goes back at least to Nik- ishin [153]. Consider the case ν =2, Λ′ = 0, 1 Z, and Λ as above. Suppose (1.90) holds, and in addition, { } ⊂

a(1,0)(1,1) > 0, (1.91)

a(i,0)(i,1) =0, i =2, 3,... (1.92) Then there are no links between the rungs of the “ladder,” 1, 2,... 0, 1 except at the end and the ladder can be unfolded to{Z! Thus,}× a two-sided{ } Jacobi matrix can be viewed as a special kind of one-sided 2 2 matrix Jacobi operator. ×It is known that for two-sided Jacobi matrices, the spectral theory is determined by the 2 2 matrix × dµ dµ dµ = 00 01 (1.93) dµ dµ  10 11 where dµkl is the measure with dµ (x) δ , (J λ)−1δ = kl (1.94) h k − li x λ Z − but also that it is very difficult to specify exactly which dµ correspond to two-sided Jacobi matrices. 16 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

This difficulty is illuminated by the theory of MOPRL. By Favard’s theorem (see Theorem 2.11), every such dµ (given by (1.93) and posi- tive definite and non-trivial in a sense we will describe in Lemma 2.1) yields a unique block Jacobi matrix with Aj > 0 (positive definite). This dµ comes from a two-sided Jacobi matrix if and only if (a) Bj is diagonal for j =2, 3,... . (b) Aj is diagonal for j =1, 2,... . (c) Bj has strictly positive off-diagonal elements. These are very complicated indirect conditions on dµ!

(c) Banded matrices. Classical Jacobi matrices are semi-infinite sym- metric tridiagonal matrices, that is, J =0 if k m > 1 (1.95) km | − | with J > 0 if k m =1. (1.96) km | − | A natural generalization are (2l + 1)-diagonal symmetric matrices, that is, J =0 if k m > l, (1.97) km | − | J > 0 if k m = l. (1.98) km | − | Such a matrix can be partitioned into l l blocks, which is tridiagonal × in block. The conditions (1.97) and (1.98) are equivalent to Ak , the set of lower triangular matrices; and conversely, A , with∈LA , B k ∈ L k k real (and Bk symmetric) correspond precisely to such banded matrices. This is why we introduce type 3 MOPRL. Banded matrices correspond to certain higher-order difference equa- tions. Unlike the second-order equation (which leads to tridiagonal matrices) where every equation with positive coefficients is equivalent via a variable rescaling to a symmetric matrix, only certain higher-order difference equations correspond to symmetric block Jacobi matrices.

(d) Magic formula. In [33], Damanik, Killip, and Simon studied per- turbations of Jacobi and CMV matrices with periodic Jacobi param- eters (or Verblunsky coefficients). They proved that if ∆ is the dis- criminant of a two-sided periodic J0, then a bounded two-sided J has ∆(J)= Sp + S−p ((Su) u ) if and only if J lies in the isospectral n ≡ n+1 torus of J0. They call this the magic formula. This allows the study of perturbations of the isospectral torus by studying ∆(J) which is a polynomial in J of degree p, and so a 2p +1 banded matrix. Thus, the study of perturbations of periodic problems is connected to perturbations of Sp + S−p as block Jacobi matrices. MATRIX ORTHOGONAL POLYNOMIALS 17

Indeed, it was this connection that stimulated our interest in MOPRL, and [33] uses some of our results here.

(e) Vector-valued prediction theory. As noted in Section 1.1, both prediction theory and filtering theory use OPUC and have natural MOPUC settings that motivated much of the MOPUC literature.

2. Matrix Orthogonal Polynomials on the Real Line 2.1. Preliminaries. OPRL are the most basic and developed of or- thogonal polynomials, and so this chapter on the matrix analogue is the most important of this survey. We present the basic formulas, assuming enough familiarity with the scalar case (see [23, 82, 167, 176, 184, 185]) that we do not need to explain why the objects we define are important.

2.1.1. Polynomials, Inner Products, Norms. Let dµ be an l l matrix- valued Hermitian positive semi-definite finite measure on R×with com- pact support, normalized by µ(R)= 1 . Define (as in (1.20)) ∈Ml f,g = f(x)† dµ(x) g(x), f = (Tr f, f )1/2, f,g , hh iiR k kR hh iiR ∈ P Z f,g = g(x) dµ(x) f(x)†, f = (Tr f, f )1/2, f,g . hh iiL k kL hh iiL ∈ P Z Clearly, we have f,g † = g, f , f,g † = g, f , (2.1) hh iiR hh iiR hh iiL hh iiL f,g = g†, f † , f = f † . (2.2) hh iiL hh iiR k kL k kR As noted in Section 1.2, we have the left and right analogues of the Cauchy inequality Tr f,g f g , Tr f,g f g . | hh iiR|≤k kRk kR | hh iiL|≤k kL k kL Thus, R and L are semi-norms in . Indeed, as noted in Sec- tion 1.2,k·k they arek·k associated to an inner product.P The sets f : f = { k kR 0 and f : f L = 0 are linear subspaces. Let R be the com- pletion} of{ /k fk : f } = 0 (viewed as a right moduleP over ) P { k kR } Ml with respect to the norm R. Similarly, let L be the completion of / f : f = 0 (viewedk·k as a left module)P with respect to the P { k kL } norm L. Thek·k set defined in Section 1.2 is a linear space. Let us introduce a semi-normV in by V 1/2 f = d f(x),µ(x)f(x) Cl . (2.3) | | h i Z  18 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Let be the of all polynomials such that f =0 V0 ⊂ V | | and let ∞ be the completion of the quotient space / 0 with respect to the normV . V V ||·|| Lemma 2.1. The following are equivalent: (1) f > 0 for every non-zero f . k kR ∈ P (2) For all n, the dimension in R of the set of all polynomials of degree at most n is (n + 1)l2. P (3) f > 0 for every non-zero f . k kL ∈ P (4) For all n, the dimension in L of the set of all polynomials of degree at most n is (n + 1)l2. P (5) For every non-zero v , we have that v =0. ∈ V | | 6 (6) For all n, the dimension in ∞ of all vector-valued polynomials of degree at most n is (n + 1)l.V The measure dµ is called non-trivial if these equivalent conditions hold. Remark. If l = 1, these are equivalent to the usual non- triviality condition, that is, supp(µ) is infinite. For l > 1, we cannot define triviality in this simple way, as can be seen by looking at the direct sum of a trivial and non-trivial measure. In that case, the measure is not non-trivial in the above sense but its support is infinite. Proof. The equivalences (1) (2), (3) (4), and (5) (6) are immediate. The equivalence⇔ (1) (3)⇔ follows from (2.2).⇔ Let us prove the equivalence (1) (5).⇔ Assume that (1) holds and let ⇔ v be non-zero. Let f l denote the matrix that has v as∈ its V leftmost column and that∈ has M zero columns otherwise. Then, 0 = f 2 = Tr f, f = v 2 and hence (5) holds. Now assume that 6 k kR hh iiR | | (1) fails and let f be non-zero with f R = 0. Then, at least one of the column vectors∈ P of f is non-zero. Supposek k for simplicity that this is the first column and denote this column vector by v. Let t be ∈Ml the matrix tij = δi1δj1; then we have f =0 f, f =0 0 = Tr(t∗ f, f t)= v 2 k kR ⇒ hh iiR ⇒ hh iiR | | and hence (5) fails.  Throughout the rest of this chapter, we assume the measure dµ to be non-trivial.

2.1.2. Monic Orthogonal Polynomials. Lemma 2.2. Let dµ be a non-trivial measure. R (i) There exists a unique monic polynomial Pn of degree n, which minimizes the norm P R . k n kR MATRIX ORTHOGONAL POLYNOMIALS 19

R (ii) The polynomial Pn can be equivalently defined as the monic poly- nomial of degree n which satisfies P R, f = 0 for any f , deg f < n. (2.4) hh n iiR ∈ P L (iii) There exists a unique monic polynomial Pn of degree n, which L minimizes the norm Pn L. L k k (iv) The polynomial Pn can be equivalently defined as the monic poly- nomial of degree n which satisfies P L, f = 0 for any f , deg f < n. (2.5) hh n iiL ∈ P (v) One has P L(x)= P R(x)† for all x R and n n ∈ P R,P R = P L,P L . (2.6) hh n n iiR hh n n iiL Proof. As noted, has an inner product , R, so there is an orthog- (PR) h· ·i onal projection πn onto discussed in Section 1.2. Then Pn P R(x)= xn π(R) (xn). (2.7) n − n−1 As usual, in inner product spaces, this uniquely minimizes xn Q over all Q . It clearly obeys − ∈ Pn−1 Tr( P R, f )=0 (2.8) hh n iiR for all f . But then for any matrix α, ∈ Pn−1 Tr( P R, f α) = Tr( P R, fα )=0 hh n iiR hh n iiR so (2.4) holds. This proves (i) and (ii). (iii) and (iv) are similar. To prove (v), note L R † that Pn (x)= Pn (x) follows from the criteria (2.4), (2.5). The identity (2.6) follows from (2.2).  Lemma 2.3. Let µ be non-trivial. For any monic polynomial P , we have det P,P =0 and det P,P =0. hh iiR 6 hh iiL 6 Proof. Let P be a monic polynomial of degree n such that P,P hh iiR has a non-trivial kernel. Then one can find α l, α = 0, such that † ∈M 6 α P,P Rα = 0. It follows that P α R = 0. But since P is monic, thehh leadingii coefficient of P α is α,k so Pk α = 0, which contradicts the non-triviality assumption. A similar argument6 works for P,P .  hh iiL By the orthogonality of Q P R to P R for any monic polynomial n − n n Qn of degree n, we have Q , Q = Q P R, Q P R + P R,P R (2.9) hh n niiR hh − n − n iiR hh n n iiR and, in particular, P R,P R Q , Q (2.10) hh n n iiR ≤ hh n niiR 20 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

R with (by non-triviality) equality if and only if Qn = Pn . Since Tr and det are strictly monotone on strictly positive matrices, we have the following variational principles ((2.11) restates (i) of Lemma 2.2):

Theorem 2.4. For any monic Qn of degree n, we have Q P R , (2.11) k nkR ≥k n kR det Q , Q det P R,P R (2.12) hh n niiR ≥ hh n n iiR R with equality if and only if Pn = Qn. 2.1.3. Expansion. Theorem 2.5. Let dµ be non-trivial. (i) We have P R,P R = γ δ (2.13) hh k n iiR n kn for some positive invertible matrices γn. R n (ii) Pk k=0 are a right-module for n; indeed, any f n has a{ unique} expansion, P ∈ P n R R f = Pj fj . (2.14) j=0 X Indeed, essentially by (1.38), f R = γ−1 P R, f . (2.15) j j hh j iiR Remark. There are similar formulas for , . By (2.6), hh· ·iiL P L,P L = γ δ (2.16) hh k n iiL n kn R (same γn, which is why we use γn and not γn ). Proof. (i) (2.13) for nk by . γ 0 follows from (1.22). By Lemma 2.3, det(γ ) = 0, so n ≥ n 6 γn is invertible. (ii) Map ( )n+1 to by Ml Pn n α ,...,α P Rα X(α ,...,α ). h 0 ni 7→ j j ≡ 0 n j=0 X By (2.13), α = γ−1 P R,X(α ,...,α ) j j hh j 0 n ii so that map is one-one. By dimension counting, it is onto.  MATRIX ORTHOGONAL POLYNOMIALS 21

2.1.4. Recurrence Relations for Monic Orthogonal Polynomials. De- R L n−1 R L note by ζn (resp. ζn ) the coefficient of x in Pn (x) (resp. Pn (x)), that is, R n R n−1 Pn (x)= x 1 + ζn x + lower order terms, L n L n−1 Pn (x)= x 1 + ζn x + lower order terms. R † L R † L Since Pn (x) = Pn (x), we have (ζn ) = ζn . Using the parameters γn R L R of (2.13) and ζn , ζn one can write down recurrence relations for Pn (x), L Pn (x). Lemma 2.6. (i) We have a commutation relation γ (ζR ζR )=(ζL ζL )γ . (2.17) n−1 n − n−1 n − n−1 n−1 (ii) We have the recurrence relations xP R(x)= P R (x)+ P R(x)(ζR ζR )+ P (x)γ−1 γ , (2.18) n n+1 n n − n+1 n−1 n−1 n xP L(x)= P L (x)+(ζL ζL )P L(x)+ γ γ−1 P L (x). (2.19) n n+1 n − n+1 n n n−1 n−1 Proof. (i) We have P R(x) xP R (x)=(ζR ζR )xn−1 + lower order terms n − n−1 n − n−1 and so (ζL ζL )γ =(ζR ζR )† P R ,P R n − n−1 n−1 n − n−1 hh n−1 n−1iiR =(ζR ζR )† xn−1,P R n − n−1 hh n−1iiR = P R xP R ,P R hh n − n−1 n−1iiR = P R,P R xP R ,P R hh n n−1iiR − hh n−1 n−1iiR = xP R ,P R −hh n−1 n−1iiR = P R , xP R −hh n−1 n−1iiR = P R ,P R xP R hh n−1 n − n−1iiR = P R , xn−1(ζR ζR ) hh n−1 n − n−1 iiR = P R , xn−1 (ζR ζR ) hh n−1 iiR n − n−1 = γ (ζR ζR ). n−1 n − n−1 (ii) By Theorem 2.5, xP R(x)= P R (x)C + P R(x)C + P R (x)C + + P RC n n+1 n+1 n n n−1 n−1 ··· 0 0 with some matrices C0,...,Cn+1. It is straightforward that Cn+1 = 1 R R and Cn = ζn ζn+1. By the orthogonality property (2.4), we find C = = C − = 0. Finally, it is easy to calculate C : 0 ··· n−2 n−1 γ = P R, xP R = xP R,P R n hh n n−1iiR hh n n−1iiR 22 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

= P R + P R(ζR ζR )+ P R C ,P R hh n+1 n n − n+1 n−1 n−1 n−1iiR † = Cn−1γn−1 −1 and so, taking adjoints and using self-adjointness of γj, Cn−1 = γn−1γn. This proves (2.18); the other relation (2.19) is obtained by conjugation. 

R 2.1.5. Normalized Orthogonal Polynomials. We call pn a right orthonormal polynomial if deg pR n and ∈ P n ≤ pR, f = 0 for every f with deg f < n, (2.20) hh n iiR ∈ P pR,pR = 1. (2.21) hh n n iiR L L Similarly, we call pn a left orthonormal polynomial if deg pn n and ∈ P ≤ pL, f = 0 for every f with deg f < n, (2.22) hh n iiL ∈ P pL,pL = 1. (2.23) hh n n iiL Lemma 2.7. Any orthonormal polynomial has the form R R R R −1/2 L L L −1/2 L pn (x)= Pn (x) Pn ,Pn R σn, pn (x)= τn Pn ,Pn L Pn (x) hh ii hh ii (2.24) where σ , τ are unitaries. In particular, deg pR = deg pL = n. n n ∈Ml n n n R Proof. Let Kn be the coefficient of x in pn . Consider the polynomial R R R q(x)= Pn (x)Kn pn (x), where Pn is the monic orthogonal polynomial from Lemma 2.2.− Then deg q < n and so from (2.4) and (2.20), it follows that q, q R = 0 and so q(x) vanishes identically. Thus, we have hh ii R R † R R 1 = pn ,pn R = Kn Pn ,Pn RKn (2.25) hh ii hh † ii−1 −1 R R and so det(Kn) = 0. From (2.25) we get (Kn) Kn = Pn ,Pn R, † 6 R R −1 Rhh R −1/ii2 and so KnKn = Pn ,Pn R . From here we get Kn = Pn ,Pn R σn hh ii L hh ii  with a unitary σn. The proof for pn is similar. R By Theorem 2.5, the polynomials pn form a right orthonormal mod- ule basis in . Thus, for any f , we have PR ∈ PR ∞ f(x)= pR f , f = pR , f (2.26) m m m hh m iiR m=0 X and the Parseval identity ∞ Tr(f f † )= f 2 (2.27) m m k kR m=0 X MATRIX ORTHOGONAL POLYNOMIALS 23 holds true. Obviously, since f is a polynomial, there are only finitely many non-zero terms in (2.26) and (2.27).

2.2. Block Jacobi Matrices. The study of block Jacobi matrices goes back at least to Krein [133].

2.2.1. Block Jacobi Matrices as Matrix Representations. Suppose that R a sequence of unitary matrices 1 = σ0, σ1, σ2,... is fixed, and pn are de- R fined according to (2.24). As noted above, pn form a right in R. The mapP f(x) xf(x) can be considered as a right homomorphism 7→ in R. Consider the matrix Jnm of this homomorphism with respect to P R the basis pn , that is, J = pR , xpR . (2.28) nm hh n−1 m−1iiR Following Killip–Simon [128] and Simon [167, 168, 176], our Jacobi matrices are indexed with n = 1, 2,... but, of course, pn has n = 0, 1, 2,... . That is why (2.28) has n 1 and m 1. − − R As in the scalar case, using the orthogonality properties of pn , we get that J = 0 if n m > 1. Denote nm | − | B = J = pR , xpR n nn hh n−1 n−1iiR and A = J = J † = pR , xpR . n n,n+1 n+1,n hh n−1 n iiR Then we have B1 A1 0 † ··· A1 B2 A2 J =  † ··· (2.29) 0 A2 B3  . . . ···.   . . . ..    R  Applying (2.26) to f(x)= xpn (x), we get the recurrence relation R R † R R xpn (x)= pn+1(x)An+1 +pn (x)Bn+1 +pn−1(x)An, n =1, 2,... (2.30) R If we set p−1(x)= 0 and A0 = 1, the relation (2.30) also holds for n = 0. L L R † By (2.2), we can always pick pn so that for x real, pn (x)= pn (x) , and thus for complex z, L R † pn (z)= pn (¯z) (2.31) by analytic continuation. By conjugating (2.30), we get

L L L † L xpn (x)= An+1pn+1(x)+ Bn+1pn (x)+ Anpn−1(x), n =0, 1, 2,... (2.32) 24 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Comparing this with the recurrence relations (2.18), (2.19), we get

† −1/2 1/2 † 1/2 R R −1/2 An = σn−1γn−1 γn σn, Bn = σn−1γn−1(ζn−1 ζn )γn−1 σn−1. − (2.33) In particular, det A = 0 for all n. n 6 Notice that since σn is unitary, det(σn) = 1, so (2.33) implies 1/2 1/2 | | det(γn ) = det(γ ) det(A ) which, by induction, implies that n−1 | n | det P R,P R = det(A ...A ) 2 (2.34) hh n n ii | 1 n | † Any of the form (2.29) with Bn = Bn and det An = 0for all n will be called a block Jacobi matrix corresponding to the6 Jacobi parameters An and Bn. 2.2.2. Basic Properties of Block Jacobi Matrices. Suppose we are given a block Jacobi matrix J corresponding to Jacobi parameters An and † Bn, where Bn = Bn and det An = 0 for each n. 6 2 l Consider the Hilbert space v = ℓ (Z+, C ) (here Z+ = 1, 2, 3,... ) with inner product H { } ∞

f,g = f ,g Cl h iHv h n ni n=1 X and orthonormal basis e Z , where { k,j}k∈ +,1≤j≤l (ek,j)n = δk,nvj and v is the standard basis of Cl. J acts on via { j}1≤j≤l Hv (Jf) = A† f + B f + A f , f (2.35) n n−1 n−1 n n n n+1 ∈ Hv (with f0 = 0) and defines a symmetric operator on this space. Note that using invertibility of the An’s, induction shows k−1 span ek,j : 1 k n, 1 j l = span J e1,j : 1 k n, 1 j l { ≤ ≤ ≤ ≤ } { ≤ ≤ (2.36)≤ ≤ } for every n 1. We want to emphasize that elements of v and are vector-valued≥ and matrix-valued, respectively. For this reason,H weH will be interested in both matrix- and vector-valued solutions of the basic difference equations. We will consider only bounded block Jacobi matrices, that is, those corresponding to Jacobi parameters satisfying † † sup Tr(AnAn + BnBn) < . (2.37) n ∞ Equivalently,

sup( An + Bn ) < . (2.38) n k k k k ∞ MATRIX ORTHOGONAL POLYNOMIALS 25

In this case, J is a bounded self-adjoint operator. This is equivalent to µ having compact support. We call two Jacobi matrices J and J˜ equivalent if there exists a sequence of unitaries un l, n 1, with u1 = 1 such that ˜ † ∈ M ≥ R R Jnm = unJnmum. From Lemma 2.7 it is clear that if pn ,p ˜n are two sequences of normalized orthogonal polynomials, corresponding to the same measure (but having different normalization), then the Jacobi R R ˜ R R matrices Jnm = pn−1, xpm−1 R and Jnm = p˜n−1, xp˜m−1 R are equiv- † hh ii hh ii alent (un = σn−1σ˜n−1). Thus, ˜ † ˜ † Bn = unBnun, An = unAnun+1. (2.39) Therefore, we have a map R R R Φ : µ J : Jmn = p , xp R, p 7→{ hh n−1 m−1ii n (2.40) correspond to dµ for some normalization } from the set of all Hermitian positive semi-definite non-trivial com- pactly supported measures to the set of all equivalence classes of bounded block Jacobi matrices. Below, we will see how to invert this map.

2.2.3. Special Representatives of the Equivalence Classes. Let J be a block Jacobi matrix with the Jacobi parameters An, Bn. We say that J is: of type 1, if A > 0 for all n; • n of type 2, if A1A2 ...An > 0 for all n; • of type 3, if A for all n. • n ∈L Here, is the class of all lower triangular matrices with strictly positive elementsL on the diagonal. Type 3 is of interest because they correspond precisely to bounded Hermitian matrices with 2l + 1 non-vanishing di- agonals with the extreme strictly positive; see Section 1.4(c). R Type 2 is the case where the leading coefficients of pn are strictly pos- itive definite. Theorem 2.8. (i) Each equivalence class of block Jacobi matrices contains exactly one element each of type 1, type 2, or type 3. (ii) Let J be a block Jacobi matrix corresponding to a sequence of poly- R nomials pn as in (2.24). Then J is of type 2 if and only if σn = 1 for all n. Proof. The proof is based on the following two well-known facts: (a) For any t l with det(t) = 0, there exists a unique unitary u such∈ Mthat tu is Hermitian6 positive semi-definite: tu 0. ∈Ml ≥ 26 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

(b) For any t with det(t) = 0, there exists a unique unitary ∈ Ml 6 u l such that tu . We first∈M prove that every∈L equivalence class of block Jacobi matrices contains at least one element of type 1. For a given sequence An, let us construct a sequence u1 = 1,u2,u3,... of unitaries such that † unAnun+1 0. By the existence part of (a), we find u2 such that ≥ † A1u2 0, then find u3 such that u2A2u3 0, etc. This, together with (2.39),≥ proves the statement. In order to≥ prove the uniqueness part, suppose we have A 0 and u† A u 0 for all n. Then, by the n ≥ n n n+1 ≥ uniqueness part of (a), A1 0 and A1u2 0 imply u2 = 1; next, † ≥ ≥ A2 0 and u2A2u3 = A2u3 0 imply u3 = 1, etc. The≥ statement (i) concerning≥ type 3 can be proven in the same way, using (b) instead of (a). The statement (i) concerning type 2 can be proven similarly. Exis- † tence: find u2 such that A1u2 0, then u3 such that (A1u2)(u2A2u3)= A A u 0, etc. Uniqueness:≥ if A ...A 0 and A ...A u 0, 1 2 3 ≥ 1 n ≥ 1 n n+1 ≥ then un+1 = 1. 1/2 By (2.33), we have A1A2 ...An = γn σn and the statement (ii) fol- lows from the positivity of γn.  We say that a block Jacobi matrix J belongs to the Nevai class if B 0 and A† A 1 as n . n → n n → →∞ It is clear that J is in the Nevai class if and only if all equivalent Jacobi matrices belong to the Nevai class. Theorem 2.9. If J belongs to the Nevai class and is of type 1 or type 3, then A 1 as n . n → →∞ † 2 Proof. If J is of type 1, then AnAn = An 1 clearly implies An 1 since square root is continuous on positive→ Hermitian matrices. → Suppose J is of type 3. We shall prove that A 1 by considering n → the rows of the matrix An one by one, starting from the lth row. Denote (n) (An)jk = aj,k . We have (A† A ) =(a(n))2 1, and so a(n) 1. n n ll l,l → l,l → Then, for any k < l, we have (A† A ) = a(n)a(n) 0, and so a(n) 0. n n lk l,l l,k → l,k → Next, consider the (l 1)st row. We have − (A† A ) =(a(n) )2 + a(n) 2 1 n n l−1,l−1 l−1,l−1 | l,l−1| → MATRIX ORTHOGONAL POLYNOMIALS 27

(n) and so, using the previous step, al−1,l−1 1 as n . Then for all k < l 1, we have → → ∞ − (A† A ) = a(n) a(n) + a(n) a(n) 0 n n l−1,k l−1,l−1 l−1,k l,l−1 l,k → and so, using the previous steps, a 0. Continuing this way, we l−1,k → get a(n) δ as required.  j,k → j,k It is an interesting open question if this result also applies to the type 2 case.

2.2.4. Favard’s Theorem. Here we construct an inverse of the mapping Φ (defined by (2.40)). Thus, Φ sets up a bijection between non-trivial measures of compact support and equivalence classes of bounded block Jacobi matrices. Before proceeding to do this, let us prove: Lemma 2.10. The mapping Φ is injective. Proof. Let µ andµ ˜ be two Hermitian positive semi-definite non-trivial compactly supported measures. Suppose that Φ(µ) =Φ(˜µ). R R Let pn andp ˜n be normalized orthogonal polynomials corresponding R R to µ andµ ˜. Suppose that the normalization both for pn and forp ˜n has been chosen such that σn = 1 (see (2.24)), that is, type 2. From Lemma 2.8 and the assumption Φ(µ) =Φ(˜µ) it follows that the corre- R R R R sponding Jacobi matrices coincide, that is, pn , xpm R = p˜n , xp˜m R for all n and m. Together with the recurrencehh relationii (2.30)hh this yieldsii R R pn =p ˜n for all n. For any n 0, we can represent xn as ≥ n n n R (n) R ˜(n) x = pk (x)Ck = p˜k (x)Ck . Xk=0 Xk=0 (n) ˜(n) The coefficients Ck and Ck are completely determined by the coef- R R (n) ˜(n) ficients of the polynomials pn andp ˜n and so Ck = Ck for all n and k. For the moments of the measure µ, we have n xndµ(x)= 1, xn = 1,pRC(n) = 1, 1 C(n) = C(n). hh iiR hh k k iiR hh iiR 0 0 Z Xk=0 Since the same calculation is valid for the measureµ ˜, we get

xndµ(x)= xndµ˜(x) Z Z 28 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON for all n. It follows that f(x)dµ(x)g(x)= f(x)dµ˜(x)g(x) Z Z for all matrix-valued polynomials f and g, and so the measures µ and µ˜ coincide.  We can now construct the inverse of the map Φ. Let a block Jacobi matrix J be given. By a version of the for self-adjoint operators with finite multiplicity (see, e.g., [2, Sect. 72]), there exists a matrix-valued measure dµ with

e , f(J)e = f(x) dµ (x) (2.41) h 1,j 1,kiHv j,k Z and an R : L2(R,dµ; Cl) Hv → such that (recall that v is the standard basis in Cl) { j} [Re ](x)= v , 1 j l, (2.42) 1,j j ≤ ≤ and, for any g , we have ∈ Hv (RJg)(x)= x(Rg)(x). (2.43) If the Jacobi matrices J and J˜ are equivalent, then we have J˜ = U ∗JU for some U = ∞ u , u = 1. Thus, ⊕n=1 n 1 e , f(J˜)e = Ue , f(J)Ue = e , f(J)e h 1,j 1,kiHv h 1,j 1,kiHv h 1,j 1,kiHv and so the measures corresponding to J and J˜ coincide. Thus, we have a map Ψ: J˜: J˜ is equivalent to J µ (2.44) { } 7→ from the set of all equivalence classes of bounded block Jacobi matrices to the set of all Hermitian positive semi-definite compactly supported measures. Theorem 2.11. (i) All measures in the image of the map Ψ are non- degenerate. (ii) Φ Ψ= id. (iii) Ψ ◦ Φ= id. ◦

Proof. (i) To put things in the right context, we first recall that Hv is a norm (rather than a semi-norm), whereas on (cf. (2.3))k·k is, ||·|| V in general, a semi-norm. Using the assumption that det(Ak) = 0 for all k (which is included in our definition of a Jacobi matrix),6 we will prove that is in fact a norm. More precisely, we will prove that p > 0 for any||·|| polynomial p ; by Lemma 2.1 this will imply that µ |is| non-degenerate. ∈ V MATRIX ORTHOGONAL POLYNOMIALS 29

Let p be a non-zero polynomial, deg p = n. Notice that (2.42) and (2.43)∈ V give k k [RJ e1,j](x)= x vj (2.45) for every k 0 and 1 j l. This shows that p can be represented as p = Rg, where≥ g = ≤n ≤J kf , and f ,...,f are vectors in such k=0 k 0 n Hv that fi, ej,k Hv = 0 for all i = 0,...,n, j 2, k = 1,...,l (i.e., the h i P ≥ l only non-zero components of fj are in the first C in v). Assumption deg p = n means f = 0. H n 6 Since R is isometric, we have p = g Hv , and so we have to prove that g = 0. Indeed, suppose| | thatk kg = 0. Using the as- 6 sumption det(Ak) = 0 and the tri-diagonal nature of J, we see that n k 6 k=0 J fk = 0 yields fn = 0, contrary to our assumption. (ii) Consider the elements Re L2(R,dµ; Cl). First note that, by P n,k ∈ (2.36) and (2.45), Ren,k is a polynomial of degree at most n 1. Next, by the unitarity of R, we have −

Re , Re 2 R Cl = δ δ . (2.46) h n,k m,jiL ( ,dµ; ) m,n k,j Let us construct matrix-valued polynomials qn(x), using Ren,1, Ren,2,...,Ren,l as columns of qn−1(x):

[qn−1(x)]j,k = [Ren,k(x)]j. We have deg q n and q , q = δ 1; the last relation is just a n ≤ hh m niiR m,n reformulation of (2.46). Hence the qn’s are right normalized orthogonal polynomials with respect to the measure dµ. We find J = [ e , Je ] nm h n,j m,kiHv 1≤j,k≤l = [ Re ,RJe 2 R Cl ] h n,j m,kiL ( ,dµ; ) 1≤j,k≤l = [ Re , xRe 2 R Cl ] h n,j m,kiL ( ,dµ; ) 1≤j,k≤l = [ [q (x)]· , x[q (x)]· 2 R Cl ] h n−1 ,j m−1 ,kiL ( ,dµ; ) 1≤j,k≤l = q , xq hh n−1 m−1iiR as required. (iii) Follows from (ii) and from Lemma 2.10.  2.3. The m-Function. 2.3.1. The Definition of the m-Function. We denote the Borel trans- form of dµ by m: dµ(x) m(z)= , Im z > 0. (2.47) x z Z − It is a matrix-valued Herglotz function, that is, it is analytic and obeys Im m(z) > 0. For information on matrix-valued Herglotz functions, 30 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON see [98] and references therein. Extensions to operator-valued Herglotz functions can be found in [94]. R Lemma 2.12. Suppose dµ is given, pn are right normalized orthogonal polynomials, and J is the associated block Jacobi matrix. Then, m(z)= pR, (x z)−1pR (2.48) hh 0 − 0 iiR and −1 m(z)= e ·, (J z) e · . (2.49) h 1, − 1, iHv R Proof. Since p0 = 1, (2.48) is just a way of rewriting the definition of m. The second identity, (2.49), is a consequence of (2.41) and Theo- rem 2.11(iii).  2.3.2. Coefficient Stripping. If J is a block Jacobi matrix correspond- ing to invertible An’s and Hermitian Bn’s, we denote the k-times stripped block Jacobi matrix, corresponding to A , B , by { k+n k+n}n≥1 J (k). That is, Bk+1 Ak+1 0 † ··· (k) Ak+1 Bk+2 Ak+2 J =  † ··· 0 Ak+2 Bk+3  . . . ···.   . . . ..   The m-function corresponding to J (k) will be denoted by m(k). Note that, in particular, J (0) = J and m(0) = m. Proposition 2.13. Let J be a block Jacobi matrix with σ (J) [a, b]. ess ⊆ Then, for every ε > 0, there is k0 0 such that for k k0, we have that σ(J (k)) [a ε, b + ε]. ≥ ≥ ⊆ − Proof. This is an immediate consequence of (the proof of) [42, Lemma 1].  Proposition 2.14 (Due to Aptekarev–Nikishin [4]). We have that m(k)(z)−1 = B z A m(k+1)(z)A† k+1 − − k+1 k+1 for Im z > 0 and k 0. ≥ Proof. It suffices to handle the case k = 0. Given (2.49), this is a special case of a general formula for 2 2 block operator matrices, due to Schur [163], that reads × −1 A B (A BD−1C)−1 A−1B(D CA−1B)−1 = CD D−1C−(A BD−1C)−1 − (D CA− −1B)−1   − − −  † and which is readily verified. Here A = B1 z, B = A1, C = A1, and D = J (1) z. −  − MATRIX ORTHOGONAL POLYNOMIALS 31

2.4. Second Kind Polynomials. Define the second kind polynomials by qR (z)= 1, −1 − R R R pn (z) pn (x) qn (z)= dµ(x) − , n =0, 1, 2,... R z x Z − R As is easy to see, for n 1, qn is a polynomial of degree n 1. For ≥ R− R future reference, let us display the first several polynomials pn and qn : pR (x)= 0, pR(x)= 1, pR(x)=(x B )A−1, (2.50) −1 0 1 − 1 1 qR (x)= 1, qR(x)= 0, qR(x)= A−1. (2.51) −1 − 0 1 1 R The polynomials qn satisfy the equation (same form as (2.30))

R R † R R xqn (x)= qn+1(x)An+1 + qn (x)Bn+1 + qn−1(x)An, n =0, 1, 2,... (2.52) For n = 0, this can be checked by a direct substitution of (2.51). For n 1, as in the scalar case, this can be checked by taking (2.30) for x and≥ for z, subtracting, dividing by x z, integrating over dµ, and taking into account the orthogonality relation−

dµ(x)pR(x)= 0, n 1. n ≥ Z Finally, let us define

R R R ψn (z)= qn (z)+ m(z)pn (z).

R According to the definition of qn , we have ψR(z)= f ,pR , f (x)=(x z¯)−1. n hh z n iiR z − By the Parseval identity, this shows that for all Im z > 0, the sequence R 2 ψn (z) is in ℓ , that is,

∞ Tr(ψR(z)†ψR(z)) < . (2.53) n n ∞ n=0 X In the same way, we define qL (z)= 1, −1 − L L L pn (z) pn (x) qn (z)= − dµ(x) , n =0, 1, 2,... R z x Z − L L L and ψn (z)= qn (z)+ pn (z)m(z). 32 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

2.5. Solutions to the Difference Equations. For Im z > 0, con- sider the solutions to the equations ∞

zun(z)= um(z)Jmn, n =2, 3,... (2.54) m=1 X∞

zvn(z)= Jnmvm(z), n =2, 3,... (2.55) m=1 X † Clearly, un(z) solves (2.54) if and only if vn(z)=(un(¯z)) solves (2.55). In the above, we normally intend z as a fixed parameter but it then can be used as a variable. That is, z is fixed and un(z) is a fixed sequence, † not a z-dependent function. A statement like vn(z)=(un(¯z)) means if un is a sequence solving (2.54) for z =z ¯0, then vn obeys (2.55) for z = z0. Of course, if un(z) is a function of z in a region, we can apply ∞ our estimates to all z in the region. For any solution un(z) n=1 of (2.54), let us define { } u (z)= zu (z) u (z)B u (z)A† . (2.56) 0 1 − 1 1 − 2 1 With this definition, the equation (2.54) for n = 1 is equivalent to u0(z) = 0. In the same way, we set v (z)= zv (z) B v (z) A v (z). 0 1 − 1 1 − 1 2 ∞ Lemma 2.15. Let Im z > 0 and suppose un(z) n=0 solves (2.54) (for n 2) and (2.56) and belongs to ℓ2. Then{ } ≥ ∞ (Im z) Tr(u (z)†u (z)) = Im Tr(u (z)u (z)†). (2.57) n n − 1 0 n=1 X R 2 In particular, un(z)= αpn−1(z) is in ℓ only if α = 0. † † Proof. Denote sn = Tr(un(z)An−1un−1(z) ). Here A0 = 1. Multiplying † (2.54) for n 2 and (2.56) for n = 1 by un(z) on the right, taking traces, and summing≥ over n, we get N N N N † † z Tr(un(z)un(z) )= sn+1 + Tr(un(z)Bn+1un(z) )+ sn . n=1 n=1 n=1 n=1 X X X X Taking imaginary parts and letting N , we obtain (2.57) since the middle sum is real and the outer sums→∞ cancel up to boundary terms. R Applying (2.57) to un(z) = αpn−1(z), we get zero in the right-hand side: ∞ R R † † (Im z) Tr(αpn−1(z)pn−1(z) α )=0 n=1 XR  and hence α = 0 since p0 = 1. MATRIX ORTHOGONAL POLYNOMIALS 33

Theorem 2.16. Let Im z > 0. ∞ (i) Any solution un(z) n=0 of (2.54) (for n 2) can be represented as { } ≥ R R un(z)= apn−1(z)+ bqn−1(z) (2.58) for suitable a, b l. In fact, a = u1(z) and b = u0(z). (ii) A sequence (2.58)∈Msatisfies (2.54) for all n 1 if and− only if b =0. 2 ≥ R (iii) A sequence (2.58) belongs to ℓ if and only if un(z)= cψn−1(z) for some c . Equivalently, a sequence (2.58) belongs to ℓ2 if and ∈Ml only if u1(z)+ u0(z)m(z)=0.

Proof. (i) Let un(z) be a solution to (2.54). Consider u˜ (z)= u (z) u (z)pR (z)+ u (z)qR (z). n n − 1 n−1 0 n−1 Thenu ˜n(z) also solves (2.54) andu ˜0(z)=˜u1(z) = 0. It follows that u˜n(z)=0 for all n. This proves (i). (ii) A direct substitution of (2.58) into (2.54) for n = 1 yields the statement. R 2 (iii) We already know that cψn−1 is an ℓ solution. Suppose that 2 un(z) is an ℓ solution to (2.54). Rewrite (2.58) as u (z)=(a bm(z))pR (z)+ bψR (z). n − n−1 n−1 R 2 R 2 Since ψn is in ℓ and cpn is not in ℓ , we get a = bm(z), which is R  equivalent to u1(z)+ u0(z)m(z)=0orto un(z)= bψn−1(z). By conjugation, we obtain: Theorem 2.17. Let Im z > 0. ∞ (i) Any solution vn(z) n=0 of (2.55) (for n 2) can be represented as { } ≥ L L vn(z)= pn−1(z)a + qn−1(z)b. (2.59) In fact, a = v1(z) and b = v0(z). (ii) A sequence (2.59) satisfies −(2.55) for all n 1 if and only if b =0. 2 ≥ L (iii) A sequence (2.59) belongs to ℓ if and only if vn(z)= ψn−1(z)c for some c . Equivalently, a sequence (2.59) belongs to ℓ2 if and ∈Ml only if v1(z)+ m(z)v0(z)=0. 2.6. Wronskians and the Christoffel–Darboux Formula. For any two -valued sequences u , v , define the by Ml n n W (u, v)= u A v u A† v . (2.60) n n n n+1 − n+1 n n † † † Note that Wn(u, v) = Wn(v ,u ) . If un(z) and vn(z) are solu- tions to (2.54) and (2.55),− then by a direct calculation, we see that Wn(u(z), v(z)) is independent of n. Put differently, if both un(z) and † vn(z) are solutions to (2.54), then Wn(u(z), v(¯z) ) is independent of n. 34 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

† Or, if both un(z) and vn(z) are solutions to (2.55), then Wn(u(¯z) , v(z)) is independent of n. In particular, by a direct evaluation for n = 0, we get R R † R R † Wn(p· −1(z),p· −1(¯z) )= Wn(q· −1(z), q· −1(¯z) )= 0, L † L L † L Wn(p· −1(¯z) ,p· −1(z)) = Wn(q· −1(¯z) , q· −1(z)) = 0, R R † L † L Wn(p· −1(z), q· −1(¯z) )= Wn(p· −1(¯z) , q· −1(z)) = 1. Let both u(z) and v(z) be solutions to (2.54) of the type (2.58), namely, R R R R un(z)= apn−1(z)+ bqn−1(z), vn(z)= cpn−1(z)+ dqn−1(z). Then the above calculation implies W (u(z), v(¯z)†)= ad† bc†. n − Theorem 2.18 (CD Formula). For any x, y C and n 1, one has ∈ ≥ n R L R L (x y) p (x)p (y)= W (p· (x),p· (y)). (2.61) − k k − n+1 −1 −1 Xk=0 L Proof. Multiplying (2.30) by pn (y) on the right and (2.32) (with y in R place of x) by pn (x) on the left and subtracting, we get R R R L R L (x y)p (x)p (y)= W (p· (x),p· (y)) W (p· (x),p· (y)). − n n n −1 −1 − n+1 −1 −1 R L Summing over n and noting that W0(p (x),p (y)) = 0, we get the required statement. 

2.7. The CD Kernel. The CD kernel is defined for z,w C by ∈ n R R † Kn(z,w)= pk (z)pk (¯w) (2.62) k=0 Xn L † L = pk (¯z) pk (w). (2.63) Xk=0 (2.63) follows from (2.62) and (2.31). Notice that K is independent of the choices σn, τn in (2.24) and that (2.61) can be written R R † (z w¯)K (z,w)= W (p· (z),p· (¯w) ). (2.64) − n − n+1 −1 −1 The independence of Kn of σ, τ can be understood by noting that if fm is given by (2.26), then n R Kn(z,w) dµ(w)f(w)= pm(z)fm (2.65) m=0 Z X MATRIX ORTHOGONAL POLYNOMIALS 35 so K is the kernel of the orthogonal projection onto polynomials of (L) L degree up to n, and so intrinsic. Similarly, if fm = f,p , so hh miiL ∞ (L) L f(x)= fm pm(x) (2.66) m=0 X then, by (2.63), n (L) L f(z) dµ(z)Kn(z,w)= fm pm(w). (2.67) m=0 Z X One has

Kn(z,w) dµ(w)Kn(w,ζ)= Kn(z,ζ) (2.68) Z as can be checked directly and which is an expression of the fact that the map in (2.65) is a projection, and so its own square. 2 We will let πn be the map of L (dµ) to itself given by (2.65) (or by (2.67)). (2.64) can then be viewed (for z,w R) as an expression of ∈ the integral kernel of the commutator [J, πn], which leads to another proof of it [175]. Let Jn;F be the finite nl nl matrix obtained from J by taking the 2 × top leftmost n blocks. It is the matrix of πn−1Mxπn−1 where Mx is multiplication by x in the pR n−1 basis. For y C and γ Cl, let { j }j=0 ∈ ∈ ϕn,γ(y) be the vector whose components are L (ϕn,γ(y))j = pj−1(y)γ (2.69) for j =1, 2,...,n. We claim that [(J y)ϕ (y)] = δ A pL(y)γ (2.70) n;F − n,γ j − jn n n as follows immediately from (2.32). This is intimately related to (2.61) and (2.64). For recalling J is the R matrix in p basis, ϕn,γ(y) corresponds to the function

n−1 R pj (x)(ϕn,γ(y))j−1 = Kn(x, y)γ. j=0 X As we will see in the next two sections, (2.70) has important conse- quences.

2.8. Christoffel Variational Principle. There is a matrix version of the Christoffel variational principle (see Nevai [152] for a discussion of uses in the scalar case; this matrix case is discussed by Duran–Polo [76]): 36 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Theorem 2.19. For any non-trivial l l matrix- valued measure, dµ, on R, we have that for any n, any x× R, and matrix polynomials 0 ∈ Qn(x) of degree at most n with

Qn(x0)= 1 (2.71) we have that Q , Q K (x , x )−1 (2.72) hh n niiR ≥ n 0 0 with equality if and only if

−1 Qn(x)= Kn(x, x0)Kn(x0, x0) (2.73)

Remark. (2.72) also holds for , L but the minimizer is then −1 hh· ·ii Kn(x0, x0) Kn(x, x0).

(0) Proof. Let Qn denote the right-hand side of (2.73). Then for any polynomial Rn of degree at most n, we have

Q(0), R = K (x , x )−1R (x ) (2.74) hh n niiR n 0 0 n 0 (0) because of (2.65). Since Qn(x0)= Qn (x0)= 1, we conclude

Q Q(0), Q Q(0) = Q , Q K (x , x )−1 (2.75) hh n − n n − n iiR hh n niiR − n 0 0 from which (2.72) is immediate and, given the supposed non- triviality, the uniqueness of minimizer. 

With this, one easily gets an extension of a result of M´at´e–Nevai [146] to MOPRL. (They had it for scalar OPUC. For OPUC, it is in M´at´e–Nevai–Totik [147] on [ 1, 1] and in Totik [187] for general OPRL. The result below can be proven− using polynomial mappings `ala Totik [188] or Jost solutions `ala Simon [174].)

Theorem 2.20. Let dµ be a non-trivial l l matrix-valued measure on R with compact support, E. Let I =(a, b×) be an open interval with I E. Then for Lebesgue a.e. x I, ⊂ ∈ lim sup(n + 1)K (x, x)−1 w(x) (2.76) n ≤ Remark. This is intended in terms of expectations in any fixed vector.

We state this explicitly since we will need it in Section 5.4, but we note that the more detailed results of M´at´e–Nevai–Totik [147], Lubin- sky [141], Simon [173], and Totik [189] also extend. MATRIX ORTHOGONAL POLYNOMIALS 37

L 2.9. Zeros. We next look at zeros of det(Pn (z)), which we will prove R soon is also det(Pn (z)). Following [66, 177], we will identify these zeros with eigenvalues of Jn;F . It is not obvious a priori that these zeros are real and, unlike the scalar situation, where the classical arguments on the zeros rely on orthogonality, we do not know how to get reality just from that (but see the remark after Theorem 2.25). Lemma 2.21. Let C(z) be an l l matrix-valued function analytic near z =0. Let × k = dim(ker(C(0))). (2.77) Then det(C(z)) has a zero at z =0 of order at least k. Remarks. 1. Even in the 1 1 case, where k = 1, the zeros can clearly × be of higher order than k since c11(z) can have a zero of any order! 2. The temptation to take products of eigenvalues will lead at best 0 z 0 z2 to a complicated proof as the cases C(z)=( 1 0 ) and C(z) = 1 0 illustrate.  Proof. Let e1 ...el be an orthonormal basis with e1 ...ek ker(C(0)). By Hadamard’s inequality (see Bhatia [13]), ∈ det(C(z)) C(z)e . . . C(z)e | |≤k 1k k lk C z k ≤ | | since C(z)ej C z if j = 1,...,k and C(z)ej d for j = k + 1,...,lk. k ≤ | | k k ≤  The following goes back at least to [34]; see also [66, 165, 177, 178]. Theorem 2.22. We have that L detCl (P (z)) = detCnl (z J ). (2.78) n − n;F L Proof. By (2.70), if γ is such that pn (y)γ = 0, then ϕn,γ(y) is an eigenvector for Jn;F with eigenvalue y. Conversely, if ϕ is an eigenvector and γ is defined as that vector in Cl whose components are the first l components of ϕ, then a simple inductive argument shows ϕ = ϕn,γ(y) and then, by (2.70) and the fact that An is invertible, we see that L pn (y)γ = 0. This shows that for any y, dim ker(P L(y)) = dim ker(J y). (2.79) n n;F − By Lemma 2.21, if y is any eigenvalue of Jn;F of multiplicity k, then L det(Pn (z)) has a zero of order at least k at y. Now let us consider the polynomials in z on the left and right in (2.78). Since Jn;F is Hermitian, the sum of the multiplicities of the zeros on the right is nl. Since the polynomial on the left is of degree nl, by a counting argument it has 38 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON the same set of zeros with the same multiplicities as the polynomial on the right. Since both polynomials are monic, they are equal. 

L Corollary 2.23. All the zeros of det(Pn (z)) are real. Moreover, R L det(Pn (z)) = det(Pn (z)). (2.80)

Proof. Since Jn;F is Hermitian, all its eigenvalues are real, so (2.78) L implies the zeros of det(Pn (z)) are real. Thus, since the polynomial is monic, L L det(Pn (¯z)) = det(Pn (z)). (2.81) By Lemma 2.2(v), we have R L † Pn (z)= Pn (¯z) (2.82) since both sides are analytic and agree if z is real. Thus,

R L † L det(Pn (z)) = det(Pn (¯z) )= det(Pn (¯z)) proving (2.80).  The following appeared before in [178]; see also [165].

nl L Corollary 2.24. Let xn,j j=1 be the zeros of det(Pn (x)) counting multiplicity ordered by { } x x x . (2.83) n,1 ≤ n,2 ≤···≤ n,nl Then x x x . (2.84) n+1,j ≤ n,j ≤ n+1,j+l Remarks. 1. This is interlacing if l = 1 and replaces it for general l.

2. Using An invertible, one can show the inequalities in (2.84) are strict. Proof. The min-max principle [159] says that

xn,j = max sup f,Jn;F f Cnl (2.85) Cnl ⊥ L⊂ f∈L h i dim(L)≤jl kf=1k If P : C(n+1)l Cnl is the natural projection, then → Pf,J P f C(n+1)l = Pf,J P f Cnl and as L runs through h n+1;F i h n;F i all subspaces of C(n+1)l dimension at most jl, P [L] runs through all subspaces of dimension at most jl in Cnl, so (2.85) implies xn+1,j xn,j. Using the same argument on Jn;F and Jn+1;F shows x≤( J ) x ( J ). But x ( J −) = x − (J ) j − n;F ≥ j − n+1;F j − n;F − nl+1−j n;F and xj( Jn+1;F ) = x(n+1)l+1−j (Jn+1;F ). That yields the other inequality.− −  MATRIX ORTHOGONAL POLYNOMIALS 39

2.10. Lower Bounds on p and the Stieltjes–Weyl Formula for m. Next, we want to exploit (2.70) to prove uniform lower bounds on L pn (y)γ when y / cvh(σ(J)), the convex hull of the support of J, k k ∈ L −1 and thereby uniform bounds pn (y) . We will then use that to prove that for z / σ(J), we have k k ∈ L −1 L m(z) = lim pn (z) qn (z) (2.86) n→∞ − the matrix analogue of a formula that spectral theorists associate with Weyl’s definition of the m-function [193], although for the discrete case, it goes back at least to Stieltjes [181]. We begin by mimicking an argument from [172]. Let H = cvh(σ(J))=[c D,c + D] with − 1 D = 2 diam(H). (2.87)

By the definition of An, A = pR , (x c)pR D. (2.88) k nk khh n−1 − n iiRk ≤ Suppose y / H and let ∈ d = dist(y,H). (2.89) By the spectral theorem, for any vector ϕ , ∈ Hv ϕ, (J y)ϕ d ϕ 2. (2.90) |h − iHv | ≥ k k By (2.70), with ϕ = ϕn,γ(y), ϕ, (J y)ϕ A pL(y)γ pL (y)γ (2.91) |h − i|≤k nkk n kk n−1 k while n−1 ϕ 2 = pL(y)γ 2. (2.92) k k k j k j=0 X As in [172, Prop. 2.2], we get: Theorem 2.25. If y / H, for any γ, ∈ pL(y)γ d 1+ d 2 (n−1)/2 γ . (2.93) k n k ≥ D D k k In particular,    pL(y)−1 D . (2.94) k n k ≤ d L Remark. (2.93) implies det(pn (y)) = 0 if Im y > 0, providing another proof that its zeros are real. 6

L L By the discussion after (2.52), if Im z > 0, qn (z)+ pn (z)m(z) is in 2 L −1 ℓ , so goes to zero. Since pn (z) is bounded, we obtain: 40 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Corollary 2.26. For any z C = z Im z > 0 , ∈ + { | } L −1 L m(z) = lim pn (z) qn (z). (2.95) n→∞ − Remark. This holds for z / H. ∈ Taking adjoints using (2.82) and m(z)† = m(¯z), we see that R R −1 m(z) = lim qn (z)pn (z) . (2.96) n→∞ − 2.11. Wronskians of Vector-Valued Solutions. Let α, β be two vector-valued solutions (Cl) of (2.55) for n = 2, 3,... . Define their scalar Wronskian as (Euclidean inner product on Cl) W (α, β)= α , A β A α , β (2.97) n h n n n+1i−h n n+1 ni for n = 2, 3,... . One can obtain two matrix solutions by using α or β for one column and 0 for the other columns. The scalar Wronskian is just a matrix element of the resulting matrix Wronskian, so Wn is constant (as can also be seen by direct calculation). Here is an application:

Theorem 2.27. Let z0 R σess(J). For k = 0, 1, let qk be the multiplicity of z as an eigenvalue∈ \ of J (k). Then, q + q l. 0 0 1 ≤ Proof. If β˜ is an eigenfunction for J (1) and we define β by 0, n =1, β = (2.98) n β˜ , n 2, ( n−1 ≥ then β solves (2.55) for n 2. If α is an eigenfunction for J = J (0), ≥ it also solves (2.55). Since αn 0, βn 0, and An is bounded, W (α, β) 0 as n and so it→ is identically→ zero. But since β = 0, n → →∞ 1 0 = W (α, β)= α , A β = α , A β˜ . (2.99) 1 h 1 1 2i h 1 1 1i Let V (k) be the set of values of eigenfunctions of J (k) at n = 1. (2.99) says (0) (1) ⊥ V [A1V ] . (2.100) (k) ⊂ Since qk = dim(V ) and A1 is non-singular, (2.100) implies that q0 l q . ≤ − 1 2.12. The Order of Zeros/Poles of m(z).

Theorem 2.28. Let z0 R σess(J). For k = 0, 1, let qk be the multiplicity of z as an eigenvalue∈ \ of J (k). If q q 0, then det(m(z)) 0 1 − 0 ≥ has a zero at z = z0 of order q1 q0. If q1 q0 < 0, then det(m(z)) has a pole at z = z of order q −q . − 0 0 − 1 MATRIX ORTHOGONAL POLYNOMIALS 41

Remarks. 1. To say det(m(z)) has a zero at z = z0 of order 0 means it is finite and non-vanishing at z0! 2. Where dµ is a direct sum of scalar measures, so is m(z), and det(m(z)) is then a product. In the scalar case, m(z) has a pole at z0 (0) (1) if J has z0 as eigenvalue and a zero at z0 if J has z0 as eigenvalue. In the direct sum case, we see there can be cancellations, which helps explain why q q occurs. 1 − 0 3. Formally, one can understand this theorem as follows. Cramer’s rule suggests det(m(z)) = det(J (1) z)/ det(J (0) z). Even though det(J (k) z) is not well-defined in the− infinite case,− we expect a can- − cellation of zeros of order q1 and q0. For z0 / H, the convex hull of σ(J (0)), one can use (2.95) to prove the theorem∈ following this intu- ition. Spurious zeros in gaps of σ(J (0)) make this strategy difficult in gaps. 4. Unlike in Lemma 2.21, we can write m as a product of eigenvalues and analyze that directly because m(x) is self-adjoint for x real, which sidesteps some of the problems associated with non-trivial Jordan nor- mal forms. 5. This proof gives another demonstration that q + q l. 0 1 ≤ Proof. m(z) has a simple pole at z = z0 with a residue which is q0 and strictly negative definite on its range. Let f(z)=(z z )m(z). − 0 f is analytic near z0 and self-adjoint for z real and near z0. Thus, by eigenvalue perturbation theory [124, 159], f(z) has l eigenvalues

ρ1(z),...,ρl(z) analytic near z0 with ρ1, ρ2,...,ρq0 non-zero at z0 and

ρq0+1,...,ρl zero at z0. Thus, m(z) has l eigenvalues near z , λ (z) = ρ (z)/(z z ), where 0 j j − 0 λ1,...,λq0 have simple poles and the others are regular. −1 By Proposition 2.14, m(z) has a simple pole at z0 with residue of −1 rank q1 (because A1 is non-singular), so m(z) has q1 eigenvalues with poles. That means q1 of λq0+1,...,λl have simple zeros at z0 and the l others are non-zero. Thus, det(m(z)) = j=1 λj(z) has a pole/zero of order q0 q1.  − Q 2.13. Resolvent of the Jacobi Matrix. Consider the matrix G (z)= pR , (x z)−1pR . nm hh n−1 − m−1iiR Theorem 2.29. One has ψL (z)pR (z), if n m, G (z)= n−1 m−1 ≥ (2.101) nm pL (z)ψR (z), if n m. ( n−1 m−1 ≤ 42 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Proof. We have ∞ G (z)J = zG (z), n = k, km mn kn 6 m=1 X∞ J G (z)= zG (z), n = k. nm mk nk 6 m=1 X Fix k 0 and let um(z) = Gkm(z). Then um(z) satisfies the equation (2.54)≥ for n = k, and so we can use Theorem 2.16 to describe this 6 2 solution. First suppose k > 1. As um is an ℓ solution and um satisfies (2.54) for n = 1, we have a (z)pR (z), m k, G (z)= k m−1 ≤ (2.102) km b (z)ψR (z), m k. ( k m−1 ≥ If k = 1, (2.102) also holds true. For m k, this follows by the same argument, and for m = k = 1, this is a≥ trivial statement. Next, similarly, let us consider vm(z) = Gmk(z). Then vm(z) solves (2.55) and so, using Theorem 2.17, we obtain pL (z)c (z), m k, G (z)= m−1 k ≤ (2.103) mk ψL (z)d (z), m k. ( m−1 k ≥ Comparing (2.102) and (2.103), we find R L ak(z)pm−1(z)= ψk−1(z)dm(z), R L bk(z)ψm−1(z)= pk−1(z)cm(z). R L As p0 = p0 = 1, it follows that L R ak(z)= ψk−1(z)d1(z), cm(z)= b1(z)ψm−1(z) and so we obtain ψL (z)d (z)pR (z) if n m, G (z)= n−1 1 m−1 ≥ (2.104) nm pL (z)b (z)ψR (z) if n m. ( n−1 1 m−1 ≤ It remains to prove that

b1(z)= d1(z)= 1. (2.105) Consider the case m = n = 1. By the definition of the resolvent, dµ(x) G (z)= pR, (J z)−1pR = = m(z). 11 hh 0 − 0 iiR x z Z − On the other hand, by (2.104), L R G11(z)= ψ0 (z)d1(z)p0 (z)= m(z)d1(z), MATRIX ORTHOGONAL POLYNOMIALS 43

L R G11(z)= p0 (z)b1(z)ψ0 (z)= b1(z)m(z), which proves (2.105). 

3. Matrix Orthogonal Polynomials on the Unit Circle 3.1. Definition of MOPUC. In this chapter, µ is an l l matrix- × valued measure on ∂D. , R and , L are defined as in the MOPRL case. Non-triviality is definedhh· ·ii as forhh· MOPRL.·ii We will always assume R L µ is non-trivial. We define monic matrix polynomials Φn , Φn by ap- R plying Gram–Schmidt to 1, z1,... , that is, Φn is the unique matrix polynomial zn1+ lower order{ with } zk1, ΦR =0 k =0, 1,...,n 1. (3.1) hh n iiR − We will define the normalized MOPUC shortly. We will only consider the analogue of what we called type 1 for MOPRL because only those appear to be useful. Unlike in the scalar case, the monic polynomials do not appear much because it is for the normalized, but not monic, polynomials that the left and right Verblunsky coefficients are the same. 3.2. The Szeg˝oRecursion. Szeg˝o[184] included the scalar Szeg˝o recursion for the first time. It seems likely that Geronimus had it in- dependently shortly after Szeg˝o. Not knowing of the connection with this work, Levinson [138] rederived the recursion but with matrix co- efficients! So the results of this section go back to 1947. For a matrix polynomial Pn of degree n, we define the reversed poly- ∗ nomial Pn by ∗ n † Pn (z)= z Pn(1/z¯) . (3.2) Notice that ∗ ∗ (Pn ) = Pn (3.3) and for any α , ∈Ml ∗ ∗ † ∗ † ∗ (αPn) = Pn α , (Pnα) = α Pn . (3.4) Lemma 3.1. We have f,g = g, f † , f,g = g, f † (3.5) hh iiL hh iiL hh iiR hh iiR and f ∗,g∗ = f,g † , f ∗,g∗ = f,g † . (3.6) hh iiL hh iiR hh iiR hh iiL Proof. The first and second identities follow immediately from the def- inition. The third identity is derived as follows:

f ∗,g∗ = einθg(θ)† dµ(θ)(einθf(θ)†)† hh iiL Z 44 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

= einθg(θ)† dµ(θ) e−inθf(θ) Z = g(θ)† dµ(θ) f(θ) Z = g, f hh iiR = f,g † . hh iiR The proof of the last identity is analogous. 

Lemma 3.2. If Pn has degree n and is left-orthogonal with respect to n R ∗ z1,...z 1, then Pn = c(Φn ) for some suitable matrix c. Proof. By assumption, 0 = P , zj1 = (zj1)∗,P ∗ = zn−j1,P ∗ for 1 j n. hh n iiL hh n iiR hh n iiR ≤ ≤ ∗ n−1 Thus, Pn is right-orthogonal with respect to 1, z1,...,z 1 and hence R it is a right-multiple of Φn . Consequently, Pn is a left- multiple of R ∗  (Φn ) . Let us define normalized polynomials by L R L L L R R R ϕ0 = ϕ0 = 1, ϕn = κn Φn and ϕn = Φn κn where the κ’s are defined according to the normalization condition ϕR,ϕR = δ 1 ϕL,ϕL = δ 1 hh n miiR nm hh n miiL nm along with (a type 1 condition) L L −1 R −1 R κn+1(κn ) > 0 and (κn ) κn+1 > 0. (3.7) L Notice that κn are determined by the normalization condition up to multiplication on the left by unitaries; these unitaries can always be uniquely chosen so as to satisfy (3.7). Now define L L L −1 R R −1 R ρn = κn (κn+1) and ρn =(κn+1) κn . L R Notice that as inverses of positives matrices, ρn > 0 and ρn > 0. In particular, we have that L L L −1 R R R −1 κn =(ρn−1 ...ρ0 ) and κn =(ρ0 ...ρn−1) . L,R Theorem 3.3 (Szeg˝oRecursion). (a) For suitable matrices αn , one has zϕL ρLϕL =(αL)†ϕR,∗, (3.8) n − n n+1 n n zϕR ϕR ρR = ϕL,∗(αR)†. (3.9) n − n+1 n n n MATRIX ORTHOGONAL POLYNOMIALS 45

L R (b) The matrices αn and αn are equal and will henceforth be denoted by αn.

(c) ρL =(1 α† α )1/2 and ρR =(1 α α† )1/2. n − n n n − n n L n+1 L Proof. (a) The matrix polynomial zϕn has leading term z κn . On L L the other hand, the matrix polynomial ρn ϕn+1 has leading term n+1 L L L z ρn κn+1. By definition of ρn , these terms are equal. Consequently, L L L zϕn ρn ϕn+1 is a matrix polynomial of degree at most n. Notice that it is− left-orthogonal with respect to z1,...zn1 since zϕL ρLϕL , zj1 = ϕL, zj−11 ρLϕL , zj1 = 0 0 = 0. hh n − n n+1 iiL hh n iiL − hh n n+1 iiL − Now apply Lemma 3.2. The other claim is proved in the same way.

(b) By part (a) and identities established earlier, L † L † (αn ) = 0 +(αn ) 1 = ϕR,∗, ρLϕL +(αL)† ϕR,ϕR hh n n n+1iiL n hh n n iiR = ϕR,∗, ρLϕL +(αL)† ϕR,∗,ϕR,∗ by (3.6) hh n n n+1iiL n hh n n iiL = ϕR,∗, ρLϕL +(αL)†ϕR,∗ hh n n n+1 n n iiL = ϕR,∗,zϕL hh n n iiL = zϕR,ϕL,∗ † (using the (n + 1)-degree *) hh n n iiR = ϕR ρR + ϕL,∗(αR)†,ϕL,∗ † hh n+1 n n n n iiR = ϕR ρR,ϕL,∗ † + ϕL,∗(αR)†,ϕL,∗ † hh n+1 n n iiR hh n n n iiR = 0 + ϕL,∗,ϕL,∗(αR)† hh n n n iiR = ϕL,∗,ϕL,∗ (αR)† hh n n iiR n = ϕL,ϕL (αR)† hh n n iiL n R † =(αn ) . (c) Using parts (a) and (b), we see that 1 = zϕL,zϕL hh n n iiL = ρLϕL + α† ϕR,∗, ρLϕL + α† ϕR,∗ hh n n+1 n n n n+1 n n iiL = ρL ϕL ,ϕL (ρL)† + α† ϕR,∗,ϕR,∗ α n hh n+1 n+1iiL n nhh n n iiL n =(ρL)2 + α† ϕR,ϕR α n nhh n n iiR n L 2 † =(ρn ) + αnαn. A similar calculation yields the other claim.  46 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

The matrices αn will henceforth be called the Verblunsky coefficients L associated with the measure dµ. Since ρn is invertible, we have α < 1. (3.10) k nk We will eventually see (Theorem 3.12) that any set of αn’s obeying (3.10) occurs as the set of Verblunsky coefficients for a unique non- trivial measure. Note that the Szeg˝orecursion for the monic orthogonal polynomials is zΦL ΦL =(κL)−1α† (κR)†ΦR,∗, n − n+1 n n n n (3.11) zΦR ΦR = ΦL,∗(κL)†α† (κR)−1, n − n+1 n n n n so the coefficients in the L and R equations are not equal and depend on all the αj, j =1,...,n. Let us write the Szeg˝orecursion in matrix form, starting with left- orthogonal polynomials. By Theorem 3.3, ϕL =(ρL)−1[zϕL α† ϕR,∗], n+1 n n − n n ϕR = [zϕR ϕL,∗α† ](ρR)−1. n+1 n − n n n which implies that ϕL = z(ρL)−1ϕL (ρL)−1α† ϕR,∗, (3.12) n+1 n n − n n n ϕR,∗ =(ρR)−1ϕR,∗ z(ρR)−1α ϕL. (3.13) n+1 n n − n n n In other words, ϕL ϕL n+1 = AL(α , z) n (3.14) ϕR,∗ n ϕR,∗  n+1  n  where z(ρL)−1 (ρL)−1α† AL(α, z)= z(ρR)−1α −(ρR)−1 −  and ρL = (1 α†α)1/2, ρR = (1 αα†)1/2. Note that, for z = 0, the inverse of AL(−α, z) is given by − 6 z−1(ρL)−1 z−1(ρL)−1α† AL(α, z)−1 = (ρR)−1α (ρR)−1   which gives rise to the inverse Szeg˝orecursion (first emphasized in the scalar and matrix cases by Delsarte el al. [37])

L −1 L −1 L −1 L −1 † R,∗ ϕn = z (ρn ) ϕn+1 + z (ρn ) αnϕn+1, R,∗ R −1 L R −1 R,∗ ϕn =(ρn ) αnϕn+1 +(ρn ) ϕn+1. MATRIX ORTHOGONAL POLYNOMIALS 47

For right-orthogonal polynomials, we find the following matrix for- mulas. By Theorem 3.3, ϕR = zϕR(ρR)−1 ϕL,∗α† (ρR)−1, (3.15) n+1 n n − n n n ϕL,∗ = ϕL,∗(ρL)−1 zϕRα (ρL)−1. (3.16) n+1 n n − n n n In other words, R L,∗ R L,∗ R ϕn+1 ϕn+1 = ϕn ϕn A (αn, z) where   z(ρR)−1 zα(ρL)−1 AR(α, z)= α†(ρR)−1 − (ρL)−1 −  For z = 0, the inverse of AR(α, z) is given by 6 z−1(ρR)−1 (ρR)−1α AR(α, z)−1 = z−1(ρL)−1α† (ρL)−1   and hence R −1 R R −1 −1 L,∗ L −1 † ϕn = z ϕn+1(ρn ) + z ϕn+1(ρn ) αn, (3.17) L,∗ R R −1 L,∗ L −1 ϕn = ϕn+1(ρn ) αn + ϕn+1(ρn ) . (3.18) 3.3. Second Kind Polynomials. In the scalar case, second kind polynomials go back to Geronimus [89, 91, 92]. For n 1, let us L,R ≥ introduce the second kind polynomials ψn by eiθ + z ψL(z)= (ϕL(eiθ) ϕL(z)) dµ(θ), (3.19) n eiθ z n − n Z − eiθ + z ψR(z)= dµ(θ)(ϕR(eiθ) ϕR(z)). (3.20) n eiθ z n − n Z − L R For n = 0, let us set ψ0 (z) = ψ0 (z) = 1. For future reference, let us display the first polynomials of each series: ϕL(z)=(ρL)−1(z α†), ϕR(z)=(z α†)(ρR)−1, (3.21) 1 0 − 0 1 − 0 0 L L −1 † R † R −1 ψ1 (z)=(ρ0 ) (z + α0), ψ1 (z)=(z + α0)(ρ0 ) . (3.22) L,∗ R,∗ We will also need formulas for ψn and ψn , n 1. These formulas follow directly from the above definition and from≥ eiθ +1/z¯ eiθ + z = . eiθ 1/z¯ −eiθ z  −  − Indeed, we have eiθ + z ψL,∗(z)= zn dµ(θ)(ϕL(1/z¯)† ϕL(eiθ)†), (3.23) n eiθ z n − n Z − 48 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

eiθ + z ψR,∗(z)= zn (ϕR(1/z¯)† ϕR(eiθ)†) dµ(θ). (3.24) n eiθ z n − n Z − Proposition 3.4. The second kind polynomials obey the recurrence relations L L −1 L † R,∗ ψn+1(z)=(ρn ) (zψn (z)+ αnψn (z)), (3.25) R,∗ R −1 L R,∗ ψn+1(z)=(ρn ) (zαnψn (z)+ ψn (z)), (3.26) and R R L,∗ † R −1 ψn+1(z)=(zψn (z)+ ψn (z)αn)(ρn ) , (3.27) L,∗ L,∗ R L −1 ψn+1(z)=(ψn (z)+ zψn (z)αn)(ρn ) (3.28) for n 0. ≥ Proof. 1. Let us check (3.25) for n 1. Denote the right-hand side of ˜L ≥ L R,∗ (3.25) by ψn+1(z). Using the recurrence relations for ϕn , ϕn and the L definition (3.19) of ψn , we find eiθ + z ψL (z) ψ˜L (z)= A (θ, z) dµ(θ) n+1 − n+1 eiθ z n Z − where A (θ, z)= ϕL (eiθ) ϕL (z) n n+1 − n+1 (ρL)−1[zϕL(eiθ) zϕL(z)+ α† znϕR(1/z¯)† α† znϕR(eiθ)†] − n n − n n n − n n =(ρL)−1[eiθϕL(eiθ) α† ϕR,∗(eiθ) zϕL(z)+ α† ϕR,∗(z) n n − n n − n n n zϕL(eiθ)+ zϕL(z) α† ϕR,∗(z)+ α† znϕR(eiθ)†] − n n − n n n n =(ρL)−1[(eiθ z)ϕL(eiθ)+ α† (zne−inθ 1)ϕR,∗(eiθ)]. n − n n − n Using the orthogonality relations

L iθ −imθ R,∗ iθ −i(m+1)θ ϕn (e ) dµ(θ)e = ϕn (e ) dµ(θ)e = 0, (3.29) Z Z m =0, 1,...,n 1, and the formula − einθ zn − = ei(n−1)θ + ei(n−2)θz + + zn−1 eiθ z ··· − we obtain eiθ + z ρL A (θ, z) dµ(θ)= (eiθϕL(eiθ) α† ϕR,∗(eiθ)) dµ(θ) n eiθ z n n − n n Z − Z L L iθ = ρn ϕn+1(e ) dµ(θ)= 0. Z MATRIX ORTHOGONAL POLYNOMIALS 49

2. Let us check (3.26) for n 1. Denote the right-hand side of (3.26) ˜R,∗ ≥ by ψn+1(z). Similarly to the argument above, we find eiθ + z ψR,∗ (z) ψ˜R,∗ (z)= B (θ, z) dµ(θ), n+1 − n+1 eiθ z n Z − where B (θ, z)= zn+1ϕR (1/z¯)† zn+1ϕR (eiθ)† n n+1 − n+1 (ρR)−1[zα ϕL(eiθ) zα ϕL(z)+ znϕR(1/z¯)† znϕR(eiθ)†] − n n n − n n n − n =(ρR)−1[ϕR,∗(z) α zϕL(z) zn+1e−i(n+1)θϕR,∗(eiθ)+ zn+1e−inθα ϕL(eiθ) n n − n n − n n n zα ϕL(eiθ)+ zα ϕL(z) ϕR,∗(z)+ zne−inθϕR,∗(eiθ)] − n n n n − n n =(ρR)−1[zα (zne−inθ 1)ϕL(eiθ)+ zne−inθ(1 ze−iθ)ϕR,∗(eiθ)]. n n − n − n Using the orthogonality relations (3.29), we get eiθ + z ρR B (θ, z) dµ(θ)= n eiθ z n Z − = zn+1 (e−i(n+1)θϕR,∗(eiθ) α e−inθϕL(eiθ)) dµ(θ) n − n n Z n+1 R −i(n+1)θ R,∗ iθ = z ρn e ϕn+1(e ) dµ(θ)= 0. Z 3. Relations (3.25) and (3.26) can be checked for n = 0 by a direct substitution of (3.21) and (3.22). 4. We obtain (3.27) and (3.28) from (3.25) and (3.26) by applying the -operation.  ∗ Writing the above recursion in matrix form, we get ψL ψL ψL 1 n+1 = AL( α , z) n , 0 = ψR,∗ − n ψR,∗ ψR,∗ 1  n+1  n   0    for left-orthogonal polynomials and ψR ψL,∗ = ψR ψL,∗ AR( α , z), ψR ψL,∗ = 1 1 . n+1 n+1 n n − n 0 0 for right-orthogonal polynomials.   Equivalently, ψL ψL ψL 1 n+1 = AL(α , z) n , 0 = (3.30) ψR,∗ n ψR,∗ ψR,∗ 1 − n+1 − n  − 0  −  and

R L,∗ R L,∗ R R L,∗ ψ ψ = ψ ψ A (αn, z), ψ ψ = 1 1 . n+1 − n+1 n − n 0 − 0 −     50 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

L,R In particular, we see that the second kind polynomials ψn cor- respond to Verblunsky coefficients αn . We have the following Wronskian-type relations: {− } Proposition 3.5. For n 0 and z C, we have ≥ ∈ n L L,∗ L L,∗ 2z 1 = ϕn (z)ψn (z)+ ψn (z)ϕn (z), (3.31) n R,∗ R R,∗ R 2z 1 = ψn (z)ϕn (z)+ ϕn (z)ψn (z). (3.32) 0 = ϕL(z)ψR(z) ψL(z)ϕR(z), (3.33) n n − n n 0 = ψR,∗(z)ϕL,∗(z) ϕR,∗(z)ψL,∗(z). (3.34) n n − n n Proof. We prove this by induction. The four identities clearly hold for n = 0. Suppose (3.31)–(3.34) hold for some n 0. Then, ≥ L L,∗ L L,∗ ϕn+1ψn+1 + ψn+1ϕn+1 = −1 = ρL [(zϕL α† ϕR,∗)(ψL,∗ + zψRα ) n n − n n n n n L † R,∗ L,∗ R L −1 +( zψ + α ψ )(ϕ zϕ αn)] ρ n n n n − n n L −1 L L,∗ L L,∗ † R,∗ R R,∗ R = ρ [z(ϕ ψ + ψ ϕ ) zα (ψ ϕ + ϕ ψ )αn n n n n n − n n n n n † R,∗ L,∗ R,∗ L,∗ 2 L R L R L −1 +α (ψ ϕ ϕ ψ )+ z (ϕ ψ ψ ϕ )αn] ρ n n n − n n n n − n n n L −1 n+1 † L −1 n+1 = ρ [2z (1 α αn)] ρ =2z 1,  n − n n where we used (3.31)–(3.34) for n in the third step. Thus, (3.31) holds for n + 1. For (3.32), we note that R,∗ R R,∗ R ψn+1ϕn+1 + ϕn+1ψn+1 = =(ρR)−1[(zα ψL + ψR,∗)(zϕR ϕL,∗α† ) n n n n n − n n −1 +(ϕR,∗ zα ϕL)(zψR + ψL,∗α† )] ρR n − n n n n n n =(ρR)−1[z(ψR,∗ϕR + ϕR,∗ψR) zα (ψLϕL,∗ + ϕLψL,∗)α† n n n n n − n n n n n n −1 + z2α (ψLϕR ϕLψR) (ψR,∗ϕL,∗ ϕR,∗ψL,∗)α† ] ρR n n n − n n − n n − n n n n R −1 n+1 † R −1 n+1 =(ρ ) 2z (1 αnα ) ρ =2z 1,  n − n n again using (3.31)–(3.34) for n in the third step. Next, ϕL ψR ψL ϕR = n+1 n+1 − n+1 n+1 =(ρL)−1[(zϕL α† ϕR,∗)(zψR + ψL,∗α† ) n n − n n n n n (zψL + α† ψR,∗)(zϕR ϕL,∗α† )](ρR)−1 − n n n n − n n n =(ρL)−1[z2(ϕLψR ψLϕR) α† (ϕR,∗ψL,∗ ψR,∗ϕL,∗)α† n n n − n n − n n n − n n n MATRIX ORTHOGONAL POLYNOMIALS 51

zα† (ϕR,∗ψR + ψR,∗ϕR)+ z(ϕLψL,∗ + ψLϕL,∗)α† ](ρR)−1 − n n n n n n n n n n n = 0 which implies first (3.33) for n+1 and then, by applying the -operation of order 2n + 2, also (3.34) for n + 1. This concludes the proof∗ of the proposition.  3.4. Christoffel–Darboux Formulas. Proposition 3.6. (a) (CD)-left orthogonal n (1 ξz¯ ) ϕL(ξ)†ϕL(z)= ϕR,∗(ξ)†ϕR,∗(z) ξzϕ¯ L(ξ)†ϕL(z) − k k n n − n n Xk=0 = ϕR,∗ (ξ)†ϕR,∗ (z) ϕL (ξ)†ϕL (z) n+1 n+1 − n+1 n+1 (b) (CD)-right orthogonal n (1 ξz¯ ) ϕR(z)ϕR(ξ)† = ϕL,∗(z)ϕL,∗(ξ)† ξzϕ¯ R(z)ϕR(ξ)† − k k n n − n n Xk=0 = ϕL,∗ (z)ϕL,∗ (ξ)† ϕR (z)ϕR (ξ)† n+1 n+1 − n+1 n+1 (c) (Mixed CD)-left orthogonal n (1 ξz¯ ) ψL(ξ)†ϕL(z)=2 1 ψR,∗(ξ)†ϕR,∗(z) ξzψ¯ L(ξ)†ϕL(z) − k k · − n n − n n Xk=0 =2 1 ψR,∗ (ξ)†ϕR,∗ (z) ψL (ξ)†ϕL (z) · − n+1 n+1 − n+1 n+1 (d) (Mixed CD)-right orthogonal n (1 ξz¯ ) ϕR(z)ψR(ξ)† =2 1 ϕL,∗(z)ψL,∗(ξ)† ξzϕ¯ R(z)ψR(ξ)† − k k · − n n − n n Xk=0 =2 1 ϕL,∗ (z)ψL,∗ (ξ)† ϕR (z)ψR (ξ)† · − n+1 n+1 − n+1 n+1 Remark. Since the ψ’s are themselves MOPUCs, the analogue of (a) and (b), with all ϕ’s replaced by ψ’s, holds. Proof. (a) Write ϕL(z) 1 0 ξz¯ 1 0 F L(z)= n , J = , J˜ = . n ϕR,∗(z) 0 1 0 1  n   −   −  Then, L L L Fn+1(z)= A (αn, z)Fn (z) and AL(α,ξ)†JAL(α, z)= 52 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

ξ¯(ρL)−1 ξα¯ †(ρR)−1 z(ρL)−1 (ρL)−1α† = α(ρL)−1 − (ρR)−1 z(ρR)−1α − (ρR)−1 −   −  ξz¯ (ρL)−2 ξzα¯ †(ρR)−2α ξ¯(ρL)−2α† + ξα¯ †(ρR)−2 = zα(ρL)−2 + z(ρR)−2α− α(ρL)−2α† (ρR)−2  − −  ξz¯ 1 0 = = J.˜ 0 1  −  Thus, L † L L † L † L L Fn+1(ξ) JFn+1(z)= Fn (ξ) A (αn,ξ) JA (αn, z)Fn (z) L † ˜ L = Fn (ξ) JFn (z) and hence ϕL (ξ)†ϕL (z) ϕR,∗ (ξ)†ϕR,∗ (z)= ξzϕ¯ L(ξ)†ϕL(z) ϕR,∗(ξ)†ϕR,∗(z) n+1 n+1 − n+1 n+1 n n − n n which shows that the last two expressions in (a) are equal. Denote L their common value by Qn (z,ξ). Then, QL(z,ξ) QL (z,ξ)= ϕR,∗(ξ)†ϕR,∗(z) ξzϕ¯ L(ξ)†ϕL(z) n − n−1 n n − n n ϕR,∗(ξ)†ϕR,∗(z)+ ϕL(ξ)†ϕL(z) − n n n n = (1 ξz¯ )ϕL(ξ)†ϕL(z). − n n L Summing over n completes the proof since Q−1(z,ξ)=0. R R L,∗ (b) The proof is analogous to (a): Write Fn (z)= ϕn (z) ϕn (z) . Then, F R (z)= F R(z)AR(α , z) and AR(α, z)JAR(α,ξ)† = J˜. Thus, n+1 n n  R R † R R R † R † Fn+1(z)JFn+1(ξ) = Fn (z)A (αn, z)JA (αn,ξ) Fn (ξ) R ˜ R † = Fn (z)JFn (ξ) and hence ϕR (z)ϕR (ξ)† ϕL,∗ (z)ϕL,∗ (ξ)† = ξzϕ¯ R(z)ϕR(ξ)† ϕL,∗(z)ϕL,∗(ξ)† n+1 n+1 − n+1 n+1 n n − n n which shows that the last two expressions in (b) are equal. Denote R their common value by Qn (z,ξ). Then, QR(z,ξ) QR (z,ξ)=(1 ξz¯ )ϕR(z)ϕR(ξ)† n − n−1 − n n and the assertion follows as before. (c) Write ψL(z) F˜L(z)= n n ψR,∗(z) − n  L,R with the second kind polynomials ψn . As in (a), we see that ˜L † L ˜L † L † L L Fn+1(ξ) JFn+1(z)= Fn (ξ) A (αn,ξ) JA (αn, z)Fn (z) MATRIX ORTHOGONAL POLYNOMIALS 53

˜L † ˜ L = Fn (ξ) JFn (z) and hence L † L R,∗ † R,∗ ¯ L † L R,∗ † R,∗ ψn+1(ξ) ϕn+1(z)+ψn+1(ξ) ϕn+1(z)= ξzψn (ξ) ϕn (z)+ψn (ξ) ϕn (z). Denote Q˜L(z,ξ)=2 1 ψR,∗ (ξ)†ϕR,∗ (z) ψL (ξ)†ϕL (z). n · − n+1 n+1 − n+1 n+1 Then, Q˜L(z,ξ) Q˜L (z,ξ)= ψR,∗(ξ)†ϕR,∗(z) ξzψ¯ L(ξ)†ϕL(z) n − n−1 − n n − n n R,∗ † R,∗ L † L + ψn (ξ) ϕn (z)+ ψn (ξ) ϕn (z) = (1 ξz¯ )ψL(ξ)†ϕL(z) − n n and the assertion follows as before. (d) Write F˜R(z)= ψR(z) ψL,∗(z) . As in (b), we see that n n − n R ˜R † R R R † ˜R † Fn+1(z)JFn+1(ξ) = Fn (z)A (αn, z)JA (αn,ξ) Fn (ξ) R ˜ ˜R † = Fn (z)JFn (ξ) and hence R R † L,∗ L,∗ † ¯ R R † L,∗ L,∗ † ϕn+1(z)ψn+1(ξ) +ϕn+1(z)ψn+1(ξ) = ξzϕn (z)ψn (ξ) +ϕn (z)ψn (ξ) . With Q˜R(z,ξ)=2 1 ϕL,∗ (z)ψL,∗ (ξ)† ϕR (z)ψR (ξ)†, we have n · − n+1 n+1 − n+1 n+1 Q˜R(z,ξ) Q˜R (z,ξ)=(1 ξz¯ )ϕR(z)ψR(ξ)† n − n−1 − n n and we conclude as in (c).  3.5. Zeros of MOPUC. Our main result in this section is: Theorem 3.7. All the zeros of det(ϕR(z)) lie in D = z : z < 1 . n { | | } We will also prove: Theorem 3.8. For each n, R L det(ϕn (z)) = det(ϕn (z)). (3.35) The scalar analogue of Theorem 3.7 has seven proofs in [167]! The simplest is due to Landau [135] and its MOPUC analogue is Theo- rem 2.13.7 of [167]. There is also a proof in Delsarte et al. [37] who attribute the theorem to Whittle [194]. We provide two more proofs here, not only for their intrinsic interest: our first proof we need be- cause it depends only on the recursion relation (it is related to the proof of Delsarte et al. [37]). The second proof is here since it relates zeros to eigenvalues of a cutoff CMV matrix. Theorem 3.9. We have 54 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

D R,∗ L,∗ R L (i) For z ∂ , all of ϕn (z), ϕn (z), ϕn (z), ϕn (z) are invertible. ∈ D L R,∗ −1 ∗,L −1 R (ii) For z ∂ , ϕn (z)(ϕn (z)) and (ϕn (z)) ϕn (z) are unitary. ∈ D R,∗ L,∗ (iii) For z , ϕn (z) and ϕn (z) are invertible. ∈ D L R,∗ −1 ∗,L −1 R (iv) For z , ϕn (z)(ϕn (z)) and (ϕn (z)) ϕn (z) are of norm at most 1∈and, for n 1, strictly less than 1. R,≥∗ L,∗ C D (v) All zeros of det(ϕn (z)) and det(ϕn (z)) lie in . R L D \ (vi) All zeros of det(ϕn (z)) and det(ϕn (z)) lie in . Remark. (vi) is our first proof of Theorem 3.7. Proof. All these results are trivial for n = 0, so we can hope to use an inductive argument. So suppose we have the result for n 1. By (3.13), − ϕR,∗ =(ρR )−1(1 zα ϕL (ϕR,∗ )−1)ϕR,∗ . (3.36) n n−1 − n−1 n−1 n−1 n−1 Since αn−1 < 1, if z 1, each factor on the right of (3.36) is invert- | | | | ≤ R,∗ ible. This proves (i) and (iii) for ϕn and a similar argument works for L,∗ iθ R iθ inθ R,∗ iθ † ϕn . If z = e , ϕn (e ) = e ϕn (e ) is also invertible, so we have (i) and (iii) for n. Next, we claim that if z ∂D, then ∈ R,∗ † R,∗ L † L ϕn (z) ϕn (z)= ϕn (z) ϕn (z). (3.37) This follows from taking z = ξ ∂D in Proposition 3.6(a). Given that R,∗ ∈ ϕn (z) is invertible, this implies L R,∗ −1 † L R,∗ −1 1 =(ϕn (z)ϕn (z) ) (ϕn (z)ϕn (z) ) (3.38) proving the first part of (ii) for n. The second part of (ii) is proven similarly by using Proposition 3.6(b). For z D, let ∈ L R,∗ −1 F (z)= ϕn (z)ϕn (z) . Then F is analytic in D, continuous in D, and F (z) = 1 on ∂D, so (iv) follows from the maximum principle. k k R,∗ D Since ϕn (z) is invertible on , its det is non-zero there, proving (v). (vi) then follows from

R nl R,∗ det(ϕn (z)) = z det(ϕn (1/z¯)) . (3.39) 

l Let be the C -valued functions on ∂D and n the span of the Cl-valuedV polynomials of degree at most n, so V dim( )= Cl(n+1). Vn l Let ∞ be the set n n of all C -valued polynomials. Let πn be the projectionV onto in∪ theV inner product (1.35). Vn V MATRIX ORTHOGONAL POLYNOMIALS 55

It is easy to see that ⊥ = ΦR(z)v : v Cl (3.40) Vn ∩ Vn−1 { n ∈ } l R since z , Φn (z)v = 0 for l = 0,...,n 1 and the dimensions on the left andh right of (3.40)i coincide. −⊥ can also be described as the Vn ∩ Vn−1 set of (v†ΦL(z))† for v Cl. n ∈ We define Mz : n−1 n or ∞ ∞ as the operator of multipli- cation by z. V → V V → V Theorem 3.10. For all n, we have

R detCl (Φ (z)) = det − (z1 π M π ). (3.41) n Vn 1 − n−1 z n−1

Remarks. 1. Since Mz 1, (3.41) immediately implies zeros of R D k k ≤ det(ϕn (z)) lie in , and a small additional argument proves they lie in D. As we will see, this also implies Theorem 3.8.

2. Of course, πn−1Mzπn−1 is a cutoff CMV matrix if written in a CMV basis. Proof. If Q , then by (3.40), ∈ Vn−k π [(z z )kQ]=0 (z z )kQ = ΦR(z)v (3.42) n−1 − 0 ⇔ − 0 n Cl R R R for some v . Thus writing det(Φn (z)) =Φn (z)v1 Φn (z)vl in a ∈ R ∧···∧ R Jordan basis for Φn (z), we see that the order of the zeros of det(Φn (z)) at z0 is exactly the order of z0 as an algebraic eigenvalue of πn−1Mzπn−1, that is, the order of z0 as a zero of the right side of (3.41). Since both sides of (3.41) are monic polynomials of degree nl and their zeros including multiplicity are the same, we have proven (3.41). 

Proof of Theorem 3.8. On the right side of (3.42), we can put L † † L (Φn (z)v ) and so conclude (3.41) holds with Φn (z) on the left. ∗ ∗ This proves (3.35) if ϕ is replaced by Φ. Since αj αj and αjαj are L R L R unitarily equivalent, det(ρj ) = det(ρj ). Thus, det(κn ) = det(κn ), and we obtain (3.35) for ϕ. 

It is a basic fact (Theorem 1.7.5 of [167]) that for the scalar case, any set of n zeros in D are the zeros of a unique OPUC Φn and any monic polynomial with all its zeros in D is a monic OPUC. It is easy to see that any set of nl zeros in D is the set of zeros of an OPUC Φn, but certainly not unique. It is an interesting open question to clarify what matrix monic OPs are monic MOPUCs. 56 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

n−1 Dn 3.6. Bernstein–Szeg˝oApproximation. Given αj j=0 , we R L { } ∈ use Szeg˝orecursion to define polynomials ϕj ,ϕj for j = 0, 1,...,n. We define a measure dµn on ∂D by dθ dµ (θ) = [ϕR(eiθ)ϕR(eiθ)†]−1 (3.43) n n n 2π Notice that (3.37) can be rewritten R iθ R iθ † L iθ † L iθ ϕn (e )ϕn (e ) = ϕn (e ) ϕn (e ). (3.44) We use here and below the fact that the proof of Theorem 3.9 only depends on Szeg˝orecursion and not on the a priori existence of a mea- sure. That theorem also shows the inverse in (3.43) exists. Thus, dθ dµ (θ) = [ϕL(eiθ)†ϕL(eiθ)]−1 . (3.45) n n n 2π

Theorem 3.11. The measure dµn is normalized (i.e., µn(∂D) = 1) and its right MOPUC for j =0,...,n are ϕR n , and for j > n, { j }j=0 R j−n R ϕj (z)= z ϕn (z). (3.46)

The Verblunsky coefficients for dµn are α , j n, α (dµ )= j ≤ (3.47) j n 0, j n +1. ( ≥ Remarks. 1. In the scalar case, one can multiply by a constant and renormalize, and then prove the constant is 1. Because of commuta- tivity issues, we need a different argument here. L 2. Of course, using (3.45), ϕn are left MOPUC for dµn. 3. Our proof owes something to the scalar proof in [80].

Proof. Let , R be the inner product associated with µn. By a direct computation,hh· ·iiϕR,ϕR = 1, and for j =0, 1,...,n 1, hh n n iiR − 1 2π zj,ϕR = e−ijθ(ϕR(eiθ)†)−1 dθ hh n iiR 2π n Z0 1 = zn−j−1(ϕR,∗(z))−1 dz = 0 2πi n I R,∗ −1 D D by analyticity of ϕn (z) in (continuity in ). R This proves ϕn is a MOPUC for dµn (and a similar calculation works for the right side of (3.46) if j n). By the inverse Szeg˝orecursion and induction downwards, ϕR n−≥1 are also OPs, and by the Szeg˝orecur- { j }j=0 sion, they are normalized. In particular, since ϕR 1 is normalized, 0 ≡ dµn is normalized.  MATRIX ORTHOGONAL POLYNOMIALS 57

3.7. Verblunsky’s Theorem. We can now prove the analogue of Favard’s theorem for MOPUC; the scalar version is called Verblun- sky’s theorem in [167] after [192]. A history and other proofs can be found in [167]. The proof below is essentially the matrix analogue of that of Geronimus [90] (rediscovered in [37, 80]). Delsarte et al. [37] presented their proof in the MOPUC case and they seem to have been the first with a matrix Verblunsky theorem. One can extend the CMV and the Geronimus theorem proofs from the scalar case to get alternate proofs of the theorem below. Theorem 3.12 (Verblunsky’s Theorem for MOPUC). Any sequence α ∞ D∞ is the set of Verblunsky coefficients of a unique measure. { j}j=0 ∈ R Proof. Uniqueness is easy, since the α’s determine the ϕj ’s and so the R Φj ’s which determine the moments. ∞ Given a set of αj j=0, let dµn be the measures of the last section. By compactness of{ l }l matrix-valued probability measures on ∂D, they × R ∞ have a weak limit. By using limits, ϕj j=0 are the right MOPUC for dµ and they determine the proper Verblunsky{ } coefficients.  3.8. Matrix POPUC. Analogously to the scalar case (see [15, 100, 119, 171, 197]), given any unitary β in , we define Ml ϕR(z; β)= zϕR (z) ϕL,∗ (z)β†. (3.48) n n−1 − n−1 As in the scalar case, this is related to the secular of unitary extensions of the cutoff CMV matrix. Moreover,

Theorem 3.13. Fix β. All the zeros of (ϕn(z; β)) lie on ∂D. Proof. If z < 1, ϕL,∗ (z) is invertible and | | n−1 ϕR(z; β)= ϕL,∗ (z)β†(1 zβϕL,∗ (z)−1ϕR (z)) n − n−1 − n−1 n−1 is invertible since the last factor differs from 1 by a strict contraction. A similar argument shows invertibility if z > 1. Thus, the only zeros of det( ) lie in ∂D. | |  · 3.9. Matrix-Valued Carath´eodory and Schur Functions. An an- alytic matrix-valued function F defined on D is called a (matrix-valued) 1 † Carath´eodory function if F (0) = 1 and Re F (z) 2 (F (z)+F (z) ) 0 for every z D. The following result can be found≡ in [44, Thm. 2.2.2].≥ ∈ Theorem 3.14 (Riesz–Herglotz). If F is a matrix-valued Carath´eodory function, then there exists a unique positive semi-definite matrix mea- sure dµ such that eiθ + z F (z)= dµ(θ). (3.49) eiθ z Z − 58 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

The measure dµ is given by the unique weak limit of the measures dµ (θ)=Re F (reiθ) dθ as r 1. Moreover, r 2π ↑ ∞ n F (z)= c0 +2 cnz n=1 X where −inθ cn = e dµ(θ). Z Conversely, if dµ is a positive semi-definite matrix measure, then (3.49) defines a matrix-valued Carath´eodory function. An analytic matrix-valued function f defined on D is called a (matrix-valued) Schur function if f(z)†f(z) 1 for every z D. This condition is equivalent to f(z)f(z)† 1≤ for every z D∈and to f(z) 1 for every z D. By the maximum≤ principle,∈ if f is notk constant,k ≤ the inequalities∈ are strict. The following can be found in [167, Prop. 4.5.3]: Proposition 3.15. The association f(z)= z−1(F (z) 1)(F (z)+ 1)−1, (3.50) − F (z)=(1 + zf(z))(1 zf(z))−1 (3.51) − sets up a one-one correspondence between matrix-valued Carath´eodory functions and matrix-valued Schur functions. Proposition 3.16. For z D, we have ∈ Re F (z)=(1 zf¯ (z)†)−1(1 z 2f(z)†f(z))(1 zf(z))−1 (3.52) − −| | − and the non-tangential boundary values Re F (eiθ) and f(eiθ) exist for Lebesgue almost every θ. dθ Write dµ(θ)= w(θ) 2π + dµs. Then, for almost every θ, w(θ)=Re F (eiθ) (3.53) and for a.e. θ, det(w(θ)) =0 if and only if f(eiθ)†f(eiθ) < 1. 6 Proof. The identity (3.52) follows from (3.51). The existence of the boundary values of f follows by application of the scalar result to the individual entries of f. Then (3.51) gives the boundary values of F . We also used the following fact: Away from a set of zero Lebesgue measure, det(1 zf(z)) has non-zero boundary values by general H∞ theory. − l (3.53) holds for η, F (z)η Cl and η,dµη Cl for any η C by the scalar result. We geth (3.53) byi polarization.h i From ∈ w(θ)=(1 e−iθf(eiθ)†)−1(1 f(eiθ)†f(eiθ))(1 eiθf(eiθ))−1 − − − MATRIX ORTHOGONAL POLYNOMIALS 59 it follows immediately that f(eiθ)†f(eiθ) < 1 implies det(w(θ)) > 0. Conversely, if f(eiθ)†f(eiθ) 1 but not f(eiθ)†f(eiθ) < 1, then det(1 f(eiθ)†f(eiθ)) = 0 and by our≤ earlier arguments det(1 e−iθf(eiθ)†)−−1 and det(1 eiθf(eiθ))−1 exist and are finite; hence det(−w(θ)) = 0. All previous statements− are true away from suitable sets of zero Lebesgue measure.  3.10. Coefficient Stripping, the Schur Algorithm, and Geron- imus’ Theorem. The matrix version of Geronimus’ theorem goes back at least to the book of Bakonyi–Constantinescu [6]. Let F (z) be the matrix-valued Carath´eodory function (3.49) (with the same mea- sure µ as the one used in the definition of , ). Let us denote hh· ·iiR L L L un (z)= ψn (z)+ ϕn (z)F (z), R R R un (z)= ψn (z)+ F (z)ϕn (z). We also define uL,∗(z)= ψL,∗(z) F (z)ϕL,∗(z), n n − n uR,∗(z)= ψR,∗(z) ϕR,∗(z)F (z). n n − n L R Proposition 3.17. For any z < 1, the sequences un (z), un (z), L,∗ R,∗ | | un (z), un (z) are square summable. Proof. Denote e−iθ +¯z eiθ + z f(θ)= , g(θ)= . e−iθ z¯ eiθ z − − By the definitions (3.19)–(3.24), we have uL(z)= f,ϕL , n hh n iiL uR,∗(z)= zn ϕR,g , − n hh n iiR uL,∗(z)= zn ϕL,g , − n hh n iiL uR(z)= f,ϕR . n hh n iiR Using the Bessel inequality and the fact that z < 1, we obtain the required statements. | |  Next we will consider sequences defined by s s n = AL(α , z) ...AL(α , z) 0 (3.54) t n−1 0 t  n   0  where s , t . Similarly, we will consider the sequences n n ∈Ml R R (sn, tn)=(s0, t0)A (α0, z) ...A (αn−1, z) (3.55) 60 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Theorem 3.18. Let z D and let f be the Schur function associated with dµ via (3.49) and ∈(3.50). Then: (i) A solution of (3.54) is square summable if and only if the initial condition is of the form s c 0 = t0 zf(z)c     for some matrix c in l. (ii) A solution of (3.55) isM square summable if and only if the initial condition is of the form

(s0, t0)=(c,czf(z)) for some matrix c. Proof. We shall prove (i); the proof of (ii) is similar. 1. By Proposition 3.17 and (3.14), (3.30), we have the square sum- mable solution

sn L R,∗ s0 d = un(z) un (z) , = , d =(F (z)+ 1). tn − t0 zf(z)d        (3.56) The matrix d = F (z)+ 1 is invertible. Thus, multiplying the above solution on the right by d−1c for any given matrix c, we get the “if” part of the theorem. R,∗ 2. Let us check that ϕn c is not square summable for any matrix c = 0. By the CD formula with ξ = z, we have 6 n−1 (1 z 2) ϕL(z)†ϕL(z)+ ϕL(z)†ϕL(z)= ϕR,∗(z)†ϕR,∗(z) −| | k k n n n n Xk=1 and so ϕR,∗(z)†ϕR,∗(z) (1 z 2)ϕL(z)†ϕL(z)=(1 z 2)1. n n ≥ −| | 0 0 −| | Thus, we get ϕR,∗c 2 (1 z 2) Tr c†c> 0. k n kR ≥ −| | sn 3. Let ( tn ) be any square summable solution to (3.54). Let us write this solution as L L sn ϕn a ψn b s0 + t0 s0 t0 = R,∗ + R,∗ , a = , b = − . (3.57) tn ϕ a ψ b 2 2    n  − n  Multiplying the solution (3.56) by b and subtracting from (3.57), we get a square summable solution L ϕn (z)(a F (z)b) ϕR,∗(a −F (z)b)  n −  MATRIX ORTHOGONAL POLYNOMIALS 61

It follows that a = F (z)b, which proves the “only if” part.  The Schur function f is naturally associated with dµ and hence with the Verblunsky coefficients α0, α1, α2,... . The Schur functions obtained by coefficient stripping will be denoted by f1, f2, f3,... , that is, fn corresponds to Verblunsky coefficients αn, αn+1, αn+2,... . We also write f f. 0 ≡ Theorem 3.19 (Schur Algorithm and Geronimus’ Theorem). For the Schur functions f0, f1, f2,... associated with Verblunsky coefficients α0, α1, α2,... , the following relations hold: f (z)= z−1(ρR)−1[f (z) α ] [1 α† f (z)]−1ρL, (3.58) n+1 n n − n − n n n R −1 † −1 L fn(z)=(ρn ) [zfn+1(z)+ αn] [1 + zαnfn+1(z)] ρn . (3.59) Remarks. 1. See (1.81) and Theorem 1.4 to understand this result. 2. (1.83) provides an alternate way to write (3.58) and (3.59). Proof. It clearly suffices to consider the case n = 0. Consider the solution of (3.54) with initial condition 1 zf0(z)   By Theorem 3.18, there exists a matrix c such that c 1 = AL(α , z) zf (z)c 0 zf (z)  1   0  z(ρL)−1 z(ρL)−1α†f (z) = 0 0 0 0 z(ρR)−1−α + z(ρR)−1f (z) − 0 0 0 0  From this we can compute zf1(z): zf (z) = [ z(ρR)−1α + z(ρR)−1f (z)] [z(ρL)−1 z(ρL)−1α†f (z)]−1 1 − 0 0 0 0 0 − 0 0 0 =(ρR)−1[f (z) α ] [1 α†f (z)]−1ρL 0 0 − 0 − 0 0 0 which is (3.58). Similarly, we can express f0 in terms of f1. From 1 c = AL(α , z)−1 zf (z) 0 zf (z)c  0   1  z−1(ρL)−1 z−1(ρL)−1α† c = 0 0 0 (ρR)−1α (ρR)−1 zf (z)c  0 0 0   1  z−1(ρL)−1c +(ρL)−1α†f (z)c = 0 0 0 1 (ρR)−1α c +(ρR)−1zf (z)c  0 0 0 1  62 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON we find that R −1 R −1 −1 L −1 L −1 † −1 zf0(z)=[(ρ0 ) α0c +(ρ0 ) zf1(z)c] [z (ρ0 ) c +(ρ0 ) α0f1(z)c] R −1 R −1 −1 L −1 L −1 † −1 = [(ρ0 ) α0 +(ρ0 ) zf1(z)] [z (ρ0 ) +(ρ0 ) α0f1(z)] R −1 −1 † −1 L =(ρ0 ) [α0 + zf1(z)] [z 1 + α0f1(z)] ρ0 which gives (3.59).  3.11. The CMV Matrix. In this section and the next, we discuss CMV matrices for MOPUC. This was discussed first by Simon in [170], which also has the involved history in the scalar case. Most of the results in this section appear already in [170]; the results of the next section are new here—they parallel the discussion in [167, Sect. 4.4] where these results first appeared in the scalar case.

3.11.1. The CMV basis. Consider the two sequences χn, xn , de- fined by ∈ H −k L,∗ −k+1 R χ2k(z)= z ϕ2k (z), χ2k−1(z)= z ϕ2k−1(z), −k R −k L,∗ x2k(z)= z ϕ2k(z), x2k−1(z)= z ϕ2k−1(z). For an integer k 0, let us introduce the following notation: i is ≥ k the (k + 1)th term of the sequence 0, 1, 1, 2, 2, 3, 3,... , and jk is the (k + 1)th term of the sequence 0, 1−, 1, 2−, 2, 3−, 3,... . Thus, for − − − example, i1 = 1, j1 = 1. We use the right module− structure of . For a set of functions f (z) n , its module span is the setH of all sums f (z)a with { k }k=0 ⊂ H k k ak l. ∈M P n Proposition 3.20. (i) For any n 1, the module span of χk k=0 ≥ ik n { } coincides with the module span of z k=0 and the module span of n { } jk n xk k=0 coincides with the module span of z k=0. (ii) {The} sequences χ ∞ and x ∞ are orthonormal:{ } { k}k=0 { k}k=0 χ , χ = x , x = δ . (3.60) hh k miiR hh k miiR km Proof. (i) Recall that ϕR(z)= κRzn + of 1,...,zn−1 , n n { } ϕL,∗(z)=(κL)† + linear combination of z,...,zn , n n { } R L † where both κn and (κn ) are invertible matrices. It follows that χ (z)= γ zin + linear combination of zi0 ,...,zin−1 , n n { } x (z)= δ zjn + linear combination of zj0 ,...,zjn−1 , n n { } where γn, δn are invertible matrices. This proves (i). MATRIX ORTHOGONAL POLYNOMIALS 63

L R (ii) By the definition of ϕn and ϕn , we have ϕR,ϕR = ϕL,∗,ϕL,∗ = δ , (3.61) hh n miiR hh n m iiR nm R m L,∗ m ϕn , z R = 0, m =0,...,n 1; ϕn , z R = 0, m =1,...,n. hh ii − hh ii (3.62) From (3.61) with n = m, we get χ , χ = x , x = 1. hh n niiR hh n niiR Considering separately the cases of even and odd n, it is easy to prove that χ , zm = 0, m = i , i ,...,i , (3.63) hh n iiR 0 1 n−1 x , zm = 0, m = j , j ,...,j . (3.64) hh n iiR 0 1 n−1 For example, for n = 2k, m i0,...,i2k−1 we have m + k 1, 2,..., 2k and so, by (3.62), ∈ { } ∈ { } χ , zm = z−kϕL,∗, zm = ϕL,∗, zm+k = 0. hh n iiR hh 2k iiR hh 2k iiR The other three cases are considered similarly. From (3.63), (3.64), and (i), we get (3.60) for k = m.  6 From the above proposition and the ∞-density of Laurent polyno- D ∞ k·k ∞ mials in C(∂ ), it follows that χk k=0 and xk k=0 are right orthonor- mal modula bases in , that is,{ any} element{f } can be represented as a H ∈ H ∞ ∞ f = χ χ , f = x x , f . (3.65) khh k iiR khh k iiR Xk=0 Xk=0 3.11.2. The CMV matrix. Consider the matrix of the right homo- morphism f(z) zf(z) with respect to the basis χk . Denote = χ , zχ 7→. The matrix is unitary in the following{ } sense: Cnm hh n miiR C ∞ ∞ † = † = δ 1. CknCkm CnkCmk nm Xk=0 Xk=0 The proof follows from (3.65): ∞ ∞ δ 1 = zχ , zχ = χ χ , zχ , zχ = † , nm hh n miiR khh k niiR m CknCkm k=0 R k=0 X∞ X∞ † δnm1 = zχ¯ n, zχ¯ m R = χk χk, zχ¯ n R, zχ¯ m = nk mk. hh ii hh ii R C C Xk=0  Xk=0 We note an immediate consequence: 64 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Lemma 3.21. Let z 1. Then, for every m 0, | | ≤ ≥ ∞ ∞ χ (z) = zχ (z), χ (1/z¯)† = zχ (1/z¯)†. n Cnm m Cmn n m n=0 n=0 X X Proof. First note that the above series contains only finitely many non- zero terms. Expanding f(z)= zχn according to (3.65), we see that ∞ ∞ zχ (z)= χ (z) χ , zχ = χ (z) n k hh k niiR k Ckn Xk=0 Xk=0 which is the first identity. Next, taking adjoints, we get ∞ zχ¯ (z)† = † χ (z)† n Ckn k Xk=0 which yields ∞ ∞ ∞ z¯ χ (z)† = † χ (z)† Cmn n Cmn Ckn k n=0 n=0 k=0 X X∞ ∞X = † χ (z)† Cmn Ckn k k=0 n=0  X∞ X † † = δmkχk(z) = χm(z) . Xk=0 Replacing z by 1/z¯, we get the required statement.  3.11.3. The -representation. Using (3.65) for f = χ , we obtain: LM m ∞ ∞ = χ , zχ = χ , zx x , χ = . Cnm hh n miiR hh n kiiRhh k miiR LnkMkm k=0 k=0 X X (3.66) Denote by Θ(α) the 2l 2l × α† ρL Θ(α)= . ρR α  −  Using the Szeg˝orecursion formulas (3.15) and (3.18), we get R R R L,∗ † zϕn = ϕn+1ρn + ϕn αn, (3.67) ϕL,∗ = ϕL,∗ρL ϕR α . (3.68) n+1 n n − n+1 n Taking n =2k and multiplying by z−k, we get † R zx2k = χ2kα2k + χ2k+1ρ2k, zx = χ ρL χ α . 2k+1 2k 2k − 2k+1 2k MATRIX ORTHOGONAL POLYNOMIALS 65

It follows that the matrix has the structure L = Θ(α ) Θ(α ) Θ(α ) . L 0 ⊕ 2 ⊕ 4 ⊕··· Taking n =2k 1 in (3.67), (3.68) and multiplying by z−k, we get − † R χ2k−1 = x2k−1α2k−1 + x2kρ2k−1, χ = x ρL x α . 2k 2k−1 2k−1 − 2k 2k−1 It follows that the matrix has the structure M = 1 Θ(α ) Θ(α ) . (3.69) M ⊕ 1 ⊕ 3 ⊕··· Substituting this into (3.66), we obtain: † L † L L α0 ρ0 α1 ρ0 ρ1 0 0 R † L ··· ρ0 α0α1 α0ρ1 0 0  − † R − † L † L L ···  0 α2ρ1 α2α1 ρ2 α3 ρ2 ρ3 =  R R − R † L ···  . (3.70) C  0 ρ2 ρ1 ρ2 α1 α2α3 α2ρ3   − − † R − † ···   0 0 0 α4ρ3 α4α3   . . . . − . ···.   ......      We note that the analogous formula to this in [170], namely, (4.30), is incorrect! The order of the factors below the diagonal is wrong there. 3.12. The Resolvent of the CMV Matrix. We begin by studying solutions to the equations ∞ w = zw , m 2, (3.71) Cmk k m ≥ k=0 X∞ w˜ = zw˜ , m 1. (3.72) kCkm m ≥ Xk=0 Let us introduce the following functions: † x˜n(z)= χn(1/z¯) , Υ (z)= z−nψL,∗(z), 2n − 2n −n+1 R Υ2n−1(z)= z ψ2n−1(z), y (z)= Υ (1/z¯)† = z−nψL (z), 2n − 2n 2n y (z)= Υ (1/z¯)† = z−nψR,∗ (z), 2n−1 − 2n−1 − 2n−1 pn(z)= yn(z)+˜xn(z)F (z),

πn(z)=Υn(z)+ F (z)χn(z). Proposition 3.22. Let z D 0 . ∈ \{ } 66 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

(i) For each n 0, a pair of values (˜w , w˜ ) uniquely determines ≥ 2n 2n+1 a solution w˜n to (3.72). Also, for any pair of values (˜w2n, w˜2n+1) in l, there exists a solution w˜n to (3.72) with these values at (2n,M2n + 1). (ii) The set of solutions w˜n to (3.72) coincides with the set of sequences

w˜n(z)= aχn(z)+ bπn(z) (3.73)

where a, b range over l. (iii) A solution (3.73) is inMℓ2 if and only if a =0. (iv) A solution (3.73) obeys (3.72) for all m 0 if and only if b =0. ≥ Proposition 3.23. Let z D 0 . (i) For each n 1, a pair∈ of values\{ } (w ,w ) uniquely determines ≥ 2n−1 2n a solution wn to (3.71). Also, for any pair of values (w2n−1,w2n) in l, there exists a solution wn to (3.71) with these values at (2nM 1, 2n). − (ii) The set of solutions wn to (3.71) coincides with the set of sequences

wn(z)=˜xn(z)a + pn(z)b (3.74)

where a, b range over l. (iii) A solution (3.74) is inMℓ2 if and only if a =0. (iv) A solution (3.74) obeys (3.71) for all m 0 if and only if b =0. ≥ Proof of Proposition 3.22. (i) The matrix z can be written in the form C − A B 0 0 0 0 ··· 0 A1 B1 0 z =  ··· (3.75) C − 0 0 A2 B2 . . . . ···.  ......    where   α† z α† ρR α† α z A = 0 A = 2n 2n−1 2n 2n−1 0 ρ−R n ρR ρR − ρR α −  0   2n 2n−1 − 2n 2n−1  ρL α† ρL ρL B = 2n 2n+1 2n 2n+1 n α α† z α ρL − 2n 2n+1 − − 2n 2n+1 Define Wn = (˜w2n, w˜2n+1) for n = 0, 1, 2,... . Then (3.72) for m = 2n +1, 2n + 2 is equivalent to f WnBn + Wn+1An+1 = 0.

It remains to prove that the 2l 2l matrices Aj, Bj are invertible. × x 0 Suppose that for somefx, y Cfl, A = . This is equivalent to ∈ n y 0 the system   α† ρR x α† α y zy = 0, 2n 2n−1 − 2n 2n−1 − MATRIX ORTHOGONAL POLYNOMIALS 67

ρR ρR x ρR α y = 0. 2n 2n−1 − 2n 2n−1 R R The second equation of this system yields ρ2n−1x = α2n−1y (since ρ2n is invertible), and upon substitution into the first equation, we get y = x = 0. Thus, ker(An) = 0 . In a similar way, one proves that ker(B )= 0 . { } n { } (ii) First note thatw ˜n = χn is a solution to (3.72) by Lemma 3.21. Let us check thatw ˜ =Υ is also a solution. If U =( 1)kδ , then n n km − km (U U)km for m 1 coincides with the CMV matrix corresponding to C ≥ L,R the coefficients αn . Recall that ψn are the orthogonal polynomials ϕL,R, corresponding{− to} the coefficients α . Taking into account the n {− n} minus signs in the definition of Υn, we see thatw ˜n =Υn solves (3.72) for m 1. It follows that anyw ˜n of the form (3.73) is a solution to (3.72).≥ Let us check that any solution to (3.72) can be represented as (3.73). By (i), it suffices to show that for anyw ˜0, w˜1, there exist a, b l such that ∈ M

aχ0(z)+ bπ0(z)=w ˜0,

aχ1(z)+ bπ1(z)=w ˜1.

† R −1 Recalling that χ0 = 1, Υ0 = 1, Υ1(z)=(z + α0)(ρ0 ) , χ1(z) = † R −1 − (z α0)(ρ0 ) , we see that the above system can be easily solved for a, b−if z = 0. 6 (iii) Let us prove that the solution πn is square integrable. We will consider separately the sequences π2n and π2n−1 and prove that they both belong to ℓ2. By (3.20) and (3.23), we have eiθ + z ψR(z)+ F (z)ϕR(z)= dµ(θ)ϕR(eiθ), (3.76) n n eiθ z n Z − eiθ + z ψL,∗(z) F (z)ϕL,∗(z)= zn dµ(θ)ϕL(eiθ)†. (3.77) n − n − eiθ z n Z − Taking n =2k in (3.77) and n =2k 1 in (3.76), we get − eiθ + z π (z)= zk dµ(θ)ϕL (eiθ)†, (3.78) 2k eiθ z 2k Z − eiθ + z π (z)= z−k+1 dµ(θ)ϕR (eiθ). (3.79) 2k−1 eiθ z 2k−1 Z − L As ϕ2k is an orthonormal sequence, using the Bessel inequality, from 2 (3.78) we immediately get that π2k is in ℓ . 68 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Consider the odd terms π2k−1. We claim that eiθ + z eiθ + z z−k+1 dµ(θ)ϕR (eiθ)= dµ(θ)ei(−k+1)θϕR (eiθ). eiθ z 2k−1 eiθ z 2k−1 Z − Z − (3.80) R imθ Indeed, using the right orthogonality of ϕ2k−1 to e , m = 0, 1,..., 2k 2, we get − eiθ + z ∞ dµ(θ)ϕR (eiθ)= 1+2 z¯meimθ,ϕR eiθ z 2k−1 2k−1 Z −  m=1 R ∞ X m imθ R = 2 z¯ e ,ϕ2k−1 R  m=2Xk−1  and eiθ + z zk−1 ei(−k+1)θ dµ(θ)ϕR (eiθ)= eiθ z 2k−1 Z − ∞ k−1 i(k−1)θ m imθ R = z¯ e 1+2 z¯ e ,ϕ2k−1   m=1  R ∞ X m imθ R = 2 z¯ e ,ϕ2k−1 R  m=2Xk−1  which proves (3.80). The identities (3.80) and (3.79) yield e−iθ +¯z π (z)= , χ 2k−1 e−iθ z¯ 2k−1  − R and, since χ2k−1 is a right orthogonal sequence, the Bessel inequality 2 2 ensures that π2k−1(z) is in ℓ . Thus, πk(z) is in ℓ . Next, as in the proof of Theorem 3.18, using the CD formula, we L,∗ check that the sequence ϕn (z) R is bounded below and therefore the k 2 k sequence χ2n(z) is not in ℓ . This proves the statement (iii). (iv) By Lemma 3.21, the solution χ (z) obeys (3.72) for all m 0. n ≥ It is easy to check directly that the solution πn(z) does not obey (3.72) for m =0 if z = 0. This proves the required statement.  6

Proof of Proposition 3.23. (i) For j = 1, 2,... , define Wj = (w2j−1,w2j). Then, using the block structure (3.75), we can rewrite (3.71) for m = 2j, 2j +1 as AjWj + BjWj+1 = 0. By the proof of Proposition 3.22, the matrices Aj and Bj are invertible, which proves (i). MATRIX ORTHOGONAL POLYNOMIALS 69

(ii) Lemma 3.21 ensures thatx ˜n(z) is a solution of (3.71). As in the proof of Proposition 3.22, by considering the matrix (U U) , one C km checks that yn(z) is also a solution to (3.71). Let us prove that any solution to (3.71) can be represented in the form (3.74). By (i), it suffices to show that for any w1, w2, there exist a, b such that ∈Ml x˜1(z)a + p1(z)b = w1,

x˜2(z)a + p2(z)b = w2. We claim that this system of equations can be solved for a, b. Here are the main steps. Substituting the definitions ofx ˜n(z) and pn(z), we rewrite this system as ϕR,∗(a + F (z)b) ψR,∗(z)b = zw , 1 − 1 1 L L ϕ2 (a + F (z)b)+ ψ2 (z)b = zw2. L L Using Szeg˝orecurrence, we can substitute the expressions for ϕ2 , ψ2 , which helps rewrite our system as ϕR,∗(a + F (z)b) ψR,∗(z)b = zw , 1 − 1 1 L L L † ϕ1 (a + F (z)b)+ ψ1 (z)b = ρ1 w2 + α1w1. R,∗ L R,∗ L Substituting explicit formulas for ϕ1 , ϕ1 , ψ1 , ψ1 , and expressing F (z) in terms of f(z), we can rewrite this as (ρR)−1(1 α z)a +2z(ρR)−1(f(z) α )(1 zf(z))−1b = zw , 0 − 0 0 − 0 − 1 (ρL)−1(z α†)a +2z(ρL)−1(1 α†f(z))(1 zf(z))−1b = ρLw + α†w . 0 − 0 0 − 0 − 1 2 1 1 Denote a =(ρR)−1(1 α z)a, 1 0 − 0 b =2z(ρL)−1(1 α†f(z))(1 zf(z))−1b. 1 0 − 0 − Then in terms of a1, b1, our system can be rewritten as

a1 + X1b1 = zw1, L † X2a1 + b1 = ρ1 w2 + α1w1, where X =(ρR)−1(f(z) α )(1 α f(z))−1ρL, 1 0 − 0 − 0 0 X =(ρL)−1(z α†)(1 α z)−1ρR. 2 0 − 0 − 0 0 Since f(z) < 1 and z < 1, we can apply Corollary 1.5, which yields X

† (iii) As pn(z)= πn(1/z¯) , by Proposition 3.22, we get that pn(z) is 2 − † in ℓ . In the same way, asx ˜n(z) = χn(1/z¯) , we get thatx ˜n(z) is not in ℓ2. (iv) By Lemma 3.21, the solutionx ˜ (z) obeys (3.71) for all m 0. n ≥ Using the explicit formula for yn(z), one easily checks that the solution yn(z) does not obey (3.71) for m =0, 1.  Theorem 3.24. We have for z D, ∈ −1 −1 (2z) x˜k(z)πl(z), l>k or k = l even, [( z) ]k,l = −1 C − ((2z) pk(z)χl(z), k>l or k = l odd. −1 Proof. Fix z D. Write Gk,l(z)=[( z) ]k,l. Then G·,l(z) is equal −1 ∈ C − to ( z) δl, which means that Gk,l(z) solves (3.71) for m = l. Since C − 2 6 G·,l(z) is ℓ at infinity and obeys the equation at m = 0, we see that it is a right-multiple of p for large k and a right-multiple ofx ˜ for small k. Thus,

x˜k(z)al(z), kl or k = l odd. Similarly,

˜bk(z)πl(z), kl or k = l odd. Equating the two expressions, we find ˜ x˜k(z)al(z)= bk(z)πl(z) k < l or k = l even, (3.81)

pk(z)bl(z)=˜ak(z)χl(z) k > l or k = l odd. (3.82)

Putting k = 0 in (3.81) and setting ˜b0(z) = c1(z), we find al(z) = c1(z)πl(z). Putting l = 0 in (3.82) and setting b0(z) = c2(z), we find pk(z)c2(z)=˜ak(z). Thus,

x˜k(z)c1(z)πl(z), kl or k = l odd. −1 We claim that c1(z) = c2(z) = (2z) 1. Consider the case k = l = 0. Then, on the one hand, by the definition, 1 G (z)= dµ(eiθ) 0,0 eiθ z Z − eiθ + z = (2z)−1 1 dµ(eiθ) eiθ z − Z  −  = (2z)−1(F (z) 1) (3.83) − MATRIX ORTHOGONAL POLYNOMIALS 71 and on the other hand, G (z)=˜x (z)c (z)π (z)= c (z)(F (z) 1). 0,0 0 1 0 1 − −1 This shows c1(z) = (2z) 1. Next, consider the case k = 1, l = 0. Then, on the one hand, by the definition, G (z)= χ , (eiθ z)−1χ 1,0 hh 1 − 0iiR and on the other hand,

G1,0(z)= p1(z)c2(z)χ0(z). Let us calculate the expressions on the right-hand side. We have p (z)c (z)χ (z)=(ρR)−1( z−1 α +(z−1 α )F (z))c (z) (3.84) 1 2 0 0 − − 0 − 0 2 and χ ,(eiθ z)−1χ = hh 1 − 0iiR =(ρR)−1 (e−iθ α )dµ(θ)(eiθ z)−1 0 − 0 − Z =(ρR)−1 [z−1(eiθ z)−1 z−1e−iθ α (eiθ z)−1] dµ(θ) 0 − − − 0 − Z 1 1 1 =(ρR)−1 (F (z) 1) α (F (z) 1) e−iθ dµ(θ) . 0 2z2 − − 2z 0 − − z  Z  Taking into account the identity

−iθ e dµ(θ)= α0 Z (which can be obtained, e.g., by expanding ϕR,ϕR = 0), we get hh 1 0 iiR 1 χ , (eiθ z)−1χ = (ρR)−1( z−1 α +(z−1 α )F (z)). hh 1 − 0iiR 2z 0 − − 0 − 0 −1 Comparing this with (3.84), we get c2(z)=(2z) 1.  As an immediate corollary, evaluating the kernel on the diagonal for even and odd indices, we obtain the formulas dµ(θ) 1 ϕL (eiθ) ϕL (eiθ)† = ϕL (z)uL,∗(z), (3.85) 2n eiθ z 2n −2z2n+1 2n 2n Z − dµ(θ) 1 ϕR (eiθ)† ϕR (eiθ)= uR,∗ (z)ϕR (z). (3.86) 2n−1 eiθ z 2n−1 −2z2n 2n−1 2n−1 Z − Combining this with (3.31) and (3.32), we find L L,∗ L L,∗ n un (z)ϕn (z)+ ϕn (z)un (z)=2z , (3.87) R,∗ R R,∗ R n ϕn (z)un (z)+ un (z)ϕn (z)=2z . (3.88) 72 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

3.13. Khrushchev Theory. Among the deepest and most elegant methods in OPUC are those of Khrushchev [125, 126, 101]. We have not been able to extend them to MOPUC! We regard their extension as an important open question; we present the first very partial steps here. Let Ω= θ : det w(θ) > 0 . { } Theorem 3.25. For every n 0, ≥ θ : f (eiθ)†f (eiθ) < 1 = Ω { n n } up to a set of zero Lebesgue measure. Consequently, dθ Ω f (eiθ) 1 | | . (3.89) k n k 2π ≥ − 2π Z Proof. Recall that, by Proposition 3.16, up to a set of zero Lebesgue measure, θ : f (eiθ)†f (eiθ) < 1 = θ : det w(θ) > 0 { 0 0 } { } so, by induction, it suffices to show that, up to a set of zero Lebesgue measure, θ : f (eiθ)†f (eiθ) < 1 = θ : f (eiθ)†f (eiθ) < 1 . { 0 0 } { 1 1 } This in turn follows from the fact that the Schur algorithm, which relates the two functions, preserves the property g†g < 1. iθ Notice that away from Ω, fn(e ) has norm one and therefore,

iθ dθ iθ dθ dθ fn(e ) fn(e ) = 1 k k 2π ≥ c k k 2π c 2π Z ZΩ ZΩ which yields (3.89).  Define L R,∗ −1 bn(z; dµ)= ϕn (z; dµ)ϕn (z; dµ) . L −1 † −1 R Proposition 3.26. (a) bn+1 =(ρn ) (zbn αn)(1 zαnbn) ρn − † −† † (b) The Verblunsky coefficients of b are ( α , α ,..., α , 1). n − n−1 − n−2 − 0 Proof. (a) By the Szeg˝orecursion, we have that L R,∗ −1 bn+1 = ϕn+1(ϕn+1) = ((ρL)−1zϕL (ρL)−1α† ϕR,∗)((ρR)−1ϕR,∗ z(ρR)−1α ϕL)−1 n n − n n n n n − n n n =(ρL)−1(zϕL α† ϕR,∗)(ϕR,∗ zα ϕL)−1ρR n n − n n n − n n n =(ρL)−1(zϕL(ϕR,∗)−1 α† ϕR,∗(ϕR,∗)−1) n n n − n n n MATRIX ORTHOGONAL POLYNOMIALS 73

(ϕR,∗(ϕR,∗)−1 zα ϕL(ϕR,∗)−1)−1ρR n n − n n n n =(ρL)−1(zb α† )(1 zα b )−1ρR n n − n − n n n

(b) It follows from part (a) that the first Verblunsky coefficient of bn is α† and that its first Schur iterate is b ; compare Theorem 3.19. − n−1 n−1 This gives the claim by induction and the fact that b0 = 1. 

4. The Szego˝ Mapping and the Geronimus Relations In this chapter, we present the matrix analogue of the Szeg˝o map- ping and the resulting Geronimus relations. This establishes a corre- spondence between certain matrix-valued measures on the unit circle and matrix-valued measures on the interval [ 2, 2] and, consequently, a correspondence between Verblunsky coefficients− and Jacobi parame- ters. Throughout this chapter, we will denote measures on the circle by dµC and measures on the interval by dµI . The scalar versions of these objects are due to Szeg˝o[182] and Geron- imus [90]. There are four proofs that we know of: the original argu- ment of Geronimus [90] based on Szeg˝o’s formula in [182], a proof of Damanik–Killip [32] using Schur functions, a proof of Killip–Nenciu [127] using CMV matrices, and a proof of Faybusovich–Gekhtman [81] using canonical moments. The matrix version of these objects was studied by Yakhlef– Marcell´an [199] who proved Theorem 4.2 below using the Geronimus– Szeg˝oapproach. Our proof uses the Killip–Nenciu–CMV approach. In comparing our formula with [199], one needs the following dictionary (their objects on the left of the equal sign and ours on the right):

H = α† , n − n+1 Dn = An,

En = Bn+1. Dette–Studden [43] have extended the theory of canonical moments from OPRL to MOPRL. It would be illuminating to use this to extend the proof that Faybusovich–Gekhtman [81] gave of Geronimus relations for scalar OPUC to MOPUC. Suppose dµC is a non-trivial positive semi-definite measure on the unit circle that is invariant under θ θ (i.e., z z¯ = z−1). Then we define the measure dµ on the interval7→ − [ 2, 2] by7→ I −

f(x) dµI(x)= f(2 cos θ) dµC (θ) Z Z 74 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON for f measurable on [ 2, 2]. The map − Sz : dµ dµ C 7→ I is called the Szeg˝omapping. The Szeg˝omapping can be inverted as follows. Suppose dµI is a non-degenerate positive semi-definite matrix measure on [ 2, 2]. Then − we define the measure dµC on the unit circle which is invariant under θ θ by 7→ −

g(θ) dµC(θ)= g (arccos(x/2)) dµI(x) Z Z for g measurable on ∂D with g(θ)= g( θ). We first show that for the measures− on the circle of interest in this section, the Verblunsky coefficients are always Hermitian.

Lemma 4.1. Suppose dµC is a non-trivial positive semi-definite Her- mitian matrix measure on the unit circle. Denote the associated Verblunsky coefficients by αn . Then, dµC is invariant under θ θ † { } 7→ − if and only if αn = αn for every n. Proof. For a polynomial P , denote P˜(z)= P (¯z)†. 1. Suppose that dµ is invariant under θ θ. Then we have C 7→ − f,g = g,˜ f˜ hh iiL hh iiR L for all f, g. Inspecting the orthogonality conditions which define Φn R and Φn , we see that Φ˜ L = ΦR and ΦL, ΦL = ΦR, ΦR . (4.1) n n hh n n iiL hh n n iiR Next, we claim that L R,† κn = κn . (4.2) L R Indeed, recall the definition of κn , κn : κL = u ΦL, ΦL −1/2, u is unitary, κL (κL)−1 > 0, κL =1, n nhh n n iiL n n+1 n 0 κR = ΦR, ΦR −1/2v , v is unitary, (κR)−1κR > 0, κR =1. n hh n n iiR n n n n+1 0 Using this definition and (4.1), one can easily prove by induction that † vn = un and therefore (4.2) holds true. Next, taking z = 0 in (3.11), we get α = (κR)−1ΦL (0)†(κL)†, n − n n+1 n α = (κR)†ΦR (0)†(κL)−1. n − n n+1 n † From here and (4.1), (4.2), we get αn = αn. † 2. Assume αn = αn for all n. Then, by Theorem 3.3(c), we have L R ρn = ρn . It follows that in this case the Szeg˝orecurrence relation is MATRIX ORTHOGONAL POLYNOMIALS 75

L R R L invariant with respect to the change ϕn ϕ˜n , ϕn ϕ˜n . It follows L R R L 7→ 7→ that ϕn =ϕ ˜n , ϕn =ϕ ˜n . In particular, we get ϕL,ϕL = ϕ˜R , ϕ˜R . (4.3) hh n miiL hh m n iiR Now let f and g be any polynomials; we have L ˜ L † f(z)= fnϕn (z), f(z)= ϕ˜n (z)fn, n n X X and a similar expansion for g. Using these expansions and (4.3), we get f,g = g,˜ f˜ for all polynomials f, g. hh iiL hh iiR From here it follows that the measure dµC is invariant under θ θ. 7→ − Now consider two measures dµC and dµI = Sz(dµC ) and the associ- ated CMV and Jacobi matrices. What are the relations between the parameters of these matrices?

Theorem 4.2. Given dµC and dµI = Sz(dµC) as above, the coefficients of the associated CMV and Jacobi matrices satisfy the Geronimus re- lations: B = 1 α α 1 α 1 + α α 1 + α , k+1 − 2k−1 2k − 2k−1 − 2k−1 2k−2 2k−1 p p p p (4.4) A = 1 α 1 α2 1 + α . (4.5) k+1 − 2k−1 − 2k 2k+1 Remarks.p1. For theseq formulasp to hold for k = 0, we set α = 1. −1 − 2. There are several proofs of the Geronimus relations in the scalar case. We follow the proof given by Killip and Nenciu in [127]. 3. These A’s are, in general, not type 1 or 2 or 3. Proof. For a Hermitian l l matrix α with α < 1, define the unitary 2l 2l matrix S(α) by × k k × 1 √1 α √1 + α S(α)= − − √2 √1 + α √1 α  −  Since α† = α, the associated ρL and ρR coincide and will be denoted by ρ. We therefore have α ρ Θ(α)= ρ α  −  and hence, by a straightforward calculation, 1 √1 α √1 + α α ρ √1 α √1 + α S(α)Θ(α)S(α)−1 = − − − 2 √1 + α √1 α ρ α √1 + α √1 α  −   −  − −  76 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

1 0 = −0 1   Thus, if we define = 1 S(α ) S(α ) . . . S ⊕ 1 ⊕ 3 ⊕ and = 1 ( 1) 1 ( 1) . . . R ⊕ − ⊕ ⊕ − ⊕ it follows that (see (3.69)) † = . SMS R The matrix + is unitarily equivalent to LM ML = ( + ) † = † + †. A S LM ML S SLS R RSLS Observe that is the direct sum of two block Jacobi matrices. Indeed, it follows quicklyA from the explicit form of that the even-odd and odd-even block entries of vanish. Consequently,R is the direct sum of its odd-odd and even-evenA block entries. We willA call these two block Jacobi matrices J and J˜, respectively. Consider as an operator on v. Then dµC is the spectral measure of in the followingC sense: H C [ m] = eimθdµ(θ); C 0,0 Z see (3.83). Then, by the spectral theorem, dµC( θ) is the spectral −1 − −1 measure of and so dµI is the spectral measure of + = + C C C t LM ( )−1 = + . Since leaves 1000 invariant, LM LM ML S ··· we see that dµI is the spectral measure of J. To determine the block entries of J, we only need to compute the odd-odd block entries of . For k 0, we have A ≥ α2k−2 0 √1 + α2k−1 2k+1,2k+1 = √1 + α2k−1 √1 α2k−1 − A − 0 α2k √1 α2k−1    −  = 1 α α 1 α  1 + α α 1 + α − 2k−1 2k − 2k−1 − 2k−1 2k−2 2k−1 and p p p p 0 0 √1 + α2k+1 2k+1,2k+3 = √1 + α2k−1 √1 α2k−1 A − ρ2k 0 √1 α2k+1    −   = 1 α 1 α2 1 + α . − 2k−1 − 2k 2k+1 The result follows.p q p  As in [127, 168], one can also use this to get Geronimus relations for the second Szeg˝omap. MATRIX ORTHOGONAL POLYNOMIALS 77

5. Regular MOPRL The theory of regular (scalar) OPs was developed by Stahl–Totik [180] generalizing a definition of Ullman [190] for [ 1, 1]. (See Simon [172] for a review and more about the history.) Here− we develop the basics for MOPRL; it is not hard to do the same for MOPUC. 5.1. Upper Bound and Definition. Theorem 5.1. Let dµ be a non-trivial l l matrix-valued measure on R with E = supp(dµ) compact. Then (with× C(E)= logarithmic capacity of E) 1/n l lim sup det(A1 ...An) C(E) . (5.1) n→∞ | | ≤

Remarks. 1. det(A1 ...An) is constant over equivalent Jacobi param- eters. | | 2. For the scalar case, this is a result of Widom [195] (it might be older) whose proof extends to the matrix case.

Proof. Let Tn be the Chebyshev polynomials for E (see [172, Appen- (l) dix B] for a definition) and let Tn be Tn 1, that is, the l l matrix ⊗ (l) × polynomial obtained by multiplying Tn(x) by 1. Tn is monic so, by (2.12) and (2.34), 1/2n det(A ...A ) 1/n det T (l)(x) 2 dµ(x) | 1 n | ≤ | n | Z  l/n sup Tn(x) . ≤ n | | 1/n By a theorem of Szeg˝o[183], supn Tn(x) C(E), so (5.1) follows. | | → 

Definition. Let dµ be a non-trivial l l matrix- valued measure with E = supp(dµ) compact. We say µ is regular× if equality holds in (5.1). 5.2. Density of Zeros. The following is a simple extension of the scalar results (see [172, Sect. 2]): Theorem 5.2. Let dµ be a regular measure with E = supp(dµ). Let dν be the zero counting measure for det(pL(x)), that is, if x(n) nl n n { j }j=1 are the zeros of this determinant (counting degenerate zeros multiply), then 1 nl dνn = δ (n) . (5.2) nl xj j=1 X Then dνn converges weakly to dρE, the equilibrium measure for E. 78 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

Remark. For a discussion of capacity, equilibrium measure, quasi-every (q.e.), etc., see [172, Appendix A] and [115, 136, 158]. Proof. By (2.80) and (2.93), l 2 (n−1)l/2 det(pR(x)) d 1+ d (5.3) | n | ≥ D D so, in particular,    R 1/nl d 2 1/2 lim inf det(pn (x)) 1+ 1. (5.4) n | | ≥ D ≥ But   det(pR(x)) = det(A ...A ) −1 exp( nlΦ (x)) (5.5) | n | | 1 n | − νn where Φν is the potential of the measure ν. Let ν∞ be a limit point of ν and use equality in (5.1) and (5.4) to conclude, for x / cvh(E), n ∈ exp( Φ ∞ (x)) C(E) − ν ≥ which, as in the proof of Theorem 2.4 of [172], implies that ν∞ = ρe.  The analogue of the almost converse of this last theorem has an extra subtlety relative to the scalar case: Theorem 5.3. Let dµ be a non-trivial l l matrix-valued measure on R with E = supp(dµ) compact. If dν ×dρ , then either µ is regular n → E or else, with dµ = M(x) dµtr(x), there is a set S of capacity zero, so det(M(x))=0 for dµ -a.e. x / S. tr ∈ Remark. By taking direct sums of regular point measures, it is easy to find regular measures where det(M(x)) =0 for dµtr-a.e. x. Proof. For a.e. x with det(M(x)) = 0, we have (see Lemma 5.7 below) 6 pR(x) C(n + 1)1. n ≤ The theorem then follows from the proof of Theorem 2.5 of [172].  5.3. General Asymptotics. The following generalizes Theorem 1.10 of [172] from OPRL to MOPRL—it is the matrix analogue of a basic result of Stahl–Totik [180]. Its proof is essentially the same as in [172]. By σess(µ), we mean the essential spectrum of the block Jacobi matrix associated to µ. Theorem 5.4. Let E R be compact and let µ be an l l matrix-valued ⊂ × measure of compact support with σess(µ) = E. Then the following are equivalent: 1/n l (i) µ is regular, that is, limn→∞ det(A1 ...An) = C(E) . (ii) For all z in C, uniformly on| compacts, | lim sup det(pR(z)) 1/n eGE (z). (5.6) | n | ≤ MATRIX ORTHOGONAL POLYNOMIALS 79

(iii) For q.e. z in E, we have lim sup det(pR(z)) 1/n 1. (5.7) | n | ≤ Moreover, if (i)–(iii) hold, then (iv) For every z C cvh(supp(dµ)), we have ∈ \ R 1/n GE (z) lim det(pn (z)) = e . (5.8) n→∞ | | (v) For q.e. z ∂Ω, ∈ R 1/n lim sup det(pn (z)) =1. (5.9) n→∞ | |

Remarks. 1. GE, the potential theorists’ Green’s function for E, is defined by G (z)= log(C(E)) Φ (z). E − − ρE 2. There is missing here one condition from Theorem 1.10 of [172] involving general polynomials. Since the determinant of a sum can be much larger than the sum of the , it is not obvious how to extend this result. 5.4. Weak Convergence of the CD Kernel and Consequences. The results of Simon in [174] extend to the matrix case. The basic result is:

1 Theorem 5.5. The measures dνn and (n+1)l Tr(Kn(x, x)) dµ(x) have the same weak limits. In particular, if dµ is regular, 1 Tr(K (x, x)dµ(x)) w dρ . (5.10) (n + 1)l n −→ E

j j As in [174], (πnMxπn) and (πnMxπn) have a difference of traces which is bounded as n , and this implies the result. Once one has this, combining it with→∞ Theorem 2.20 leads to:

Theorem 5.6. Let I = [a, b] E R with E compact. Let σess(dµ)= E for an l l matrix-valued⊂ measure,⊂ and suppose µ is regular for E and ×

dµ = W (x) dx + dµs (5.11) where dµ is singular and det(W (x)) > 0 for a.e. x I. Then, s ∈ R † R (1) lim pn (x) dµs(x)pn (x) 0 (5.12) n→∞ → ZI 1 n (2) pR(x)†W (x)p (x) ρ (x)1 dx 0 (5.13) n +1 j j − E → ZI j=0 X

80 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

5.5. Widom’s Theorem. Lemma 5.7. Let dµ be an l l matrix-valued measure supported on a compact E R and let dη be× a scalar measure on R so ⊂ dµ(x)= W (x) dη(x)+ dµs(x) (5.14) where dµs is dη singular. Suppose for η-a.e. x, det(W (x)) > 0. (5.15) Then for η-a.e. x, there is a positive real function C(x) so that pR(x) C(x)(n + 1)1. (5.16) k n k ≤ In particular, det(pR(x)) C(x)l(n + 1)l. (5.17) | n | ≤ Proof. Since pR 2 = 1, we have k n kR ∞ (n + 1)−2 pR 2 < (5.18) k n kR ∞ n=0 X so ∞ (n + 1)−2 Tr(pR(x)†W (x)pR(x)) < n n ∞ n=0 X for η-a.e. x. Since (5.15) holds, for a.e. x, W (x) b(x)1 ≥ for some scalar function b(x). Thus, for a.e. x, ∞ (n + 1)−2 Tr(pR(x)†pR(x)) C(x)2. n n ≤ n=0 X Since A 2 Tr(A†A), we find (5.16), which in turn implies (5.17). k k ≤ 

This lemma replaces Lemma 4.1 of [172] and then the proof there of Theorem 1.12 extends to give (a matrix version of the theorem of Widom [195]):

Theorem 5.8. Let dµ be an l l matrix-valued measure with σess(dµ)= E R compact. Suppose × ⊂ dµ(x)= W (x) dρE(x)+ dµs(x) (5.19) with dµs singular with respect to dρE. Suppose for dρE-a.e. x, det(W (x)) > 0. Then µ is regular. MATRIX ORTHOGONAL POLYNOMIALS 81

5.6. A Conjecture. We end our discussion of regular MOPRL with a conjecture—an analog of a theorem of Stahl–Totik [180]; see also Theorem 1.13 of [172] for a proof and references. We expect the key will be some kind of matrix Remez inequality. For direct sums, this conjecture follows from Theorem 1.13 of [172]. Conjecture 5.9. Let E be a finite union of disjoint closed intervals in R. Suppose µ is an l l matrix-valued measure on R with σess(dµ)= E. For each η > 0 and m× =1, 2,... , define S = x : µ([x 1 , x + 1 ]) e−ηm1 . (5.20) m,η { − m m ≥ } Suppose that for each η (with = Lebesgue measure) |·| lim E Sm,η =0 m→∞ | \ | Then µ is regular.

References [1] R. Ackner, H. Lev-Ari, and T. Kailath, The Schur algorithm for matrix-valued meromorphic functions, SIAM J. Matrix Anal. Appl. 15 (1994), 140–150. [2] N. Akhiezer and I. Glazman, Theory of Linear Operators in Hilbert Space, Dover Publications, New York (1993). [3] D. Alpay, The Schur Algorithm, Reproducing Kernel Spaces and System The- ory, SMF/AMS Texts and Monographs 5, American Mathematical Society, Providence, R.I.; Soci´et´eMath´ematique de France, Paris (2001). [4] A. Aptekarev and E. Nikishin, The scattering problem for a discrete Sturm– Liouville operator, Mat. Sb. 121(163) (1983), 327–358. [5] F. V. Atkinson, Discrete and Continuous Boundary Problems, Mathematics in Science and Engineering 8, Academic Press, New York-London (1964). [6] M. Bakonyi and T. Constantinescu, Schur’s Algorithm and Several Applica- tions, Pitman Research Notes in Math. 261, Longman, Essex, U.K. (1992). [7] S. Basu and N. K. Bose, Matrix Stieltjes series and network models, SIAM J. Math. Anal. 14 (1983), 209–222. [8] E. D. Belokolos, F. Gesztesy, K. A. Makarov, and L. A. Sakhnovich, Matrix- valued generalizations of the theorems of Borg and Hochstadt, Evolution Equa- tions (G. Ruiz Goldstein et al., eds.), pp. 1–34, Lecture Notes in Pure and Applied Mathematics 234, Marcel Dekker, New York (2003). [9] Ju. M. Berezans’ki, Expansions in Eigenfunctions of Selfadjoint Operators, Translations of Mathematical Monographs 17, American Mathematical Soci- ety, R.I. (1968). [10] C. Berg, The matrix moment problem, to appear in Coimbra Lecture Notes on Orthogonal Polynomials (A. Foulqui´eMoreno and Amilcar Branquinho, eds.). [11] C. Berg and A. J. Dur´an, Measures with finite index of determinacy or a mathetmatical model for Dr. Jekyll and Mr. Hyde, Proc. Amer. Math. Soc. 125 (1997), 523–530. 82 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

[12] C. Berg and A. J. Dur´an, Orthogonal polynomials and analytic functions associated to positive definite matrices, J. Math. Anal. Appl. 315 (2006), 54–67. [13] R. Bhatia, , Graduate Texts in Mathematics 169, Springer- Verlag, New York (1997). [14] M. J. Cantero, M. P. Ferrer, L. Moral, and L. Vel´azquez, A connection between orthogonal polynomials on the unit circle and matrix orthogonal polynomials on the real line, J. Comput. Appl. Math. 154 (2003), 247–272. [15] M. J. Cantero, L. Moral, and L. Vel´azquez, Measures and para-orthogonal polynomials on the unit circle, East J. Approx. 8 (2002), 447–464. [16] M. J. Cantero, L. Moral, and L. Vel´azquez, Differential properties of matrix orthogonal polynomials, J. Concr. Appl. Math. 3 (2005), 313–334. [17] M. J. Cantero, L. Moral, and L. Vel´azquez, Matrix orthogonal polynomials whose derivatives are also orthogonal, J. Approx. Theory 146 (2007), 174– 211. [18] E. Cartan, Sur les domaines born´es homog`enes de l’espace de n variables complexes, Abh. Math. Sem. Univ. Hamburg 11 (1935), 116–162. [19] M. Castro and F. A. Gr¨unbaum, Orthogonal matrix polynomials satisfying first order differential equations: a collection of instructive examples, J. Non- linear Math. Physics 12 (2005), 63–76. [20] M. Castro and F. A. Gr¨unbaum, The algebra of differential operators asso- ciated to a family of matrix valued orthogonal polynomials: five instructive examples, Int. Math. Res. Not. 2006:7 (2006), 1–33. [21] M. Castro and F. A. Gr¨unbaum, The noncommutative bispectral problem for operators of order one, to appear in Constr. Approx. [22] G. Chen and Y. Hu, The truncated Hamburger matrix moment problems in the nondegenerate and degenerate cases, and matrix continued fractions, Appl. 277 (1998), 199–236. [23] T. S. Chihara, An Introduction to Orthogonal Polynomials, Mathematics and Its Applications 13, Gordon and Breach, New York-London-Paris (1978). [24] S. Clark and F. Gesztesy, Weyl-Titchmarsh M-function asymptotics for matrix-valued Schr¨odinger operators, Proc. London Math. Soc. 82 (2001), 701–724. [25] S. Clark and F. Gesztesy, Weyl–Titchmarsh M-function asymptotics, local uniqueness results, trace formulas, and Borg-type theorems for Dirac opera- tors, Trans. Amer. Math. Soc. 354 (2002), 3475–3534. [26] S. Clark and F. Gesztesy, On Povzner–Wienholtz-type self-adjointness re- sults for matrix-valued Sturm–Liouville operators, Proc. Math. Soc. Edin- burgh 133A (2003), 747–758 [27] S. Clark and F. Gesztesy, On Weyl–Titchmarsh theory for singular finite difference Hamiltonian systems, J. Comput. Appl. Math. 171 (2004), 151– 184. [28] S. Clark, F. Gesztesy, H. Holden, and B. M. Levitan, Borg-type theorems for matrix-valued Schr¨odinger operators, J. Diff. Eqs. 167 (2000), 181–210. [29] S. Clark, F. Gesztesy, and W. Renger, Trace formulas and Borg-type theorems for matrix-valued Jacobi and Dirac finite difference operators, J. Diff. Eqs. 219 (2005), 144–182. MATRIX ORTHOGONAL POLYNOMIALS 83

[30] S. Clark, F. Gesztesy, and M. Zinchenko, Weyl–Titchmarsh theory and Borg– Marchenko-type uniqueness results for CMV operators with matrix-valued Verblunsky coefficients, to appear in Operators and Matrices. [31] W. Craig and B. Simon, Log H¨older continuity of the integrated density of states for stochastic Jacobi matrices, Comm. Math. Phys. 90 (1983), 207– 218. [32] D. Damanik and R. Killip, Half-line Schr¨odinger operators with no bound states, Acta Math. 193 (2004), 31–72. [33] D. Damanik, R. Killip, and B. Simon, Perturbations of orthogonal polynomials with periodic recursion coefficients, preprint. [34] P. Dean and J. L. Martin, A method for determining the frequency spectra of disordered lattices in two-dimensions, Proc. Roy. Soc. (London) A 259 (1960), 409–418. [35] A. M. Delgado, J. S. Geronimo, P. Iliev, and F. Marcell´an, Two variable orthogonal polynomials and structured matrices, SIAM J. Matrix Anal. Appl. 28 (2006), 118–147 [electronic]. [36] P. Delsarte and Y. V. Genin, On a generalization of the Szeg˝o–Levinson recur- rence and its application in lossless inverse scattering, IEEE Trans. Inform. Theory 38 (1992), 104–110. [37] P. Delsarte, Y. V. Genin, and Y. G. Kamp, Orthogonal polynomial matrices on the unit circle, IEEE Trans. Circuits and Systems CAS-25 (1978), 149– 160. [38] P. Delsarte, Y. V. Genin, and Y. G. Kamp, Planar least squares inverse polynomials, I. Algebraic properties, IEEE Trans. Circuits and Systems CAS- 26 (1979), 59–66. [39] P. Delsarte, Y. V. Genin, and Y. G. Kamp, The Nevanlinna–Pick problem for matrix-valued functions, SIAM J. Appl. Math. 36 (1979), 47–61. [40] P. Delsarte, Y. V. Genin, and Y. G. Kamp, Schur parametrization of positive definite block-Toeplitz systems, SIAM J. Appl. Math. 36 (1979), 34–46. [41] P. Delsarte, Y. V. Genin, and Y. G. Kamp, Generalized Schur representation of matrix-valued functions, SIAM J. Algebraic Discrete Methods 2 (1981), 94–107. [42] S. Denisov, On Rakhmanov’s theorem for Jacobi matrices, Proc. Amer. Math. Soc. 132 (2004), 847–852. [43] H. Dette and W. J. Studden, Matrix measures, moment spaces and Favard’s theorem for the interval [0, 1] and [0, ), Linear Algebra Appl. 345 (2002), 169–193. ∞ [44] V. Dubovoj, B. Fritzsche, and B. Kirstein, Matricial Version of the Clas- sical Schur Problem, Teubner Texts in Mathematics 129, Teubner Verlag, Stuttgart (1992). [45] A. J. Dur´an, A generalization of Favard’s theorem for polynomials satisfying a recurrence relation, J. Approx. Theory 74 (1993), 83–109. [46] A. J. Dur´an, On orthogonal polynomials with respect to a positive definite matrix of measures, Canad. J. Math. 47 (1995), 88–112. [47] A. J. Dur´an, Markov’s theorem for orthogonal matrix polynomials, Canad. J. Math. 48 (1996), 1180–1195. [48] A. J. Dur´an, Matrix inner product having a matrix symmetric second order differential operator, Rocky Mountain J. Math. 27 (1997), 585–600. 84 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

[49] A. J. Dur´an, Ratio asymptotics for orthogonal matrix polynomials, J. Approx. Theory 100 (1999), 304–344. [50] A. J. Dur´an, How to find weight matrices having symmetric second order differential operators with matrix leading coefficient, preprint. [51] A. J. Dur´an, Generating weight matrices having symmetric second order dif- ferential operators from a trio of upper triangular matrices, preprint. [52] A. J. Dur´an and E. Daneri, Ratio asymptotics for orthogonal matrix polyno- mials with unbounded recurrence coefficients, J. Approx. Theory 110 (2001), 1–17. [53] A. J. Dur´an and E. Daneri, Weak convergence for orthogonal matrix polyno- mials, Indag. Math. (N.S.) 13 (2002), 47–62. [54] A. J. Dura´nand E. Defez, Orthogonal matrix polynomials and quadrature formulas, Linear Algebra Appl. 345 (2002), 71–84. [55] A. J. Dur´an and F. A. Gr¨unbaum, Orthogonal matrix polynomials satisfying second order differential equations, Int. Math. Res. Not. 2004:10 (2004), 461– 484. [56] A. J. Dur´an and F. A. Gr¨unbaum, A survey on orthogonal matrix polynomials satisfying second order differential equations, J. Comput. Appl. Math. 178 (2005), 169–190. [57] A. J. Dur´an and F. A. Gr¨unbaum, Structural formulas for orthogonal matrix polynomials satysfying second order differential equations, I, Constr. Approx. 22 (2005), 255–271. [58] A. J. Dur´an and F. A. Gr¨unbaum, Orthogonal matrix poyonomials, scalar type Rodrigues’ formulas and Pearson equations, J. Approx. Theory. 134 (2005), 267–280. [59] A. J. Dur´an and F. A. Gr¨unbaum, A characterization for a class of weight matrices with orthogonal matrix polynomials satisfying second order differen- tial equations, Int. Math. Res. Not. 23 (2005), 1371–1390. [60] A. J. Dur´an and F. A. Gr¨unbaum, P. A. M. Dirac meets M. G. Krein: ma- trix orthogonal polynomials and Diracs equation, J. Phys. A: Math. Gen. 39 (2006), 3655–3662. [61] A. J. Dur´an and F. A. Gr¨unbaum, Matrix orthogonal polynomials satisfying second order differential equations: coping without help from group represen- tation theory, to appear in J. Approx. Theory. [62] A. J. Dur´an and F. A. Gr¨unbaum, Matrix differential equations and scalar polynomials satisfying higher order recursions, preprint. [63] A. Duran, F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, Matrix orthogo- nal polynomials and differential operators, with examples and applications, in preparation. [64] A. J. Dur´an and M. D. de la Iglesia, Some examples of orthogonal matrix polynomials satisfying odd order differential equations, to appear in J. Approx. Theory. [65] A. J. Dur´an and M. D. de la Iglesia, Second order differential operators having several families of orthogonal matrix polynomials as eigenfunctions, preprint. [66] A. J. Dur´an and P. L´opez-Rodr´ıguez, Orthogonal matrix polynomials: zeros and Blumenthal’s theorem, J. Approx. Theory 84 (1996), 96–118. MATRIX ORTHOGONAL POLYNOMIALS 85

[67] A. J. Dur´an and P. L´opez-Rodr´ıguez, The Lp space of a positive definite ma- trix of measures and density of matrix polynomials in L1, J. Approx. Theory 90 (1997), 299–318. [68] A. J. Dur´an and P. L´opez-Rodr´ıguez, Density questions for the truncated matrix moment problem, Canad. J. Math. 49 (1997), 708–721. [69] A. J. Dur´an and P. L´opez-Rodr´ıguez, N-extremal matrices of measures for an indeterminate matrix moment problem, J. Funct. Anal. 174 (2000), 301–321. [70] A. J. Dur´an and P. L´opez-Rodr´ıguez, The matrix moment problem, Mar- garita Mathematica en memoria de Jos´eJavier Guadalupe (L. Espa˜nol and J. L. Varona, eds.), pp. 333–348, Universidad de La Rioja, Logro˜no (2001). [71] A. J. Dur´an and P. L´opez-Rodr´ıguez, Orthogonal matrix polynomials, Laredo’s SIAG Lecture Notes, (R. Alvarez-Nodarse´ et al., eds.), pp. 11–43, Advances in the Theory of Special Functions and Orthogonal Polynomials, Nova Science Publishers 1 (2003). [72] A. J. Dur´an and P. L´opez-Rodr´ıguez, Structural formulas for orthogonal ma- trix polynomials satisfying second order differential equations, I, Constr. Ap- prox. 26 (2007), 29–47. [73] A. J. Dur´an and P. L´opez-Rodr´ıguez, Structural formulas for orthogonal ma- trix polynomials satisfying second order differential equations, II, Constr. Ap- prox. 26 (2007) 29–47. [74] A. J. Dur´an, P. L´opez-Rodr´ıguez, and E. B. Saff, Zero asymptotic behaviour for orthogonal matrix polynomials, J. d’Analyse Math. 78 (1999), 37–60. [75] A. J. Dur´an and B. Polo, Gaussian quadrature formulae for matrix weights, Linear Algebra Appl. 355 (2002), 119–146. [76] A. J. Dur´an and B. Polo, Matrix Christoffel functions, Constr. Approx. 20 (2004), 353–376. [77] A. J. Dur´an and W. Van Assche, Orthogonal matrix polynomials and higher- order recurrence relations, Linear Algebra Appl. 219 (1995), 261–280. [78] H. Dym, J Contractive Matrix Functions, Reproducing Kernel Hilbert Spaces and Interpolation, CBMS Regional Conference Series in Math. 71, American Mathematical Society, Providence, R.I. (1989). [79] Yu. M. Dyukarev, Deficiency numbers of symmetric operators generated by block Jacobi matrices, Sb. Math. 197 (2006), 1177–1203. [80] T. Erd´elyi, P. Nevai, J. Zhang, and J. Geronimo, A simple proof of “Favard’s theorem” on the unit circle, Atti Sem. Mat. Fis. Univ. Modena 39 (1991), 551–556. [81] L. Faybusovich and M. Gekhtman, On Schur flows, J. Phys. A 32 (1999), 4671–4680. [82] G. Freud, Orthogonal Polynomials, Pergamon Press, Oxford-New York (1971). [83] P. A. Fuhrmann, Orthogonal matrix polynomials and system theory, Rend. Sem. Mat. Univ. Politec. Torino 1987, Special Issue, 68–124. [84] J. S. Geronimo, Matrix orthogonal polynomials on the unit circle, J. Math. Phys. 22 (1981), 1359–1365. [85] J. S. Geronimo, Scattering theory and matrix orthogonal polynomials on the real line, Circuits Systems Signal Process 1 (1982), 471–495. [86] J. S. Geronimo, Polynomials orthogonal on the unit circle with random recur- rence coefficients, Methods of Approximation Theory in Complex Analysis 86 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

and Mathematical Physics (Leningrad, 1991), pp. 43–61, Lecture Notes in Math. 1550, Springer, Berlin (1993). [87] J. S. Geronimo and K. M. Case, Scattering theory and polynomials orthogonal on the unit circle, J. Math. Phys. 20 (1979), 299–310. [88] J. S. Geronimo and H. Woerdeman, Two variable orthogonal polynomials on the bicircle and structured matrices, SIAM J. Matrix Anal. Appl. 29 (2007), 796–825. [89] Ya. L. Geronimus, On polynomials orthogonal on the circle, on trigonometric moment problem, and on allied Carath´eodory and Schur functions, Mat. Sb. 15 (1944), 99–130 [Russian]. [90] Ya. L. Geronimus, On the trigonometric moment problem, Ann. of Math. (2) 47 (1946), 742–761. [91] Ya. L. Geronimus, Polynomials Orthogonal on a Circle and Their Applica- tions, Amer. Math. Soc. Translation 1954 (1954), no. 104, 79 pp. [92] Ya. L. Geronimus, Orthogonal Polynomials: Estimates, Asymptotic Formulas, and Series of Polynomials Orthogonal on the Unit Circle and on an Interval, Consultants Bureau, New York (1961). [93] Ya. L. Geronimus, Orthogonal polynomials, Engl. translation of the appendix to the Russian translation of Szeg˝o’s book [184], in “Two Papers on Special Functions,” Amer. Math. Soc. Transl., Ser. 2, Vol. 108, pp. 37–130, American Mathematical Society, Providence, R.I. (1977). [94] F. Gesztesy, N. J. Kalton, K. A. Makarov, and E. Tsekanovskii, Some applica- tions of operator-valued Herglotz functions, Operator Theory, System Theory and Related Topics (Beer-Sheva/Rehovot, 1997), pp. 271–321, Oper. Theory Adv. Appl. 123, Birkh¨auser, Basel (2001). [95] F. Gesztesy, A. Kiselev, and K. Makarov, Uniqueness results for matrix- valued Schr¨odinger, Jacobi, and Dirac-type operators, Math. Nachr. 239/240 (2002), 103–145. [96] F. Gesztesy and L. A. Sakhnovich, A class of matrix-valued Schr¨odinger oper- ators with prescribed finite-band spectra, Reproducing Kernel Hilbert Spaces, Positivity, System Theory and Related Topics (D. Alpay, ed.), pp. 213–253, Operator Theory: Advances and Applications 143, Birkh¨auser, Basel (2003). [97] F. Gesztesy and B. Simon, On local Borg–Marchenko uniqueness results, Comm. Math. Phys. 211 (2000), 273–287. [98] F. Gesztesy and E. Tsekanovskii, On matrix-valued Herglotz functions, Math. Nachr. 218 (2000), 61–138. [99] I. Gohberg and M. Krein, Introduction to the Theory of Linear Nonselfadjoint Operators, Translations of Mathematical Monographs 18, American Mathe- matical Society, Providence, R.I. (1969). [100] L. Golinskii, Quadrature formula and zeros of para-orthogonal polynomials on the unit circle, Acta Math. Hungar. 96 (2002), 169–186. [101] L. Golinskii and S. Khrushchev, Ces`aro asymptotics for orthogonal polynomi- als on the unit circle and classes of measures, J. Approx. Theory 115 (2002), 187–237. [102] F. A. Gr¨unbaum, Matrix valued Jacobi polynomials, Bull. Sciences Math. 127 (2003), 207–214. [103] F. A. Gr¨unbaum, Random walks and orthogonal polynomials: some chal- lenges, preprint. MATRIX ORTHOGONAL POLYNOMIALS 87

[104] F. A. Gr¨unbaum and M. D. de la Iglesia, Matrix valued orthogonal polyno- mials related to SU(N + 1), their algebras of differential operators and the corresponding curves, Exp. Math. 16 (2007), 189–207. [105] F. A. Gr¨unbaum and M. D. de la Iglesia, Matrix valued orthogonal polynomials arising from group representation theory and a family of quasi-birth-and-death processes, preprint. [106] F. A. Gr¨unbaum and P. Iliev, A noncommutative version of the bispectral problem, J. Comput. Appl. Math. 161 (2003), 99–118. [107] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, A matrix valued solution to Bochner’s problem, J. Phys. A: Math. Gen. 34 (2001), 10647–10656. [108] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, Matrix valued spherical functions associated to the three dimensional hyperbolic space, Internat. J. of Math. 13 (2002), 727–784. [109] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, Matrix valued spherical func- tions associated to the complex projective plane, J. Funct. Anal. 188 (2002), 350–441. [110] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, Matrix valued orthogonal polynomials of the Jacobi type, Indag. Math. 14 (2003), 353–366. [111] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, An invitation to matrix- valued spherical functions: Linearization of products in the case of complex projective space P2(C), Modern Signal Processing, pp. 147–160, Math. Sci. Res. Inst. Publ. 46, Cambridge University Press, Cambridge (2004). [112] F. A. Gr¨unbaum, I. Pacharoni, and J. A. Tirao, Matrix valued orthogonal polynomials of the Jacobi type: The role of group representation theory, Ann. Inst. Fourier (Grenoble) 55 (2005), 2051–2068. [113] F. A. Gr¨unbaum and J. A. Tirao, The algebra of differential operators asso- ciated to a weight matrix, Integr. Equ. Oper. Th. 58 (2007), 449–475. [114] S. Helgason, Differential Geometry, Lie Groups, and Symmetric Spaces, cor- rected reprint of the 1978 original, Graduate Studies in Mathematics 34, American Mathematical Society, Providence, R.I. (2001). [115] L. L. Helms, Introduction to Potential Theory, Pure and Applied Mathematics 22, Wiley-Interscience, New York (1969). [116] H. Helson and D. Lowdenslager, Prediction theory and Fourier series in sev- eral variables, Acta Math. 99 (1958), 165–202. [117] H. Helson and D. Lowdenslager, Prediction theory and Fourier series in sev- eral variables, II, Acta Math. 106 (1961), 175–213. [118] M. E. H. Ismail, Classical and Quantum Orthogonal Polynomials in One Vari- able, Encyclopedia of Mathematics and its Applications 98, Cambridge Uni- versity Press, Cambridge (2005). [119] W. B. Jones, O. Nj˚astad, and W. J. Thron, Moment theory, orthogonal poly- nomials, quadrature, and continued fractions associated with the unit circle, Bull. London Math. Soc. 21 (1989), 113–152. [120] T. Kailath, A view of three decades of linear filtering theory, IEEE Trans. Inform. Theory IT-20 (1974), 146–181. [121] T. Kailath, Signal processing applications of some moment problems, Mo- ments in Mathematics, (San Antonio, Tex., 1987), Proc. Sympos. Appl. Math. 37, pp. 71–109, American Mathematical Society, Providence, R.I. (1987). 88 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

[122] T. Kailath, Norbert Wiener and the development of mathematical engineering, The Legacy of Norbert Wiener: A Centennial Symposium, Proc. Sympos. Pure Math. 60, pp. 93–116, American Mathematical Society, Providence, R.I. (1997). [123] T. Kailath, A. Vieira, and M. Morf, Inverses of Toeplitz operators, innova- tions, and orthogonal polynomials, SIAM Rev. 20 (1978), 106–119. [124] T. Kato, Perturbation Theory for Linear Operators, second edition, Grundlehren der Mathematischen Wissenschaften 132, Springer, Berlin-New York (1976). [125] S. Khrushchev, Schur’s algorithm, orthogonal polynomials, and convergence of Wall’s continued fractions in L2(T), J. Approx. Theory 108 (2001), 161– 248. [126] S. Khrushchev, Classification theorems for general orthogonal polynomials on the unit circle, J. Approx. Theory 116 (2002), 268–342. [127] R. Killip and I. Nenciu, Matrix models for circular ensembles, Int. Math. Res. Not. 2004:50 (2004), 2665–2701. [128] R. Killip and B. Simon, Sum rules for Jacobi matrices and their applications to spectral theory, Ann. of Math. 158 (2003), 253–321. [129] A. N. Kolmogorov, Stationary sequences in Hilbert space, Bull. Univ. Moscow 2 (1941), 40 pp. [Russian]. [130] S. Kotani and B. Simon, Stochastic Schr¨odinger operators and Jacobi matrices on the strip, Comm. Math. Phys. 119 (1988), 403–429. [131] M. G. Krein, On a generalization of some investigations of G. Szeg˝o, W. M. Smirnov, and A. N. Kolmogorov, Dokl. Akad. Nauk SSSR 46 (1945), 91–94. [132] M. G. Krein, On a problem of extrapolation of A. N. Kolmogorov, Dokl. Akad. Nauk SSSR 46 (1945), 306–309. [133] M. G. Krein, Infinite J-matrices and a matrix moment problem, Dokl. Akad. Nauk SSSR 69 (1949), 125–128 [Russian]. [134] J. Lacroix, The random Schr¨odinger operator in a strip, Probability Measures on Groups, VII (Oberwolfach, 1983), pp. 280–297, Lecture Notes in Math. 1064, Springer, Berlin (1984). [135] H. J. Landau, Maximum entropy and the moment problem, Bull. Amer. Math. Soc. 16 (1987), 47–77. [136] N. S. Landkof, Foundations of Modern Potential Theory, Springer-Verlag, Berlin-New York (1972). [137] L. Lerer and A. C. M. Ran, A new inertia theorem for Stein equations, in- ertia of invertible Hermitian block Toeplitz matrices and matrix orthogonal polynomials, Integr. Equ. Oper. Th. 47 (2003), 339–360. [138] N. Levinson, The Wiener RMS (root-mean square) error criterion in filter design and prediction, J. Math. Phys. Mass. Inst. Tech. 25 (1947), 261–278. [139] P. L´opez-Rodr´ıguez, Riesz’s theorem for orthogonal matrix polynomials, Con- str. Approx. 15 (1999), 135–151. [140] P. L´opez-Rodr´ıguez, Nevanlinna parametrization for a matrix moment prob- lem, Math. Scand. 89 (2001), 245–267. [141] D. S. Lubinsky, A new approach to universality limits involving orthogonal polynomials, to appear in Ann. of Math. MATRIX ORTHOGONAL POLYNOMIALS 89

[142] F. Marcell´an and I. Rodriguez, A class of matrix orthogonal polynomials on the unit circle, Linear Algebra Appl. 121, (1989), 233–241. [143] F. Marcell´an and G. Sansigre, On a class of matrix orthogonal polynomials on the real line, Linear Algebra Appl. 181 (1993), 97–109. [144] F. Marcell´an and H. O. Yakhlef, Recent trends on analytic properties of ma- trix orthonormal polynomials, Orthogonal Polynomials, Approximation The- ory, and Harmonic Analysis (Inzel, 2000), Electron. Trans. Numer. Anal. 14 (2002), 127–141 [electronic]. [145] F. Marcell´an and S. M. Zagorodnyuk, On the basic set of solutions of a high- order linear difference equation, J. Diff. Equ. Appl. 12 (2006), 213–228. [146] A. M´at´eand P. Nevai, Bernstein’s inequality in Lp for 0

[162] M. Rosenberg, The square integrability of matrix valued functions with respect to a non-negative Hermitian measure, Duke Math. J. 31 (1964), 291–298. [163] I. Schur, Uber¨ Potenzreihen, die im Innern des Einheitskreises beschr ¨ankt sind, I, II, J. Reine Angew. Math. 147 (1917), 205–232; 148 (1918), 122– 145; English translation in “I. Schur Methods in Operator Theory and Signal Processing” (I. Gohberg, ed.), pp. 31–59, 66–88, Operator Theory: Advances and Applications 18, Birkh¨auser, Basel (1986). [164] H. Schulz-Baldes, Perturbation theory for Lyapunov exponents of an Anderson model on a strip, Geom. Funct. Anal. 14 (2004), 1089–1117. [165] H. Schulz-Baldes, numbers for Jacobi matrices with matrix entries, to appear in Math. Phys. Electron. J. [166] B. Schwarz and A. Zaks, Matrix M¨obius transformations, Comm. Algebra 9 (1981), 1913–1968. [167] B. Simon, Orthogonal Polynomials on the Unit Circle. Part 1. Classical The- ory, Colloquium Publications 54.1, American Mathematical Society, Provi- dence, R.I. (2005). [168] B. Simon, Orthogonal Polynomials on the Unit Circle. Part 2. Spectral The- ory, Colloquium Publications 54.2, American Mathematical Society, Provi- dence, R.I. (2005). [169] B. Simon, Trace Ideals and Their Applications, second edition, Mathematical Surveys and Monographs 120, American Mathematical Society, Providence, R.I. (2005). [170] B. Simon, CMV matrices: Five years after, J. Comput. Appl. Math. 208 (2007), 120–154. [171] B. Simon, Rank one perturbations and the zeros of paraorthogonal polynomials on the unit circle, J. Math. Anal. Appl. 329 (2007), 376–382. [172] B. Simon, Equilibrium measures and capacities in spectral theory, to appear in Inverse Problems and Imaging. [173] B. Simon, Two extensions of Lubinsky’s universality theorem, to appear in J. Anal. Math. [174] B. Simon, Weak convergence of CD kernels and applications, preprint. [175] B. Simon, The Christoffel–Darboux kernel, in preparation, to appear in Maz’ya Birthday volume. [176] B. Simon, Szeg˝o’s Theorem and Its Descendants: Spectral Theory for L2 Per- turbations of Orthogonal Polynomials, in preparation; to be published by Princeton University Press. [177] A. Sinap, Gaussian quadrature for matrix valued functions on the real line, J. Comput. Appl. Math. 65 (1995), 369–385. [178] A. Sinap and W. Van Assche, Orthogonal matrix polynomials and applica- tions, J. Comput. Appl. Math. 66 (1996), 27–52. [179] V. N. Sorokin and J. Van Iseghem, Matrix continued fractions, J. Approx. Theory 96 (1999), 237–257. [180] H. Stahl and V. Totik, General Orthogonal Polynomials, Encyclopedia of Mathematics and its Applications 43, Cambridge University Press, Cam- bridge (1992). [181] T. Stieltjes, Recherches sur les fractions continues, Ann. Fac. Sci. Univ. Toulouse 8 (1894–1895), J76–J122; ibid. 9, A5–A47. MATRIX ORTHOGONAL POLYNOMIALS 91

[182] G. Szeg˝o, Uber¨ den asymptotischen Ausdruck von Polynomen, die durch eine Orthogonalit¨atseigenschaft definiert sind, Math. Ann. 86 (1922), 114–139. [183] G. Szeg˝o, Bemerkungen zu einer Arbeit von Herrn M. Fekete: Uber¨ die Verteilung der Wurzeln bei gewissen algebraischen Gleichungen mit ganzzahli- gen Koeffizienten, Math. Z. 21 (1924), 203–208. [184] G. Szeg˝o, Orthogonal Polynomials, Amer. Math. Soc. Colloq. Publ. 23, Amer- ican Mathematical Society, Providence, R.I. (1939); third edition, (1967). [185] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Mathematical Surveys and Monographs 72, American Mathematical Society, Providence, R.I. (2000). [186] J. A. Tirao, The matrix valued hypergeometric equation, Proc. Nat. Acad. Sci. U.S.A. 100 (2003), 8138–8141. [187] V. Totik, Asymptotics for Christoffel functions for general measures on the real line, J. Anal. Math. 81 (2000), 283–303. [188] V. Totik, Polynomial inverse images and polynomial inequalities, Acta Math. 187 (2001), 139–160. [189] V. Totik, Universality and fine zero spacing on general sets, in preparation. [190] J. L. Ullman, On the regular behaviour of orthogonal polynomials, Proc. Lon- don Math. Soc. (3) 24 (1972), 119–148. [191] W. Van Assche, Rakhmanov’s theorem for orthogonal matrix polynomials on the unit circle, J. Approx. Theory 146 (2007), 227–242. [192] S. Verblunsky, On positive harmonic functions: A contribution to the algebra of Fourier series, Proc. London Math. Soc. (2) 38 (1935), 125–157. [193] H. Weyl, Uber¨ gew¨ohnliche Differentialgleichungen mit Singularit ¨aten und die zugeh¨origen Entwicklungen willk¨urlicher Funktionen, Math. Ann. 68 (1910), 220–269. [194] P. Whittle, On the fitting of multivariate autoregressions and the approximate canonical factorization of a spectral , Biometrika 50 (1963), 129–134. [195] H. Widom, Polynomials associated with measures in the complex plane, J. Math. Mech. 16 (1967), 997–1013. [196] N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. With Engineering Applications, The Technology Press of the Mas- sachusetts Institute of Technology, Cambridge, Mass. (1949). [197] M.-W. L. Wong, First and second kind paraorthogonal polynomials and their zeros, J. Approx. Theory 146 (2007), 282–293. [198] H. O. Yakhlef, Relative asymptotics for orthogonal matrix polynomials with unbounded recurrence coefficients, Integral Transforms Spec. Funct. 18 (2007), 39–57. [199] H. O. Yakhlef and F. Marcell´an, Orthogonal matrix polynomials, connection between recurrences on the unit circle and on a finite interval, Approximation, Optimization and Mathematical Economics (Pointe-`a-Pitre, 1999), pp. 369– 382, Physica, Heidelberg (2001). [200] H. O. Yakhlef and F. Marcell´an, Relative asymptotics for orthogonal matrix polynomials with respect to a perturbed matrix measure on the unit circle, Approx. Theory Appl. (N.S.) 18 (2002), 1–19. 92 D.DAMANIK,A.PUSHNITSKI,ANDB.SIMON

[201] H. O. Yakhlef, F. Marcell´an, and M. A. Pi˜nar, Perturbations in the Nevai ma- trix class of orthogonal matrix polynomials, Linear Algebra Appl. 336 (2001), 231–254. [202] H. O. Yakhlef, F. Marcell´an, and M. A. Pi˜nar, Relative asymptotics for orthog- onal matrix polynomials with convergent recurrence coefficients, J. Approx. Theory 111 (2001), 1–30. [203] D. C. Youla and N. N. Kazanjian, Bauer-type factorization of positive matrices and the theory of matrix polynomials orthogonal on the unit circle, IEEE Trans. Circuits and Systems CAS-25 (1978), 57–69. [204] M. Zygmunt, Matrix Chebyshev polynomials and continued fractions, Linear Algebra Appl. 340 (2002), 155–168.

Department of Mathematics, Rice University, Houston, TX 77005, USA E-mail address: [email protected] Department of Mathematics, King’s College London, Strand, Lon- don WC2R 2LS, UK E-mail address: [email protected] Mathematics 253–37, California Institute of Technology, Pasadena, CA 91125, USA E-mail address: [email protected]