<<

MAJORIZATION AND THE SCHUR-HORN THEOREM

A Thesis Submitted to the Faculty of Graduate Studies and Research In Partial Fulfillment of the Requirements for the Degree of Master of Science In University of Regina

By Maram Albayyadhi Regina, Saskatchewan January 2013

c Copyright 2013: Maram Albayyadhi

UNIVERSITY OF REGINA

FACULTY OF GRADUATE STUDIES AND RESEARCH

SUPERVISORY AND EXAMINING COMMITTEE

Maram Albayyadhi, candidate for the degree of Master of Science in Mathematics, has presented a thesis titled, Majorization and the Schur-Horn Theorem, in an oral examination held on December 18, 2012. The following committee members have found the thesis acceptable in form and content, and that the candidate demonstrated satisfactory knowledge of the subject material.

External Examiner: Dr. Daryl Hepting, Department of Computer Science

Supervisor: Dr. Martin Argerami, Department of Mathematics and

Committee Member: Dr. Douglas Farenick, Department of Mathematics and Statistics

Chair of Defense: Dr. Sandra Zilles, Department of Computer Science

Abstract

We study majorization in Rn and some of its properties. The concept of ma- jorization plays an important role in analysis by producing several useful re- lationships. We find out that there is a strong relationship between majorization and doubly stochastic matrices; this relation has been perfectly described in Birkhoff’s Theorem. On the other hand, majorization characterizes the connection between the eigenvalues and the diagonal elements of self adjoint matrices. This relation is summarized in the Schur-Horn Theorem. Using this theorem, we prove versions of Kadison’s Carpenter’s Theorem. We discuss A. Neumann’s extension of the concept of majorization to infinite dimension to that provides a Schur-Horn Theorem in this context. Finally, we detail the work of W. Arveson and R.V. Kadison in proving a strict Schur-Horn Theorem for positive trace-class operators.

i Acknowledgments

Throughout my studying, I could have not done my project without the support of my professors whom I insist on thanking even though my words cannot adequately express my gratitude. Dr. Martin Argerami, I would like to thank you from the bottom of my heart for all the support and the guidance that you provided for me. Dr. Shaun Fallat and Dr. Remus Floricel, if it were not for your classes, I would not have learned as much about my field. I’m also honored to thank all my amazing colleagues in the math department, especially my friend Angshuman Bhattacharya. To my father Ibrahim Albayyadhi and my Mother Suad Bakkari, words cannot express my love for you. Your prayers, belief in me and encouragement are the main reasons for my success. If I kept on thanking you all of my life, I could not pay you back. To my husband, Dr. Hadi Mufti, you always make it easier for me whenever I face obstacles; you have always been the wind beneath my wings. Finally, I would like to thank the one who kept wiping my tears on the hard days and saying, “Mom ... don’t give up” – my son Yazan.

ii Contents

Abstract i

Acknowledgments ii

Table of Contents iii

1 Preliminaries 1 1.1 Majorization ...... 1 1.2 Doubly Stochastic Matrices ...... 4 1.3 Doubly Stochastic Matrices and Majorization ...... 6

2 The Schur-Horn Theorem in the Finite Dimensional Case 14 2.1 The Pythagorean Theorem in Finite Dimension ...... 14 2.2 The Schur-Horn Theorem in the Finite Dimensional Case ...... 20 2.3 A Pythagorean Theorem for Finite Doubly Stochastic Matrices . . . . 24

3 The Carpenter Theorem in the Infinite Dimensional Case 27 3.1 The Subspaces K and K⊥ both have Infinite Dimension ...... 28 3.2 One of the Subspaces K,K⊥ has Finite Dimension ...... 45 3.3 A Pythagorean Theorem for Infinite Doubly Stochastic Matrices . . 47

iii 4 A Schur-Horn Theorem in Infinite Dimensional Case 52 4.1 Majorization in Infinite Dimension ...... 52 4.2 Neumann’s Schur-Horn Theorem ...... 55 4.3 A Strict Schur-Horn Theorem for Positive Trace-Class Operators . . . 57

5 Conclusion and Future Work 66

iv Chapter 1

Preliminaries

In this chapter we provide some basic information about majorization and some of it properties, which we will use later. The material in this chapter is basic and can be found in many matrix analysis books [4, 7].

1.1 Majorization

n ↑ ↑ ↑ ↓ ↓ ↓ Let x = (x1, ··· , xn) ∈ R . Let x = (x1, ··· , xn) and x = (x1, ··· , xn) denote the vector x with it’s coordinates rearranged in increasing and decreasing orders respectively. Then

↑ ↓ xi = xn−i+1, 1 ≤ i ≤ n.

Definition 1.1. Given x, y in Rn, we say x is majorized by y, denoted by x ≺ y,

1 if

k k X ↓ X ↓ xi ≤ yi , 1 ≤ k ≤ n (1.1) i=1 i=1 and n n X ↓ X ↓ xi = yi . (1.2) i=1 i=1

Pn Example 1.2. If xi ∈ [0, 1], and i=1 xi = 1, then we have

1 1 ( , ··· . ) ≺ (x , ··· , x ) ≺ (1, 0, ··· , 0). n n 1 n

Proposition 1.3. If x ≺ y and y ≺ x, then there exists a P such that y = Px.

Proof. Assume first that x, y are rearranged in decreasing order,i.e. x1 ≥ x2 ≥ · · · ≥ xn, y1 ≥ y2 · · · ≥ yn. We proceed by induction on k. For k = 1 we get x1 ≤ y1, and y1 ≤ x1 from the majorization equation (1.1), which implies x1 = y1. By induction hypothesis we assume that xi = yi, for i = 1, ··· , k and we will show that it works up to k + 1. From ( 1.1) x1 + x2 + ··· + xk + xk+1 ≤ y1 + y2 + ··· + yk + yk+1, and by our assumption x1 + x2 + ··· + xk = y1 + y2 + ··· + yk, so if we cancel them we get xk+1 ≤ yk+1, as the roles of x and y can be reversed yk+1 ≤ xk+1, and this implies xk+1 = yk+1. In general, if we write x↓ and y↓ for the non-increasing rearrangements, there exist

↓ ↓ permutations P1 and P2 such that x = P1x, y = P2y. By the first part of the proof,

↓ ↓ −1 x = y , i.e. x = P1 P2y.

Although majorization is defined through non-increasing rearrangements, it can

2 also be done by non-decreasing ones:

Proposition 1.4. If x, y ∈ Rn, then x ≺ y if and only if

k k X ↑ X ↑ xi ≥ yi , 1 ≤ k ≤ n (1.3) i=1 i=1 and n n X ↑ X ↑ xi = yi (1.4) i=1 i=1

Pn ↓ Pn ↓ Pn ↑ Proof. For equation(1.4), we have i=1 xi = i=1 yi , since x ≺ y, but i=1 xi = Pn ↓ Pn ↑ Pn ↓ Pn ↓ Pn ↑ Pn ↑ Pn ↑ i=1 xi , so i=1 xi = i=1 xi = i=1 yi = i=1 yi , and i=1 xi = i=1 yi . And for equation (1.3), we know that

↑ ↓ xi = xn−i+1, 1 ≤ i ≤ n.

So k k X ↑ X ↓ xi = xn−i+1. i=1 i=1 Then

k n n n−k X ↑ X ↓ X X ↓ xi = xl = xl − xl i=1 l=n−k+1 l=1 l=1 n−k X ↓ = tr (x) − xl l=1 n−k X ↓ ≥ tr (y) − yl l=1 n n−k n k X X ↓ X ↓ X ↑ = yl − yl = yl = yi . l=1 l=1 l=n−k+1 i=1

3 1.2 Doubly Stochastic Matrices

There is a deep relation between majorization and doubly stochastic matrices; we will discuss that relation in this section.

Definition 1.5. Let B = (bij) be a . We say B is doubly stochastic if :

• bij ≥ 0 ∀i, j,

Pn • i=1 bij = 1 ∀j,

Pn • j=1 bij = 1 ∀i.

Proposition 1.6. The set of square doubly stochastic matrices is a convex set and it is closed under multiplication and the adjoint operation. But it is not a group.

Pr Proof. If t1, ··· , tr ∈ [0, 1] with j=1 tj = 1 and P1, ··· ,Pr are permutations, let Pr A = j=1 tjPj. Clearly every entry of A is non-negative. Also,

n n r X X X Akl = tj(Pj)kl k=1 k=1 j=1 r n X X = tj (Pj)kl j=1 k=1 r X = tj = 1, ∀l. j=1

Pn A similar computation shows that l=1 Akl = 1 for all k. If A, B are doubly stochastic matrices then AB is also doubly stochastic. Indeed,

4 the sum over the rows is

n n n X X X (AB)ij = AikBkj j=1 j=1 k=1 n n X X = Aik Bkj k=1 j=1 n X = Aik = 1. k=1

And the same thing can be done for the columns, which shows that AB is also doubly stochastic. Also it is clear that if A doubly stochastic, then so is its adjoint A∗. The class of doubly stochastic matrices is not a group since not every doubly is invertible; for example if we take the 2 × 2 matrix, with all its

1 entries equal to 2 , this matrix is doubly stochastic and its determinant is zero.

Proposition 1.7. Every permutation matrix is doubly stochastic and is an extreme point of the convex set of all doubly stochastic matrices.

Proof. A permutation matrix has exactly one entry +1 in each row and in each column and all other entries are zero. So it is doubly stochastic.

Now let A = α1B + α2C, with A a permutation matrix such that α1, α2 ∈ (0, 1),

α1 + α2 = 1, and B,C are doubly stochastic matrices. Then every entry of B,C that corresponds to the zero element aij = 0 of A must be zero; indeed, 0 = aij =

α1bij + α2cij. As α1, α2 both are non zero, and B,C are both nonnegative, we have bij = cij = 0. Hence, nonzero entries must be all +1, A = B = C. This shows that every permutation matrix is an extreme point of the set of doubly stochastic matrices.

5 1.3 Doubly Stochastic Matrices and Majorization

Theorem 1.8. A matrix A ∈ Mn(R) is doubly stochastic if and only if Ax ≺ x, for all x ∈ Rn.

Proof. For the implication (⇐), assume Ax ≺ x for all vectors x. Then

n n X X Ax = x. i=1 i=1

From the definition of majorization. If we choose x to be ej, where ej is the vector ej = (0, ··· , 0, 1, 0, ··· , 0), 1 ≤ j ≤ n, then

        a11 ··· a1n 0 a1j 0          . .     .     . .  × 1 =  .  ≺ 1 .                 an1 ··· ann 0 anj 0

Then min{a1j, ··· , anj} ≥ min{1, 0} = 0, so akj ≥ 0 for all k. Also this implies, as j was arbitrary, that the sum over each column is 1. To show the sum over the Pn rows is also 1 we use the vector e = j=1 ej; we get

        Pn a11 ··· a1n 1 a1,j 1      j=1     . .  .  .  .  . .  × . =  .  ≺ . .             Pn    an1 ··· ann 1 j=1 an,j 1

Pn Pn Pn Then j=1 aij = 1 for all i, since max{ j=1 aij : j} ≤ 1, min{ j=1 aij; j} ≥ 1. For the other direction (⇒), let A be doubly stochastic, and let y = Ax. To prove

6 y ≺ x, we first show that we can assume x and y have their entries in non-increasing order; this because x = Px↓, and y = Qy↓ for some permutation matrices P and Q, so

Qy↓ = APx↓

y↓ = Q−1APx↓,

y↓ = Bx↓. where Q−1AP = B is doubly stochastic, since the permutation matrices are doubly stochastic, and the product of doubly stochastic matrices is doubly stochastic by Proposition 1.6. For any k ∈ {1, ··· , n} we have

k k n X X X yj = bjixi. (1.5) j=1 j=1 i=1

Pk Pn Let si = j=1 bji, then 0 ≤ si ≤ 1, i=1 si = k and

k n X X yj = sixi. j=1 i=1

Then k k n k X X X X yj − xj = sixi − xi. (1.6) j=1 j=1 i=1 i=1

7 Pn By adding and subtracting i=1 sixk from equation (1.6),

k k n k n n X X X X X X yj − xj = sixi − xi + ( si − si)xk j=1 j=1 i=1 i=1 i=1 i=1 n k n n X X X X = sixi − xi + (k − si)xk, since k = si i=1 i=1 i=1 i=1 k n k k n X X X X X = sixi + sixi − xi + xk − sixk i=1 i=k+1 i=1 i=1 i=1 k n k k k n X X X X X X = sixi + sixi − xi + xk − sixk − sixk i=1 i=k+1 i=1 i=1 i=1 i=k+1 k n k X X X = − (1 − si)xi + (xi − xk)si + (1 − si)xk i=1 i=k+1 i=1 k k X X = (si − 1)(xi − xk) + (xi − xk)si i=1 i=k+1 ≤ 0.

Pk Pk So j=1 yj ≤ j=1 xj for all k. When k = n,

n n n X X X yj = bjixj j=1 j=1 i=1 X X = ( bi1)x1 + ··· + ( bin)xn i i n X = xj. j=1

Definition 1.9. Let A ∈ Rn be a linear map. We say that A is a T-transform if there exists a ∈ [0, 1] and j, k such that

Ay = (y1, ··· , yj−1, ayj + (1 − a)yk, yj+1, ··· , (1 − a)yj + ayk, yk+1, ··· , yn),

8 i.e. A = aI + (1 − a)P, where P is the transposition (jk).

Theorem 1.10. Given x, y ∈ Rn, the following statements are equivalent:

1. x ≺ y.

2. x = T y, where T is a product of T -transforms.

3. x ∈ conv Sy = conv{Py : P is a permutation}.

4. x = Ay , where A is doubly stochastic matrix.

Proof. (1) =⇒ (2):

We want to show that if x ≺ y, then x = (Tr ··· T1)y for some T-transforms

n T1, ··· ,Tr. We will show this is true for any n by induction. Let x, y ∈ R . We can assume that x, y have their coordinates in decreasing order by permuting them, and each of these permutations is a product of T -transforms (because transpositions are T - transforms, and permutations are products of transpositions). So when x ≺ y, we have yn ≤ x1 ≤ y1. If we take k ≤ n such that yk ≤ x1 ≤ yk−1, then x1 = ty1 +(1−t)yk for some t ∈ [0, 1], and we define T1y = (ty1 + (1 − t)yk, y2, ··· , yk−1, (1 − t)y1 + tyk, yk+1, ··· , yn). Let

0 0 n−1 x = (x2, ··· , xn), y = (y2, ··· , yk−1, (1 − t)y1 + tyk, yk+1, ··· , yn) ∈ R .

Note that the first coordinate of T1y is x1. If we take off x1 from x and T1y, then

0 0 0 0 x = x and T1y = y . We will show that x ≺ y . For m such that 2 ≤ m ≤ k − 1,

m m m X X X yj ≥ x1 ≥ xj. j=2 j=2 j=2

9 And for k ≤ m ≤ n we have

m−1 k−1 m X 0 X X yj = yj + [(1 − t)y1 + tyk] + yj j=1 j=2 j=k+1 k−1 m X X = yj + y1 − ty1 + tyk + yj j=2 j=k+1 m X = yj − ty1 − (1 − t)yk, by adding and subtracting yk j=1 m X = yj − x1 j=1 m m−1 X X 0 ≥ xj − x1 = xj j=1 j=1

The last inequality is an equality when m = n since x ≺ y and so x0 ≺ y0. By

n−1 induction hypothesis there exist a finite number of T-transforms T2, ··· ,Tr on R

0 0 n such that x = (Tr, ··· ,T2)y . We can regard each of them as T-transform on R if we don’t touch the first coordinate of any vector. Then we will have

0 0 (Tr ··· T1)y = (Tr ··· T2)(x1, y ) = (x, x ) = x.

(2)=⇒(3): It is clear that each T -transform is a convex combination of permutations. Now we have to show that a product of two convex combinations of permutations is a convex combination of permutations. For PjQj where each of them is a permutation; tjsj ≥ 0, ∀j, k. We have

l ! m ! l m X X X X tjPj skQk = tjskPjQk. j=1 k=1 j=1 k=1

10 Where l m l m l X X X X X tjsk = tj sk = tj = 1. j=1 k=1 j=1 k=1 j=1

(3) =⇒(4): Is trivial, since we have x ∈ conv(Py), and from Proposition 1.6 we know that a convex combination of permutation matrices is doubly stochastic. (4)=⇒(1): This is Theorem 1.8.

Definition 1.11. If B is a square matrix, and P is a permutation, then we call the set { b1P(1), b2P(2), ··· , bnP(n)} a diagonal of B.

Each element in the diagonal of B has one entry from each column and each row.

Theorem 1.12. (The K¨onig-Frobenius Theorem) Given a square matrix B, then every diagonal of B contains a zero element if and only if B has an i × j submatrix with all entries zero for some i, j such that i + j > n.

Proof. This is equivalent to Hall’s Theorem.

Theorem 1.13. (Birkhoff’s Theorem) The set of n × n doubly stochastic matrices is a convex set whose extreme points are the permutation matrices.

Proof. To prove this we have to show two things. First, that a convex combination of doubly stochastic matrices is doubly stochastic; this was proven in Proposition 1.7. Second we have to show that every extreme point is a permutation matrix, and for this we will show that each doubly stochastic matrix is a convex combination of a permutation matrix. This can be proved by induction on the number of nonnegative

11 entries of the matrix. When A has n positive entries, if A is doubly stochastic, then A is a permutation matrix. Let A be doubly stochastic. Then A has at least one diagonal with no zero entry; indeed, let [0k×l] be a submatrix of zeros that A might have. In such case we can find permutation matrices Q1, Q2 such that

  0 B   Q1AQ2 =   ; 0 is a k × l submatrix with all entries zero. CD

Q1AQ2 is doubly stochastic which means the sum of the rows of B is 1 and the sum of the columns of C is 1. i.e.

n−l X bhi = 1, ∀h = 1, ··· , k, i=1 and n−k X cih = 1, ∀h = 1, ··· , l. i=1 Also, looking at the rows and columns that intersect D,

l n−l k n−k X X X X chi + dhi = 1, bhi + dhi = 1. i=1 i=1 h=1 h=1

Let k n−l l n−k X X X X k = bhi, and l = cih. i=1 i=1 i=1 i=1

12 Then

k n−l n−l k X X X X k = bji = bji j=1 i=1 i=1 j=1 n−l n−k ! X X = 1 − dji i=1 j=1 = n − l − d, where d is a positive number.

Hence , k + l ≤ n. So by The K¨onig-Frobenius Theorem 1.12 at least one diagonal of A must have all its entries positive. Now suppose A is a doubly stochastic matrix with n + k non-zero entries. By the previous paragraph A has a “never zero” diagonal, given by some permutation Q. Let a be the minimum element of this diagonal. Clearly a < 1, because otherwise A would have a 1 in each row and column, making it a permutation, with only n non-zero entries. Let A − aQ B = . 1 − a

Then B is doubly stochastic, and the entry in B corresponding to the location of a is zero, so B has at most n + k − 1 non-zero entries. By induction hypothesis B is a convex combination of permutation matrices. Since A = (1 − a)B + aQ, it is clear that A is a convex combination of a permutation matrices too.

13 Chapter 2

The Schur-Horn Theorem in the Finite Dimensional Case

In this chapter we study some variants of the Pythagorean Theorem. The Py- thagorean Theorem plays an important role in describing the relation between the three sides of the right triangle in Euclidean geometry. Among the variations of the Pythagorean Theorem that we will consider, some are trivial while others are not. We will find that these can be solved by using the Schur-Horn Theorem.

2.1 The Pythagorean Theorem in Finite Dimension

In the following we will represent the Pythagorean Theorem (PT) in different dimensions, beginning with the classical variant. Also we will formulate the converse of the Pythagorean Theorem, which we call the Carpenter Theorem (CT)[8].

Theorem 2.1. (PT-1) If we have right triangle, such that its two sides are x,y and the angle between them

π 2 2 2 is θ = 2 , then x + y = z .

14 Although less known, the converse of (PT-1) holds. We call this the Carpenter Theorem (CT).

Theorem 2.2. (CT-1)

2 2 2 π If we have a triangle with sides x, y, z, such that x + y = z , then θ = 2 , i.e. we have a right triangle.

2 2 Let {e1, e2} be an orthonormal basis for R . Then for x ∈ R , we can write x as linear combination of {e1, e2}, and in this case we can re-write Theorem 2.1 as

2 2 Theorem 2.3. (PT-2) If x = t1e1 + t2e2, and kxk = 1, then |t1| + |t2| = 1.

Proof. Since the norm of x is one we have

2 2 2 1 = kxk = kt1e1k + kt2e2k

2 2 2 2 = |t1| ke1k + |t2| ke2k

2 2 = |t1| + |t2| .

Note that (PT-2) is Parseval’s equality, which says that if {ej : j ∈ J} is an orthonormal basis in H, then for every x ∈ H the following equality holds:

2 X 2 kxk = |hx, eji| . j∈J

In what follows, we denote by PK x the orthogonal projection of x onto the sub- space K.

Theorem 2.4. (CT-2):

+ 2 2 2 If t1, t2 ∈ R , and t1 + t2 = 1, then there exists x ∈ R such that kxk = 1 and kPRe1 xk = t1, kPRe2 xk = t2.

15 Proof. Let x = t1e1 + t2e2. Then

2 2 kxk = kt1e1 + t2e2k

2 2 = kt1e1k + kt2e2k

2 2 = t1 + t2 = 1.

As PRe1 x = hx, e1ie1,

2 2 2 X kPRe1 xk = |hx, e1i| , where x = tiei i=1 2 X 2 = |h tiei, e1i| i=1 2 = t1.

2 2 And the same thing for kPRe2 xk = t2.

2 2 Since kxk = ke1k = 1, then kPRe1 xk = |hx, e1i| = |he1, xi| = kPRxe1k. From this point of view, we can rephrase the (PT-2) and (CT-2) as,

Theorem 2.5. (PT-3):

2 2 2 If K is a one-dimensional subspace of R , then kPK e1k + kPK e2k = 1.

Theorem 2.6. (CT-3):

+ 2 If t1, t2 ∈ R , and t1 + t2 = 1, then there exists one-dimensional K ⊂ R , such that

2 2 kPK e1k = t1, kPK e2k = t2.

Next we see that the same results hold in Rn :

Theorem 2.7. (PT-4):

16 n n If K is a one-dimensional subspace of R , and {ej}j=1 an orthonormal basis, then Pn 2 j=1 kPK ejk = 1.

P n Proof. We choose x = tjej to be a unit vector for K ⊂ R , then it spans K and

2 2 kPK eik = khei, xixk

2 2 = |hei, xi| kxk

2 = |hei, xi| .

P 2 P 2 2 Then j kPK ejk = j |hej, xi| = kxk by Parseval’s equality.

Theorem 2.8. (CT-4): Pn If t1, ··· , tn ∈ [0, 1], and j=1 tj = 1, then there exists a one-dimensional subspace

n 2 K ⊂ R such that kPK ejk = tj, j = 1, ··· , n.

1 Pn 2 Proof. Let x = j=1 tj ej and put K = Cx. Then

PK ei = hei, xix n 1 X 2 = hei, tj ejix j=1 n 1 X 2 = tj hei, ejix j=1 1 2 = ti x.

1 2 2 2 So kPK eik = kti xk = ti.

In the following we are going to generalize the Pythagorean Theorem in Rn, by allowing K to have different dimensions.

17 Theorem 2.9. (PT-5):

n Pn 2 If K is an m-dimensional subspace of R , then i=1 kPK eik = m.

n Proof. If we choose f1, ··· , fm to be an orthonormal basis for K ⊂ R , then the projection of ei onto K is m X PK ei = hei, fjifj. j=1 So

n n m X 2 X X 2 kPK eik = k hei, fjifjk i=1 i=1 j=1 n m X X 2 = |hei, fji| i=1 j=1 m n X X 2 = |hei, fji| j=1 i=1 m m X 2 X = kfjk = 1 = m. j=1 j=1

The converse of (PT-5) would be

Theorem 2.10. (CT-5):

n Pn If {ti}i=1 ⊂ [0, 1], and i=1 ti = m, then there exists an m-dimensional subspace K

n 2 of R , such that kPK eik = ti, i = 1, ··· , n.

Suddenly, its is not so obvious how to construct K. So first we will attempt to reformulate the theorem.

n n If K ⊂ R , PK is the orthogonal projection of R on K, and e1, ··· , en is an

18 orthonormal basis with (tij) the matrix of PK , then

2 kPK ejk = hPK ej,PK eji

= hPK ej, eji

= tjj,

2 ∗ Pn 2 Pn since PK = PK = PK . Then i=1k kPK eik = i=1 ti. Which we can write as Pn 2 i=1 kPK eik = tr (PK ). With this in mind, we can rewrite (PT,CT-5) as:

Theorem 2.11. (PT-6):

n If K is an m-dimensional subspace of R , then tr (PK ) = m.

Theorem 2.12. (CT-6):

Pn n If t1, ··· , tn ∈ [0, 1], and j=1 tj = m, then there is K ⊂ R , such that the diagonal of PK is (t1, . . . , tn).

This formulation of (CT-6) makes it clear that its proof is not going to be trivial as the previous results of (PT-CT). If we have these numbers t1, ··· , tn ∈ [0, 1],

n 2 such that their sum is m ∈ N and we want to look for K ∈ R with kPK eik = ti for all i; in short, we want to form a matrix of PK with diagonal of ti, such that

∗ 2 PK = PK = PK . It is not obvious that such a thing is even possible. If we try to find n(n+1) n(n−1) a projection in that way we will get 2 equations with 2 variables, as we see in the next example.

∗ 2 Example 2.13. Take PK to be a 2 × 2 matrix such that PK = PK = PK , and such

19 that the diagonal of PK is (t, 1 − t) for a fixed t ∈ [0, 1]. So

    t x t2 + x2 x P =   ,P 2 =   . K   K   x 1 − t x x2 + (1 − t)2

As these two should be equal, we get 2 equations t = t2 + x, 1 − t = x2 + (1 − t)2 √ with the single unknown x. In this particular case one can check that x = 1 − t2 gives a solution. But for a 10 × 10 matrix we will have 55 equations in 45 unknowns. The bigger the projection, the more equations we have to deal with, and the systems will always be over-determined. This is an issue, because such systems may have no solution. We will soon see, however, that this problem can be solved in general and that the Schur-Horn Theorem is the way to go.

2.2 The Schur-Horn Theorem in the Finite Dimensional Case

The Schur-Horn theorem characterizes the relation between the eigenvalues and the diagonal elements of selfadjoint matrix by using majorization.

Theorem 2.14. (Schur 1923 [14])

sa If A ∈ Mn(C) , then diag(A) ≺ λ(A), where diag(A) is the diagonal of A and λ(A) is the eigenvalue list of A.

Proof. Let A = UDU ∗, where D is and U is a unitary. Then the

20 diagonal of A is given by

X ∗ akk = DhlUkhUlk h,l X = λlUklUkl (2.7) l X 2 = λl|Ukl| . l

2 ∗ ∗ Define a matrix T by Tkl = |Ukl| . The fact that U U = UU = I implies that T is doubly stochastic. Equation 2.7 shows that diag(A) = T λ(A). By Theorem 1.10, diag(A) ≺ λ(A).

Horn [6] proved in 1954 the converse of Schur’s Theorem 2.14. We offer a proof following ideas of Kadison [8, Theorem 6]. A very similar proof appears in Arveson- Kadison [3, Theorem 2.1], but using only results from Kadison with no acknowledge- ment whatsoever of the well-known results in majorization theory that we outlined in Chapter 1. The following lemma contains Kadison’s key idea.

Lemma 2.15. Let A ∈ Mn(C) with diagonal y. Let T be a T -transform. Then there

∗ exists a unitary U ∈ Mn(C) such that UAU has diagonal T y.

21 Proof. Let A be an n × n matrix. Define a unitary U by

i j

1 p p  .  .. 0 0   p p   1 p p  i   --- ξ sin θ --- − cos θ ---  1  U =  p p   ..   0 p . p 0  .    p 1 p  --- ξ cos θ - - - sin θ --- j    p p 1   ..   0 p 0 p .  p p 1

Where ξ ∈ C, with |ξ| = 1, and bijξ = −bijξ. Then a straightforward computa-

∗ 2 tion shows that diag(UAU ) = t . y+(1−t) . yσ, Where t = sin θ, σ = (i j) ∈ Sn.

Theorem 2.16. (Horn 1954 [6])

n sa If x, y ∈ R , and x ≺ y, then ∃ A ∈ Mn(C) such that

diag(A) = x, λ(A) = y

Proof. Let x = (x1, ··· , xn), y = (y1, ··· , yn) such that x ≺ y. By Theorem 1.10

x ≺ y ⇐⇒ x = (Tr ··· T1)y, where T1, ··· ,Tr are T − transform.

Let A1 ∈ Mn(R) with diagonal y and zeroes elsewhere. By Lemma 2.15, there exists

∗ a unitary V1 such that A2 = V1A1V1 has diagonal T1y. Similarly, there exists a

∗ unitary V2 such that A3 = V2A2V2 has diagonal T2(T1y) = T2T1y. Repeating this,

22 ∗ ∗ after r steps, we will have unitaries V1, ··· ,Vr such that A = Vr ··· V1A1V1 ··· Vr has diagonal Tr ··· T1y = x. As unitary conjugation preserves the spectrum, A has spectrum y and diagonal x.

We can rephrase Schur’s result by saying that, that for every x ∈ Rn,

∗ {Mx : x ≺ y} ⊇ D{UMyU : U ∈ U(n)},

where Mx,My are diagonal matrices that have x, y at the diagonal. So if we conjugate

My with the U we still have the same vector y at the diagonal of My.

And Horn proved the other inclusion, i.e. for every x ∈ Rn,

∗ {Mx : x ≺ y} ⊆ D{UMyU : U ∈ U(n)}.

So we can rephrase both theorems together as follows :

Theorem 2.17. (Schur-Horn Theorem)

For every x ∈ Rn,

∗ {Mx : x ≺ y} = D{UMyU : U ∈ U(n)}.

Now we can prove (CT-6) as follows:

n Pn Proof of (CT-6). If a = (a1, ··· , an) ∈ [0, 1] and j=1 aj = m, then a ≺ m times (1z, ···}| , 1{, 0, ··· , 0). By the Schur-Horn Theorem, there exists a self adjoint matrix m times z }| { P ∈ Mn(C) with diagonal a and eigenvalues (1, ··· , 1, 0, ··· , 0). The minimal poly- nomial of P is f(t) = t(1 − t). So P (I − P ) = 0, i.e. P = P 2. Thus, P is a

23 projection.

2.3 A Pythagorean Theorem for Finite Doubly Stochastic

Matrices

In this section we consider, following Kadison [8, 9], certain differences of sums of entries of doubly stochastic matrices. We will use results here to generalize Theorem 2.18 below to the infinite dimensional case (Chapter 3).

Theorem 2.18. Let K be an m-dimensional subspace of H, and e1, ··· , en an or-

Pr 2 Pn 2 thonormal basis for H. If a = i=1 kPK eik , b = i=r+1 kPK⊥ eik , then

a − b = m − n + r.

Proof. As PK⊥ = I − PK , we have

2 kPK⊥ eik = hPK⊥ ei, eii

= h(I − PK )ei, eii

2 = 1 − hPK ei, eii = 1 − kPK eik .

Pr Pn 2 So a = i=1 ai, b = i=r+1 1 − ai, where ai = kPK eik , and thus (using Theorem

24 2.9)

r n X X a − b = ai − ( 1 − ai) i=1 i=r+1 r n n X X X = ai + ai − 1 i=1 i=r+1 r+1 n X = ai − (n − r) i=1 = tr (PK ) − (n − r)

= m − n + r.

Definition 2.19. Let A ∈ Mm,n(R). Fix subsets K ⊂ {1, ··· , m}, L ⊂ {1, ··· , n}. Then we can construct the block or submatrix B by taking only the rows in K and the columns in L of A. The complement of the block B is the matrix B0 with remaining rows and columns of A. The sum of all entries of the block B is the weight of the block B, and we write it as w(B).

Definition 2.20. A doubly stochastic matrix A = (aij) is said to be Pythagorean if there is a Hilbert space H with {ei}, {fj} orthonormal basis, such that

2 aij = |hei, fji| , for all i, j.

The following result can be seen as a “Pythagorean Theorem” for doubly stochas- tic matrices.

Theorem 2.21. If A is a doubly stochastic matrix and B is a block in A with p rows and q columns, then w(B) − w(B0) = p − q + n.

25 Proof.

0 X X w(B) − w(B ) = ajl − ajl j∈K j∈K⊥ l∈L l∈L0 ! ! X X X X = 1 − ajl − 1 − ajl j∈K l∈L0 l∈L0 j∈k   X  0 X  = |K| − ajl − |L | − ajl j∈K j∈K l∈L0 l∈L0 = |K| − |L0|

= |K| − (n − |L|)

= |K| + |L| − n

We can use Theorem 2.21 to give another proof of Theorem 2.18:

Proof. Let K be an m-dimensional subspace of H, and K⊥ its orthogonal comple- ment. Choose {e1, ··· , en}, {f1, ··· , fm}, {fm+1, ··· , fn} to be orthonormal bases

⊥ for H, K and K . Let A = (ajk) be the n × n doubly stochastic matrix given

2 by ajk = |hej, fki| . Let B be the submatrix of A given by the first r rows and m Pm columns of A. If Pk is the projection onto K, then Pkei = j=1hei, fjifj, (1−Pk)ei = Pn 2 Pm 2 Pm Pr 2 j=m+1hei, fjifj and kPK eik = j=1 |hei, fji| = j=1 aij, so i=1 kPK eik = Pr Pm i=1 j=1 aij = w(B). 2 Pn 2 Pn Pn 2 Similarly, kPK⊥ eik = j=m+1 |hei, fji| = j=m+1 aij, so i=r+1 kPK⊥ eik = Pn Pn 0 i=r+1 j=m+1 aij = w(B ). By Theorem 2.21,

r n X 2 X 2 0 kPK eik − kPK eik = w(B) − w(B ) = m − n + r. i=1 i=r+1

26 Chapter 3

The Carpenter Theorem in the Infinite Dimensional Case

In the second chapter we dealt with the finite dimensional space Rn, and we showed many cases of the Pythagorean Theorem. Here we will deal with infinite dimensional Hilbert spaces and we will discuss two cases of the Carpenter Theorem. The first case when the subspace K ⊂ H and its orthogonal complement K⊥ have infinite dimension, and the second case when one of the subspaces K,K⊥ has finite dimension [9]. We include here several definitions that we will need to refer to operators on an infinite-dimensional Hilbert space.

Definition 3.1. For an operator A ∈ B(H) we say A is positive if A = A∗, and the spectrum of A consists of positive real numbers. i.e. if (Ax, x) ≥ 0, ∀x ∈ H.

Definition 3.2. Let {ei} be an orthonormal basis in B(H) and A ∈ B(H) be a

27 positive operator. Then we say A is a trace-class operator if

∞ X tr (A) = hAei, eii < ∞. i=1

If the sum is finite for one orthonormal basis, then it is finite and has the same value for any other orthonormal basis. For an arbitrary A, we say it is trace-class if

∗ 1 (A A) 2 is trace-class.

3.1 The Subspaces K and K⊥ both have Infinite Dimension

We will start with some facts that we are going to use later.

Lemma 3.3. If P ∈ B(H) is a projection that has p1, p2, ··· as diagonal elements, and σ : N → N is any bijection, then there exists a projection P0 ∈ B(H) with pσ(1), pσ(2), ··· as diagonal elements.

Proof. We have P ∈ B(H) is a projection with diagonal {pi}i∈{1,2,··· } i.e.

pi = hPei, eii,

where {ei} is an orthonormal basis. We define a unitary U by Uei −→ eσ(i). Then we define a projection P0 = U ∗PU. The ith diagonal element of this projection is given

28 by

0 ∗ hP ei, eii = hU PU ei, eii

= hP Uei, Ueii

= hPeσ(i), eσ(i)i

= Pσ(i).

Lemma 3.4. Let α1, α2, ··· , αn, β ∈ [0, 1], such that α1+α2+···+αn = β+m, m ∈ . Then N m z }| { (α1, α2, ··· , αn) ≺ (β, 1,..., 1, 0).

Proof. Since αj ∈ [0, 1] for all j, we have

α1 ≤ 1

α1 + α2 ≤ 1 + 1 . .

α1 + ··· + αm ≤ m.

As α1 + α2 + ··· + αn = β + m, we have for any k ∈ {m + 1, ··· , n},

α1 + ··· + αk ≤ α1 + α2 + ··· + αn = m + β.

The following lemma is proven in [5]. While the result certainly looks obvious —as expected from the finite-dimensional analogue— and its proof is not very hard, it is not elementary either. We will generalize it in the proof of Theorem 3.8.

29 Lemma 3.5. Let P,Q ∈ B(H) be two orthogonal projections such that P − Q is trace-class. Then tr (P − Q) ∈ Z.

Lemma 3.6. Let T ∈ B(H) be a trace class operator. Let R1,R2, ··· be pairwise P orthogonal finite-rank projections with k Rk = I. Then

X tr (T ) = tr (TRk). k

P P P Proof. k(TRk) = T k(Rk) = TI. This implies T = k TRk. So, if we construct an orthonormal basis {ei} by joining orthonormal bases corresponding to each RkH, we get

X tr (T ) = hT ei, eii i X = h TRkei, eii i,k X X = hTRkei, eii

k ei∈ rang Rk X = tr (TRk). k

Let K be subspace in H and consider the orthogonal projection QK onto K.

Let q1, q2, ··· be the diagonal of QK . If QK has finite rank (i.e. if dim K < ∞), P P then j qj = tr (QK ) ∈ N. But what happens if j qj = ∞ (i.e. dim K = ∞)?

Since QK is a projection, then I − QK is going to be a projection too with diagonal P 1 − q1, 1 − q2, ··· . Therefore tr (I − QK ) = j(1 − qj) will have to be an integer if finite (this is an elementary exercise in Functional Analysis, or it can be obtained

1 from Effros’ Lemma 3.5). For example, if qj = 1 − j2 , then the diagonal of I − QK P 1 π2 1 would be j j2 = 6 , not an integer. So no projection with diagonal {1 − j2 }

30 exists. That is, when K is finite-dimensional or it is orthogonal complement K⊥ is, we obtain an obstruction to what the possible diagonals of QK are. We will address this case in Section 3.2.

⊥ P P What about when K,K are both infinite-dimensional (i.e. j qj = j(1−qj) = ∞)? The next Theorem (3.8) will illustrate this condition and give us the complete idea of the whole situation. The proof of Theorem 3.8 in the original paper [9] is kind of complicated, so we tried to simplify it as much as we could.

Definition 3.7. Let {en} be an orthonormal basis. Then we define the “ matrix units” {Emn}m,n associated to {en} as the rank-one operators

Emnx = hx, eniem, x ∈ H.

The following is the main result in this thesis.

Theorem 3.8. Given {ej}j∈N an orthonormal basis of H and {aj}j∈N ⊂ [0, 1], the following statements are equivalent:

1. There exists an infinite dimensional subspace K ⊂ H with infinite dimensional

2 complement such that kPkejk = aj ∀j ∈ N.

2. P a = ∞ and P (1 − a ) = ∞; and either (i) or (ii) holds: j∈N j j∈N j (i) a = ∞ or b = ∞;

(ii) a < ∞, b < ∞, and a − b ∈ Z, where a = P a , b = P a . aj ≤1/2 j aj >1/2 j

Proof. 2(i) =⇒ (1): First we will show that when a = ∞, then there exists a projection P with diagonal

31 {aj}. Let N0 = {j : aj = 0 or aj = 1}, K = span{ej : j ∈ N0}. Then for

⊥ j∈ / N0, aj ∈ (0, 1). If we find P0 on B(K ) with diagonal {aj : j∈ / N0}, then P = P + P satisfies 1, where P = P a E . So we will assume that a ∈ (0, 1) 0 1 1 j∈N0 j jj j for all j.

00 S 0 00 1 1 We consider a decomposition {aj} = {aj } {aj}, where 0 ≤ aj ≤ 2 , and 2 < 0 P∞ 00 P∞ 0 00 0 aj ≤ 1, so a, b will be j=1 aj , j=1 1 − aj. Let aj = aγ(j), aj = aδ(j), and

0 00 0 00 00 N = {δ(n): n ∈ N}, N = {γ(n): n ∈ N}. Let n(1) = min{n : a1+a1 +···+an ≥ 3};

00 1 0 notice that each aj ≤ 2 , a1 < 1 so n(1) ≥ 5. Let

00 b1 = a1

00 00 ↓ (b2, ··· , bn(1)) = (a2, ··· , an(1)) .

0 Let m(1) = min{n : a1 + b1 + ··· + bn ≥ 3}; then 5 ≤ m(1) ≤ n(1). Let

m(1)−1 0 X aˇ = 3 − a1 − bj 1 0 bm(1)−1 = bm(1)−1 +a ˇ

0 bm(1) = bm(1) − a.ˇ

0 Pm(1)−1 0 Pm(1)−2 0 As a1 + m(1)−2 bj < 3, a1 + 1 bj + bm(1)−1 = 3, we havea ˇ ≥ 0 and

0 0 0 ≤ bm(1) < bm(1) ≤ bm(1)−1 < bm(1)−1 ≤ 1.

Let = {δ(1), γσ (1), ··· , γσ (m(1) − 1)}, where b = a00 = a for certain N1 1 1 j σ1(j) γσ1(j) permutation σ1. Let j(1) = γσ1(m(1)), j(2) = δ(2), and {j(n)}n≥3 an increasing

00 0 0 Pn enumeration of N \(N1 ∪{j(1), j(2)}). Let n(2) = min{n : a2 +bm(1) + 3 aj(k) ≥ 3}.

32 Let

0 c1 = bm(1)

0 c2 = a2

c3 = aj(3)

↓ (c4, ··· , cn(2)) = (aj(4), ··· , aj(n(2))) .

Pm Pm(2)−1 Let m(2) = min{m : j=1 cj ≥ 3}; then j=1 cj < 3, 6 ≤ m(2) ≤ n(2). Define

m(2)−1 ˇ X b = 3 − cj j=1 0 ˇ cm(2)−1 = cm(2)−1 + b

0 ˇ cm(2) = cm(2) − b.

ˇ 0 0 Then 0 ≤ b ≤ cm(2)−1, and 0 ≤ cm(2) ≤ cm(2) ≤ cm(2)−1 ≤ cm(2)−1 ≤ 1. Let N2 =

{j(1), j(2)} ∪ {n : ∃k, 3 ≤ k ≤ m(2) − 1 with ck = an}. Let k(1) be such that ak(1) is cn(2), k(2) = δ(3), write

00 N \ (N1 ∪ N2) = {k(1), k(2), ···} with {k(3), k(4), ···} in ascending order.

33 0 0 Pn Let n(3) = min{n : a3 + cm(2) + r=3 ak(r) ≥ 3} and let

0 d1 = cm(2)

0 d2 = a3

d3 = ak(3)

↓ (d4, ··· , dn(3)) = (ak(4), ··· , ak(n(3))) ,

Pm Pm Pm(3)−1 and let m(3) = min{n : j=1 dj ≥ 3}. Then j=1 dj ≥ 3, j=1 dj ≤ 3, and 6 ≤ m(3) ≤ n(3). Let

m(3)−1 X cˇ = 3 − dj, j=1 0 dm(3)−1 = dm(3)−1 +c, ˇ

0 dm(3) = dm(3) − c,ˇ

1 0 0 so that 0 < cˇ ≤ dm(3) ≤ 2 , and 0 ≤ dm(3) < dm(3) ≤ dm(3)−1 < dm(3)−1 ≤ 1.

By repeating these processes we will build pairwise disjoint subsets N1, N2, ··· of

N such that N1 ∪ N2 ∪ · · · = N. We can write Nj = {pj(1), ··· , pj(m(j) − 1)}, then

m(j)−2 0 0 X 0 bm(j−1) + aj + apj (k) + bm(j)−1 = 3. (3.8) k=3

By Theorem 2.12 we will get a self adjoint projection Ej with diagonal

0 0 0 {bm(j−1), aj, apj (3), ··· , apj (m(j)−2), bm(j)−1}.

34 We also have

0 0 bm(1) + bm(j)−1 = bm(1) + bm(j)−1, (3.9) and

0 0 0 ≤ bm(j) ≤ bm(j) ≤ bm(j)−1 ≤ bm(j)−1. (3.10)

∞ M If we write P = Ej, we get j=1

  a0  1     a 0 0   p1(2)     ..   .       ap (m(1)−2)   1     b0   m(1)−1     0   bm(1)       a0   2  P =   .  a   0 p2(2) 0     ..   .       b0   m(2)−1     b0   m(2)           .   0 0 ..    

The projection P has all the aj in its diagonal with the exception of the pairs

35 0 0 bm(j)−1, bm(j) in place of bm(j)−1, bm(j). We will now construct a unitary operator

0 0 that will conjugate bm(j)−1, bm(j) into bm(j)−1, bm(j). So let U be

  1    .   .. 0 0       1         sin θ1 − cos θ1       cos θ1 sin θ1       1       ..  U =  0 . 0  ,      1       sin θ − cos θ   2 2       cos θ2 sin θ2           .   ..   0 0   

∗ where θ1, θ2, ··· are to be determined. Let Q = UPU . Then every entry of Q outside the 2 × 2 blocks agrees with P . In the 2 × 2 blocks,

     ∗ 0 sin θj − cos θj b 0 sin θj − cos θj    m(1)−1    =    0    cos θj sin θj 0 bm(1) cos θj sin θj

36   b0 sin2 θ + b0 cos2 θ ∗  m(1)−1 m(1)   0 2 0 2  ∗ bm(1) sin θ + bm(1)−1 cos θ

The conditions 3.9 and 3.10 guarantee that for each j there exists tj ∈ [0, 1] such that

0 0 0 0 bm(j) = tjbm(j) + (1 − t)bm(j)−1, bm(j)−1 = (1 − tj)bm(j) + tjbm(j)−1.

Choosing θj so that tj = sin θj, we get the desired diagonal for Q. This proves the case a = ∞. When b = ∞, we can repeat the proof for the coefficients bi = 1 − ai.

That way we obtain a projection with E with diagonal 1 − aj. Then I − E is the projection we are looking for. 2(ii) =⇒ 1:

We will show that if a < ∞, b < ∞, a − b ∈ Z, then there exists a projection P with diagonal {aj}. By using Lemma 3.3, we can reorder the numbers {aj} ∈ [0, 1]

00 1 0 00 S 0 as needed. Again write 0 ≤ aj ≤ 2 < aj ≤ 1, where {aj } {aj} = {aj}. Let 00 0 00 0 aj = aγ(j), aj = aδ(j) such that N = {γ(n): n ∈ N}, N = {δ(n): n ∈ N}. Then P 00 P 0 a = 00 a , b = 0 1 − a . Since a, b are finite, we can get finite subsets aj ∈N j aj ∈N j 00 00 0 0 N1 ⊂ N , N1 ⊂ N such that

X 00 X 0 γ1 = aj < 1, δ1 = 1 − aj < γ1. 00 00 0 0 N \N1 N \N1

37 We are given a − b ∈ Z, so

X X a − b = aj + γ1 − ( (1 − aj) + δ1) 00 0 N1 N1 X X 0 = aj + aj + γ1 − δ1 − |N1| 00 0 N1 N1 X 0 = aj − |N1| + γ1 − δ1 ∈ Z. 00 0 N1 ∪N1

P 00 Then m = 00 0 aj + γ1 − δ1 ∈ . Write N ∪ N1 = {k1, ··· , kr}. Since N1 ∪N1 Z 1

0 ≤ γ1 − δ1 < 1, 0 ≤ aj ≤ 1, and

m = ak1 + ··· + akr + (γ1 − δ1), we get from Lemma 3.4 that

m times r+1−m times z }| { z }| { (ak1 , ··· , akr , (δ1 − γ1)) ≺ (1, ··· , 1, 0, ··· , 0) .

By Theorem 2.12 (CT-6), there exists a projection P0 such that

  ak1    .   .. *    P0 =   .  a   kr    * γ1 − δ1

38 By adding the element 1 to the diagonal of P0 we get another projection P1

  a  k1   .   ..   * 0       akr       * δ1 − γ1 0 0    P1 =   .    0 1 0       0 0 0       0 0     .  ..

00 00 Now we will pay attention only to the 3 × 3 block at the middle. Let N \ N1 =

0 0 {l1, l2, ···}, N \ N1 = {m1, m2, ···}. We know 0 ≤ ali < 1, 0 ≤ ami < δ1 ∀i ∈ N. Let

γ2 = γ1 − al1 , δ2 = δ1 − (1 − am1 ), then

γ2 − δ2 = γ1 − δ1 − (al1 + am1 ) + 1 (3.11)

γ2 − δ2 + al1 + am1 = γ1 − δ1 + 1 (3.12)

From Lemma 3.4,

(al1 , am1 , γ2 − δ2) ≺ (γ1 − δ1, 1, 0).

By the Schur-Horn Theorem 2.17 there exists a 3 × 3 self adjoint matrix U1 such that

39     al 0 0 γ1 − δ1 0 0  1        ∗  0 a 0  = diag (U1  0 1 0 U ).  m1    1     0 0 γ2 − δ2 0 0 0

So   ak  1   .   .. 0       a ∗   kr       al1    ∗   U1P1U =  ∗ a  . 1  m1       γ2 − δ2       0       0 0      ...

Now we define P2 as

40   ak  1   .   .. 0       a   kr       al1      P2 =  a   m1       γ2 − δ2       1       0 0     .  ..

∗ i.e. P2 = U1P1U1 + Er+4,r+4. Since 0 ≤ al2 ≤ γ2, 0 ≤ 1 − am2 ≤ δ2, and by letting

γ3 = γ2 − al2 , δ3 = δ2 − 1 − am2 , then again by Lemma 3.4,

(al2 , am2 , γ3 − δ3) ≺ (γ2 − δ2, 1, 0).

If we keep repeating the process we will end up with a sequence of projections {Pn} ∈ B(H) such that

diag(PN ) = ak1 , ··· , akr , al1 , ··· , aln , am1 , ··· , amn , γn+1 − δn+1, 0, ··· .

As the unitary Uj+1 is the identity except possibly in the first r + 3j + 3 basis

41 elements,

PiQr+3j+4 = PjQr+3j+4, i ≥ j (3.13)

Ps where is Qs = h=1 Ehh. This shows that the sequence {Pn} converges strongly, Indeed, for any i ≥ j, ξ ∈ H,

k(Pi − Pj)ξk = k(Pi − Pj)Iξk

= k(Pi − Pj)[(I − Qr+3j+4) + Qr+3j+4]ξk

≤ k(Pi − Pj)(I − Qr+3j+4)ξk + k(Pi − Pj)Qr+3j+4ξk

≤ (kPik + kPjk)k(I − Qr+3j+4)ξk = 2k(I − Qr+3j+4)ξk.

When h → ∞, Qh → I strongly, so we find out that the sequence {Pnξ} is Cauchy in

H, and so convergent. This shows {Pn} ∈ B(H) is strongly convergent to a projection

P. As PiQr+3j+4 = PjQr+3j+4, the diagonal of P is {aj}. (1) =⇒ (2):

⊥ P P The fact that K,K are infinite-dimensional imply that j aj = j 1−aj = ∞. So we need to show that either 2(i) or 2(ii) holds. If 2(i) holds, we are done. Otherwise we want to show that if there is a projection P ∈ B(H) such that diag(P ) = an, then a − b ∈ Z, where

X X a = ai < ∞, and b = (1 − ai) < ∞. 0 00 ai∈N ai∈N

The proof of this is inspired by Effros’ proof of Lemma 3.5. Let Q ∈ B(H) be the

42 projection Q = P E . Then n∈N0 nn

X tr (QP Q) = tr ( EnnPEnn) n∈N0 X = tr ( anEnn) n∈N0 X = an n∈N0 = a, and

⊥ ⊥ ⊥ X tr (Q P Q ) = tr ( Enn(I − P )Enn) n∈N00 X = tr ( (1 − an)Enn) n∈N00 X = (1 − an) n∈N00 = b.

This implies that both QP Q, Q⊥P ⊥Q⊥ are trace class, since they are positive. We now notice that P − Q⊥ is Hilbert-Schmidt, and so in particular it is compact.

43 Indeed, using that P = P 2, (1 − P )2 = (1 − P ), we have

⊥ 2 X ⊥ tr ((P − Q ) ) = [(P − Q )]hh h X X ⊥ 2 = |Phk − Qhk| (no problem exchanging the sums h k X X ⊥ 2 = |Phk − Qhk| since every term is non-negative) k h X X 2 X 2 X 2 = |Phk| + (1 − Pkk) + | − Phk| k∈N0 h k∈N00 n6=k X X = Pkk + (1 − Pkk) k∈N0 k∈N00 = a + b < ∞.

⊥ 2 ⊥ 2 P Since (P − Q ) is positive and compact, then we can write (P − Q ) = k λkRk, where Rk = R1,R2, ··· are pairwise orthogonal finite rank projections and {λi}i∈N arranged strictly in decreasing order and converging to zero. The points {λi} are isolated points in the spectrum of (P − Q⊥)2, so there exist continuous functions

⊥ 2 ⊥ 2 ⊥ 2 f1, f2, ··· such that Ek = fk((P − Q ) ). Since P (P − Q ) = (P − Q ) P , Q(P −

⊥ 2 ⊥ 2 Q ) = (P −Q ) Q (from a direct computation), we deduce that PRk = RkP, QRk =

RkQ, ∀ k; in particular PRk, QRk are both finite rank projections. By keeping in

44 mind that QP Q, Q⊥P ⊥Q⊥ are trace class and using Lemma 3.6,

a − b = tr (QP Q) − tr (Q⊥P ⊥Q⊥)

= tr (QP Q − Q⊥P ⊥Q⊥)

X ⊥ ⊥ ⊥ = tr [(QP Q − Q P Q )Rk] k X ⊥ ⊥ ⊥ = tr (QP QRk) − tr (Q P Q Rk) k X ⊥ ⊥ ⊥ = tr (QP RkQRk) − tr (Q P RkQ Rk) k X ⊥ ⊥ = tr (PRkQRk − P RkQ Rk) k X = tr (PRkQRk − (Rk − RkQRk − PRk + PRkQRk)) k X = tr (PRk + QRk − Rk). k

PRk, QRk, and Rk are finite rank projections and their traces are integers, so

tr (PRk + QRk − Rk) ∈ Z, ∀k.

As every term in the series is an integer, we conclude that they are eventually zero, and that a − b ∈ Z.

3.2 One of the Subspaces K,K⊥ has Finite Dimension

In this context we will show the Carpenter’s theorem for the finite dimensional subspaces K,K⊥ of an infinite dimensional Hilbert space, where the sum of the diago- nal elements of the projection Pk is finite and integer. The proof comes as consequence of Theorem 3.8.

45 Theorem 3.9. If {ej}j∈J is an orthonormal basis for an infinite-dimensional Hilbert space H, and {tj}j∈J ⊂ [0, 1], then the following statements are equivalent:

2 1. There exists m-dimensional K ⊂ H, such that kPkejk = tj.

P 2. j∈J tj = m.

Proof. First we will show (2) ⇒ (1):

0 00 0 1 1 00 As in Theorem 3.8, we write {tj} = {tj} ∪ {tj }, 0 ≤ tj ≤ 2 , and 2 < tj ≤ 1. As P 2 00 00 00 00 j tj < ∞, tj → 0, so {tj } is necessarily finite, i.e. {tj } = {t1, ··· , tk}. Let

∞ k X 0 X 00 a = tj, b = (1 − tj ). 1 1

Then

∞ k X 0 X 00 a − b = tj − (1 − tj ) 1 1 ∞ X = tj − k 1 = m − k ∈ Z.

Since a − b ∈ Z we apply Theorem 3.8 to get a projection PK such that the diagonal P∞ P∞ of PK is t1, t2, ··· . And dimK = tr (PK ) = j=1hPK ej, eji = j=1 tj = m.

For the other direction (1) ⇒ (2): Let PK be the orthogonal projection onto K. As dim K = m, we get

X X 2 X m = tr (PK ) = hPK ej, eji = kPK ejk = tj. j j j

When K has finite co-dimension, we can apply Theorem 3.9 to its orthogonal

46 ⊥ complement K to obtain a projection PK with prescribed diagonal {sj} and PK⊥ =

I − PK . So the diagonal of PK⊥ is {1 − sj}. We get thus the following result:

Theorem 3.10. If {ej} is an orthonormal basis of H, and {tj}j∈J ⊂ [0, 1], then the following statements are equivalent:

2 1. There exists K ⊂ H, of co-dimension m, such that kPkejk = {tj}j∈J .

P 2. j∈J (1 − tj) = m.

3.3 A Pythagorean Theorem for Infinite Doubly Stochastic

Matrices

In chapter 2 we defined Pythagorean matrix (2.20), then we studied finite doubly stochastic matrices A, and we calculated the weight of some block B on the matrix A and the weight of it complementary block. Then we proved a Pythagorean Theorem

(Theorem 2.21) for finite doubly stochastic matrices, where w(B) − w(B0) ∈ Z. For an infinite doubly stochastic matrix we can’t do that immediately because we are dealing with infinite rows and columns. Kadison defines the weight for an infinite doubly stochastic matrix, by taking the sum as a limit. The sets of the rows and the columns are infinite sets, so he takes all the family of the finite subsets, that ordered by inclusion gives us a net, and then he takes the limit over the net. By using Theorem 3.8 we will show a Pythagorean Theorem for infinite Pythagorean matrix when it has infinite complementary blocks, each of them with finite weight.

Definition 3.11. We say that a doubly stochastic matrix A is orthostochastic if

2 A = |Uij| , where U is unitary matrix [4].

47 Lemma 3.12. If A is a Pythagorean matrix, then it is doubly stochastic.

Proof. To show that a Pythagorean matrix is doubly stochastic we will show that it

2 is orthostochastic. Let ei, fj be orthonormal bases with A = |hei, fji| . We know

2 2 2 that there exists a unitary U, with Ufi = ei. So |hei, fji| = |hUfi, fji| = |Uji| .

How can an infinite doubly stochastic matrix that is Pythagorean have infinite complementary blocks with finite weights? To make that clear we will discuss an example. Let

  i  2 if i ∈ Z− ai = −i  (1 − 2 ) if i ∈ Z+

and let a = P a = 1, b = P 1 − a = 1. Write = ∪ . By using i∈Z− i i∈Z+ i Z0 Z+ Z− Theorem 3.8, since a−b ∈ Z, there exists an infinite dimensional K ⊂ H, with infinite

⊥ dimensional orthogonal complement K such that the diagonal of the projection PK

2 2 −j is {ai}i∈Z0 . When j ∈ Z+, k(I − PK )ejk = 1 − kPK ejk = 1 − aj = 2 . Let

⊥ 2 {fj}j∈Z+ , {fj}j∈Z− be orthonormal bases for K,K respectively. Let aij = |hei, fji| ; then A = (aij) is an infinite Pythagorean matrix.

X X 2 2 0 aij = |hei, fji| = kPK eik = ai, ∀i ∈ Z j∈Z+ j∈Z+ and

X 2 aij = k(I − Pk)eik = 1 − ai, ∀i, j∈Z− so

48 X X X aij = ai = 1 i∈Z− j∈Z+ i∈Z− and

X X X aij = aj = 1. j∈Z+ i∈Z− j∈Z+ Thus the weight of both complementary blocks is finite, and the difference is an integer.

Theorem 3.13. If A is an infinite Pythagorean matrix, and B,B0 are a block and its complement in A, and both their weights are finite, then w(B) − w(B0) ∈ Z.

Proof. Since both blocks have finite weight, then both B,B0 are infinite (because the complement of a finite block has infinite weight). Let us write A = (aij), ∀i, j ∈ Z0,

0 where B = (aij)i,j∈Z− , and B = (aij)i,j∈Z+ . As A is Pythagorean, there exist

2 {ei}i∈Z0 , {fj}j∈Z0 orthonormal bases for H such that |hei, fji| = aij ∀i, j ∈ Z0. Let

K ⊂ H be the subspace that is spanned by {fj}j∈Z− . Then

X X X 2 w(B) = aij = |hei, fji| i,j∈Z− i∈Z− j∈Z− X X = hhei, fjifj, eii i∈Z− j∈Z− X X 2 = hPK ei, eii = kPK eik . i∈Z− i∈Z−

Similarly,

0 X X X 2 X X 2 X 2 w(B ) = aij = |hei, fji| = k(I − PK )eik = 1 − kPK eik . i,j∈Z+ i∈Z+ j∈Z+ i∈Z+ j∈Z+ i∈Z+

49 As both weights are finite, (i) ⇒ (ii) in Theorem 3.8 gives us w(B) − w(B0) ∈ Z.

If we compare the finite and infinite dimensional versions of the Pythagorean Theorem for doubly stochastic matrices 2.21 and 3.13, we note that in the finite- dimensional case the result holds for any doubly stochastic matrix, while the infinite dimensional case requires an additional hypothesis (i.e. Pythagorean). We don’t know if Theorem 3.13 holds for arbitrary doubly stochastic infinite matrices. We note below that not every doubly stochastic matrix is Pythagorean, starting with dimension 3. For example if we take any 2 × 2 doubly stochastic matrix,

  a 1 − a   A =   1 − a a

and then construct 2 orthonormal bases e1, e2, and f1, f2, where

√ √ √ √ f1 = a e1 + 1 − a e2, f2 = − 1 − a e1 + a e2,

2 then Aij = |hei, fji| , so A is Pythagorean. What about 3 × 3 doubly stochastic matrices? Consider   1 1 0   1   A = 1 0 1 . 2     0 1 1

If A were Pythagorean, n there would exist orthonormal bases {e1, e2, e3}, {f1, f2, f3}

50 2 with Aij = |hei, fji| . Then

2 0 = A31 = |he3, f1i| , so f1 = se1 + re2,

2 0 = A22 = |he2, f2i| , so f2 = pe1 + qe3.

Also,

1 = A = |he , f i|2 = |s|2, so s 6= 0 2 11 1 1 1 = A = |he , f i|2 = |p|2, so p 6= 0. 2 12 1 2

But then hf1, f2i = sp 6= 0, contradicting the fact that {f1, f2, f3} is an orthonormal basis. So every 2 × 2 doubly stochastic matrix is Pythagorean, but bigger doubly stochastic matrices will not necessarily be Pythagorean.

51 Chapter 4

A Schur-Horn Theorem in Infinite Dimensional Case

In this chapter we will see a generalization for Schur-Horn Theorem in infinite dimension by A. Neumann [11]. After that we will give an explanation for this result. After that, we will consider W. Arveson and R. Kadison’s Schur-Horn theorem for positive trace-class operators on an infinite dimensional Hilbert space [3].

4.1 Majorization in Infinite Dimension

To begin, first we will define majorization in infinite dimension and for that we will use `∞(N) ⊂ B(H) instead of using Rn, such that `∞(N) has real entries. We chose `∞(N) ⊂ B(H) here because the diagonal of a self adjoint operator as a matrix is a bounded sequence of real numbers, so we can think here about `∞(N) as “diagonal self adjoint matrices” inside B(H). We defined majorization in finite dimension as follows:

52 For any two finite vectors x, y ∈ Rn, we say x is majorized by y, denoted x ≺ y, if

k k X ↓ X ↓ xi ≤ yi for k < n, i=1 i=1 and n n X X xi = yi. i=1 i=1

One can try to generalize this, naively, by saying that for x, y ∈ `∞(N), x is majorized by y if

k k X ↓ X ↓ xi ≤ yi for k < ∞, i=1 i=1 and ∞ ∞ X X xi = yi. i=1 i=1

But such “majorization” would fail to generalize many of the properties that finite- dimensional majorization enjoys. For instance in finite dimension (Proposition 1.4), we have that x ≺ y if and only if

k k X ↓ X ↓ yi ≤ xi ∀ k < n, i=1 i=1 and n n X X xi = yi. i=1 i=1

∞ ↓ 1 But in ` (N), the numbers xn are not defined if x = 1 − n , and neither are ↑ 1 xn if xn = n . Also, looking at Theorem 1.10, the sequences x = (1, 1, 1, ··· ) and

53 y = (2, 2, 2, ··· ) would satisfy x ≺ y, but x∈ / convPy (not even in the closure). What A. Neumann does to solve this problem, instead of taking these sums

k k X ↓ X ↓ yi ≤ xi ∀ k < n, i=1 i=1

P he defines Uk(x) = sup{ i∈F xi : |F | = k} i.e. he takes all possible sums of cardinality k in x, and then he takes the supremum. But this is not enough, we can see the reason in the following example.

Example 4.1. Let x, y ∈ `∞(N) such that

x = (1, 1, 1, 1, ··· ) and y = (1, 0, 1, 0, ··· ), then

Uk(x) = k and Uk(y) = k.

So both vectors would majorize each other, and having double majorization one would expect both vectors to be permutations of each other, as in Theorem 1.3. So we would have x ≺ y, y ≺ x ⇐⇒ x = Py, which in this example is clear that can’t happen. Note that if we have

k k k k X ↓ X ↓ X ↓ X ↓ xi ≤ yi , and yi ≤ xi ∀ k i=1 i=1 i=1 i=1

54 then n n X X xi = yi. i=1 i=1 Neumann uses this idea to (implicitly) define majorization in infinite dimension as follows [11]:

Definition 4.2. Let x, y ∈ `∞(N), then we say x ≺ y if ∀ k ∈ N,

Uk(x) ≤ Uk(y)

and

Lk(x) ≥ Lk(y), where X X Uk(x) = sup{ xi : |F | = k},Lk(x) = inf{ xi : |F | = k}. i∈F i∈F

So, in Example 4.1, Uk(x) = Lk(x) = k, Uk(y) = k, Lk(y) = 0; then x ≺ y is true, but y ⊀ x.

4.2 Neumann’s Schur-Horn Theorem

We start by defining some terminology to be used later [15].

Definition 4.3. A subalgebra A ⊂ B(H) is maximal abelian if any two elements from A commute, and A is not properly contained in any other commutative subalgebra A0 ⊂ B(H).

Definition 4.4. A von Neumann algebra M on H is a ∗-subalgebra of B(H) such that M = M 00, where M 00 is the double commutant of M.

55 Definition 4.5. We say that a von Neumann M is atomic when every nonzero projection majorizes a nonzero minimal projection.

Definition 4.6. Let A ⊂ M be subalgebra a von Neumann algebra M. Then we say a linear map E is a conditional expectation when E : M −→ A is onto, E = E2, and kEk = 1. Moreover, we say that E is trace preserving when

tr ◦ E = tr .

In the finite dimensional case, Schur-Horn Theorem 2.17 states for every x, y ∈ Rn we have

∗ {Mx ∈ x : x ≺ y} = D{UMyU : U ∈ U(H)}.

If we want to do the analog thing for infinite dimensional case, take x, y ∈ `∞(N), where we see `∞(N) as the diagonal operators for a fixed orthonormal basis. If we write E for the projection onto the diagonal, then we expect

∗ {Mx : x ≺ y} = E{UMyU : U ∈ U(H)}.

This equality cannot hold without norm closure: if we go back to our Example 4.1, were x = (1, 1, 1, 1, ··· ) and y = (1, 0, 1, 0, ··· ), then x ≺ y but

∗ ∗ Mx = I/∈ E{UMyU : U ∈ U(H)}, while Mx = I ∈ E{UMyU : U ∈ U(H)}.

So the closure seems to be necessary to satisfy the equality of Schur-Horn Theorem in the infinite dimensional case, and this is what Neumann does to form the next theorem ([11], Corollary 2.18, and Theorem 3.13).

56 Theorem 4.7 ([11]). For x, y ∈ l∞(N) we have

∗ {Mx : x ≺ y} = E{UMyU : U ∈ U(H)}.

4.3 A Strict Schur-Horn Theorem for Positive Trace-Class

Operators

We will define positive trace-class operators and then we will form a Schur-Horn Theorem for positive trace-class operator on infinite dimensional Hilbert space. We call this theorem “strict”, because no closure is required after projecting onto the diagonal. We refer to the definitions of positive and trace-class operators at the beginning of Chapter 3.

Definition 4.8. For a trace-class operator A, its one-norm is

∗ 1 kAk1 = tr ((A A) 2 ).

Definition 4.9. Let A, B ∈ B(H) be two trace-class operators. We say A and B

1 ∞ are L equivalent if there exist a sequence of unitary operators {Ui}i=1 such that

∗ kA − UnBUnk1 → 0, when n → ∞.

57 The trace-class operators are compact, and thus their spectrum consists of a se- quence of eigenvalues that converge to zero; if λ is the eigenvalue list of A, then

X kAk1 = |λj|. j

Proposition 4.10. Let A ∈ B(H) be a positive trace-class operator with eigenvalue list λ, Oλ be the set of all positive trace-class operators having λ as their eigenvalue k.k list, and O(A) = {UAU ∗ : U ∈ U(H)} 1 , where U(H) the set of unitary operators. Then

Oλ = O(A).

Proof. We have to show that when two trace-class operators are L1 equivalent then their eigenvalue lists are equal, and also the converse. So first we are going to show that if we have A, B two equivalent trace-class operators with eigenvalues lists λ, µ, then λ = µ. We will use the inequality [12]

∞ X |λn − µn| ≤ kA − Bk1 i=1

∗ to show that there exist unitary operators Um such that kUmAUm − Bk1 → 0. Note

∗ that UmAUm has the same eigenvalue list as A. So for a given  > 0 choose m such ∗ P∞ that kUmAUm − Bk1 < , and so 1 |λn − µn| < . As  is arbitrary this implies P∞ 1 |λn − µn| = 0. Thus λ = µ for all n. For the reverse implication we are going to show that trace-class operators A, B

58 having the same eigenvalues list then are equivalent. Since A, B are trace class oper- ators they are compact, so we can write A as

n ∞ X X A = An + Rn, where An = λiPi,Rn = λiPi, i=1 i=n+1

with {Pi} pairwise orthogonal rank-one projections, similarly,

n ∞ X 0 X 0 B = Bn + Sn, where Bn = λiPi ,Sn = λiPi . i=1 i=n+1

Note that kRnk1 → 0, kSnk1 → 0. Since An,Bn are finite-rank operators with the

∗ same eigenvalues, there exists a unitary Un with UnAnUn = Bn. Then

∗ ∗ kB − UnAUnk1 = kBn + Sn − Un(An + Rn)Unk1

∗ ∗ = kBn + Sn − UnAnUn − UnRnUnk1

∗ = kSn − UnRnUnk1 ≤ kSnk1 + kRnk1 → 0.

∗ So kB − UnAUnk → 0.

The next two theorems 4.11, and 4.12, are about majorization of positive compact operators. We will use them to deduce the Schur-Horn Theorem for the positive trace- class operators (Theorem 4.13).

Theorem 4.11 ([3]). Let M ⊆ B(H) be a discrete maximal abelian algebra having a conditional expectation E : B(H) → M. Let A be positive compact operator in

B(H), with eigenvalue list λ = (λ1 ≥ λ2 ≥ · · · ). Let B ∈ M be positive compact operator with eigenvalue list µ. Then the following are equivalent:

59 • ∃ contraction L ∈ B(H) such that E(L∗AL) = B.

• µ1 + µ2 + ··· + µn ≤ λ1 + λ2 + ··· + λn, n = 1, 2, ··· .

Theorem 4.12 ([3]). Let A ∈ B(H) be a positive compact operator having eigenvalue list λ. Let Pn ∈ B(H) be the set of all n-dimensional projections. Then

sup tr AP = max tr AP = λ1 + ··· + λn. (4.14) P ∈Pn P ∈Pn

The maximum is achieved for the n−dimensional projection such that its range is spanned by the eigenvectors of λ1, ··· , λn.

The following is Arveson-Kadison’s strict Schur-Horn theorem for positive trace- class operators. Note that, as opposed to Neumann’s result (Theorem 4.7), no clo- sure is needed outside E(Oλ). Of course, Neumann’s result applies to all bounded sequences, while Arveson-Kadison’s require the sequences to be `1.

Theorem 4.13. Let M ∈ B(H) be a discrete maximal abelian von Neumann algebra with the trace preserving conditional expectation

E : B(H) → M.

Let the eigenvalue list λ be arranged in decreasing order in `1 with positive terms only.

Then E(Oλ) contains all positive trace-class operators B ∈ M that have eigenvalue list µ such that µ ≺ λ.

Proof. By Proposition 4.10, we have to show that, for a fixed A with eigenvalue list λ, E{UAU ∗ : U ∈ U(H)} = {B : µ ≺ λ, where µ is the eigenvalue list of B}. For

60 the implication (⇒) we are going to show that for a positive trace-class operator A ∈ B(H) that has eigenvalue list λ, the eigenvalue list µ of B = E(A) will satisfy

n n X X µi ≤ λi, for all n, (4.15) i=1 i=1 and ∞ ∞ X X µi = λi, (4.16) i=1 i=1

Let {ek} be an orthonormal basis such that B is diagonal on it, i.e. Bek = µkek for

n k = 1, 2,.... Let Pn be the projection onto the span of {ei}i=1. Since A is a positive trace-class operator, then A is compact. So we can apply Theorem 4.12, to get

n n X X µi = hBek, eki i=1 k=1 = tr (BPn)

= tr (E(A)Pn)

= tr (E(APn))

= tr (APn)

≤ λ1 + ··· + λn,

61 and

∞ ∞ X X µi = hBei, eii i=1 i=1 = tr (B)

= tr (E(A))

= tr (A) ∞ X = λi. i=1

Now we will show the other implication (⇐). Let assume that we have two eigenvalues lists µ, λ for B,A such that they satisfy the majorization conditions 4.15, 4.16. Then B ∈ B(H) is a compact positive operator that is diagonal on some orthonormal basis {ek}. So Bek = µkek. By permuting A with a unitary, we can assume without loss of generality that Aek = λkek for all k. let P be the projection on the range of A,

PH = span{ek : λk 6= 0}.

Then A = AP + AP⊥ = AP. By Theorem 4.11 there exists a contraction L, kLk ≤ 1 such that E(L∗AL) = B. So by replacing L by PL, we still have the same equality PL = L, and LL∗ ≤ P. Since L is contraction, then LL∗ = PLL∗P ≤ kLL∗kP ≤ P, and P(I − LL∗)P ≥ 0. So we know that P(I − LL∗)P is positive; we will show it is equal to zero. By hypothesis, tr (A) = tr (B), so tr (L∗AL) = tr (E(L∗AL)) =

62 tr (B) = tr (A). Then

0 = tr (A) − tr (L∗AL)

1 1 1 ∗ 1 = tr (A 2 PA 2 ) − tr (A 2 LL A 2 )

1 ∗ 1 = tr (A 2 (P − LL )A 2 ).

1 ∗ 1 As A 2 (P − LL )A 2 ≥ 0, we deduce that

1 ∗ 1 A 2 (P − PLL P)A 2 = 0.

1 ∗ 1 ∗ 1 1 ∗ ∗ 1 1 As A 2 (P − LL )A 2 = ((P − LL ) 2 A 2 ) ((P − LL ) 2 A 2 ), we conclude that

∗ 1 1 (P − LL ) 2 A 2 = 0,

∗ 1 1 1 and so (P − LL )A 2 = 0. As Aek = λkek for all k, A 2 ek = λ 2 ek. So

∗ ∗ − 1 1 (P − LL )ek = (P − LL )λ 2 A 2 ek = 0, for all k, so P − LL∗ = 0, and therefore P = LL∗ which implies that L is partial isometry. In particular, L∗L is a projection. Now let us define a unitary operator

U : H −→ PH ⊕ ker L, by ξ → Lξ ⊕ Qξ, where Q is the orthogonal projection onto ker L. Note that PH =

63 LH. For any ξ ∈ PH, η ∈ ker L,

U(L∗ξ + η) = L(L∗ξ + η) ⊕ Q(L∗ξ + η)

= LL∗ξ ⊕ (QL∗ξ + η)

= Pξ ⊕ η

= ξ ⊕ η.

So U is onto. To show that U is unitary we have to show it is isometric. Indeed

kUξk2 = kLξ ⊕ Qξk2

= kLξk2 + kQξk2

= hLξ, Lξi + hQξ, Qξi

= hL∗Lξ, ξi + hQξ, ξi

= h(L∗L + Q)ξ, ξi

= hξ, ξi

= kξk2,

so U is a surjective isometry, thus it is unitary. Then if A0 = A|PH , we have for any k

h(A0 ⊕ 0)Uek, Ueki = h(A0 ⊕ 0)(Lek ⊕ Qek)Lek ⊕ Qeki

= hAPLek, Leki

∗ = hL PAPLek, eki

∗ = hL ALek, eki.

64 ∗ ∗ So U (A0 ⊕ 0) = L AL. Then

∗ ∗ E(U (A0 ⊕ 0)U) = E(L AL) = B.

∗ The eigenvalue list of U (A0 ⊕ 0)U is the same eigenvalue list of A, therefore

∗ U (A0 ⊕ 0)U ∈ O(A), and by Proposition 4.10 we get B ∈ O(A).

65 Chapter 5

Conclusion and Future Work

The goal of this thesis was to better understand the role of majorization in the Schur-Horn Theorem. In our way to understand this theorem and its generaliza- tions we studied the relation between majorization and doubly stochastic matrices, including Birkhoff’s Theorem 1.13. Then we found that majorization characterizes the relation between the diagonal of self adjoint matrices and their eigenvalues: this is the classical Schur-Horn Theorem 2.17. We used the Schur-Horn Theorem to prove the Carpenter Theorem in both the finite (CT-6, Theorem 2.12) and the infinite di- mensional cases (Theorem 3.8). Our main contribution is a new proof for Theorem 3.8, which is shorter and we think conceptually clearer than Kadison’s. In the infinite-dimensional setting, we considered A. Neumann’s extension of the notion of majorization to `∞(N). When the right generalization of majorization is taken, we can get Neumann’s result, Theorem 4.7. This result has opened the way for many scientists to extend this result to many areas. In particular W. Arveson and R.V. Kadison got a Schur-Horn Theorem for positive trace-class operators, Theorem 4.13, where they can dispense with taking closure on the left-hand-side. Arveson and

Kadison also studied the Carpenter Theorem in II1-factors [2], but were not able to

66 prove an analog of their trace-class result. Their conjecture was first considered by Argerami and Massey [1], who proved a weak version (i.e. with closure a-la Neumann), and it was finally settled a few months ago by M. Ravichandran [13]. Possible future work in this area would be to extend Theorem 3.8 to a Schur-Horn Theorem, both in the atomic case (directly generalizing Theorem 3.8) and in the

II1-case (using the techniques in Ravichandran’s recent proof [13]).

67 Bibliography

[1] M. Argerami and P. Massey. A Schur-Horn theorem in II1 factors. Indiana Univ. Math. J., 56(5):2051–2059, 2007.

[2] William Arveson. Diagonals of normal operators with finite spectrum. Proc. Natl. Acad. Sci. USA, 104(4):1152–1158 (electronic), 2007.

[3] William Arveson and Richard V. Kadison. Diagonals of self-adjoint operators. In Operator theory, operator algebras, and applications, volume 414 of Contemp. Math., pages 247–263. Amer. Math. Soc., Providence, RI, 2006.

[4] Rajendra Bhatia. Matrix analysis, volume 169 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1997.

[5] Edward G. Effros. Why the circle is connected: an introduction to quantized topology. Math. Intelligencer, 11(1):27–34, 1989.

[6] Alfred Horn. Doubly stochastic matrices and the diagonal of a . Amer. J. Math., 76:620–630, 1954.

[7] Roger A. Horn and Charles R. Johnson. Matrix analysis. Cambridge University Press, Cambridge, 1990. Corrected reprint of the 1985 original.

68 [8] Richard V. Kadison. The Pythagorean theorem. I. The finite case. Proc. Natl. Acad. Sci. USA, 99(7):4178–4184 (electronic), 2002.

[9] Richard V. Kadison. The Pythagorean theorem. II. The infinite discrete case. Proc. Natl. Acad. Sci. USA, 99(8):5217–5222 (electronic), 2002.

[10] Richard V. Kadison. Non-commutative conditional expectations and their ap- plications. In Operator algebras, quantization, and noncommutative geometry, volume 365 of Contemp. Math., pages 143–179. Amer. Math. Soc., Providence, RI, 2004.

[11] Andreas Neumann. An infinite-dimensional version of the Schur-Horn convexity theorem. J. Funct. Anal., 161(2):418–451, 1999.

[12] Robert T. Powers and Derek W. Robinson. An index for continuous semigroups of ∗-endomorphisms of B(H). J. Funct. Anal., 84(1):85–96, 1989.

[13] Mohan Ravichandran. preprint arXiv:1209.0909, 2012.

[14] I. Schur. Uber¨ eine klasse von mittelbildungen mit anwendung auf die determi- nantentheorie. S.-Ber. Berliner math. Ges., 2:9–20, 1923.

[15] M. Takesaki. Theory of operator algebras. I, volume 124 of Encyclopaedia of Mathematical Sciences. Springer-Verlag, Berlin, 2002. Reprint of the first (1979) edition, Operator Algebras and Non-commutative Geometry, 5.

69